Commit Graph

1457 Commits

Author SHA1 Message Date
Linus Torvalds
a48fa7efaf drm fixes for 6.6-rc1
amdgpu:
 - Display replay fixes
 - Fixes for headless boards
 - Fix documentation breakage
 - RAS fixes
 - Handle newer IP discovery tables
 - SMU 13.0.6 fixes
 - SR-IOV fixes
 - Display vstartup fixes
 - NBIO 7.9 fixes
 - Display scaling mode fixes
 - Debugfs power reporting fix
 - GC 9.4.3 fixes
 - Dirty framebuffer fixes for fbcon
 - eDP fixes
 - DCN 3.1.5 fix
 - Display ODM fixes
 - GPU core dump fix
 - Re-enable zops property now that IGT test is fixed
 - Fix possible UAF in CS code
 - Cursor degamma fix
 
 amdkfd:
 - HMM fixes
 - Interrupt masking fix
 - GFX11 MQD fixes
 
 i915:
 - Mark requests for GuC virtual engines to avoid use-after-free
 
 nouveau:
 - Fix fence state in nouveau_fence_emit()
 
 ivpu:
 - replace strncpy
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmT6ihUACgkQDHTzWXnE
 hr6Vqg/+OGfbxx0qev5C93bLYpg8d4zbply0zTed2hU48zczARkOyDH+h2uYM4tP
 rmB6mK/0KRy3H8vkngKduR/IF6QBwzOnLDpS+C/TrHYQYqMDwvs3qEDVYqXh3V5H
 GPIFuu9sb2Nb/o9Fid70pvNACbgGsAUgqMUhQ/is3NjzOR8S/qjdBQ7wSdoUoOqx
 okTdlwuMK5SEYrihSyTZvhNcwpR/1L8JuxOUXXXUSQ0tRXBer/ZNF2lcEyYmQ0zs
 bZHKM4dNbdew/EhygbH6LVB5RjFaT5pGw08Xm8zJry+q5tXQV/NIXPQHL3vWqQoX
 i2QLbvGX/Uu8LJg9YNdsa1kPwNKADAxF64cW38Llv8ybsPHyva/I255j689/TvSG
 Se7HkTooURKS6GWFHPOkyMNC0Y+Fb/7WG5zUPSDVk9stqJz9pzx48okTsCWeuovD
 cBOssp8If1QsTyPvDq5A47l5z1oO3J1rdJ9fL0GnpOuXulPhgJGIoUSkftyI6lbw
 rhUoRd7w6VcgOsA9WIkAf+/325em0Y0AKZBgQnr2jfF56IE4iLa8yFuJNeReZ9oy
 W9yf14AB0orVm9+P4+PqATXBh+PdCTD1CPcB0MyK1SZAjh836Tc0HPKvJAUU6jp+
 8aw3BKXcaLjIP/dCyhSddXCnuuTPWreff5isktwgEXUtNFv9jx0=
 =fPeN
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2023-09-08' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Regular rounds of rc1 fixes, a large bunch for amdgpu since it's three
  weeks in one go, one i915, one nouveau and one ivpu.

  I think there might be a few more fixes in misc that I haven't pulled
  in yet, but we should get them all for rc2.

  amdgpu:
   - Display replay fixes
   - Fixes for headless boards
   - Fix documentation breakage
   - RAS fixes
   - Handle newer IP discovery tables
   - SMU 13.0.6 fixes
   - SR-IOV fixes
   - Display vstartup fixes
   - NBIO 7.9 fixes
   - Display scaling mode fixes
   - Debugfs power reporting fix
   - GC 9.4.3 fixes
   - Dirty framebuffer fixes for fbcon
   - eDP fixes
   - DCN 3.1.5 fix
   - Display ODM fixes
   - GPU core dump fix
   - Re-enable zops property now that IGT test is fixed
   - Fix possible UAF in CS code
   - Cursor degamma fix

  amdkfd:
   - HMM fixes
   - Interrupt masking fix
   - GFX11 MQD fixes

  i915:
   - Mark requests for GuC virtual engines to avoid use-after-free

  nouveau:
   - Fix fence state in nouveau_fence_emit()

  ivpu:
   - replace strncpy"

* tag 'drm-next-2023-09-08' of git://anongit.freedesktop.org/drm/drm: (51 commits)
  drm/amdgpu: Restrict bootloader wait to SMUv13.0.6
  drm/amd/display: prevent potential division by zero errors
  drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma
  drm/amd/display: limit the v_startup workaround to ASICs older than DCN3.1
  Revert "drm/amd/display: Remove v_startup workaround for dcn3+"
  drm/amdgpu: fix amdgpu_cs_p1_user_fence
  Revert "Revert "drm/amd/display: Implement zpos property""
  drm/amdkfd: Add missing gfx11 MQD manager callbacks
  drm/amdgpu: Free ras cmd input buffer properly
  drm/amdgpu: Hide xcp partition sysfs under SRIOV
  drm/amdgpu: use read-modify-write mode for gfx v9_4_3 SQ setting
  drm/amdkfd: use mask to get v9 interrupt sq data bits correctly
  drm/amdgpu: Allocate coredump memory in a nonblocking way
  drm/amdgpu: Support query ecc cap for aqua_vanjaram
  drm/amdgpu: Add umc_info v4_0 structure
  drm/amd/display: always switch off ODM before committing more streams
  drm/amd/display: Remove wait while locked
  drm/amd/display: update blank state on ODM changes
  drm/amd/display: Add smu write msg id fail retry process
  drm/amdgpu: Add SMU v13.0.6 default reset methods
  ...
2023-09-07 19:47:04 -07:00
Jay Cornwall
e9dca969b2 drm/amdkfd: Add missing gfx11 MQD manager callbacks
mqd_stride function was introduced in commit 2f77b9a242
("drm/amdkfd: Update MQD management on multi XCC setup")
but not assigned for gfx11. Fixes a NULL dereference in debugfs.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.5.x
2023-08-31 18:14:28 -04:00
Alex Sierra
a1fe9e9f73 drm/amdkfd: use mask to get v9 interrupt sq data bits correctly
Interrupt sq data bits were not taken properly from contextid0 and contextid1.
Use macro KFD_CONTEXT_ID_GET_SQ_INT_DATA instead.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31 18:10:28 -04:00
Alex Sierra
3aca8cca60 drm/amdkfd: retry after EBUSY is returned from hmm_ranges_get_pages
if hmm_range_get_pages returns EBUSY error during
svm_range_validate_and_map, within the context of a page fault
interrupt. This should retry through svm_range_restore_pages
callback. Therefore we treat this as EAGAIN error instead, and defer
it to restore pages fallback.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31 17:54:41 -04:00
Linus Torvalds
461f35f014 drm for 6.6-rc1
core:
 - fix gfp flags in drmm_kmalloc
 
 gpuva:
 - add new generic GPU VA manager (for nouveau initially)
 
 syncobj:
 - add new DRM_IOCTL_SYNCOBJ_EVENTFD ioctl
 
 dma-buf:
 - acquire resv lock for mmap() in exporters
 - support dma-buf self import automatically
 - docs fixes
 
 backlight:
 - fix fbdev interactions
 
 atomic:
 - improve logging
 
 prime:
 - remove struct gem_prim_mmap plus driver updates
 
 gem:
 - drm_exec: add locking over multiple GEM objects
 - fix lockdep checking
 
 fbdev:
 - make fbdev userspace interfaces optional
 - use linux device instead of fbdev device
 - use deferred i/o helper macros in various drivers
 - Make FB core selectable without drivers
 - Remove obsolete flags FBINFO_DEFAULT and FBINFO_FLAG_DEFAULT
 - Add helper macros and Kconfig tokens for DMA-allocated framebuffer
 
 ttm:
 - support init_on_free
 - swapout fixes
 
 panel:
 - panel-edp: Support AUO B116XAB01.4
 - Support Visionox R66451 plus DT bindings
 - ld9040: Backlight support, magic improved,
           Kconfig fix
 - Convert to of_device_get_match_data()
 - Fix Kconfig dependencies
 - simple: Set bpc value to fix warning; Set connector type for AUO T215HVN01;
   Support Innolux G156HCE-L01 plus DT bindings
 - ili9881: Support TDO TL050HDV35 LCD panel plus DT bindings
 - startek: Support KD070FHFID015 MIPI-DSI panel plus DT bindings
 - sitronix-st7789v: Support Inanbo T28CP45TN89 plus DT bindings;
          Support EDT ET028013DMA plus DT bindings; Various cleanups
 - edp: Add timings for N140HCA-EAC
 - Allow panels and touchscreens to power sequence together
 - Fix Innolux G156HCE-L01 LVDS clock
 
 bridge:
 - debugfs for chains support
 - dw-hdmi: Improve support for YUV420 bus format
            CEC suspend/resume, update EDID on HDMI detect
 - dw-mipi-dsi: Fix enable/disable of DSI controller
 - lt9611uxc: Use MODULE_FIRMWARE()
 - ps8640: Remove broken EDID code
 - samsung-dsim: Fix command transfer
 - tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups
 - adv7511: Fix low refresh rate
 - anx7625: Switch to macros instead of hardcoded values
            locking fixes
 - tc358767: fix hardware delays
 - sitronix-st7789v: Support panel orientation; Support rotation
                     property; Add support for Jasonic
                     JT240MHQS-HWT-EK-E3 plus DT bindings
 
 amdgpu:
 - SDMA 6.1.0 support
 - HDP 6.1 support
 - SMUIO 14.0 support
 - PSP 14.0 support
 - IH 6.1 support
 - Lots of checkpatch cleanups
 - GFX 9.4.3 updates
 - Add USB PD and IFWI flashing documentation
 - GPUVM updates
 - RAS fixes
 - DRR fixes
 - FAMS fixes
 - Virtual display fixes
 - Soft IH fixes
 - SMU13 fixes
 - Rework PSP firmware loading for other IPs
 - Kernel doc fixes
 - DCN 3.0.1 fixes
 - LTTPR fixes
 - DP MST fixes
 - DCN 3.1.6 fixes
 - SMU 13.x fixes
 - PSP 13.x fixes
 - SubVP fixes
 - GC 9.4.3 fixes
 - Display bandwidth calculation fixes
 - VCN4 secure submission fixes
 - Allow building DC on RISC-V
 - Add visible FB info to bo_print_info
 - HBR3 fixes
 - GFX9 MCBP fix
 - GMC10 vmhub index fix
 - GMC11 vmhub index fix
 - Create a new doorbell manager
 - SR-IOV fixes
 - initial freesync panel replay support
 - revert zpos properly until igt regression is fixeed
 - use TTM to manage doorbell BAR
 - Expose both current and average power via hwmon if supported
 
 amdkfd:
 - Cleanup CRIU dma-buf handling
 - Use KIQ to unmap HIQ
 - GFX 9.4.3 debugger updates
 - GFX 9.4.2 debugger fixes
 - Enable cooperative groups fof gfx11
 - SVM fixes
 - Convert older APUs to use dGPU path like newer APUs
 - Drop IOMMUv2 path as it is no longer used
 - TBA fix for aldebaran
 
 i915:
 - ICL+ DSI modeset sequence
 - HDCP improvements
 - MTL display fixes and cleanups
 - HSW/BDW PSR1 restored
 - Init DDI ports in VBT order
 - General display refactors
 - Start using plane scale factor for relative data rate
 - Use shmem for dpt objects
 - Expose RPS thresholds in sysfs
 - Apply GuC SLPC min frequency softlimit correctly
 - Extend Wa_14015795083 to TGL, RKL, DG1 and ADL
 - Fix a VMA UAF for multi-gt platform
 - Do not use stolen on MTL due to HW bug
 - Check HuC and GuC version compatibility on MTL
 - avoid infinite GPU waits due to premature release
   of request memory
 - Fixes and updates for GSC memory allocation
 - Display SDVO fixes
 - Take stolen handling out of FBC code
 - Make i915_coherent_map_type GT-centric
 - Simplify shmem_create_from_object map_type
 
 msm:
 - SM6125 MDSS support
 - DPU: SM6125 DPU support
 - DSI: runtime PM support, burst mode support
 - DSI PHY: SM6125 support in 14nm DSI PHY driver
 - GPU: prepare for a7xx
 - fix a690 firmware
 - disable relocs on a6xx and newer
 
 radeon:
 - Lots of checkpatch cleanups
 
 ast:
 - improve device-model detection
 - Represent BMV as virtual connector
 - Report DP connection status
 
 nouveau:
 - add new exec/bind interface to support Vulkan
 - document some getparam ioctls
 - improve VRAM detection
 - various fixes/cleanups
 - workraound DPCD issues
 
 ivpu:
 - MMU updates
 - debugfs support
 - Support vpu4
 
 virtio:
 - add sync object support
 
 atmel-hlcdc:
 - Support inverted pixclock polarity
 
 etnaviv:
 - runtime PM cleanups
 - hang handling fixes
 
 exynos:
 - use fbdev DMA helpers
 - fix possible NULL ptr dereference
 
 komeda:
 - always attach encoder
 
 omapdrm:
 - use fbdev DMA helpers
 ingenic:
 - kconfig regmap fixes
 
 loongson:
 - support display controller
 
 mediatek:
 - Small mtk-dpi cleanups
 - DisplayPort: support eDP and aux-bus
 - Fix coverity issues
 - Fix potential memory leak if vmap() fail
 
 mgag200:
 - minor fixes
 
 mxsfb:
 - support disabling overlay planes
 
 panfrost:
 - fix sync in IRQ handling
 
 ssd130x:
 - Support per-controller default resolution plus DT bindings
 - Reduce memory-allocation overhead
 - Improve intermediate buffer size computation
 - Fix allocation of temporary buffers
 - Fix pitch computation
 - Fix shadow plane allocation
 
 tegra:
 - use fbdev DMA helpers
 - Convert to devm_platform_ioremap_resource()
 - support bridge/connector
 - enable PM
 
 tidss:
 - Support TI AM625 plus DT bindings
 - Implement new connector model plus driver updates
 
 vkms:
 - improve write back support
 - docs fixes
 - support gamma LUT
 
 zynqmp-dpsub:
 - misc fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmTukSYACgkQDHTzWXnE
 hr6vnQ/+J7vBVkBr8JsaEV/twcZwzbNdpivsIagd8U83GQB50nDReVXbNx+Wo0/C
 WiGlrC6Sw3NVOGbkigd5IQ7fb5C/7RnBmzMi/iS7Qnk2uEqLqgV00VxfGwdm6wgr
 0gNB8zuu2xYphHz2K8LzwnmeQRdN+YUQpUa2wNzLO88IEkTvq5vx2rJEn5p9/3hp
 OxbbPBzpDRRPlkNFfVQCN8todbKdsPc4am81Eqgv7BJf21RFgQodPGW5koCYuv0w
 3m+PJh1KkfYAL974EsLr/pkY7yhhiZ6SlFLX8ssg4FyZl/Vthmc9bl14jRq/pqt4
 GBp8yrPq1XjrwXR8wv3MiwNEdANQ+KD9IoGlzLxqVgmEFRE+g4VzZZXeC3AIrTVP
 FPg4iLUrDrmj9RpJmbVqhq9X2jZs+EtRAFkJPrPbq2fItAD2a2dW4X3ISSnnTqDI
 6O2dVwuLCU6OfWnvN4bPW9p8CqRgR8Itqv1SI8qXooDy307YZu1eTUf5JAVwG/SW
 xbDEFVFlMPyFLm+KN5dv1csJKK21vWi9gLg8phK8mTWYWnqMEtJqbxbRzmdBEFmE
 pXKVu01P6ZqgBbaETpCljlOaEDdJnvO4W+o70MgBtpR2IWFMbMNO+iS0EmLZ6Vgj
 9zYZctpL+dMuHV0Of1GMkHFRHTMYEzW4tuctLIQfG13y4WzyczY=
 =CwV9
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "The drm core grew a new generic gpu virtual address manager, and new
  execution locking helpers. These are used by nouveau now to provide
  uAPI support for the userspace Vulkan driver. AMD had a bunch of new
  IP core support, loads of refactoring around fbdev, but mostly just
  the usual amount of stuff across the board.

  core:
   - fix gfp flags in drmm_kmalloc

  gpuva:
   - add new generic GPU VA manager (for nouveau initially)

  syncobj:
   - add new DRM_IOCTL_SYNCOBJ_EVENTFD ioctl

  dma-buf:
   - acquire resv lock for mmap() in exporters
   - support dma-buf self import automatically
   - docs fixes

  backlight:
   - fix fbdev interactions

  atomic:
   - improve logging

  prime:
   - remove struct gem_prim_mmap plus driver updates

  gem:
   - drm_exec: add locking over multiple GEM objects
   - fix lockdep checking

  fbdev:
   - make fbdev userspace interfaces optional
   - use linux device instead of fbdev device
   - use deferred i/o helper macros in various drivers
   - Make FB core selectable without drivers
   - Remove obsolete flags FBINFO_DEFAULT and FBINFO_FLAG_DEFAULT
   - Add helper macros and Kconfig tokens for DMA-allocated framebuffer

  ttm:
   - support init_on_free
   - swapout fixes

  panel:
   - panel-edp: Support AUO B116XAB01.4
   - Support Visionox R66451 plus DT bindings
   - ld9040:
      - Backlight support
      - magic improved
      - Kconfig fix
   - Convert to of_device_get_match_data()
   - Fix Kconfig dependencies
   - simple:
      - Set bpc value to fix warning
      - Set connector type for AUO T215HVN01
      - Support Innolux G156HCE-L01 plus DT bindings
   - ili9881: Support TDO TL050HDV35 LCD panel plus DT bindings
   - startek: Support KD070FHFID015 MIPI-DSI panel plus DT bindings
   - sitronix-st7789v:
      - Support Inanbo T28CP45TN89 plus DT bindings
      - Support EDT ET028013DMA plus DT bindings
      - Various cleanups
   - edp: Add timings for N140HCA-EAC
   - Allow panels and touchscreens to power sequence together
   - Fix Innolux G156HCE-L01 LVDS clock

  bridge:
   - debugfs for chains support
   - dw-hdmi:
      - Improve support for YUV420 bus format
      - CEC suspend/resume
      - update EDID on HDMI detect
   - dw-mipi-dsi: Fix enable/disable of DSI controller
   - lt9611uxc: Use MODULE_FIRMWARE()
   - ps8640: Remove broken EDID code
   - samsung-dsim: Fix command transfer
   - tc358764:
      - Handle HS/VS polarity
      - Use BIT() macro
      - Various cleanups
   - adv7511: Fix low refresh rate
   - anx7625:
      - Switch to macros instead of hardcoded values
      - locking fixes
   - tc358767: fix hardware delays
   - sitronix-st7789v:
      - Support panel orientation
      - Support rotation property
      - Add support for Jasonic JT240MHQS-HWT-EK-E3 plus DT bindings

  amdgpu:
   - SDMA 6.1.0 support
   - HDP 6.1 support
   - SMUIO 14.0 support
   - PSP 14.0 support
   - IH 6.1 support
   - Lots of checkpatch cleanups
   - GFX 9.4.3 updates
   - Add USB PD and IFWI flashing documentation
   - GPUVM updates
   - RAS fixes
   - DRR fixes
   - FAMS fixes
   - Virtual display fixes
   - Soft IH fixes
   - SMU13 fixes
   - Rework PSP firmware loading for other IPs
   - Kernel doc fixes
   - DCN 3.0.1 fixes
   - LTTPR fixes
   - DP MST fixes
   - DCN 3.1.6 fixes
   - SMU 13.x fixes
   - PSP 13.x fixes
   - SubVP fixes
   - GC 9.4.3 fixes
   - Display bandwidth calculation fixes
   - VCN4 secure submission fixes
   - Allow building DC on RISC-V
   - Add visible FB info to bo_print_info
   - HBR3 fixes
   - GFX9 MCBP fix
   - GMC10 vmhub index fix
   - GMC11 vmhub index fix
   - Create a new doorbell manager
   - SR-IOV fixes
   - initial freesync panel replay support
   - revert zpos properly until igt regression is fixeed
   - use TTM to manage doorbell BAR
   - Expose both current and average power via hwmon if supported

  amdkfd:
   - Cleanup CRIU dma-buf handling
   - Use KIQ to unmap HIQ
   - GFX 9.4.3 debugger updates
   - GFX 9.4.2 debugger fixes
   - Enable cooperative groups fof gfx11
   - SVM fixes
   - Convert older APUs to use dGPU path like newer APUs
   - Drop IOMMUv2 path as it is no longer used
   - TBA fix for aldebaran

  i915:
   - ICL+ DSI modeset sequence
   - HDCP improvements
   - MTL display fixes and cleanups
   - HSW/BDW PSR1 restored
   - Init DDI ports in VBT order
   - General display refactors
   - Start using plane scale factor for relative data rate
   - Use shmem for dpt objects
   - Expose RPS thresholds in sysfs
   - Apply GuC SLPC min frequency softlimit correctly
   - Extend Wa_14015795083 to TGL, RKL, DG1 and ADL
   - Fix a VMA UAF for multi-gt platform
   - Do not use stolen on MTL due to HW bug
   - Check HuC and GuC version compatibility on MTL
   - avoid infinite GPU waits due to premature release of request memory
   - Fixes and updates for GSC memory allocation
   - Display SDVO fixes
   - Take stolen handling out of FBC code
   - Make i915_coherent_map_type GT-centric
   - Simplify shmem_create_from_object map_type

  msm:
   - SM6125 MDSS support
   - DPU: SM6125 DPU support
   - DSI: runtime PM support, burst mode support
   - DSI PHY: SM6125 support in 14nm DSI PHY driver
   - GPU: prepare for a7xx
   - fix a690 firmware
   - disable relocs on a6xx and newer

  radeon:
   - Lots of checkpatch cleanups

  ast:
   - improve device-model detection
   - Represent BMV as virtual connector
   - Report DP connection status

  nouveau:
   - add new exec/bind interface to support Vulkan
   - document some getparam ioctls
   - improve VRAM detection
   - various fixes/cleanups
   - workraound DPCD issues

  ivpu:
   - MMU updates
   - debugfs support
   - Support vpu4

  virtio:
   - add sync object support

  atmel-hlcdc:
   - Support inverted pixclock polarity

  etnaviv:
   - runtime PM cleanups
   - hang handling fixes

  exynos:
   - use fbdev DMA helpers
   - fix possible NULL ptr dereference

  komeda:
   - always attach encoder

  omapdrm:
   - use fbdev DMA helpers
ingenic:
   - kconfig regmap fixes

  loongson:
   - support display controller

  mediatek:
   - Small mtk-dpi cleanups
   - DisplayPort: support eDP and aux-bus
   - Fix coverity issues
   - Fix potential memory leak if vmap() fail

  mgag200:
   - minor fixes

  mxsfb:
   - support disabling overlay planes

  panfrost:
   - fix sync in IRQ handling

  ssd130x:
   - Support per-controller default resolution plus DT bindings
   - Reduce memory-allocation overhead
   - Improve intermediate buffer size computation
   - Fix allocation of temporary buffers
   - Fix pitch computation
   - Fix shadow plane allocation

  tegra:
   - use fbdev DMA helpers
   - Convert to devm_platform_ioremap_resource()
   - support bridge/connector
   - enable PM

  tidss:
   - Support TI AM625 plus DT bindings
   - Implement new connector model plus driver updates

  vkms:
   - improve write back support
   - docs fixes
   - support gamma LUT

  zynqmp-dpsub:
   - misc fixes"

* tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drm: (1327 commits)
  drm/gpuva_mgr: remove unused prev pointer in __drm_gpuva_sm_map()
  drm/tests/drm_kunit_helpers: Place correct function name in the comment header
  drm/nouveau: uapi: don't pass NO_PREFETCH flag implicitly
  drm/nouveau: uvmm: fix unset region pointer on remap
  drm/nouveau: sched: avoid job races between entities
  drm/i915: Fix HPD polling, reenabling the output poll work as needed
  drm: Add an HPD poll helper to reschedule the poll work
  drm/i915: Fix TLB-Invalidation seqno store
  drm/ttm/tests: Fix type conversion in ttm_pool_test
  drm/msm/a6xx: Bail out early if setting GPU OOB fails
  drm/msm/a6xx: Move LLC accessors to the common header
  drm/msm/a6xx: Introduce a6xx_llc_read
  drm/ttm/tests: Require MMU when testing
  drm/panel: simple: Fix Innolux G156HCE-L01 LVDS clock
  Revert "Revert "drm/amdgpu/display: change pipe policy for DCN 2.0""
  drm/amdgpu: Add memory vendor information
  drm/amd: flush any delayed gfxoff on suspend entry
  drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix
  drm/amdgpu: Remove gfxoff check in GFX v9.4.3
  drm/amd/pm: Update pci link speed for smu v13.0.6
  ...
2023-08-30 13:34:34 -07:00
Linus Torvalds
b96a3e9142 - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list")
- Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
   reduces the special-case code for handling hugetlb pages in GUP.  It
   also speeds up GUP handling of transparent hugepages.
 
 - Peng Zhang provides some maple tree speedups ("Optimize the fast path
   of mas_store()").
 
 - Sergey Senozhatsky has improved te performance of zsmalloc during
   compaction (zsmalloc: small compaction improvements").
 
 - Domenico Cerasuolo has developed additional selftest code for zswap
   ("selftests: cgroup: add zswap test program").
 
 - xu xin has doe some work on KSM's handling of zero pages.  These
   changes are mainly to enable the user to better understand the
   effectiveness of KSM's treatment of zero pages ("ksm: support tracking
   KSM-placed zero-pages").
 
 - Jeff Xu has fixes the behaviour of memfd's
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").
 
 - David Howells has fixed an fscache optimization ("mm, netfs, fscache:
   Stop read optimisation when folio removed from pagecache").
 
 - Axel Rasmussen has given userfaultfd the ability to simulate memory
   poisoning ("add UFFDIO_POISON to simulate memory poisoning with UFFD").
 
 - Miaohe Lin has contributed some routine maintenance work on the
   memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
   check").
 
 - Peng Zhang has contributed some maintenance work on the maple tree
   code ("Improve the validation for maple tree and some cleanup").
 
 - Hugh Dickins has optimized the collapsing of shmem or file pages into
   THPs ("mm: free retracted page table by RCU").
 
 - Jiaqi Yan has a patch series which permits us to use the healthy
   subpages within a hardware poisoned huge page for general purposes
   ("Improve hugetlbfs read on HWPOISON hugepages").
 
 - Kemeng Shi has done some maintenance work on the pagetable-check code
   ("Remove unused parameters in page_table_check").
 
 - More folioification work from Matthew Wilcox ("More filesystem folio
   conversions for 6.6"), ("Followup folio conversions for zswap").  And
   from ZhangPeng ("Convert several functions in page_io.c to use a
   folio").
 
 - page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").
 
 - Baoquan He has converted some architectures to use the GENERIC_IOREMAP
   ioremap()/iounmap() code ("mm: ioremap: Convert architectures to take
   GENERIC_IOREMAP way").
 
 - Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
   batched/deferred tlb shootdown during page reclamation/migration").
 
 - Better maple tree lockdep checking from Liam Howlett ("More strict
   maple tree lockdep").  Liam also developed some efficiency improvements
   ("Reduce preallocations for maple tree").
 
 - Cleanup and optimization to the secondary IOMMU TLB invalidation, from
   Alistair Popple ("Invalidate secondary IOMMU TLB on permission
   upgrade").
 
 - Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
   for arm64").
 
 - Kemeng Shi provides some maintenance work on the compaction code ("Two
   minor cleanups for compaction").
 
 - Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle most
   file-backed faults under the VMA lock").
 
 - Aneesh Kumar contributes code to use the vmemmap optimization for DAX
   on ppc64, under some circumstances ("Add support for DAX vmemmap
   optimization for ppc64").
 
 - page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
   data in page_ext"), ("minor cleanups to page_ext header").
 
 - Some zswap cleanups from Johannes Weiner ("mm: zswap: three
   cleanups").
 
 - kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").
 
 - VMA handling cleanups from Kefeng Wang ("mm: convert to
   vma_is_initial_heap/stack()").
 
 - DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
   implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
   address ranges and DAMON monitoring targets").
 
 - Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").
 
 - Liam Howlett has improved the maple tree node replacement code
   ("maple_tree: Change replacement strategy").
 
 - ZhangPeng has a general code cleanup - use the K() macro more widely
   ("cleanup with helper macro K()").
 
 - Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for memmap
   on memory feature on ppc64").
 
 - pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
   in page_alloc"), ("Two minor cleanups for get pageblock migratetype").
 
 - Vishal Moola introduces a memory descriptor for page table tracking,
   "struct ptdesc" ("Split ptdesc from struct page").
 
 - memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
   for vm.memfd_noexec").
 
 - MM include file rationalization from Hugh Dickins ("arch: include
   asm/cacheflush.h in asm/hugetlb.h").
 
 - THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
   output").
 
 - kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
   object_cache instead of kmemleak_initialized").
 
 - More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
   and _folio_order").
 
 - A VMA locking scalability improvement from Suren Baghdasaryan
   ("Per-VMA lock support for swap and userfaults").
 
 - pagetable handling cleanups from Matthew Wilcox ("New page table range
   API").
 
 - A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
   using page->private on tail pages for THP_SWAP + cleanups").
 
 - Cleanups and speedups to the hugetlb fault handling from Matthew
   Wilcox ("Change calling convention for ->huge_fault").
 
 - Matthew Wilcox has also done some maintenance work on the MM subsystem
   documentation ("Improve mm documentation").
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO1JUQAKCRDdBJ7gKXxA
 jrMwAP47r/fS8vAVT3zp/7fXmxaJYTK27CTAM881Gw1SDhFM/wEAv8o84mDenCg6
 Nfio7afS1ncD+hPYT8947UnLxTgn+ww=
 =Afws
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Some swap cleanups from Ma Wupeng ("fix WARN_ON in
   add_to_avail_list")

 - Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
   reduces the special-case code for handling hugetlb pages in GUP. It
   also speeds up GUP handling of transparent hugepages.

 - Peng Zhang provides some maple tree speedups ("Optimize the fast path
   of mas_store()").

 - Sergey Senozhatsky has improved te performance of zsmalloc during
   compaction (zsmalloc: small compaction improvements").

 - Domenico Cerasuolo has developed additional selftest code for zswap
   ("selftests: cgroup: add zswap test program").

 - xu xin has doe some work on KSM's handling of zero pages. These
   changes are mainly to enable the user to better understand the
   effectiveness of KSM's treatment of zero pages ("ksm: support
   tracking KSM-placed zero-pages").

 - Jeff Xu has fixes the behaviour of memfd's
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").

 - David Howells has fixed an fscache optimization ("mm, netfs, fscache:
   Stop read optimisation when folio removed from pagecache").

 - Axel Rasmussen has given userfaultfd the ability to simulate memory
   poisoning ("add UFFDIO_POISON to simulate memory poisoning with
   UFFD").

 - Miaohe Lin has contributed some routine maintenance work on the
   memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
   check").

 - Peng Zhang has contributed some maintenance work on the maple tree
   code ("Improve the validation for maple tree and some cleanup").

 - Hugh Dickins has optimized the collapsing of shmem or file pages into
   THPs ("mm: free retracted page table by RCU").

 - Jiaqi Yan has a patch series which permits us to use the healthy
   subpages within a hardware poisoned huge page for general purposes
   ("Improve hugetlbfs read on HWPOISON hugepages").

 - Kemeng Shi has done some maintenance work on the pagetable-check code
   ("Remove unused parameters in page_table_check").

 - More folioification work from Matthew Wilcox ("More filesystem folio
   conversions for 6.6"), ("Followup folio conversions for zswap"). And
   from ZhangPeng ("Convert several functions in page_io.c to use a
   folio").

 - page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").

 - Baoquan He has converted some architectures to use the
   GENERIC_IOREMAP ioremap()/iounmap() code ("mm: ioremap: Convert
   architectures to take GENERIC_IOREMAP way").

 - Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
   batched/deferred tlb shootdown during page reclamation/migration").

 - Better maple tree lockdep checking from Liam Howlett ("More strict
   maple tree lockdep"). Liam also developed some efficiency
   improvements ("Reduce preallocations for maple tree").

 - Cleanup and optimization to the secondary IOMMU TLB invalidation,
   from Alistair Popple ("Invalidate secondary IOMMU TLB on permission
   upgrade").

 - Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
   for arm64").

 - Kemeng Shi provides some maintenance work on the compaction code
   ("Two minor cleanups for compaction").

 - Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle
   most file-backed faults under the VMA lock").

 - Aneesh Kumar contributes code to use the vmemmap optimization for DAX
   on ppc64, under some circumstances ("Add support for DAX vmemmap
   optimization for ppc64").

 - page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
   data in page_ext"), ("minor cleanups to page_ext header").

 - Some zswap cleanups from Johannes Weiner ("mm: zswap: three
   cleanups").

 - kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").

 - VMA handling cleanups from Kefeng Wang ("mm: convert to
   vma_is_initial_heap/stack()").

 - DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
   implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
   address ranges and DAMON monitoring targets").

 - Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").

 - Liam Howlett has improved the maple tree node replacement code
   ("maple_tree: Change replacement strategy").

 - ZhangPeng has a general code cleanup - use the K() macro more widely
   ("cleanup with helper macro K()").

 - Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for
   memmap on memory feature on ppc64").

 - pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
   in page_alloc"), ("Two minor cleanups for get pageblock
   migratetype").

 - Vishal Moola introduces a memory descriptor for page table tracking,
   "struct ptdesc" ("Split ptdesc from struct page").

 - memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
   for vm.memfd_noexec").

 - MM include file rationalization from Hugh Dickins ("arch: include
   asm/cacheflush.h in asm/hugetlb.h").

 - THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
   output").

 - kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
   object_cache instead of kmemleak_initialized").

 - More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
   and _folio_order").

 - A VMA locking scalability improvement from Suren Baghdasaryan
   ("Per-VMA lock support for swap and userfaults").

 - pagetable handling cleanups from Matthew Wilcox ("New page table
   range API").

 - A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
   using page->private on tail pages for THP_SWAP + cleanups").

 - Cleanups and speedups to the hugetlb fault handling from Matthew
   Wilcox ("Change calling convention for ->huge_fault").

 - Matthew Wilcox has also done some maintenance work on the MM
   subsystem documentation ("Improve mm documentation").

* tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (489 commits)
  maple_tree: shrink struct maple_tree
  maple_tree: clean up mas_wr_append()
  secretmem: convert page_is_secretmem() to folio_is_secretmem()
  nios2: fix flush_dcache_page() for usage from irq context
  hugetlb: add documentation for vma_kernel_pagesize()
  mm: add orphaned kernel-doc to the rst files.
  mm: fix clean_record_shared_mapping_range kernel-doc
  mm: fix get_mctgt_type() kernel-doc
  mm: fix kernel-doc warning from tlb_flush_rmaps()
  mm: remove enum page_entry_size
  mm: allow ->huge_fault() to be called without the mmap_lock held
  mm: move PMD_ORDER to pgtable.h
  mm: remove checks for pte_index
  memcg: remove duplication detection for mem_cgroup_uncharge_swap
  mm/huge_memory: work on folio->swap instead of page->private when splitting folio
  mm/swap: inline folio_set_swap_entry() and folio_swap_entry()
  mm/swap: use dedicated entry for swap in folio
  mm/swap: stop using page->private on tail pages for THP_SWAP
  selftests/mm: fix WARNING comparing pointer to 0
  selftests: cgroup: fix test_kmem_memcg_deletion kernel mem check
  ...
2023-08-29 14:25:26 -07:00
Kefeng Wang
f7992bfaf3 drm/amdkfd: use vma_is_initial_stack() and vma_is_initial_heap()
Use the helpers to simplify code.

Link: https://lkml.kernel.org/r/20230728050043.59880-3-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Christian Göttsche <cgzones@googlemail.com>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21 13:37:31 -07:00
James Zhu
b25fdc048c drm/amdgpu: skip xcp drm device allocation when out of drm resource
Return 0 when drm device alloc failed with -ENOSPC in
order to  allow amdgpu drive loading. But the xcp without
drm device node assigned won't be visiable in user space.
This helps amdgpu driver loading on system which has more
than 64 nodes, the current limitation.

The proposal to add more drm nodes is discussed in public,
which will support up to 2^20 nodes totally.
kernel drm:
https://lore.kernel.org/lkml/20230724211428.3831636-1-michal.winiarski@intel.com/T/
libdrm:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305

Signed-off-by: James Zhu <James.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-16 15:46:39 -04:00
James Zhu
400a39f1ec drm/amdgpu: skip xcp drm device allocation when out of drm resource
Return 0 when drm device alloc failed with -ENOSPC in
order to  allow amdgpu drive loading. But the xcp without
drm device node assigned won't be visiable in user space.
This helps amdgpu driver loading on system which has more
than 64 nodes, the current limitation.

The proposal to add more drm nodes is discussed in public,
which will support up to 2^20 nodes totally.
kernel drm:
https://lore.kernel.org/lkml/20230724211428.3831636-1-michal.winiarski@intel.com/T/
libdrm:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305

Signed-off-by: James Zhu <James.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-16 11:34:11 -04:00
Ruan Jinjie
275e37221b drm/amdkfd: Remove unnecessary NULL values
The NULL initialization of the pointers assigned by kzalloc() first is
not necessary, because if the kzalloc() failed, the pointers will be
assigned NULL, otherwise it works as usual. so remove it.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15 18:08:27 -04:00
Jonathan Kim
8b4c350c4d drm/amdkfd: fix double assign skip process context clear
Remove redundant assignment when skipping process ctx clear.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15 18:07:42 -04:00
Atul Raut
bd6040b0ea drm/amdkfd: Use memdup_user() rather than duplicating its implementation
To prevent its redundant implementation and streamline
code, use memdup_user.

This fixes warnings reported by Coccinelle:
./drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c:2811:13-20: WARNING opportunity for memdup_user

Signed-off-by: Atul Raut <rauji.raut@gmail.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15 18:07:41 -04:00
Arnd Bergmann
475968fe4a drm/amdkfd: fix build failure without CONFIG_DYNAMIC_DEBUG
When CONFIG_DYNAMIC_DEBUG is disabled altogether, calling
_dynamic_func_call_no_desc() does not work:

drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c: In function 'svm_range_set_attr':
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:52:9: error: implicit declaration of function '_dynamic_func_call_no_desc' [-Werror=implicit-function-declaration]
   52 |         _dynamic_func_call_no_desc("svm_range_dump", svm_range_debug_dump, svms)
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:3564:9: note: in expansion of macro 'dynamic_svm_range_dump'
 3564 |         dynamic_svm_range_dump(svms);
      |         ^~~~~~~~~~~~~~~~~~~~~~

Add a compile-time conditional in addition to the runtime check.

Fixes: 8923137dbe ("drm/amdkfd: avoid svm dump when dynamic debug disabled")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15 18:07:28 -04:00
Jay Cornwall
a34cab4409 drm/amdkfd: Add missing tba_hi programming on aldebaran
Previously asymptomatic because high 32 bits were zero.

Fixes: 96c211f1f9 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole")
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-15 17:40:55 -04:00
Alex Deucher
80e28aaf93 drm/amdkfd: rename device_queue_manager_init_v10_navi10()
Drop the navi10 in the name for consistency with other
families.  All gfx10 parts use the same implementation.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-11 14:47:39 -04:00
Alex Deucher
c99a2e7ae2 drm/amdkfd: drop IOMMUv2 support
Now that we use the dGPU path for all APUs, drop the
IOMMUv2 support.

v2: drop the now unused queue manager functions for gfx7/8 APUs

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-11 14:47:25 -04:00
Alex Deucher
2b4adeb34f drm/amdkfd: disable IOMMUv2 support for Raven
Use the dGPU path instead.  There were a lot of platform
issues with IOMMU in general on these chips due to windows
not enabling IOMMU at the time.  The dGPU path has been
used for a long time with newer APUs and works fine.  This
also paves the way to simplify the driver significantly.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-11 14:47:14 -04:00
Alex Deucher
99c1501996 drm/amdkfd: disable IOMMUv2 support for KV/CZ
Use the dGPU path instead.  There were a lot of platform
issues with IOMMU in general on these chips due to windows
not enabling IOMMU at the time.  The dGPU path has been
used for a long time with newer APUs and works fine.  This
also paves the way to simplify the driver significantly.

v2: use the dGPU queue manager functions

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-11 14:47:03 -04:00
Alex Deucher
95979df25b drm/amdkfd: ignore crat by default
We are dropping the IOMMUv2 path, so no need to enable this.
It's often buggy on consumer platforms anyway.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-11 14:46:18 -04:00
Alex Deucher
091ae5473f drm/amdkfd: disable IOMMUv2 support for Raven
Use the dGPU path instead.  There were a lot of platform
issues with IOMMU in general on these chips due to windows
not enabling IOMMU at the time.  The dGPU path has been
used for a long time with newer APUs and works fine.  This
also paves the way to simplify the driver significantly.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-09 10:58:40 -04:00
Alex Deucher
616f92d188 drm/amdkfd: disable IOMMUv2 support for KV/CZ
Use the dGPU path instead.  There were a lot of platform
issues with IOMMU in general on these chips due to windows
not enabling IOMMU at the time.  The dGPU path has been
used for a long time with newer APUs and works fine.  This
also paves the way to simplify the driver significantly.

v2: use the dGPU queue manager functions

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-09 10:58:30 -04:00
Alex Deucher
a6dea2d64f drm/amdkfd: ignore crat by default
We are dropping the IOMMUv2 path, so no need to enable this.
It's often buggy on consumer platforms anyway.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-09 10:58:20 -04:00
Shashank Sharma
8da0d694a3 drm/amdgpu: remove unused functions and variables
This patch removes some variables and functions from KFD
doorbell handling code, which are no more required since
doorbell manager is handling doorbell calculations.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-07 17:14:07 -04:00
Shashank Sharma
2105a15a20 drm/amdgpu: use doorbell mgr for kfd process doorbells
This patch:
- adds a doorbell object in kfd pdd structure.
- allocates doorbells for a process while creating its queue.
- frees the doorbells with pdd destroy.
- moves doorbell bitmap init function to kfd_doorbell.c

PS: This patch ensures that we don't break the existing KFD
    functionality, but now KFD userspace library should also
    create doorbell pages as AMDGPU GEM objects using libdrm
    functions in userspace. The reference code for the same
    is available with AMDGPU Usermode queue libdrm MR. Once
    this is done, we will not need to create process doorbells
    in kernel.

V2: - Do not use doorbell wrapper API, use amdgpu_bo_create_kernel
      instead (Alex).
    - Do not use custom doorbell structure, instead use separate
      variables for bo and doorbell_bitmap (Alex)
V3:
   - Do not allocate doorbell page with PDD, delay doorbell process
     page allocation until really needed (Felix)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felilx.Kuehling@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-07 17:14:07 -04:00
Shashank Sharma
c318666510 drm/amdgpu: use doorbell mgr for kfd kernel doorbells
This patch:
- adds a doorbell bo in kfd device structure.
- creates doorbell page for kfd kernel usages.
- updates the get_kernel_doorbell and free_kernel_doorbell functions
  accordingly

V2: Do not use wrapper API, use direct amdgpu_create_kernel(Alex)
V3:
 - Move single variable declaration below (Christian)
 - Add a to-do item to reuse the KGD kernel level doorbells for
   KFD for non-MES cases, instead of reserving one page (Felix)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-07 17:14:07 -04:00
Jay Cornwall
05c899eacc drm/amdkfd: Sign-extend TMA address in trap handler
SMEM instructions can reach addresses above 47 bits but require
bit 47 to be sign-extended through bits [63:48].

This allows the TMA to be relocated in a following patch.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-07 17:14:06 -04:00
Jay Cornwall
96c211f1f9 drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole
The TBA and TMA, along with an unused IB allocation, reside at low
addresses in the VM address space. A stray VM fault which hits these
pages must be serviced by making their page table entries invalid.
The scheduler depends upon these pages being resident and fails,
preventing a debugger from inspecting the failure state.

By relocating these pages above 47 bits in the VM address space they
can only be reached when bits [63:48] are set to 1. This makes it much
less likely for a misbehaving program to generate accesses to them.
The current placement at VA (PAGE_SIZE*2) is readily hit by a NULL
access with a small offset.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-07 17:14:06 -04:00
Jay Cornwall
631ddc3553 drm/amdkfd: Sync trap handler binaries with source
Some changes have been lost during rebases. Rebuild sources.

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-07 17:14:06 -04:00
Alex Sierra
ab3400eb94 drm/amdkfd: avoid unmap dma address when svm_ranges are split
DMA address reference within svm_ranges should be unmapped only after
the memory has been released from the system. In case of range
splitting, the DMA address information should be copied to the
corresponding range after this has split. But leaving dma mapping
intact.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-07 16:36:44 -04:00
Daniel Vetter
3d00c59d14 amd-drm-next-6.6-2023-07-28:
amdgpu:
 - Lots of checkpatch cleanups
 - GFX 9.4.3 updates
 - Add USB PD and IFWI flashing documentation
 - GPUVM updates
 - RAS fixes
 - DRR fixes
 - FAMS fixes
 - Virtual display fixes
 - Soft IH fixes
 - SMU13 fixes
 - Rework PSP firmware loading for other IPs
 - Kernel doc fixes
 - DCN 3.0.1 fixes
 - LTTPR fixes
 - DP MST fixes
 - DCN 3.1.6 fixes
 - SubVP fixes
 - Display bandwidth calculation fixes
 - VCN4 secure submission fixes
 - Allow building DC on RISC-V
 - Add visible FB info to bo_print_info
 - HBR3 fixes
 - Add PSP 14.0 support
 - GFX9 MCBP fix
 - GMC10 vmhub index fix
 - GMC11 vmhub index fix
 - Create a new doorbell manager
 - SR-IOV fixes
 
 amdkfd:
 - Cleanup CRIU dma-buf handling
 - Use KIQ to unmap HIQ
 - GFX 9.4.3 debugger updates
 - GFX 9.4.2 debugger fixes
 - Enable cooperative groups fof gfx11
 - SVM fixes
 
 radeon:
 - Lots of checkpatch cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZMQ0vAAKCRC93/aFa7yZ
 2EOOAQCrsNf1IEynXVj0gVYOWFDpBCdaDkw+gXR73nOlwBeZzgD8DAoismXYDY95
 pkKlx/HL5O8qyZ25Lc9ZlgsJnTpnpw4=
 =c/Jk
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-next-6.6-2023-07-28' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.6-2023-07-28:

amdgpu:
- Lots of checkpatch cleanups
- GFX 9.4.3 updates
- Add USB PD and IFWI flashing documentation
- GPUVM updates
- RAS fixes
- DRR fixes
- FAMS fixes
- Virtual display fixes
- Soft IH fixes
- SMU13 fixes
- Rework PSP firmware loading for other IPs
- Kernel doc fixes
- DCN 3.0.1 fixes
- LTTPR fixes
- DP MST fixes
- DCN 3.1.6 fixes
- SubVP fixes
- Display bandwidth calculation fixes
- VCN4 secure submission fixes
- Allow building DC on RISC-V
- Add visible FB info to bo_print_info
- HBR3 fixes
- Add PSP 14.0 support
- GFX9 MCBP fix
- GMC10 vmhub index fix
- GMC11 vmhub index fix
- Create a new doorbell manager
- SR-IOV fixes

amdkfd:
- Cleanup CRIU dma-buf handling
- Use KIQ to unmap HIQ
- GFX 9.4.3 debugger updates
- GFX 9.4.2 debugger fixes
- Enable cooperative groups fof gfx11
- SVM fixes

radeon:
- Lots of checkpatch cleanups

Merge conflicts:
- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
	The switch to drm eu helpers in 8a206685d3 ("drm/amdgpu: use
	drm_exec for GEM and CSA handling v2") clashed with the
	cosmetic cleanups from 30953c4d00 ("drm/amdgpu: Fix style
	issues in amdgpu_gem.c"). I
	kept the former since the cleanup up code is gone.
- drivers/gpu/drm/amd/amdgpu/atom.c.
	adf64e2142 ("drm/amd: Avoid reading the VBIOS part number
	twice") removed code that 992b8fe106 ("drm/radeon: Replace
	all non-returning strlcpy with strscpy") polished.

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230728214228.8102-1-alexander.deucher@amd.com
[sima: some merge conflict wrangling as noted]
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2023-08-04 11:10:18 +02:00
Jonathan Kim
fc7f1d9697 drm/amdkfd: fix and enable ttmp setup for gfx11
The MES cached process context must be cleared on adding any queue for
the first time.

For proper debug support, the MES will clear it's cached process context
on the first call to SET_SHADER_DEBUGGER.

This allows TTMPs to be pesistently enabled in a safe manner.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Eric Huang <jinhuieric@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-27 14:59:29 -04:00
Ramesh Errabolu
6d67b681f9 drm/amdgpu: Checkpoint and Restore VRAM BOs without VA
Extend checkpoint logic to allow inclusion of VRAM BOs that
do not have a VA attached

Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-27 14:48:07 -04:00
Jonathan Kim
602816c3ee drm/amdkfd: fix trap handling work around for debugging
Update the list of devices that require the cwsr trap handling
workaround for debugging use cases.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Acked-by: Ruili Ji <ruili.ji@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 16:13:58 -04:00
Alex Sierra
8923137dbe drm/amdkfd: avoid svm dump when dynamic debug disabled
Set dynamic_svm_range_dump macro to avoid iterating over SVM lists
from svm_range_debug_dump when dynamic debug is disabled. Otherwise,
it could drop performance, specially with big number of SVM ranges.
Make sure both svm_range_set_attr and svm_range_debug_dump functions
are dynamically enabled to print svm_range_debug_dump debug traces.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Tested-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 13:44:05 -04:00
Jonathan Kim
7a1c5c6753 drm/amdkfd: enable cooperative groups for gfx11
MES can concurrently schedule queues on the device that require
exclusive device access if marked exclusively_scheduled without the
requirement of GWS.  Similar to the F32 HWS, MES will manage
quality of service for these queues.
Use this for cooperative groups since cooperative groups are device
occupancy limited.

Since some GFX11 devices can only be debugged with partial CUs, do not
allow the debugging of cooperative groups on these devices as the CU
occupancy limit will change on attach.

In addition, zero initialize the MES add queue submission vector for MES
initialization tests as we do not want these to be cooperative
dispatches.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25 13:35:43 -04:00
Jonathan Kim
cef600e1fd drm/amdkfd: fix trap handling work around for debugging
Update the list of devices that require the cwsr trap handling
workaround for debugging use cases.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Acked-by: Ruili Ji <ruili.ji@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-21 16:52:25 -04:00
Daniel Vetter
6c7f27441d drm-misc-next for v6.6:
UAPI Changes:
 
  * fbdev:
    * Make fbdev userspace interfaces optional; only leaves the
      framebuffer console active
 
  * prime:
    * Support dma-buf self-import for all drivers automatically: improves
      support for many userspace compositors
 
 Cross-subsystem Changes:
 
  * backlight:
    * Fix interaction with fbdev in several drivers
 
  * base: Convert struct platform.remove to return void; part of a larger,
    tree-wide effort
 
  * dma-buf: Acquire reservation lock for mmap() in exporters; part
    of an on-going effort to simplify locking around dma-bufs
 
  * fbdev:
    * Use Linux device instead of fbdev device in many places
    * Use deferred-I/O helper macros in various drivers
 
  * i2c: Convert struct i2c from .probe_new to .probe; part of a larger,
    tree-wide effort
 
  * video:
    * Avoid including <linux/screen_info.h>
 
 Core Changes:
 
  * atomic:
    * Improve logging
 
  * prime:
    * Remove struct drm_driver.gem_prime_mmap plus driver updates: all
      drivers now implement this callback with drm_gem_prime_mmap()
 
  * gem:
    * Support execution contexts: provides locking over multiple GEM
      objects
 
  * ttm:
    * Support init_on_free
    * Swapout fixes
 
 Driver Changes:
 
  * accel:
    * ivpu: MMU updates; Support debugfs
 
  * ast:
    * Improve device-model detection
    * Cleanups
 
  * bridge:
    * dw-hdmi: Improve support for YUV420 bus format
    * dw-mipi-dsi: Fix enable/disable of DSI controller
    * lt9611uxc: Use MODULE_FIRMWARE()
    * ps8640: Remove broken EDID code
    * samsung-dsim: Fix command transfer
    * tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups
    * Cleanups
 
  * ingenic:
    * Kconfig REGMAP fixes
 
  * loongson:
    * Support display controller
 
  * mgag200:
    * Minor fixes
 
  * mxsfb:
    * Support disabling overlay planes
 
  * nouveau:
    * Improve VRAM detection
    * Various fixes and cleanups
 
  * panel:
    * panel-edp: Support AUO B116XAB01.4
    * Support Visionox R66451 plus DT bindings
    * Cleanups
 
  * ssd130x:
    * Support per-controller default resolution plus DT bindings
    * Reduce memory-allocation overhead
    * Cleanups
 
  * tidss:
    * Support TI AM625 plus DT bindings
    * Implement new connector model plus driver updates
 
  * vkms
    * Improve write-back support
    * Documentation fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmSvvRAACgkQaA3BHVML
 eiNpGQgAs8jq1XjN9t8jZsdgXnoCbkZyVUI2NO0HwoVwpRCLgbXp5AX5qq2oRciE
 TBhe4Fceh/ZsYqHTZQahnguxgRKM5JgXwbI4Z0iiOVcqasNbycaKAqipxJJ7kdo1
 qPhGCbgQFVX7oIq2xjfXehh6O0SYX+R9r88X8dMJxMYv/pcLwOHG74kS040WOcQq
 uATgcnobOf/D8ZmlqvfKGAeTUoFo/RSR2Uhlauka58qgeUbicrTELZT2barY9d+k
 as6U5vv4wx2zMklTkjrlkMpAT1ZpbB9d3jGHwL27VEnjlfd3wV2bdH7Dzn9qZRf/
 gn0ALg/b3u5yBWk/k7YBvijXyNcH6Q==
 =bBuG
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2023-07-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v6.6:

UAPI Changes:

 * fbdev:
   * Make fbdev userspace interfaces optional; only leaves the
     framebuffer console active

 * prime:
   * Support dma-buf self-import for all drivers automatically: improves
     support for many userspace compositors

Cross-subsystem Changes:

 * backlight:
   * Fix interaction with fbdev in several drivers

 * base: Convert struct platform.remove to return void; part of a larger,
   tree-wide effort

 * dma-buf: Acquire reservation lock for mmap() in exporters; part
   of an on-going effort to simplify locking around dma-bufs

 * fbdev:
   * Use Linux device instead of fbdev device in many places
   * Use deferred-I/O helper macros in various drivers

 * i2c: Convert struct i2c from .probe_new to .probe; part of a larger,
   tree-wide effort

 * video:
   * Avoid including <linux/screen_info.h>

Core Changes:

 * atomic:
   * Improve logging

 * prime:
   * Remove struct drm_driver.gem_prime_mmap plus driver updates: all
     drivers now implement this callback with drm_gem_prime_mmap()

 * gem:
   * Support execution contexts: provides locking over multiple GEM
     objects

 * ttm:
   * Support init_on_free
   * Swapout fixes

Driver Changes:

 * accel:
   * ivpu: MMU updates; Support debugfs

 * ast:
   * Improve device-model detection
   * Cleanups

 * bridge:
   * dw-hdmi: Improve support for YUV420 bus format
   * dw-mipi-dsi: Fix enable/disable of DSI controller
   * lt9611uxc: Use MODULE_FIRMWARE()
   * ps8640: Remove broken EDID code
   * samsung-dsim: Fix command transfer
   * tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups
   * Cleanups

 * ingenic:
   * Kconfig REGMAP fixes

 * loongson:
   * Support display controller

 * mgag200:
   * Minor fixes

 * mxsfb:
   * Support disabling overlay planes

 * nouveau:
   * Improve VRAM detection
   * Various fixes and cleanups

 * panel:
   * panel-edp: Support AUO B116XAB01.4
   * Support Visionox R66451 plus DT bindings
   * Cleanups

 * ssd130x:
   * Support per-controller default resolution plus DT bindings
   * Reduce memory-allocation overhead
   * Cleanups

 * tidss:
   * Support TI AM625 plus DT bindings
   * Implement new connector model plus driver updates

 * vkms
   * Improve write-back support
   * Documentation fixes

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230713090830.GA23281@linux-uq9g
2023-07-17 15:37:57 +02:00
Jonathan Kim
8e43632695 drm/amdkfd: report dispatch id always saved in ttmps after gc9.4.2
The feature to save the dispatch ID in trap temporaries 6 & 7 on context
save is unconditionally enabled during MQD initialization.

Now that TTMPs are always setup regardless of debug mode for GC 9.4.3, we
should report that the dispatch ID is always available for debug/trap
handling.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12 12:22:52 -04:00
Mukul Joshi
1879e009a4 drm/amdkfd: Update CWSR grace period for GFX9.4.3
For GFX9.4.3, setup a reduced default CWSR grace period equal to
1000 cycles instead of 64000 cycles.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12 11:12:09 -04:00
Jonathan Kim
41b8a08109 drm/amdkfd: add multi-process debugging support for GC v9.4.3
Similar to GC v9.4.2, GC v9.4.3 should use the 5-Dword extended
MAP_PROCESS packet to support multi-process debugging.  Update the
mutli-process debug support list so that the KFD updates the runlist
on debug mode setting and that it allocates enough GTT memory during
KFD device initialization.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12 11:12:09 -04:00
Jonathan Kim
7a93cc579c drm/amdkfd: enable watch points globally for gfx943
Set watch points for all xcc instances on GFX943.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12 11:12:08 -04:00
Jonathan Kim
567db9e070 drm/amdkfd: restore debugger additional info for gfx v9_4_3
The additional information that the KFD reports to the debugger was
destroyed when the following commit was merged:
"drm/amdkfd: convert switches to IP version checking"

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Acked-by: Amber Lin <amber.lin@amd.com>
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12 11:10:19 -04:00
Eric Huang
036e348fdc drm/amdkfd: add kfd2kgd debugger callbacks for GC v9.4.3
Implement the similarities as GC v9.4.2, and the difference
for GC v9.4.3 HW spec, i.e. xcc instance.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12 10:58:01 -04:00
Christian König
8abc1eb298 drm/amdkfd: switch over to using drm_exec v3
Avoids quite a bit of logic and kmalloc overhead.

v2: fix multiple problems pointed out by Felix
v3: two more nit picks from Felix fixed

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230711133122.3710-4-christian.koenig@amd.com
2023-07-12 14:14:26 +02:00
Philip Yang
8c45b31909 drm/amdkfd: Skip handle mapping SVM range with no GPU access
If the SVM range has no GPU access nor access-in-place attribute,
validate and map to GPU should skip the range.

Add NULL pointer check if find_first_bit(ctx->bitmap, MAX_GPU_INSTANCE)
returns MAX_GPU_INSTANCE as gpuidx if ctx->bitmap is empty.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-07 13:51:48 -04:00
Mukul Joshi
9041b53a59 drm/amdkfd: Use KIQ to unmap HIQ
Currently, we unmap HIQ by directly writing to HQD
registers. This doesn't work for GFX9.4.3. Instead,
use KIQ to unmap HIQ, similar to how we use KIQ to
map HIQ. Using KIQ to unmap HIQ works for all GFX
series post GFXv9.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-07 13:51:48 -04:00
Ramesh Errabolu
95de7f26b5 drm/amdkfd: Access gpuvm_export_dmabuf() API to get Dmabuf
Directly invoking the function amdgpu_gem_prime_export() from within
KFD is not correct. By utilizing the KFD API to obtain Dmabuf, the
implementation can prevent the creation of multiple instances of
struct dma_buf.

Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: David Francis <David.Francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-07 13:51:47 -04:00
Mukul Joshi
d4300362a6 drm/amdkfd: Update interrupt handling for GFX 9.4.3
For GFX 9.4.3, interrupt handling needs to be updated for:
- Interrupt cookie will have a NodeId field. Each KFD
  node needs to check the NodeId before processing the
  interrupt.
- For CPX mode, there are additional checks of client ID
  needed to process the interrupt.
- Add NodeId to the process drain interrupt.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-30 13:11:35 -04:00
Mukul Joshi
fc133acc43 drm/amdkfd: Enable GWS on GFX9.4.3
Enable GWS capable queue creation for forward
progress gaurantee on GFX 9.4.3.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-30 13:06:49 -04:00
Alex Sierra
03d400e760 drm/amdkfd: set coherent host access capability flag
This flag determines whether the host possesses coherent access to
the memory of the device.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-23 15:35:34 -04:00