Commit Graph

34 Commits

Author SHA1 Message Date
Linus Torvalds
260f6f4fda drm for 6.17-rc1
non-drm:
 rust:
 - make ETIMEDOUT available
 - add size constants up to SZ_2G
 - add DMA coherent allocation bindings
 mtd:
 - driver for Intel GPU non-volatile storage
 i2c
 - designware quirk for Intel xe
 
 core:
 - atomic helpers: tune enable/disable sequences
 - add task info to wedge API
 - refactor EDID quirks
 - connector: move HDR sink to drm_display_info
 - fourcc: half-float and 32-bit float formats
 - mode_config: pass format info to simplify
 
 dma-buf:
 - heaps: Give CMA heap a stable name
 
 ci:
 - add device tree validation and kunit
 
 displayport:
 - change AUX DPCD access probe address
 - add quirk for DPCD probe
 - add panel replay definitions
 - backlight control helpers
 
 fbdev:
 - make CONFIG_FIRMWARE_EDID available on all arches
 
 fence:
 - fix UAF issues
 
 format-helper:
 - improve tests
 
 gpusvm:
 - introduce devmem only flag for allocation
 - add timeslicing support to GPU SVM
 
 ttm:
 - improve eviction
 
 sched:
 - tracing improvements
 - kunit improvements
 - memory leak fixes
 - reset handling improvements
 
 color mgmt:
 - add hardware gamma LUT handling helpers
 
 bridge:
 - add destroy hook
 - switch to reference counted drm_bridge allocations
 - tc358767: convert to devm_drm_bridge_alloc
 - improve CEC handling
 
 panel:
 - switch to reference counter drm_panel allocations
 - fwnode panel lookup
 - Huiling hl055fhv028c support
 - Raspberry Pi 7" 720x1280 support
 - edp: KDC KD116N3730A05, N160JCE-ELL CMN, N116BCJ-EAK
 - simple: AUO P238HAN01
 - st7701: Winstar wf40eswaa6mnn0
 - visionox: rm69299-shift
 - Renesas R61307, Renesas R69328 support
 - DJN HX83112B
 
 hdmi:
 - add CEC handling
 - YUV420 output support
 
 xe:
 - WildCat Lake support
 - Enable PanthorLake by default
 - mark BMG as SRIOV capable
 - update firmware recommendations
 - Expose media OA units
 - aux-bux support for non-volatile memory
 - MTD intel-dg driver for non-volatile memory
 - Expose fan control and voltage regulator in sysfs
 - restructure migration for multi-device
 - Restore GuC submit UAF fix
 - make GEM shrinker drm managed
 - SRIOV VF Post-migration recovery of GGTT nodes
 - W/A additions/reworks
 - Prefetch support for svm ranges
 - Don't allocate managed BO for each policy change
 - HWMON fixes for BMG
 - Create LRC BO without VM
 - PCI ID updates
 - make SLPC debugfs files optional
 - rework eviction rejection of bound external BOs
 - consolidate PAT programming logic for pre/post Xe2
 - init changes for flicker-free boot
 - Enable GuC Dynamic Inhibit Context switch
 
 i915:
 - drm_panic support for i915/xe
 - initial flip queue off by default for LNL/PNL
 - Wildcat Lake Display support
 - Support for DSC fractional link bpp
 - Support for simultaneous Panel Replay and Adaptive sync
 - Support for PTL+ double buffer LUT
 - initial PIPEDMC event handling
 - drm_panel_follower support
 - DPLL interface renames
 - allocate struct intel_display dynamically
 - flip queue preperation
 - abstract DRAM detection better
 - avoid GuC scheduling stalls
 - remove DG1 force probe requirement
 - fix MEI interrupt handler on RT kernels
 - use backlight control helpers for eDP
 - more shared display code refactoring
 
 amdgpu:
 - add userq slot to INFO ioctl
 - SR-IOV hibernation support
 - Suspend improvements
 - Backlight improvements
 - Use scaling for non-native eDP modes
 - cleaner shader updates for GC 9.x
 - Remove fence slab
 - SDMA fw checks for userq support
 - RAS updates
 - DMCUB updates
 - DP tunneling fixes
 - Display idle D3 support
 - Per queue reset improvements
 - initial smartmux support
 
 amdkfd:
 - enable KFD on loongarch
 - mtype fix for ext coherent system memory
 
 radeon:
 - CS validation additional GL extensions
 - drop console lock during suspend/resume
 - bump driver version
 
 msm:
 - VM BIND support
 - CI: infrastructure updates
 - UBWC single source of truth
 - decouple GPU and KMS support
 - DP: rework I/O accessors
 - DPU: SM8750 support
 - DSI: SM8750 support
 - GPU: X1-45 support and speedbin support for X1-85
 - MDSS: SM8750 support
 
 nova:
 - register! macro improvements
 - DMA object abstraction
 - VBIOS parser + fwsec lookup
 - sysmem flush page support
 - falcon: generic falcon boot code and HAL
 - FWSEC-FRTS: fb setup and load/execute
 
 ivpu:
 - Add Wildcat Lake support
 - Add turbo flag
 
 ast:
 - improve hardware generations implementation
 
 imx:
 - IMX8qxq Display Controller support
 
 lima:
 - Rockchip RK3528 GPU support
 
 nouveau:
 - fence handling cleanup
 
 panfrost:
 - MT8370 support
 - bo labeling
 - 64-bit register access
 
 qaic:
 - add RAS support
 
 rockchip:
 - convert inno_hdmi to a bridge
 
 rz-du:
 - add RZ/V2H(P) support
 - MIPI-DSI DCS support
 
 sitronix:
 - ST7567 support
 
 sun4i:
 - add H616 support
 
 tidss:
 - add TI AM62L support
 - AM65x OLDI bridge support
 
 bochs:
 - drm panic support
 
 vkms:
 - YUV and R* format support
 - use faux device
 
 vmwgfx:
 - fence improvements
 
 hyperv:
 - move out of simple
 - add drm_panic support
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmiJM/0ACgkQDHTzWXnE
 hr6MpA/+JJKGdSdrE95QkaMcOZh/3e3areGXZ0V/RrrJXdB4/DoAfQSHhF0H7m7y
 MhBGVLGNMXq7KHrz28p1MjLHrE1mwmvJ6hZ4J076ed4u9naoCD0m6k5w5wiue+KL
 HyPR54ADxN0BYmgV0l/B0wj42KsHyTO4x4hdqPJu02V9Dtmx6FCh2ujkOF3p9nbK
 GMwWDttl4KEKljD0IvQ9YIYJ66crYGx/XmZi7JoWRrS104K/h1u8qZuXBp5jVKTy
 OZRAVyLdmJqdTOLH7l599MBBcEd/bNV37/LVwF4T5iFunEKOAiyN0QY0OR+IeRVh
 ZfOv2/gp4UNyIfyahQ7LKLgEilNPGHoPitvDJPvBZxW2UjwXVNvA1QfdK5DAlVRS
 D5NoFRjlFFCz8/c2hQwlKJ9o7eVgH3/pK0mwR7SPGQTuqzLFCrAfCuzUvg/gV++6
 JFqmGKMHeCoxO2o4GMrwjFttStP41usxtV/D+grcbPteNO9UyKJS4C38n4eamJXM
 a9Sy9APuAb6F0w5+yMItEF7TQifgmhIbm5AZHlxE1KoDQV6TdiIf1Gou5LeDGoL6
 OACbXHJPL52tUnfCRpbfI4tE/IVyYsfL01JnvZ5cZZWItXfcIz76ykJri+E0G60g
 yRl/zkimHKO4B0l/HSzal5xROXr+3VzeWehEiz/ot1VriP5OesA=
 =n9MO
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2025-07-30' of https://gitlab.freedesktop.org/drm/kernel

Pull drm updates from Dave Airlie:
 "Highlights:

   - Intel xe enable Panthor Lake, started adding WildCat Lake

   - amdgpu has a bunch of reset improvments along with the usual IP
     updates

   - msm got VM_BIND support which is important for vulkan sparse memory

   - more drm_panic users

   - gpusvm common code to handle a bunch of core SVM work outside
     drivers.

  Detail summary:

  Changes outside drm subdirectory:
   - 'shrink_shmem_memory()' for better shmem/hibernate interaction
   - Rust support infrastructure:
      - make ETIMEDOUT available
      - add size constants up to SZ_2G
      - add DMA coherent allocation bindings
   - mtd driver for Intel GPU non-volatile storage
   - i2c designware quirk for Intel xe

  core:
   - atomic helpers: tune enable/disable sequences
   - add task info to wedge API
   - refactor EDID quirks
   - connector: move HDR sink to drm_display_info
   - fourcc: half-float and 32-bit float formats
   - mode_config: pass format info to simplify

  dma-buf:
   - heaps: Give CMA heap a stable name

  ci:
   - add device tree validation and kunit

  displayport:
   - change AUX DPCD access probe address
   - add quirk for DPCD probe
   - add panel replay definitions
   - backlight control helpers

  fbdev:
   - make CONFIG_FIRMWARE_EDID available on all arches

  fence:
   - fix UAF issues

  format-helper:
   - improve tests

  gpusvm:
   - introduce devmem only flag for allocation
   - add timeslicing support to GPU SVM

  ttm:
   - improve eviction

  sched:
   - tracing improvements
   - kunit improvements
   - memory leak fixes
   - reset handling improvements

  color mgmt:
   - add hardware gamma LUT handling helpers

  bridge:
   - add destroy hook
   - switch to reference counted drm_bridge allocations
   - tc358767: convert to devm_drm_bridge_alloc
   - improve CEC handling

  panel:
   - switch to reference counter drm_panel allocations
   - fwnode panel lookup
   - Huiling hl055fhv028c support
   - Raspberry Pi 7" 720x1280 support
   - edp: KDC KD116N3730A05, N160JCE-ELL CMN, N116BCJ-EAK
   - simple: AUO P238HAN01
   - st7701: Winstar wf40eswaa6mnn0
   - visionox: rm69299-shift
   - Renesas R61307, Renesas R69328 support
   - DJN HX83112B

  hdmi:
   - add CEC handling
   - YUV420 output support

  xe:
   - WildCat Lake support
   - Enable PanthorLake by default
   - mark BMG as SRIOV capable
   - update firmware recommendations
   - Expose media OA units
   - aux-bux support for non-volatile memory
   - MTD intel-dg driver for non-volatile memory
   - Expose fan control and voltage regulator in sysfs
   - restructure migration for multi-device
   - Restore GuC submit UAF fix
   - make GEM shrinker drm managed
   - SRIOV VF Post-migration recovery of GGTT nodes
   - W/A additions/reworks
   - Prefetch support for svm ranges
   - Don't allocate managed BO for each policy change
   - HWMON fixes for BMG
   - Create LRC BO without VM
   - PCI ID updates
   - make SLPC debugfs files optional
   - rework eviction rejection of bound external BOs
   - consolidate PAT programming logic for pre/post Xe2
   - init changes for flicker-free boot
   - Enable GuC Dynamic Inhibit Context switch

  i915:
   - drm_panic support for i915/xe
   - initial flip queue off by default for LNL/PNL
   - Wildcat Lake Display support
   - Support for DSC fractional link bpp
   - Support for simultaneous Panel Replay and Adaptive sync
   - Support for PTL+ double buffer LUT
   - initial PIPEDMC event handling
   - drm_panel_follower support
   - DPLL interface renames
   - allocate struct intel_display dynamically
   - flip queue preperation
   - abstract DRAM detection better
   - avoid GuC scheduling stalls
   - remove DG1 force probe requirement
   - fix MEI interrupt handler on RT kernels
   - use backlight control helpers for eDP
   - more shared display code refactoring

  amdgpu:
   - add userq slot to INFO ioctl
   - SR-IOV hibernation support
   - Suspend improvements
   - Backlight improvements
   - Use scaling for non-native eDP modes
   - cleaner shader updates for GC 9.x
   - Remove fence slab
   - SDMA fw checks for userq support
   - RAS updates
   - DMCUB updates
   - DP tunneling fixes
   - Display idle D3 support
   - Per queue reset improvements
   - initial smartmux support

  amdkfd:
   - enable KFD on loongarch
   - mtype fix for ext coherent system memory

  radeon:
   - CS validation additional GL extensions
   - drop console lock during suspend/resume
   - bump driver version

  msm:
   - VM BIND support
   - CI: infrastructure updates
   - UBWC single source of truth
   - decouple GPU and KMS support
   - DP: rework I/O accessors
   - DPU: SM8750 support
   - DSI: SM8750 support
   - GPU: X1-45 support and speedbin support for X1-85
   - MDSS: SM8750 support

  nova:
   - register! macro improvements
   - DMA object abstraction
   - VBIOS parser + fwsec lookup
   - sysmem flush page support
   - falcon: generic falcon boot code and HAL
   - FWSEC-FRTS: fb setup and load/execute

  ivpu:
   - Add Wildcat Lake support
   - Add turbo flag

  ast:
   - improve hardware generations implementation

  imx:
   - IMX8qxq Display Controller support

  lima:
   - Rockchip RK3528 GPU support

  nouveau:
   - fence handling cleanup

  panfrost:
   - MT8370 support
   - bo labeling
   - 64-bit register access

  qaic:
   - add RAS support

  rockchip:
   - convert inno_hdmi to a bridge

  rz-du:
   - add RZ/V2H(P) support
   - MIPI-DSI DCS support

  sitronix:
   - ST7567 support

  sun4i:
   - add H616 support

  tidss:
   - add TI AM62L support
   - AM65x OLDI bridge support

  bochs:
   - drm panic support

  vkms:
   - YUV and R* format support
   - use faux device

  vmwgfx:
   - fence improvements

  hyperv:
   - move out of simple
   - add drm_panic support"

* tag 'drm-next-2025-07-30' of https://gitlab.freedesktop.org/drm/kernel: (1479 commits)
  drm/tidss: oldi: convert to devm_drm_bridge_alloc() API
  drm/tidss: encoder: convert to devm_drm_bridge_alloc()
  drm/amdgpu: move reset support type checks into the caller
  drm/amdgpu/sdma7: re-emit unprocessed state on ring reset
  drm/amdgpu/sdma6: re-emit unprocessed state on ring reset
  drm/amdgpu/sdma5.2: re-emit unprocessed state on ring reset
  drm/amdgpu/sdma5: re-emit unprocessed state on ring reset
  drm/amdgpu/gfx12: re-emit unprocessed state on ring reset
  drm/amdgpu/gfx11: re-emit unprocessed state on ring reset
  drm/amdgpu/gfx10: re-emit unprocessed state on ring reset
  drm/amdgpu/gfx9.4.3: re-emit unprocessed state on kcq reset
  drm/amdgpu/gfx9: re-emit unprocessed state on kcq reset
  drm/amdgpu: Add WARN_ON to the resource clear function
  drm/amd/pm: Use cached metrics data on SMUv13.0.6
  drm/amd/pm: Use cached data for min/max clocks
  gpu: nova-core: fix bounds check in PmuLookupTableEntry::new
  drm/amdgpu: Replace HQD terminology with slots naming
  drm/amdgpu: Add user queue instance count in HW IP info
  drm/amd/amdgpu: Add helper functions for isp buffers
  drm/amd/amdgpu: Initialize swnode for ISP MFD device
  ...
2025-07-30 19:26:49 -07:00
Lucas De Marchi
99e91521ce drm/xe: Fix build without debugfs
When CONFIG_DEBUG_FS is off, drivers/gpu/drm/xe/xe_gt_debugfs.o
is not built and build fails on some setups with:

	ld: drivers/gpu/drm/xe/xe_gt.o: in function `xe_fault_inject_gt_reset':
	drivers/gpu/drm/xe/xe_gt.h:27:(.text+0x1659): undefined reference to `gt_reset_failure'
	ld: drivers/gpu/drm/xe/xe_gt.h:27:(.text+0x1c16): undefined reference to `gt_reset_failure'
	collect2: error: ld returned 1 exit status

Do not use the gt_reset_failure attribute if debugfs is not enabled.

Fixes: 8f3013e0b2 ("drm/xe: Introduce fault injection for gt reset")
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250722-xe-fix-build-fault-v1-1-157384d50987@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 4d3bbe9dd28c0a4ca119e4b8823c5f5e9cb3ff90)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-07-24 21:06:52 +02:00
Michal Wajdeczko
ffab82b062 drm/xe: Introduce xe_gt_is_main_type helper
Instead of checking for not being a media type GT provide a small
helper to explicitly express our intentions.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20250713103625.1964-5-michal.wajdeczko@intel.com
2025-07-14 18:19:29 +02:00
Maarten Lankhorst
11bf0f0b3a drm/xe: Split init of xe_gt_init_hwconfig to xe_gt_init and *_early
Now that we added the separate step of initialising GUC in
xe_gt_init_early, it should be ok to initialise the minimum during early
init, and the rest after allocations are allowed.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250619104858.418440-20-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26 22:11:35 +02:00
Lucas De Marchi
ff6cd29b69 drm/xe: Cleanup unwind of gt initialization
The only thing in xe_gt_remove() that really needs to happen on the
device remove callback is the xe_uc_remove(). That's because of the
following call chain:

	xe_gt_remove()
	  xe_uc_remove()
	    xe_gsc_remove()
	      xe_gsc_proxy_remove()

Move xe_gsc_proxy_remove() to be handled as a xe_device_remove_action,
so it's recorded when it should run during device removal. The rest can
be handled normally by devm infra.

Besides removing the deep call chain above, xe_device_probe() doesn't
have to unwind the gt loop and it's also more in line with the
xe_device_probe() style.

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250213192909.996148-7-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-14 11:42:55 -08:00
Maciej Patelczyk
155c77f45f drm/xe: introduce xe_gt_reset and xe_gt_wait_for_reset
Add synchronous version gt reset as there are few places where it
is expected.
Also add a wait helper to wait until gt reset is done.

Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Fixes: f3bc5bb4d5 ("drm/xe: Allow userspace to configure CCS mode")
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241211111727.1481476-2-maciej.patelczyk@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2025-01-13 10:44:43 +01:00
Nitin Gote
75fd04f276 drm/xe: Fix all typos in xe
Fix all typos in files of xe, reported by codespell tool.

Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250106102646.1400146-2-nitin.r.gote@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2025-01-09 17:58:09 +01:00
Thomas Hellström
b88132ceb3 Merge drm/drm-next into drm-xe-next
Backmerging to resolve a conflict with core locally.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-10-04 11:46:30 +02:00
Matt Roper
58548b9110 drm/xe: Defer gt->mmio initialization until after multi-tile setup
With the recent xe_mmio redesign, tiles and GTs each have their own MMIO
accessor, with the GT inheriting some of the information (such as the
iomap pointer) from their containing tile.  Given that non-root tiles
get initialized later than the root tile (and currently after the point
at which GT MMIO is initialized for _all_ GTs), we wind up incorrectly
inheriting uninitialized pointers for the initialization of GT MMIO for
GTs that reside on non-root tiles.  This causes a driver crash on
multi-tile PVC platforms.

With the general xe_mmio redesign, it's now only necessary to do the
GT-level MMIO setup before the point we start reading/writing GT
registers.  Move initialization of gt->mmio out of xe_info_init (which
runs before non-root tiles are initialized) and to the beginning of
where we start actually accessing the GTs themselves.

The high-level initialization flow now boils down to:
 - General device init, software-only setup
 - (no register access possible yet)
 - Root tile initialization
 - (access to device/tile0 registers possible via xe_root_tile_mmio())
 - Initialization of non-root tiles
 - (access to any tile's registers possible via tile->mmio)
 - GT MMIO initialization, inheriting iomap from each GT's tile
 - (access to any GT's registers possible via gt->mmio)

Fixes: fa599b8c95 ("drm/xe: Populate GT's mmio iomap from tile during init")
Reported-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240917221615.875962-2-matthew.d.roper@intel.com
2024-09-18 12:52:53 -07:00
Maarten Lankhorst
501d799a47 drm/xe: Wire up device shutdown handler
The system is turning off, and we should probably put the device
in a safe power state. We don't need to evict VRAM or suspend running
jobs to a safe state, as the device is rebooted anyway.

This does not imply the system is necessarily reset, as we can
kexec into a new kernel. Without shutting down, things like
USB Type-C may mysteriously start failing.

References: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/3500
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
[mlankhorst: Add !xe_driver_flr_disabled assert]
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240905150052.174895-4-maarten.lankhorst@linux.intel.com
2024-09-11 19:07:57 +02:00
Jani Nikula
ccbfd2df30 drm/xe: clean up fault injection usage
With the proper stubs in place in linux/fault-inject.h, we can remove a
bunch of conditional compilation for CONFIG_FAULT_INJECTION=n.

Link: https://lkml.kernel.org/r/20240813121237.2382534-3-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Abhinav Kumar <quic_abhinavk@quicinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Rob Clark <robdclark@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:33 -07:00
Matthew Brost
c9474b726b
drm/xe: Wedge the entire device
Wedge the entire device, not just GT which may have triggered the wedge.
To implement this, cleanup the layering so xe_device_declare_wedged()
calls into the lower layers (GT) to ensure entire device is wedged.

While we are here, also signal any pending GT TLB invalidations upon
wedging device.

Lastly, short circuit reset wait if device is wedged.

v2:
 - Short circuit reset wait if device is wedged (Local testing)

Fixes: 8ed9aaae39 ("drm/xe: Force wedged state and block GT reset upon any GPU hang")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-1-matthew.brost@intel.com
(cherry picked from commit 7dbe8af13c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-18 10:25:33 -04:00
Matthew Brost
7dbe8af13c drm/xe: Wedge the entire device
Wedge the entire device, not just GT which may have triggered the wedge.
To implement this, cleanup the layering so xe_device_declare_wedged()
calls into the lower layers (GT) to ensure entire device is wedged.

While we are here, also signal any pending GT TLB invalidations upon
wedging device.

Lastly, short circuit reset wait if device is wedged.

v2:
 - Short circuit reset wait if device is wedged (Local testing)

Fixes: 8ed9aaae39 ("drm/xe: Force wedged state and block GT reset upon any GPU hang")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-1-matthew.brost@intel.com
2024-07-17 11:58:26 -07:00
Vinay Belgaumkar
3b1592fb78
drm/xe/lnl: Apply Wa_22019338487
This WA requires us to limit media GT frequency requests to a certain
cap value during driver load. Freq limits are restored after load
completes, so perf will not be affected during normal operations.

During normal driver operation, this WA requires dummy writes to media
offset 0x380D8C after every ~63 GGTT writes. This will ensure completion
of the LMEM writes originating from Gunit.

During driver unload(before FLR), the WA requires that we set requested
frequency to the cap value again.

v3: Do not use WA number in function name. Call WA wrapper from xe_device.
Rename some variables, check for locks in the correct function (Rodrigo).
Ensure reset path is also covered for this WA.

v4: Fix BAT failure

v5: Add a function pointer for ggtt_ops (Michal W)

v6: Fix name collision and use static function (Rodrigo)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240620224928.3986377-2-vinay.belgaumkar@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-26 18:23:45 -04:00
Daniele Ceraolo Spurio
1c4324793e Revert "drm/xe: make gt_remove use devm"
This reverts commit cd506a33b0.

The gt_remove function was explicitly added as part of the remove flow
instead of using drmm/devm automatic cleanup due to it being illegal
to remove a component after the driver has been detached from the pci
device; the GSC proxy component is removed as part of gt_remove, so we
need to do it in the pci cleanup flow. The function already has a
comment above it to explain this.

Note that the change to use the devm also caused an invalid pointer
deref in the gsc_proxy unbind function, but I didn't bother to debug
which pointer was bad since we shouldn't be calling the unbind that
late anyway and this revert fixes it.

Both issue were not seen in CI because the GSC loading is temporarily
disabled due to a critical bug, which means we're not binding the
component.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240528182354.1200424-1-daniele.ceraolospurio@intel.com
2024-05-30 13:24:36 -07:00
Matthew Auld
cd506a33b0 drm/xe: make gt_remove use devm
No need to hand roll the onion unwind here, just move gt_remove over to
devm which will already have the correct ordering.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240522102143.128069-31-matthew.auld@intel.com
2024-05-22 13:22:40 +01:00
Lucas De Marchi
6aa18d7436 drm/xe: Add helper to return any available hw engine
Get the first available engine from a gt, which helps in the case any
engine serves as a context, like when reading RING_TIMESTAMP.

Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240517204310.88854-8-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-21 06:33:40 -07:00
Lucas De Marchi
baa1486552 drm/xe: Cache data about user-visible engines
gt->info.engine_mask used to indicate the available engines, but that
is not always true anymore: some engines are reserved to kernel and some
may be exposed as a single engine (e.g. with ccs_mode).

Runtime changes only happen when no clients exist, so it's safe to cache
the list of engines in the gt and update that when it's needed. This
will help implementing per client engine utilization so this (mostly
constant) information doesn't need to be re-calculated on every query.

Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240517204310.88854-7-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-21 06:33:40 -07:00
Niranjana Vishwanathapura
d6219e1cd5 drm/xe: Add Indirect Ring State support
When Indirect Ring State is enabled, the Ring Buffer state and
Batch Buffer state are context save/restored to/from Indirect
Ring State instead of the LRC. The Indirect Ring State is a 4K
page mapped in global GTT at a 4K aligned address. This address
is programmed in the INDIRECT_RING_STATE register of the
corresponding context's LRC.

v2: Fix kernel-doc, add bspec reference
v3: Fix typo in commit text

Bspec: 67296, 67139
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240507224255.5059-3-niranjana.vishwanathapura@intel.com
2024-05-08 14:48:30 -07:00
Michał Winiarski
bf8ec3c3e8 drm/xe: Initialize GuC earlier during probe
SR-IOV VF has limited access to MMIO registers. Fortunately, it is able
to access a curated subset that is needed to initialize the driver by
communicating with SR-IOV PF using GuC CT.
Initialize GuC earlier in order to keep the unified probe ordering
between VF and PF modes.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240219130530.1406044-4-michal.winiarski@intel.com
2024-02-20 14:13:47 -05:00
Daniele Ceraolo Spurio
997a55caa1 drm/xe/gsc: Initialize GSC proxy
The GSC uC needs to communicate with the CSME to perform certain
operations. Since the GSC can't perform this communication directly on
platforms where it is integrated in GT, the graphics driver needs to
transfer the messages from GSC to CSME and back. The proxy flow must be
manually started after the GSC is loaded to signal to GSC that we're
ready to handle its messages and allow it to query its init data from
CSME.

Note that the component must be removed before the pci_remove call
completes, so we can't use a drmm helper for it and we need to instead
perform the cleanup as part of the removal flow.

v2: add function documentation, more targeted memory clear, clearer logs
and variable names (Alan)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240117182621.2653049-2-daniele.ceraolospurio@intel.com
2024-01-18 11:04:34 -08:00
Rodrigo Vivi
e157f0f762 drm/xe: Fix build without CONFIG_FAULT_INJECTION
Ideally this header could be included without the CONFIG_FAULT_INJECTION
and it would take care itself for the includes it needs.
So, let's temporary workaround this by moving this below and including
only when CONFIG_FAULT_INJECTION is selected to avoid build breakages.

Another solution would be us including the linux/types.h as well, but
this creates unnecessary cases.

Reference: https://lore.kernel.org/all/20230816134748.979231-1-himal.prasad.ghimiray@intel.com/
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Oded Gabbay <ogabbay@kernel.org>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2023-12-21 11:47:38 -05:00
Lucas De Marchi
5a92da34dd drm/xe: Rename info.supports_* to info.has_*
Rename supports_mmio_ext and supports_usm to use a has_ prefix so the
flags are grouped together. This settles on just one variant for
positive info matching ("has_") and one for negative ("skip_").

Also make sure the has_* flags are grouped together in xe_pci.c.

Reviewed-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20231205145235.2114761-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:27 -05:00
Niranjana Vishwanathapura
0d97ecce16 drm/xe: Enable Fixed CCS mode setting
Disable dynamic HW load balancing of compute resource assignment
to engines and instead enabled fixed mode of mapping compute
resources to engines on all platforms with more than one compute
engine.

By default enable only one CCS engine with all compute slices
assigned to it. This is the desired configuration for common
workloads.

PVC platform supports only the fixed CCS mode (workaround 16016805146).

v2: Rebase, make it platform agnostic
v3: Minor code refactoring

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:26 -05:00
Himal Prasad Ghimiray
8f3013e0b2 drm/xe: Introduce fault injection for gt reset
To trigger gt reset failure:
 echo 100 >  /sys/kernel/debug/dri/<cardX>/fail_gt_reset/probability
 echo 2 >  /sys/kernel/debug/dri/<cardX>/fail_gt_reset/times

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:15 -05:00
Francois Dugast
3e8e7ee6a3 drm/xe: Cleanup style warnings
Reduce the number of warnings reported by checkpatch.pl from 118 to 48 by
addressing those warnings types:

  LEADING_SPACE
  LINE_SPACING
  BRACES
  TRAILING_SEMICOLON
  CONSTANT_COMPARISON
  BLOCK_COMMENT_STYLE
  RETURN_VOID
  ONE_SEMICOLON
  SUSPECT_CODE_INDENT
  LINE_CONTINUATIONS
  UNNECESSARY_ELSE
  UNSPECIFIED_INT
  UNNECESSARY_INT
  MISORDERED_TYPE

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:31 -05:00
Francois Dugast
4cd6d49259 drm/xe: Cleanup SPACING style issues
Remove almost all existing style issues of type SPACING reported
by checkpatch.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:30 -05:00
Matt Roper
f6929e80cd drm/xe: Allocate GT dynamically
In preparation for re-adding media GT support, switch the primary GT
within the tile to a dynamic allocation.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-19-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:15 -05:00
Matt Roper
08dea76745 drm/xe: Move migration from GT to tile
Migration primarily focuses on the memory associated with a tile, so it
makes more sense to track this at the tile level (especially since the
driver was already skipping migration operations on media GTs).

Note that the blitter engine used to perform the migration always lives
in the tile's primary GT today.  In theory that could change if media
GTs ever start including blitter engines in the future, but we can
extend the design if/when that happens in the future.

v2:
 - Fix kunit test build
 - Kerneldoc parameter name update
v3:
 - Removed leftover prototype for removed function.  (Gustavo)
 - Remove unrelated / unwanted error handling change.  (Gustavo)

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-15-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:15 -05:00
Matt Roper
ebd288cba7 drm/xe: Move VRAM from GT to tile
On platforms with VRAM, the VRAM is associated with the tile, not the
GT.

v2:
 - Unsquash the GGTT handling back into its own patch.
 - Fix kunit test build
v3:
 - Tweak the "FIXME" comment to clarify that this function will be
   completely gone by the end of the series.  (Lucas)
v4:
 - Move a few changes that were supposed to be part of the GGTT patch
   back to that commit.  (Gustavo)
v5:
 - Kerneldoc parameter name fix.

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-11-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:11 -05:00
Matt Roper
f79ee3013a drm/xe: Add backpointer from gt to tile
Rather than a backpointer to the xe_device, a GT should have a
backpointer to its tile (which can then be used to lookup the device if
necessary).

The gt_to_xe() helper macro (which moves from xe_gt.h to xe_gt_types.h)
can and should still be used to jump directly from an xe_gt to
xe_device.

v2:
 - Fix kunit test build
 - Move a couple changes to the previous patch. (Lucas)

Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-4-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:11 -05:00
Lucas De Marchi
8846ffb457 drm/xe: Allow const propagation in gt_to_xe()
Replace the inline function with a _Generic() so gt_to_xe() can work
with a const struct xe_gt*, which leads to a const struct xe *.
This allows a const gt being passed around and when the xe device is
needed, compiler won't issue a warning that calling gt_to_xe() would
discard the const. Rather, just propagate the const to the xe pointer
being returned.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:45 -05:00
Matthew Brost
da3799c975 drm/xe: Use GuC to do GGTT invalidations for the GuC firmware
Only the GuC should be issuing TLB invalidations if it is enabled. Part
of this patch is sanitize the device on driver unload to ensure we do
not send GuC based TLB invalidations during driver unload.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19 18:27:47 -05:00
Matthew Brost
dd08ebf6c3 drm/xe: Introduce a new DRM driver for Intel GPUs
Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms starting with Tiger Lake (first Intel Xe Architecture).

The code is at a stage where it is already functional and has experimental
support for multiple platforms starting from Tiger Lake, with initial
support implemented in Mesa (for Iris and Anv, our OpenGL and Vulkan
drivers), as well as in NEO (for OpenCL and Level0).

The new Xe driver leverages a lot from i915.

As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there. But it is not added
in this patch.

This initial work is a collaboration of many people and unfortunately
the big squashed patch won't fully honor the proper credits. But let's
get some git quick stats so we can at least try to preserve some of the
credits:

Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Co-developed-by: Francois Dugast <francois.dugast@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Co-developed-by: Philippe Lecluse <philippe.lecluse@intel.com>
Co-developed-by: Nirmoy Das <nirmoy.das@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: José Roberto de Souza <jose.souza@intel.com>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Co-developed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Co-developed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
2023-12-12 14:05:48 -05:00