Commit Graph

13 Commits

Author SHA1 Message Date
John Harrison
16b7e65d29 drm/xe/guc: Track FAST_REQ H2Gs to report where errors came from
Most H2G messages are FAST_REQ which means no synchronous response is
expected. The messages are sent as fire-and-forget with no tracking.
However, errors can still be returned when something goes unexpectedly
wrong. That leads to confusion due to not being able to match up the
error response to the originating H2G.

So add support for tracking the FAST_REQ H2Gs and matching up an error
response to its originator. This is only enabled in XE_DEBUG builds
given that such errors should never happen in a working system and
there is an overhead for the tracking.

Further, if XE_DEBUG_GUC is enabled then even more memory and time is
used to record the call stack of each H2G and report that with the
error. That makes it much easier to work out where a specific H2G came
from if there are multiple code paths that can send it.

v2: Some re-wording of comments and prints, more consistent use of #if
vs stub functions - review feedback from Daniele & Michal).
v3: Split config change to separate patch, improve a debug print
(review feedback from Michal).
v4: Bunch of minor tweaks (review feedback from Michal).

Original-i915-code: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://lore.kernel.org/r/20250512215324.1457009-5-John.C.Harrison@Intel.com
2025-05-15 12:27:37 -07:00
John Harrison
d7d97890e2 drm/xe/guc: Rename CONFIG_XE_LARGE_GUC_BUFFER
Rename XE_LARGE_GUC_BUFFER to XE_DEBUG_GUC to allow for more debug
only code (in subsequent patch) without adding more config defines
that each control only a single thing.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://lore.kernel.org/r/20250512215324.1457009-4-John.C.Harrison@Intel.com
2025-05-15 12:27:36 -07:00
Nitin Gote
75fd04f276 drm/xe: Fix all typos in xe
Fix all typos in files of xe, reported by codespell tool.

Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250106102646.1400146-2-nitin.r.gote@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2025-01-09 17:58:09 +01:00
Ilia Levi
b46afdac45 drm/xe: Introduce dedicated config for memirq debug
Separate config for debugging memory based interrupts (memirq)
infrastructure.

Signed-off-by: Ilia Levi <ilia.levi@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240918053942.1331811-2-illevi@habana.ai
2024-09-19 10:03:38 +02:00
José Roberto de Souza
83ee002df0
drm/xe: Nuke simple error capture
This error capture prints into dmesg HW state when a gpu hang happens.
It was useful when we did not had devcoredump, now it is a incompleted
version of devcoredump that has potential to flood dmesg.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240522203431.191594-1-jose.souza@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-23 13:38:26 -04:00
Arnd Bergmann
0e6fec6da2 drm/xe/kunit: fix link failure with built-in xe
When the driver is built-in but the tests are in loadable modules,
the helpers don't actually get put into the driver:

ERROR: modpost: "xe_kunit_helper_alloc_xe_device" [drivers/gpu/drm/xe/tests/xe_test.ko] undefined!

Change the Makefile to ensure they are always part of the driver
even when the rest of the kunit tests are in loadable modules.

Fixes: 5095d13d75 ("drm/xe/kunit: Define helper functions to allocate fake xe device")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-1-arnd@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-02-28 11:38:12 -08:00
Michal Wajdeczko
ed750833f1 drm/xe: Define DRM_XE_DEBUG_SRIOV config
We will be using extra logs during enabling of the SR-IOV features
or when adding support for new platforms. Define separate config
flag to keep that low level logs disabled if we're not debugging.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20231128151507.1015-2-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:27 -05:00
Lucas De Marchi
4cc0440229 drm/xe: Add basic unit tests for rtp
Add some basic unit tests for rtp. This is intended to prove the
functionality of the rtp itself, like coalescing entries, rejecting
non-disjoint values, etc.

Contrary to the other tests in xe, this is a unit test to test the
sw-side only, so it can be executed on any machine - it doesn't interact
with the real hardware. Running it produces the following output:

	$ ./tools/testing/kunit/kunit.py run --raw_output-kunit  \
		--kunitconfig drivers/gpu/drm/xe/.kunitconfig xe_rtp
	...
	[01:26:27] Starting KUnit Kernel (1/1)...
	KTAP version 1
	1..1
	    KTAP version 1
	    # Subtest: xe_rtp
	    1..1
		KTAP version 1
		# Subtest: xe_rtp_process_tests
		ok 1 coalesce-same-reg
		ok 2 no-match-no-add
		ok 3 no-match-no-add-multiple-rules
		ok 4 two-regs-two-entries
		ok 5 clr-one-set-other
		ok 6 set-field
	[drm:xe_reg_sr_add] *ERROR* Discarding save-restore reg 0001 (clear: 00000001, set: 00000001, masked: no): ret=-22
		ok 7 conflict-duplicate
	[drm:xe_reg_sr_add] *ERROR* Discarding save-restore reg 0001 (clear: 00000003, set: 00000000, masked: no): ret=-22
		ok 8 conflict-not-disjoint
	[drm:xe_reg_sr_add] *ERROR* Discarding save-restore reg 0001 (clear: 00000002, set: 00000002, masked: no): ret=-22
	[drm:xe_reg_sr_add] *ERROR* Discarding save-restore reg 0001 (clear: 00000001, set: 00000001, masked: yes): ret=-22
		ok 9 conflict-reg-type
	    # xe_rtp_process_tests: pass:9 fail:0 skip:0 total:9
	    ok 1 xe_rtp_process_tests
	# Totals: pass:9 fail:0 skip:0 total:9
	ok 1 xe_rtp
	...

Note that the ERRORs in the kernel log are expected since it's testing
incompatible entries.

v2:
  - Use parameterized table for tests  (Michał Winiarski)
  - Move everything to the xe_rtp_test.ko and only add a few exports to the
    right namespace
  - Add more tests to cover FIELD_SET, CLR, partially true rules, etc

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst<maarten.lankhorst@linux.intel.com> # v1
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://lore.kernel.org/r/20230401085151.1786204-7-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:32 -05:00
Matthew Auld
2a8477f761 drm/xe: s/lmem/vram/
This seems to be the preferred nomenclature in xe. Currently we are
intermixing vram and lmem, which is confusing.

v2 (Gwan-gyeong Mun & Lucas):
  - Rather apply to the entire driver

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:45 -05:00
Thomas Hellström
282c683a56 drm/xe/tests: Remove CONFIG_FB dependency
We currently don't have any tests that explicitly depends on this
config option, so remove that build dependency.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:44 -05:00
Mauro Carvalho Chehab
a4c75c0fd6 drm/xe: KUnit tests depend on CONFIG_DRM_FBDEV_EMULATION
ERROR:root:../drivers/gpu/drm/xe/display/intel_fbdev.c:585:5: error: redefinition of ‘intel_fbdev_init’
  585 | int intel_fbdev_init(struct drm_device *dev)
      |     ^~~~~~~~~~~~~~~~
In file included from ../drivers/gpu/drm/xe/display/intel_fbdev.c:55:
../drivers/gpu/drm/xe/display/intel_fbdev.h:26:19: note: previous definition of ‘intel_fbdev_init’ with type ‘int(struct drm_device *)’
   26 | static inline int intel_fbdev_init(struct drm_device *dev)
      |                   ^~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.c:626:6: error: redefinition of ‘intel_fbdev_initial_config_async’
  626 | void intel_fbdev_initial_config_async(struct drm_device *dev)
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.h:31:20: note: previous definition of ‘intel_fbdev_initial_config_async’ with type ‘void(struct drm_device *)’
   31 | static inline void intel_fbdev_initial_config_async(struct drm_device *dev)
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.c:646:6: error: redefinition of ‘intel_fbdev_unregister’
  646 | void intel_fbdev_unregister(struct drm_i915_private *dev_priv)
      |      ^~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.h:35:20: note: previous definition of ‘intel_fbdev_unregister’ with type ‘void(struct xe_device *)’
   35 | static inline void intel_fbdev_unregister(struct drm_i915_private *dev_priv)
      |                    ^~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.c:661:6: error: redefinition of ‘intel_fbdev_fini’
  661 | void intel_fbdev_fini(struct drm_i915_private *dev_priv)
      |      ^~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.h:39:20: note: previous definition of ‘intel_fbdev_fini’ with type ‘void(struct xe_device *)’
   39 | static inline void intel_fbdev_fini(struct drm_i915_private *dev_priv)
      |                    ^~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.c:692:6: error: redefinition of ‘intel_fbdev_set_suspend’
  692 | void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous)
      |      ^~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.h:43:20: note: previous definition of ‘intel_fbdev_set_suspend’ with type ‘void(struct drm_device *, int,  bool)’ {aka ‘void(struct drm_device *, int,  _Bool)’}
   43 | static inline void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous)
      |                    ^~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.c:751:6: error: redefinition of ‘intel_fbdev_output_poll_changed’
  751 | void intel_fbdev_output_poll_changed(struct drm_device *dev)
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.h:47:20: note: previous definition of ‘intel_fbdev_output_poll_changed’ with type ‘void(struct drm_device *)’
   47 | static inline void intel_fbdev_output_poll_changed(struct drm_device *dev)
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.c:770:6: error: redefinition of ‘intel_fbdev_restore_mode’
  770 | void intel_fbdev_restore_mode(struct drm_device *dev)
      |      ^~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.h:51:20: note: previous definition of ‘intel_fbdev_restore_mode’ with type ‘void(struct drm_device *)’
   51 | static inline void intel_fbdev_restore_mode(struct drm_device *dev)
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.c:785:27: error: redefinition of ‘intel_fbdev_framebuffer’
  785 | struct intel_framebuffer *intel_fbdev_framebuffer(struct intel_fbdev *fbdev)
      |                           ^~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_fbdev.h:54:41: note: previous definition of ‘intel_fbdev_framebuffer’ with type ‘struct intel_framebuffer *(struct intel_fbdev *)’
   54 | static inline struct intel_framebuffer *intel_fbdev_framebuffer(struct intel_fbdev *fbdev)
      |                                         ^~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:27:43 -05:00
Mauro Carvalho Chehab
1598955dfc drm/xe/Kconfig.debug: select DEBUG_FS for KUnit runs
KUnit reuquires debugfs, as otherwise, it won't build:

$ make ARCH=x86_64 O=.kunit --jobs=8
ERROR:root:../drivers/gpu/drm/xe/display/intel_display_debugfs.c:1612:6: error: redefinition of ‘intel_display_debugfs_register’
 1612 | void intel_display_debugfs_register(struct drm_i915_private *i915)
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../drivers/gpu/drm/xe/display/intel_display_debugfs.c:18:
../drivers/gpu/drm/xe/display/intel_display_debugfs.h:18:20: note: previous definition of ‘intel_display_debugfs_register’ with type ‘void(struct xe_device *)’
   18 | static inline void intel_display_debugfs_register(struct drm_i915_private *i915) {}
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_display_debugfs.c:1935:6: error: redefinition of ‘intel_connector_debugfs_add’
 1935 | void intel_connector_debugfs_add(struct intel_connector *intel_connector)
      |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_display_debugfs.h:19:20: note: previous definition of ‘intel_connector_debugfs_add’ with type ‘void(struct intel_connector *)’
   19 | static inline void intel_connector_debugfs_add(struct intel_connector *connector) {}
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_display_debugfs.c:1993:6: error: redefinition of ‘intel_crtc_debugfs_add’
 1993 | void intel_crtc_debugfs_add(struct drm_crtc *crtc)
      |      ^~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/xe/display/intel_display_debugfs.h:20:20: note: previous definition of ‘intel_crtc_debugfs_add’ with type ‘void(struct drm_crtc *)’
   20 | static inline void intel_crtc_debugfs_add(struct drm_crtc *crtc) {}
      |                    ^~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:27:43 -05:00
Matthew Brost
dd08ebf6c3 drm/xe: Introduce a new DRM driver for Intel GPUs
Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms starting with Tiger Lake (first Intel Xe Architecture).

The code is at a stage where it is already functional and has experimental
support for multiple platforms starting from Tiger Lake, with initial
support implemented in Mesa (for Iris and Anv, our OpenGL and Vulkan
drivers), as well as in NEO (for OpenCL and Level0).

The new Xe driver leverages a lot from i915.

As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there. But it is not added
in this patch.

This initial work is a collaboration of many people and unfortunately
the big squashed patch won't fully honor the proper credits. But let's
get some git quick stats so we can at least try to preserve some of the
credits:

Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Co-developed-by: Francois Dugast <francois.dugast@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Co-developed-by: Philippe Lecluse <philippe.lecluse@intel.com>
Co-developed-by: Nirmoy Das <nirmoy.das@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: José Roberto de Souza <jose.souza@intel.com>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Co-developed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Co-developed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
2023-12-12 14:05:48 -05:00