Commit Graph

56 Commits

Author SHA1 Message Date
Michal Wajdeczko
e497957fee drm/xe/pf: Invalidate LMTT during LMEM unprovisioning
Invalidate LMTT immediately after removing VF's LMTT page tables
and clearing root PTE in the LMTT PD to avoid any invalid access
by the hardware (and VF) due to stale data.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250711193316.1920-6-michal.wajdeczko@intel.com
2025-07-15 13:05:20 +02:00
Matt Roper
457123d5a0 drm/xe: Don't compare GT ID to GT count when determining valid GTs
On current platforms with multiple GTs, all of the GT IDs are
consecutive; as a result we know that the GT IDs range from 0 to
gt_count-1 and can determine if a GT ID is valid by comparing against
the count.  The consecutive nature of GT IDs may not hold true on future
platforms if/when we have platforms that are both multi-tile and have
multiple GTs within each tile.  Once such platforms exist, it's quite
possible that we could wind up with something like a GT list composed of
IDs 0, 2, and 3 with no GT 1 (which would be a 2-tile platform with
media only on the second tile).

To future-proof the code we should stop comparing against the GT count
to determine whether a GT ID is valid or not.  Instead we should do an
actual lookup of the ID to determine whether the GT exists.  This also
means that our GT loop macro should not end at the GT count, but should
rather examine the entire space up to (# of tiles) * (max GT per tile)
to ensure it doesn't stop prematurely.

Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://lore.kernel.org/r/20250701201320.2514369-15-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-07-02 16:08:54 -07:00
Matt Roper
f8e0f4c526 drm/xe: Track maximum GTs per tile on a per-platform basis
Today all of our platforms fall into one of three cases:
 * Single tile platforms with a single (primary) GT
 * Single tile platforms with two GTs (primary + media)
 * Two-tile platforms with a single GT (primary) in each

Our numbering of GTs has been a bit inconsistent between platforms
(e.g., GT1 is the media GT on some platforms, but the second tile's
primary GT on others).  In the future we'll likely have platforms that
are both multi-tile and multi-GT, which will make the situation more
confusing.  We could also wind up with more than just two types of GTs
at some point in the future.

Going forward we should standardize the way we assign uapi GT IDs to
internal GT structures.  Let's declare that for userspace GT ID n,

  GT[n]'s tile             = n / (max gt per tile)
  GT[n]'s slot within tile = n % (max gt per tile)

We don't want the GT numbering to change for any of our current
platforms since the current IDs are part of our ABI contract with
userspace so this means we should track the 'max gt per tile' value on a
per-platform basis rather than just using a single value across the
driver.  Encode this into device descriptors in xe_pci.c and use the
per-platform number for various checks in the code.  Constant
XE_MAX_GT_PER_TILE will remain just as the maximum across all platforms
for easy of sizing array allocations.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250701201320.2514369-12-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-07-02 16:08:54 -07:00
Satyanarayana K V P
87c648c313 drm/xe: Add helper function to inject fault into ct_dead_capture()
When injecting fault to xe_guc_ct_send_recv() & xe_guc_mmio_send_recv()
functions, the CI test systems are going out of space and crashing. To
avoid this issue, a new helper function is created and when fault is
injected into this xe_is_injection_active() helper function, ct dead
capture is avoided which suppresses ct dumps in the log.

Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Suggested-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://lore.kernel.org/r/20250612080402.22011-1-satyanarayana.k.v.p@intel.com
2025-06-12 16:53:56 -07:00
Lucas De Marchi
d01bdc0025 drm/xe: Drop remove callback support
Now that devres supports component driver cleanup during driver removal
cleanup, the xe custom support for removal callbacks is not needed
anymore. Drop it.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250222001051.3012936-7-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-25 14:29:06 -08:00
Lucas De Marchi
776e3b502b drm/xe: Add callback support for driver remove
xe device probe uses devm cleanup in most places. However there are a
few cases where this is not possible: when the driver interacts with
component add/del. In that case, the resource group would be cleanup
while the entire device resources are in the process of cleanup.  One
example is the xe_gsc_proxy and display using that to interact with mei
and audio.

Add a callback-based remove so the exception doesn't make the probe
use multiple error handling styles.

v2: Change internal API to mimic the devm API. This will make it easier
    to migrate in future when devm can be used.

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250213192909.996148-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-14 11:39:29 -08:00
Piotr Piórkowski
cbc0a0ee34 drm/xe/pf: Use an explicit check to see if the device has LMTT
So far, the main condition for using LMTT has been to check that
the device is a discrete gfx.
Let's add a dedicated function to check if the device supports LMTT
as not all future discrete GPU platforms will require LMTT.

v2:
 - use xe_has_device_lmtt only when necessary - leave IS_DGFX for other
   things related to LMEM provisioning
v3:
 - remove IS_SRIOV_PF condition from xe_device_has_lmtt (Michal
   Wajdeczko)
 - keep IS_SRIOV_PF asserts in LMTT-related code (Michal Wajdeczko)
v4:
 - update commit description

Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250207113111.853821-2-piotr.piorkowski@intel.com
2025-02-10 11:10:14 +01:00
Ilia Levi
da889070be
drm/xe/irq: Separate MSI and MSI-X flows
A new flow is added for devices that support MSI-X:
- MSI-X vector 0 is used for GuC-to-host interrupt
- MSI-X vector 1 (aka default MSI-X) is used for HW engines

The default MSI-X will be passed to the HW engines in a subsequent
patch.

Signed-off-by: Ilia Levi <ilia.levi@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241213072538.6823-2-ilia.levi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-12-13 13:38:13 -05:00
Nirmoy Das
cbe006a649 drm/xe: Move LNL scheduling WA to xe_device.h
Move LNL scheduling WA to xe_device.h so this can be used in other
places without needing keep the same comment about removal of this WA
in the future. The WA, which flushes work or workqueues, is now wrapped
in macros and can be reused wherever needed.

Cc: Badal Nilawar <badal.nilawar@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
cc: stable@vger.kernel.org # v6.11+
Suggested-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241029120117.449694-1-nirmoy.das@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-11-01 06:23:26 -07:00
Ilia Levi
ef6103d20f drm/xe: memirq infra changes for MSI-X
When using MSI-X, hw engines report interrupt status and source to engine
instance 0. For this scenario, in order to differentiate between the
engines, we need to pass different status/source pointers in the LRC.

The requirements on those pointers are:
- Interrupt status should be 4KiB aligned
- Interrupt source should be 64 bytes aligned

To accommodate this, we duplicate the current memirq page layout -
allocating a page for each engine instance and pass this page in the LRC.
Note that the same page can be reused for different engine types.
For example, an LRC executing on CCS #x will have pointers to page #x,
and an LRC executing on BCS #x will have the same pointers. Thus, to
locate the proper page, the pointer accessors were modified to receive
the hw engine.

Signed-off-by: Ilia Levi <ilia.levi@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240918053942.1331811-5-illevi@habana.ai
2024-09-19 10:14:20 +02:00
Ilia Levi
6fa86e7ad4 drm/xe: Introduce xe_device_uses_memirq()
Simplify some memirq usage scenarios and asserts in memirq infrastructure.

Signed-off-by: Ilia Levi <ilia.levi@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240918053942.1331811-3-illevi@habana.ai
2024-09-19 10:10:26 +02:00
Matt Roper
9d383916a5 drm/xe: Move GSI offset adjustment fields into 'struct xe_mmio'
By moving the GSI adjustment fields into 'struct xe_mmio' we can replace
the GT's MMIO substructure with another instance of xe_mmio.  At the
moment this means MMIO operations wind up pulling information from two
different places (the tile's xe_mmio for the iomap and the GT's xe_mmio
for the adjustment), but we'll address that in future patches.

The type headers change a bit with this change, meaning that various
files should be including xe_device_types.h instead of (or in addition
to) xe_gt_types.h.

v2:
 - Fix pre-existing kerneldoc typo while moving the fields (Lucas)
v3:
 - Add missing '@' in kerneldoc.  (Rodrigo)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240910234719.3335472-49-matthew.d.roper@intel.com
2024-09-11 15:17:31 -07:00
Matt Roper
998fde0647 drm/xe: Move forcewake to 'gt.pm' substructure
Forcewake is a general GT power management concept that isn't specific
to MMIO register access.  Move the forcewake information for a GT out of
the 'mmio' substruct and into a 'pm' substruct.  Also use the gt_to_fw()
helper in a few more places where it was being open-coded.

v2:
 - Kerneldoc tweaks.  (Lucas)

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240910234719.3335472-46-matthew.d.roper@intel.com
2024-09-11 15:17:29 -07:00
Jani Nikula
83e5af5997 drm/i915 & drm/xe: save struct drm_device to drvdata
In the future, the display code shall not have any idea about struct
xe_device or struct drm_i915_private, but will need to get at the struct
drm_device via drvdata. Store the struct drm_device pointer to drvdata
instead of the driver specific pointer.

Avoid passing NULL to container_of() via to_i915()/to_xe_device(). (It
does return NULL for NULL pointers when the offset happens to be 0, but
otherwise returns garbage pointers for NULL.)

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/946805b32e38d4785880cc7857e01e6a309126a9.1724942754.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-09-02 15:01:59 +03:00
Francois Dugast
4099cfda9d drm/xe/device: Remove unused xe_device::usm::num_vm_in_*
Those counters were used to keep track of the numbers VMs in fault mode
and in non-fault mode, to determine if the whole device was in fault mode
or not. This is no longer needed so remove those variables and their
usages.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-12-francois.dugast@intel.com
2024-08-17 18:31:59 -07:00
Jani Nikula
d408d6f8cb drm/xe: add kdev_to_xe_device() helper and use it
There are enough users for kernel device to xe device conversion, add a
helper for it.

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/38c80846e70c7e410850530426384e17cff9d031.1723458544.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-08-13 12:11:49 +03:00
Michal Wajdeczko
bd40536ae3 drm/xe: Introduce const cast helper
Typically we want to preserve pointer constness when converting
from one xe pointer to another, but in some rare cases, like kunit
parameter conversions, we might want to discard this constness.
Add a helper that we will use to clearly indicate our intention.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> #v1
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240720142528.530-2-michal.wajdeczko@intel.com
2024-07-22 12:20:07 +02:00
Umesh Nerlige Ramappa
ce8c161cba drm/xe: Add ref counting for xe_file
Add ref counting for xe_file.

v2:
- Add kernel doc for exported functions (Matt)
- Instead of xe_file_destroy, export the get/put helpers (Lucas)

v3: Fixup the kernel-doc format and description (Matt, Lucas)

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-3-umesh.nerlige.ramappa@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-19 00:31:08 -07:00
Matthew Auld
01570b4469 drm/xe/bmg: implement Wa_16023588340
This involves enabling l2 caching of host side memory access to VRAM
through the CPU BAR. The main fallout here is with display since VRAM
writes from CPU can now be cached in GPU l2, and display is never
coherent with caches, so needs various manual flushing.  In the case of
fbc we disable it due to complications in getting this to work
correctly (in a later patch).

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Vinod Govindapillai <vinod.govindapillai@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703124338.208220-3-matthew.auld@intel.com
2024-07-05 09:53:12 +01:00
Jani Nikula
d754ed2821 Merge drm/drm-next into drm-intel-next
Sync to v6.10-rc3.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-06-19 11:38:31 +03:00
Michal Wajdeczko
b7f6318a9c drm/xe: Fix xe_device.h
Some explicit includes are needed only from the xe_device.c.
And there is no need for redundant forward declarations.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240507110959.2747-4-michal.wajdeczko@intel.com
2024-05-07 23:21:21 +02:00
Nirmoy Das
c01c6066e6 drm/xe/device: implement transient flush
Display surfaces can be tagged as transient by mapping it using one of
the various L3:XD PAT index modes on Xe2. The expectation is that KMD
needs to request transient data flush at the start of flip sequence to
ensure all transient data in L3 cache is flushed to memory. Add a
routine for this which we can then call from the display code.

v2: rebase(RK)

Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430172850.1881525-18-radhakrishna.sripada@intel.com
2024-05-03 13:15:54 -07:00
Rodrigo Vivi
6b8ef44cc0
drm/xe: Introduce the wedged_mode debugfs
So, the wedged mode can be selected per device at runtime,
before the tests or before reproducing the issue.

v2: - s/busted/wedged
    - some locking consistency

v3: - remove mutex
    - toggle guc reset policy on any mode change

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423221817.1285081-4-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-24 12:12:58 -04:00
Rodrigo Vivi
8ed9aaae39
drm/xe: Force wedged state and block GT reset upon any GPU hang
In many validation situations when debugging GPU Hangs,
it is useful to preserve the GT situation from the moment
that the timeout occurred.

This patch introduces a module parameter that could be used
on situations like this.

If xe.wedged module parameter is set to 2, Xe will be declared
wedged on every single execution timeout (a.k.a. GPU hang) right
after devcoredump snapshot capture and without attempting any
kind of GT reset and blocking entirely any kind of execution.

v2: Really block gt_reset from guc side. (Lucas)
    s/wedged/busted (Lucas)

v3: - s/busted/wedged
    - Really use global_flags (Dafna)
    - More robust timeout handling when wedging it.

v4: A really robust clean exit done by Matt Brost.
    No more kernel warns on unbind.

v5: Simplify error message (Lucas)

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Dafna Hirschfeld <dhirschfeld@habana.ai>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Himanshu Somaiya <himanshu.somaiya@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423221817.1285081-3-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-24 12:12:58 -04:00
Rodrigo Vivi
fb74b205cd
drm/xe: Introduce a simple wedged state
Introduce a very simple 'wedged' state where any attempt
to access the GPU is entirely blocked.

On some critical cases, like on gt_reset failure, we need to
block any other attempt to use the GPU. Otherwise we are at
a risk of reaching cases that would force us to reboot the machine.

So, when this cases are identified we corner and block any GPU
access. No IOCTL and not even another GT reset should be attempted.

The 'wedged' state in Xe is an end state with no way back.
Only a device "re-probe" (unbind + bind) can restore the GPU access.

v2: - s/wedged/busted (Lucas)
    - use unbind+bind instead of module reload (Lucas)
    - added more info on unbind operations and instruction on bug report
    - only print the message once.

v3: - s/busted/wedged (Ashutosh, Tvrtko, Thomas)
    - don't assume user has sudo and tee available (Lucas)

v4: - remove unnecessary cases around ct communication or migration.

Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Tvrtko Ursulin <tursulin@ursulin.net>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> #v2
Link: https://patchwork.freedesktop.org/patch/msgid/20240423221817.1285081-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-24 12:12:58 -04:00
Rodrigo Vivi
783d6cdc82
drm/xe: Kill xe_device_mem_access_{get*,put}
Let's simply convert all the current callers towards direct
xe_pm_runtime access and remove this extra layer of indirection.

No functional change is expected with this patch since
xe_mem_access_get was already using the xe_pm_runtime_get_noresume
at this point.

v2: Convert all the current callers instead of a big refactor
at once.

v3: - Rebased
    - Squashed the GSC/HDCP
    - Added a new case: sriov_pf_policy
    - Improved commit message to highlight that
      there's no functional change in this patch.

Reviewed-by: Matthew Auld <matthew.auld@intel.com> #v2
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240418143049.43231-1-rodrigo.vivi@intel.com
2024-04-22 09:03:09 -04:00
Rodrigo Vivi
16b57c90bb
drm/xe: Convert mem_access_if_ongoing to direct xe_pm_runtime_get_if_active
Now that assert_mem_access is relying directly on the pm_runtime state
instead of the counters, there's no reason why we cannot use
the pm_runtime functions directly.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240417203952.25503-8-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-18 08:31:40 -04:00
Rodrigo Vivi
a382291017
drm/xe: Removing extra mem_access protection from runtime pm
This is not needed any longer, now that we have all the protection
in place with the runtime pm itself.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240417203952.25503-7-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-18 08:31:40 -04:00
Rodrigo Vivi
8ae84a2744
drm/xe: Move lockdep protection from mem_access to xe_pm_runtime
The mem_access itself is not holding any lock, but attempting
to train lockdep with possible scarring locks happening during
runtime pm. We are going soon to kill the mem_access get and put
helpers in favor of direct xe_pm_runtime calls, so let's just
move this lock around to where it now belongs.

v2: s/lockdep_training/lockdep_prime (Matt Auld)

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240417203952.25503-4-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-18 08:31:39 -04:00
Matthew Auld
a96cd71ec7 drm/xe/device: fix XE_MAX_TILES_PER_DEVICE check
Here XE_MAX_TILES_PER_DEVICE is the gt array size, therefore the gt
index should always be less than.

v2 (Lucas):
  - Add fixes tag.

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318180532.57522-6-matthew.auld@intel.com
2024-03-19 08:49:00 +00:00
Matthew Auld
a5ef563b1d drm/xe/device: fix XE_MAX_GT_PER_TILE check
Here XE_MAX_GT_PER_TILE is the total, therefore the gt index should
always be less than.

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318180532.57522-5-matthew.auld@intel.com
2024-03-19 08:47:49 +00:00
Thomas Hellström
f1a9abc0cf drm/xe/uapi: Remove support for persistent exec_queues
Persistent exec_queues delays explicit destruction of exec_queues
until they are done executing, but destruction on process exit
is still immediate. It turns out no UMD is relying on this
functionality, so remove it. If there turns out to be a use-case
in the future, let's re-add.

Persistent exec_queues were never used for LR VMs

v2:
- Don't add an "UNUSED" define for the missing property
  (Lucas, Rodrigo)
v3:
- Remove the remaining struct xe_exec_queue::persistent state
  (Niranjana, Lucas)

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240209113444.8396-1-thomas.hellstrom@linux.intel.com
2024-02-19 12:54:48 +01:00
José Roberto de Souza
5746eaaa80 drm/xe: Add functions to convert regular address to canonical address and back
Some instructions requires canonical address like
MI_BATCH_BUFFER_START(UMDs must call xe_exec with a canonical address
for Xe2+).

So here adding functions to convert regular address to canonical
address and back, the first user of this functions will be added
in the next patch.

v3:
- inline removed
- rename highest_address_bit_get() to ppgtt_msb_get()

v4:
- use xe->info.va_bits instead of xe->info.dma_mask_size

BSpec: 47626
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Maarten Lankhorst <dev@lankhorst.se>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240130135648.30211-1-jose.souza@intel.com
2024-01-30 11:53:47 -08:00
José Roberto de Souza
4376cee620 drm/xe: Print more device information in devcoredump
To properly decode batch buffer Mesa tools needs to know what
platform is this one, for now we can do that with PCI id but
already making it future proof by also printing GTs GMD version.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Maarten Lankhorst <dev@lankhorst.se>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240123204454.246788-5-jose.souza@intel.com
2024-01-24 11:08:25 -08:00
Michal Wajdeczko
a6581ebe76 drm/xe/vf: Introduce Memory Based Interrupts Handler
The register based interrupts infrastructure does not scale
efficiently to allow delivering interrupts to a large number
of virtual machines. Memory based interrupt reporting provides
an efficient and scalable infrastructure.

Define handler to read and dispatch memory based interrupts.
We will use this handler in upcoming patch.

Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231214185955.1791-8-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
2023-12-21 16:31:29 -05:00
Michał Winiarski
4f122766f9 drm/xe/device: Introduce xe_device_probe_early
SR-IOV VF doesn't have access to MMIO registers used to determine
graphics/media ID. It can however communicate with GuC.
Introduce xe_device_probe_early, which initializes enough HW to use
MMIO GuC communication.
This will allow both VF and PF/native driver to have unified probe
ordering.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:45:11 -05:00
Michal Wajdeczko
d6d14854dd drm/xe: Add device flag to indicate SR-IOV support
The Single Root I/O Virtualization (SR-IOV) extension to
the PCI Express (PCIe) specification suite is supported
starting from 12th generation of Intel Graphics processors.

Add a device flag that we will use to enable SR-IOV specific
code paths and to indicate our readiness to support SR-IOV.

We will enable this flag for the specific platforms once all
required changes and additions will be ready and merged.

Bspec: 52391
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231115073804.1861-1-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:44:57 -05:00
Daniele Ceraolo Spurio
c4991ee01d drm/xe/uc: Rename guc_submission_enabled() to uc_enabled()
The guc_submission_enabled() function is being used as a boolean toggle
for all firmwares and all related features, not just GuC submission. We
could add additional flags/functions to distinguish and allow different
use-cases (e.g. loading HuC but not using GuC submission), but given
that not using GuC is a debug-only scenario having a global switch for
all FWs is enough. However, we want to make it clear that this switch
turns off everything, so rename it to uc_enabled().

v2: rebase on s/XE_WARN_ON/xe_assert

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:41:13 -05:00
Francois Dugast
9b9529ce37 drm/xe: Rename engine to exec_queue
Engine was inappropriately used to refer to execution queues and it
also created some confusion with hardware engines. Where it applies
the exec_queue variable name is changed to q and comments are also
updated.

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/162
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:20 -05:00
Rodrigo Vivi
c8dc154648 drm/xe: Invert guc vs execlists parameters and info.
The module parameter should reflect the name of the optional,
experimental and unsafe option, rather than the default one.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
2023-12-21 11:37:54 -05:00
Matthew Auld
9700a1df0a drm/xe: add lockdep annotation for xe_device_mem_access_put()
The main motivation is with d3cold which will make the suspend and
resume callbacks even more scary, but is useful regardless. We already
have the needed annotation on the acquire side with
xe_device_mem_access_get(), and by adding the annotation on the release
side we should have a lot more confidence that our locking hierarchy is
correct.

v2:
  - Move the annotation into both callbacks for better symmetry. Also
    don't hold over the entire mem_access_get(); we only need to lockep
    to understand what is being held upon entering mem_access_get(), and
    how that matches up with locks in the callbacks.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:53 -05:00
Matthew Auld
a00b8f1aae drm/xe: fix xe_device_mem_access_get() races
It looks like there is at least one race here, given that the
pm_runtime_suspended() check looks to return false if we are in the
process of suspending the device (RPM_SUSPENDING vs RPM_SUSPENDED).  We
later also do xe_pm_runtime_get_if_active(), but since the device is
suspending or has now suspended, this doesn't do anything either.
Following from this we can potentially return from
xe_device_mem_access_get() with the device suspended or about to be,
leading to broken behaviour.

Attempt to fix this by always grabbing the runtime ref when our internal
ref transitions from 0 -> 1. The hard part is then dealing with the
runtime_pm callbacks also calling xe_device_mem_access_get() and
deadlocking, which the pm_runtime_suspended() check prevented.

v2:
 - ct->lock looks to be primed with fs_reclaim, so holding that and then
   allocating memory will cause lockdep to complain. Now that we
   unconditionally grab the mem_access.lock around mem_access_{get,put}, we
   need to change the ordering wrt to grabbing the ct->lock, since some of
   the runtime_pm routines can allocate memory (or at least that's what
   lockdep seems to suggest). Hopefully not a big deal.  It might be that
   there were already issues with this, just that the atomics where
   "hiding" the potential issues.
v3:
 - Use Thomas Hellström' idea with tracking the active task that is
   executing in the resume or suspend callback, in order to avoid
   recursive resume/suspend calls deadlocking on itself.
 - Split the ct->lock change.
v4:
 - Add smb_mb() around accessing the pm_callback_task for extra safety.
   (Thomas Hellström)
v5:
 - Clarify the kernel-doc for the mem_access.lock, given that it is quite
   strange in what it protects (data vs code). The real motivation is to
   aid lockdep. (Rodrigo Vivi)
v6:
 - Split out the lock change. We still want this as a lockdep aid but
   only for the xe_device_mem_access_get() path. Sticking a lock on the
   put() looks be a no-go, also the runtime_put() there is always async.
 - Now that the lock is gone move to atomics and rely on the pm code
   serialising multiple callers on the 0 -> 1 transition.
 - g2h_worker_func() looks to be the next issue, given that
   suspend-resume callbacks are using CT, so try to handle that.
v7:
 - Add xe_device_mem_access_get_if_ongoing(), and use it in
   g2h_worker_func().
v8 (Anshuman):
 - Just always grab the rpm, instead of just on the 0 -> 1 transition,
   which is a lot clearer and simplifies the code quite a bit.
v9:
 - Make sure we also adjust the CT fast-path with if-active.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/258
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Acked-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:35 -05:00
Francois Dugast
4ab5901cc0 drm/xe: Cleanup POINTER_LOCATION style issues
Remove all existing style issues of type POINTER_LOCATION reported
by checkpatch.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:30 -05:00
Francois Dugast
4cd6d49259 drm/xe: Cleanup SPACING style issues
Remove almost all existing style issues of type SPACING reported
by checkpatch.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:37:30 -05:00
Matthew Auld
c4bbc32e09 drm/xe: hold mem_access.ref for CT fast-path
Just checking xe_device_mem_access_ongoing() is not enough, we also need
to hold the reference otherwise the ref can transition from 1 -> 0 as we
enter g2h_read(), leading to warnings. While we can't do a full rpm sync
in the IRQ, we can keep the device awake if the ref is non-zero.
Introduce a new helper for this and set it to work in for the CT
fast-path.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:35:22 -05:00
Matt Roper
37efea9ca2 drm/xe: Allow GT looping and lookup on standalone media
Allow xe_device_get_gt() and for_each_gt() to operate as expected on
platforms with standalone media.

FIXME: We need to figure out a consistent ID scheme for GTs.  This patch
keeps the pre-existing behavior of 0/1 being the GT IDs for both PVC
(multi-tile) and MTL (multi-GT), but depending on the direction we
decide to go with uapi, we may change this in the future (e.g., to
return 0/1 on PVC and 0/2 on MTL).  Or if we decide we only need to
expose tiles to userspace and not GTs, we may not even need ID numbers
for the GTs anymore.

v2:
 - Restructure a bit to make the assertions more clear.
 - Clarify in commit message that the goal here is to preserve existing
   behavior; UAPI-visible changes may be introduced in the future once
   we settle on what we really want.
v3:
 - Store total GT count in xe_device for ease of lookup.  (Brian)
 - s/(id__++)/(id__)++/  (Gustavo)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Cc: Brian Welty <brian.welty@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-29-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:20 -05:00
Matt Roper
f6929e80cd drm/xe: Allocate GT dynamically
In preparation for re-adding media GT support, switch the primary GT
within the tile to a dynamic allocation.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-19-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:15 -05:00
Matt Roper
ed006ba5e6 drm/xe: Clarify 'gt' retrieval for primary tile
There are a bunch of places in the driver where we need to perform
non-GT MMIO against the platform's primary tile (display code, top-level
interrupt enable/disable, driver initialization, etc.).  Rename
'to_gt()' to 'xe_primary_mmio_gt()' to clarify that we're trying to get
a primary MMIO handle for these top-level operations.

In the future we need to move away from xe_gt as the target for MMIO
operations (most of which are completely unrelated to GT).

v2:
 - s/xe_primary_mmio_gt/xe_root_mmio_gt/ for more consistency with how
   we refer to tile 0.  (Lucas)
v3:
 - Tweak comment on xe_root_mmio_gt().  (Lucas)

Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-16-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:15 -05:00
Matt Roper
3643e63715 drm/xe: Add for_each_tile iterator
As we start splitting tile handling out from GT handling, we'll need to
be able to iterate over tiles separately from GTs.  This iterator will
be used in upcoming patches.

v2:
 - s/(id__++)/(id__)++/  (Gustavo)

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-6-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:11 -05:00
Matt Roper
a5edc7cdb3 drm/xe: Introduce xe_tile
Create a new xe_tile structure to begin separating the concept of "tile"
from "GT."  A tile is effectively a complete GPU, and a GT is just one
part of that.  On platforms like MTL, there's only a single full GPU
(tile) which has its IP blocks provided by two GTs.  In contrast, a
"multi-tile" platform like PVC is basically multiple complete GPUs
packed behind a single PCI device.

For now, just create xe_tile as a simple wrapper around xe_gt.  The
items in xe_gt that are truly tied to the tile rather than the GT will
be moved in future patches.  Support for multiple GTs per tile (i.e.,
the MTL standalone media case) will also be re-introduced in a future
patch.

v2:
 - Fix kunit test build
 - Move hunk from next patch to use local tile variable rather than
   direct xe->tiles[id] accesses.  (Lucas)
 - Mention compute in kerneldoc.  (Rodrigo)

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-3-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:34:11 -05:00