Commit Graph

13 Commits

Author SHA1 Message Date
Daniele Ceraolo Spurio
72d479601d drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues
Userspace is required to mark a queue as using PXP to guarantee that the
PXP instructions will work. In addition to managing the PXP sessions,
when a PXP queue is created the driver will set the relevant bits in
its context control register.

On submission of a valid PXP queue, the driver will validate all
encrypted objects mapped to the VM to ensured they were encrypted with
the current key.

v2: Remove pxp_types include outside of PXP code (Jani), better comments
and code cleanup (John)

v3: split the internal PXP management to a separate patch for ease of
review. re-order ioctl checks to always return -EINVAL if parameters are
invalid, rebase on msix changes.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250129174140.948829-9-daniele.ceraolospurio@intel.com
2025-02-03 11:51:18 -08:00
Francois Dugast
0d92cd8935 drm/xe/exec_queue: Prepare last fence for hw engine group resume context
Ensure we can safely take a ref of the exec queue's last fence from the
context of resuming jobs from the hw engine group. The locking requirements
differ from the general case, hence the introduction of this new function.

v2: Add kernel doc, rework the code to prevent code duplication

v3: Fix kernel doc, remove now unnecessary lockdep variants (Matt Brost)

v4: Remove new put function (Matt Brost)

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-7-francois.dugast@intel.com
2024-08-17 18:31:54 -07:00
Matthew Brost
852856e3b6 drm/xe: Use reserved copy engine for user binds on faulting devices
User binds map to engines with can fault, faults depend on user binds
completion, thus we can deadlock. Avoid this by using reserved copy
engine for user binds on faulting devices.

While we are here, normalize bind queue creation with a helper.

v2:
 - Pass in extensions to bind queue creation (CI)
v3:
 - s/resevered/reserved (Lucas)
 - Fix NULL hwe check (Jonathan)

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240816034033.53837-1-matthew.brost@intel.com
2024-08-17 14:12:19 -07:00
Matthew Brost
96e7ebb220 drm/xe: Add xe_exec_queue_last_fence_test_dep
Helpful to determine if a bind can immediately use CPU or needs to be
deferred a drm scheduler job.

v7:
 - Better wording in kernel doc (Matthew Auld)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-4-matthew.brost@intel.com
2024-07-03 22:27:02 -07:00
Umesh Nerlige Ramappa
45bb564de0 drm/xe: Use run_ticks instead of runtime for client stats
Note that runtime is also used in the pm context, so it is confusing to
use the same name to denote run time of the drm client. Use a more
appropriate name for the client utilization.

While at it, drop the incorrect multi-lrc comment in the helper
description

v2: s/show_runtime/show_run_ticks/ (Rodrigo)

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240524234744.1352543-1-umesh.nerlige.ramappa@intel.com
2024-05-27 14:07:44 -07:00
Umesh Nerlige Ramappa
6109f24f87 drm/xe: Add helper to accumulate exec queue runtime
Add a helper to accumulate per-client runtime of all its
exec queues. This is called every time a sched job is finished.

v2:
  - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
    runtime when job is finished since xe_sched_job_completed() is not a
    notification that job finished.
  - Stop trying to update runtime from xe_exec_queue_fini() - that is
    redundant and may happen after xef is closed, leading to a
    use-after-free
  - Do not special case the first timestamp read: the default LRC sets
    CTX_TIMESTAMP to zero, so even the first sample should be a valid
    one.
  - Handle the parallel submission case by multiplying the runtime by
    width.
v3: Update comments

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240517204310.88854-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-21 06:33:40 -07:00
Rodrigo Vivi
8ed9aaae39
drm/xe: Force wedged state and block GT reset upon any GPU hang
In many validation situations when debugging GPU Hangs,
it is useful to preserve the GT situation from the moment
that the timeout occurred.

This patch introduces a module parameter that could be used
on situations like this.

If xe.wedged module parameter is set to 2, Xe will be declared
wedged on every single execution timeout (a.k.a. GPU hang) right
after devcoredump snapshot capture and without attempting any
kind of GT reset and blocking entirely any kind of execution.

v2: Really block gt_reset from guc side. (Lucas)
    s/wedged/busted (Lucas)

v3: - s/busted/wedged
    - Really use global_flags (Dafna)
    - More robust timeout handling when wedging it.

v4: A really robust clean exit done by Matt Brost.
    No more kernel warns on unbind.

v5: Simplify error message (Lucas)

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Dafna Hirschfeld <dhirschfeld@habana.ai>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Himanshu Somaiya <himanshu.somaiya@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423221817.1285081-3-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-24 12:12:58 -04:00
Brian Welty
25ce7c5063 drm/xe: Finish refactoring of exec_queue_create
Setting of exec_queue user extensions is moved from the end of the ioctl
function earlier, into __xe_exec_queue_alloc().
This fixes bug in that the USM attributes for access counters were being
applied too late, and effectively were ignored.

However, in order to apply user extensions this early, we can no longer
call q->ops functions.  Instead, make it more efficient. The user extension
functions can simply update the q->sched_props values and they will be
applied by the backend during q->ops->init().

v2: minor changes for readability (Matt)

Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2024-01-10 15:01:53 -08:00
Rodrigo Vivi
0f1d88f278 drm/xe/uapi: Kill exec_queue_set_property
All the properties should be immutable and set upon exec_queue creation
using the existent extension. So, let's kill this useless and dangerous
uapi.

Cc: Francois Dugast <francois.dugast@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
2023-12-21 11:45:23 -05:00
Matthew Brost
e669f10cd3 drm/xe: Fix VM bind out-sync signaling ordering
A case existed where an out-sync of a later VM bind operation could
signal before a previous one if the later operation results in a NOP
(e.g. a unbind or prefetch to a VA range without any mappings). This
breaks the ordering rules, fix this. This patch also lays the groundwork
for users to pass in num_binds == 0 and out-syncs.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:43:18 -05:00
Daniele Ceraolo Spurio
0b1d1473b3 drm/xe: common function to assign queue name
The queue name assignment is identical in both GuC and execlists
backends, so we can move it to a common function. This will make adding
a new entry in the next patch slightly cleaner.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230817201831.1583172-2-daniele.ceraolospurio@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:40:20 -05:00
Francois Dugast
9b9529ce37 drm/xe: Rename engine to exec_queue
Engine was inappropriately used to refer to execution queues and it
also created some confusion with hardware engines. Where it applies
the exec_queue variable name is changed to q and comments are also
updated.

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/162
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:20 -05:00
Francois Dugast
c22a4ed0c3 drm/xe: Rename xe_engine.[ch] to xe_exec_queue.[ch]
This is a preparation commit for a larger renaming of engine to exec queue.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21 11:39:17 -05:00