Commit Graph

174 Commits

Author SHA1 Message Date
Michal Wajdeczko
9f50b729dd drm/xe/pf: Prepare to stop SR-IOV support prior GT reset
As part of the resume or GT reset, the PF driver schedules work
which is then used to complete restarting of the SR-IOV support,
including resending to the GuC configurations of provisioned VFs.

However, in case of short delay between those two actions, which
could be seen by triggering a GT reset on the suspened device:

 $ echo 1 > /sys/kernel/debug/dri/0000:00:02.0/gt0/force_reset

this PF worker might be still busy, which lead to errors due to
just stopped or disabled GuC CTB communication:

 [ ] xe 0000:00:02.0: [drm:xe_gt_resume [xe]] GT0: resumed
 [ ] xe 0000:00:02.0: [drm] GT0: trying reset from force_reset_show [xe]
 [ ] xe 0000:00:02.0: [drm] GT0: reset queued
 [ ] xe 0000:00:02.0: [drm] GT0: reset started
 [ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel stopped
 [ ] xe 0000:00:02.0: [drm:guc_ct_send_recv [xe]] GT0: H2G request 0x5503 canceled!
 [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 12 config KLVs (-ECANCELED)
 [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 configuration (-ECANCELED)
 [ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel disabled
 [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 12 config KLVs (-ENODEV)
 [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 configuration (-ENODEV)
 [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push 2 of 2 VFs configurations
 [ ] xe 0000:00:02.0: [drm:pf_worker_restart_func [xe]] GT0: PF: restart completed

While this VFs reprovisioning will be successful during next spin
of the worker, to avoid those errors, make sure to cancel restart
worker if we are about to trigger next reset.

Fixes: 411220808c ("drm/xe/pf: Restart VFs provisioning after GT reset")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20250711193316.1920-2-michal.wajdeczko@intel.com
2025-07-15 13:05:15 +02:00
Tvrtko Ursulin
aded26ccaa drm/xe: Waste fewer instructions in emit_wa_job()
I was debugging some unrelated issue and noticed the current code was
very verbose. We can improve it easily by using the more common batch
buffer building pattern.

Before:
                bb->cs[bb->len++] = MI_LOAD_REGISTER_REG | MI_LRR_DST_CS_MMIO;
     c4d:       41 8b 56 10             mov    0x10(%r14),%edx
     c51:       49 8b 4e 08             mov    0x8(%r14),%rcx
     c55:       8d 72 01                lea    0x1(%rdx),%esi
     c58:       41 89 76 10             mov    %esi,0x10(%r14)
     c5c:       c7 04 91 01 00 08 15    movl   $0x15080001,(%rcx,%rdx,4)
                        bb->cs[bb->len++] = entry->reg.addr;
     c63:       8b 08                   mov    (%rax),%ecx
     c65:       41 8b 56 10             mov    0x10(%r14),%edx
     c69:       49 8b 76 08             mov    0x8(%r14),%rsi
     c6d:       81 e1 ff ff 3f 00       and    $0x3fffff,%ecx
     c73:       8d 7a 01                lea    0x1(%rdx),%edi
     c76:       41 89 7e 10             mov    %edi,0x10(%r14)
     c7a:       89 0c 96                mov    %ecx,(%rsi,%rdx,4)
 ..etc..

After:
                *cs++ = MI_LOAD_REGISTER_REG | MI_LRR_DST_CS_MMIO;
     c52:       41 c7 04 24 01 00 08    movl   $0x15080001,(%r12)
     c59:       15
                        *cs++ = entry->reg.addr;
     c5a:       8b 10                   mov    (%rax),%edx
 ..etc..

Resulting in the following binary change:

	add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-348 (-348)
	Function                                     old     new   delta
	xe_gt_record_default_lrcs.cold               304     296      -8
	xe_gt_record_default_lrcs                   2200    1860    -340
	Total: Before=13554, After=13206, chg -2.57%

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250710-lrc-refactors-v2-7-a5e2ca03f6bd@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14 13:40:17 -07:00
Lucas De Marchi
f4b538245f drm/xe/gt: Drop third submission for default context
There's no need to submit the nop job again on the first queue. Any
state needed is already saved when the first LRC is switched out. The
comment is a little misleading regarding indirect W/A: first of all
there's still no indirect W/A enabled and secondly, even after they are,
there's no need to submit this job again for having their state
propagated: the indirect W/A will actually run on every LRC switch.

Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250710-lrc-refactors-v2-6-a5e2ca03f6bd@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14 13:40:17 -07:00
Lucas De Marchi
fab2cc0c09 drm/xe/gt: Extract emit_job_sync()
Both the nop and wa jobs are going through the same boiler plate calls
to emit the job with a timeout and handling error for both bb and job.
Extract emit_job_sync() so those functions create the bb, handling
possible errors and delegate the part about really emitting the job
and waiting for its completion.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Link: https://lore.kernel.org/r/20250710-lrc-refactors-v2-3-a5e2ca03f6bd@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14 13:20:03 -07:00
Lucas De Marchi
e4cb5823ba drm/xe: Count dwords before allocating
The bb allocation in emit_wa_job() is wrong in 2 ways: first it's
allocating enough space for the 3DSTATE or hardcoding 4k depending on
the engine. In the first case it doesn't account for the WAs and in the
former it may not be sufficient. Secondly it's using the size instead of
number of dwords, causing the buffer to be 4x bigger than needed:
xe_bb_new() receives number of dwords as parameter and its declaration
was also not following its implementation.

Lastly, reword the debug message since it's not only about the LRC WAs
anymore as it also include the 3DSTATE for render.

While it's unlikely this is causing any real issue, let's calculate the
needed space and allocate just enough.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Link: https://lore.kernel.org/r/20250710-lrc-refactors-v2-2-a5e2ca03f6bd@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14 13:20:02 -07:00
Michal Wajdeczko
ffab82b062 drm/xe: Introduce xe_gt_is_main_type helper
Instead of checking for not being a media type GT provide a small
helper to explicitly express our intentions.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20250713103625.1964-5-michal.wajdeczko@intel.com
2025-07-14 18:19:29 +02:00
Matthew Brost
beb72acb5b drm/xe: Move page fault init after topology init
We need the topology to determine GT page fault queue size, move page
fault init after topology init.

Cc: stable@vger.kernel.org
Fixes: 3338e4f90c ("drm/xe: Use topology to determine page fault queue size")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://lore.kernel.org/r/20250710191208.1040215-1-matthew.brost@intel.com
2025-07-11 10:43:11 -07:00
Maarten Lankhorst
18635b6328 drm/xe: Rename xe_uc_init_hw to xe_uc_load_hw
It feels to me like load is closer to the intention than init_hw.

It makes the init calls slightly less confusing to me. :)

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250619104858.418440-24-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26 22:11:35 +02:00
Maarten Lankhorst
80fa03eb8a drm/xe: Remove xe_uc_init_hwconfig()
This function is called immediately after xe_uc_init(), so just put it
there instead.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250619104858.418440-22-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26 22:11:35 +02:00
Maarten Lankhorst
11bf0f0b3a drm/xe: Split init of xe_gt_init_hwconfig to xe_gt_init and *_early
Now that we added the separate step of initialising GUC in
xe_gt_init_early, it should be ok to initialise the minimum during early
init, and the rest after allocations are allowed.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250619104858.418440-20-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26 22:11:35 +02:00
Maarten Lankhorst
6386a49951 drm/xe: Rename gt_init sub-functions
s/gt_fw_domain_init/gt_init_with_gt_forcewake()/
s/all_fw_domain_init/gt_init_with_all_forcewake()/

Clarify that the functions are the part of gt_init() that are called
with the respective power domains held. all_domain() of course only
works after discovering and initialisation of force_wake on all engines,
that's why the split is needed in the first place.

Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250619104858.418440-19-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26 22:11:35 +02:00
Maarten Lankhorst
4c5517e9ec drm/xe: Only dump PAT when xe_hw_engines_init_early fails
After discussion with Lucas De Marchi, it turns out that is the
specific caller requiring a dump. This allows us to cleanup
gt_init in a bit.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250619104858.418440-18-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26 22:11:35 +02:00
Maarten Lankhorst
396044c9d8 drm/xe: Simplify GuC early initialization
Add a 2-stage GuC init. An early one for everything that is needed
for VF, and a full one that loads GuC and is allowed to do allocations.

Link: https://lore.kernel.org/r/20250619104858.418440-16-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-06-26 22:10:30 +02:00
Maarten Lankhorst
b3412d7233 drm/xe/sriov: Move VF bootstrap and query_config to vf_guc_init
We want to split up GUC init to an alloc and noalloc part to keep the
init path the same for VF and !VF as much as possible.

Everything in vf_guc_init should be done as early as possible, otherwise
VRAM probing becomes impossible.

Also move xe_gt_mmio_init to the end of xe_gt_init_early(), cleaning up
the init in xe_device slightly.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250619104858.418440-15-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-06-26 22:07:44 +02:00
Vinay Belgaumkar
6ab42fa03d drm/xe/bmg: Update Wa_16023588340
This allows for additional L2 caching modes.

Fixes: 01570b4469 ("drm/xe/bmg: implement Wa_16023588340")
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://lore.kernel.org/r/20250612-wa-14022085890-v4-2-94ba5dcc1e30@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-06-12 23:23:39 -07:00
Balasubramani Vivekanandan
241cc827c0 drm/xe/mocs: Initialize MOCS index early
MOCS uc_index is used even before it is initialized in the following
callstack
    guc_prepare_xfer()
    __xe_guc_upload()
    xe_guc_min_load_for_hwconfig()
    xe_uc_init_hwconfig()
    xe_gt_init_hwconfig()

Do MOCS index initialization earlier in the device probe.

Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Ravi Kumar Vodapalli <ravi.kumar.vodapalli@intel.com>
Link: https://lore.kernel.org/r/20250520142445.2792824-1-balasubramani.vivekanandan@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-05-29 14:29:18 -07:00
Daniele Ceraolo Spurio
12370bfcc4 drm/xe/gsc: do not flush the GSC worker from the reset path
The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM,
while the GSC one isn't (and can't be as we need to do memory
allocations in the gsc worker). Therefore, we can't flush the latter
from the former.

The reason why we had such a flush was to avoid interrupting either
the GSC FW load or in progress GSC proxy operations. GSC proxy
operations fall into 2 categories:

1) GSC proxy init: this only happens once immediately after GSC FW load
   and does not support being interrupted. The only way to recover from
   an interruption of the proxy init is to do an FLR and re-load the GSC.

2) GSC proxy request: this can happen in response to a request that
   the driver sends to the GSC. If this is interrupted, the GSC FW will
   timeout and the driver request will be failed, but overall the GSC
   will keep working fine.

Flushing the work allowed us to avoid interruption in both cases (unless
the hang came from the GSC engine itself, in which case we're toast
anyway). However, a failure on a proxy request is tolerable if we're in
a scenario where we're triggering a GT reset (i.e., something is already
gone pretty wrong), so what we really need to avoid is interrupting
the init flow, which we can do by polling on the register that reports
when the proxy init is complete (as that ensure us that all the load and
init operations have been completed).

Note that during suspend we still want to do a flush of the worker to
make sure it completes any operations involving the HW before the power
is cut.

v2: fix spelling in commit msg, rename waiter function (Julia)

Fixes: dd0e89e5ed ("drm/xe/gsc: GSC FW load")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@intel.com
2025-05-05 13:36:52 -07:00
Michal Wajdeczko
f2f90989cc drm/xe: Avoid reading RMW registers in emit_wa_job
To allow VFs properly handle LRC WAs, we should postpone doing
all RMW register operations and let them be run by the engine
itself, since attempt to perform read registers from within the
driver will fail on the VF. Use MI_MATH and ALU for that.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250303173522.1822-4-michal.wajdeczko@intel.com
2025-03-12 11:37:51 +01:00
Tvrtko Ursulin
067a974fd8 drm/xe: Add performance tunings to debugfs
Add a list of active tunings to debugfs, analogous to the existing
list of workarounds.

Rationale being that it seems to make sense to either have both or none.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250227101304.46660-6-tvrtko.ursulin@igalia.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-28 21:47:33 -08:00
Tvrtko Ursulin
25d434cef7 drm/xe: Fix GT "for each engine" workarounds
Any rules using engine matching are currently broken due RTP processing
happening too in early init, before the list of hardware engines has been
initialised.

Fix this by moving workaround processing to later in the driver probe
sequence, to just before the processed list is used for the first time.

Looking at the debugfs gt0/workarounds on ADL-P we notice 14011060649
should be present while we see, before:

 GT Workarounds
     14011059788
     14015795083

And with the patch:

 GT Workarounds
     14011060649
     14011059788
     14015795083

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: stable@vger.kernel.org # v6.11+
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250227101304.46660-2-tvrtko.ursulin@igalia.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-28 21:42:49 -08:00
Harish Chegondi
9a0b11d4cf drm/xe/eustall: Add support to init, enable and disable EU stall sampling
Implement EU stall sampling APIs introduced in the previous patch for
Xe_HPC (PVC). Add register definitions and the code that accesses these
registers to the APIs.

Add initialization and clean up functions and their implementations,
EU stall enable and disable functions.

v11: Move stream->xecore_buf alloc to xe_eu_stall_data_buf_alloc().
     Register xe_eu_stall_fini() with devm_add_action_or_reset()
     instead of calling it from xe_gt_fini().
     Changed a couple of variables in struct xe_eu_stall_data_stream
     from unsigned int to int.
v10: Fixed error rewinding code
     Moved code around as per review feedback
v9: Moved structure definitions from xe_eu_stall.h to xe_eu_stall.c
    Moved read and poll implementations to the next patch
    Used xe_bo_create_pin_map_at_aligned instead of xe_bo_create_pin_map
    Changed lock names as per review feedback
    Moved drop data handling into a subsequent patch
    Moved code around as per review feedback
v8: Updated copyright year in xe_eu_stall_regs.h to 2025.
    Renamed struct drm_xe_eu_stall_data_pvc to struct xe_eu_stall_data_pvc
    since it is a local structure.
v6: Fix buffer wrap around over write bug (Matt Olson)

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b6aeca593d521828a0b4fbf6cfd2844716c4fc66.1740533885.git.harish.chegondi@intel.com
2025-02-26 11:30:59 -08:00
Ilia Levi
eb79d71e50 drm/xe: Add xe_mmio_init() initialization function
Add a convenience function for minimal initialization of struct xe_mmio.
This function also validates that the entirety of the provided mmio region
is usable with struct xe_reg.

v2: Modify commit message, add kernel doc, refactor assert (Michal)
v3: Fix off-by-one bug, add clarifying macro (Michal)
v4: Derive bitfield width from size (Michal)

Signed-off-by: Ilia Levi <ilia.levi@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250213093559.204652-1-ilia.levi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-18 08:27:11 -08:00
Lucas De Marchi
f5ebe80e32 drm/xe: Cleanup extra calls to xe_hw_fence_irq_finish()
Now that xe_gt_remove is handled entirely by xe_gt, it's clear there are
some extra calls to xe_hw_fence_irq_finish() that aren't necessary.
Neither all_fw_domain_init() or gt_fw_domain_init() need to do that
since it's handled by the caller on any error.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250213192909.996148-8-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-14 11:42:55 -08:00
Lucas De Marchi
ff6cd29b69 drm/xe: Cleanup unwind of gt initialization
The only thing in xe_gt_remove() that really needs to happen on the
device remove callback is the xe_uc_remove(). That's because of the
following call chain:

	xe_gt_remove()
	  xe_uc_remove()
	    xe_gsc_remove()
	      xe_gsc_proxy_remove()

Move xe_gsc_proxy_remove() to be handled as a xe_device_remove_action,
so it's recorded when it should run during device removal. The rest can
be handled normally by devm infra.

Besides removing the deep call chain above, xe_device_probe() doesn't
have to unwind the gt loop and it's also more in line with the
xe_device_probe() style.

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250213192909.996148-7-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-14 11:42:55 -08:00
Michal Wajdeczko
459777724d drm/xe/vf: Don't try to trigger a full GT reset if VF
VFs don't have access to the GDRST(0x941c) register that driver
uses to reset a GT. Attempt to trigger a reset using debugfs:

 $ cat /sys/kernel/debug/dri/0000:00:02.1/gt0/force_reset

or due to a hang condition detected by the driver leads to:

 [ ] xe 0000:00:02.1: [drm] GT0: trying reset from force_reset [xe]
 [ ] xe 0000:00:02.1: [drm] GT0: reset queued
 [ ] xe 0000:00:02.1: [drm] GT0: reset started
 [ ] ------------[ cut here ]------------
 [ ] xe 0000:00:02.1: [drm] GT0: VF is trying to write 0x1 to an inaccessible register 0x941c+0x0
 [ ] WARNING: CPU: 3 PID: 3069 at drivers/gpu/drm/xe/xe_gt_sriov_vf.c:996 xe_gt_sriov_vf_write32+0xc6/0x580 [xe]
 [ ] RIP: 0010:xe_gt_sriov_vf_write32+0xc6/0x580 [xe]
 [ ] Call Trace:
 [ ]  <TASK>
 [ ]  ? show_regs+0x6c/0x80
 [ ]  ? __warn+0x93/0x1c0
 [ ]  ? xe_gt_sriov_vf_write32+0xc6/0x580 [xe]
 [ ]  ? report_bug+0x182/0x1b0
 [ ]  ? handle_bug+0x6e/0xb0
 [ ]  ? exc_invalid_op+0x18/0x80
 [ ]  ? asm_exc_invalid_op+0x1b/0x20
 [ ]  ? xe_gt_sriov_vf_write32+0xc6/0x580 [xe]
 [ ]  ? xe_gt_sriov_vf_write32+0xc6/0x580 [xe]
 [ ]  ? xe_gt_tlb_invalidation_reset+0xef/0x110 [xe]
 [ ]  ? __mutex_unlock_slowpath+0x41/0x2e0
 [ ]  xe_mmio_write32+0x64/0x150 [xe]
 [ ]  do_gt_reset+0x2f/0xa0 [xe]
 [ ]  gt_reset_worker+0x14e/0x1e0 [xe]
 [ ]  process_one_work+0x21c/0x740
 [ ]  worker_thread+0x1db/0x3c0

Fix that by sending H2G VF_RESET(0x5507) action instead.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4078
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250131182502.852-1-michal.wajdeczko@intel.com
2025-02-04 15:31:45 +01:00
Michal Wajdeczko
9ebb5846e1 drm/xe/pf: Fix migration initialization
The migration support only needs to be initialized once, but it
was incorrectly called from the xe_gt_sriov_pf_init_hw(), which
is part of the reset flow and may be called multiple times.

Fixes: d86e3737c7 ("drm/xe/pf: Add functions to save and restore VF GuC state")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250120232443.544-1-michal.wajdeczko@intel.com
2025-01-21 21:52:47 +01:00
Michal Wajdeczko
bbd8429264 drm/xe: Always setup GT MMIO adjustment data
While we believed that xe_gt_mmio_init() will be called just once
per GT, this might not be a case due to some tweaks that need to
performed by the VF driver during early probe.  To avoid leaving
any stale data in case of the re-run, reset the GT MMIO adjustment
data for the non-media GT case.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114175955.2299-2-michal.wajdeczko@intel.com
2025-01-18 22:03:42 +01:00
Lucas De Marchi
5001ef3af8 drm/xe: Fix tlb invalidation when wedging
If GuC fails to load, the driver wedges, but in the process it tries to
do stuff that may not be initialized yet. This moves the
xe_gt_tlb_invalidation_init() to be done earlier: as its own doc says,
it's a software-only initialization and should had been named with the
_early() suffix.

Move it to be called by xe_gt_init_early(), so the locks and seqno are
initialized, avoiding a NULL ptr deref when wedging:

	xe 0000:03:00.0: [drm] *ERROR* GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01
	xe 0000:03:00.0: [drm] *ERROR* GT0: firmware signature verification failed
	xe 0000:03:00.0: [drm] *ERROR* CRITICAL: Xe has declared device 0000:03:00.0 as wedged.
	...
	BUG: kernel NULL pointer dereference, address: 0000000000000000
	#PF: supervisor read access in kernel mode
	#PF: error_code(0x0000) - not-present page
	PGD 0 P4D 0
	Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
	CPU: 9 UID: 0 PID: 3908 Comm: modprobe Tainted: G     U  W          6.13.0-rc4-xe+ #3
	Tainted: [U]=USER, [W]=WARN
	Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-S ADP-S DDR5 UDIMM CRB, BIOS ADLSFWI1.R00.3275.A00.2207010640 07/01/2022
	RIP: 0010:xe_gt_tlb_invalidation_reset+0x75/0x110 [xe]

This can be easily triggered by poking the GuC binary to force a
signature failure. There will still be an extra message,

	xe 0000:03:00.0: [drm] *ERROR* GT0: GuC mmio request 0x4100: no reply 0x4100

but that's better than a NULL ptr deref.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3956
Fixes: 7dbe8af13c ("drm/xe: Wedge the entire device")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250103001111.331684-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-01-03 12:43:01 -08:00
Lucas De Marchi
3fcf68d739 drm/xe: Apply whitelist to engine save-restore
Instead of handling the whitelist directly in the GuC ADS
initialization, make it follow the same logic as other engine registers
that are save-restored. Main benefit is that then the SW tracking then
shows it in debugfs and there's no risk of an engine workaround to write
to the same nopriv register that is being passed directly to GuC.

This means that xe_reg_whitelist_process_engine() only has to process
the RTP and convert them to entries for the hwe.  With that all the
registers should be covered by xe_reg_sr_apply_mmio() to write to the HW
and there's no special handling in GuC ADS to also add these registers
to the list of registers that is passed to GuC.

Example for DG2:

	# cat  /sys/kernel/debug/dri/0000\:03\:00.0/gt0/register-save-restore
	...
	Engine
	rcs0
		...
		REG[0x24d0] clr=0xffffffff set=0x1000dafc masked=no mcr=no
		REG[0x24d4] clr=0xffffffff set=0x1000db01 masked=no mcr=no
		REG[0x24d8] clr=0xffffffff set=0x0000db1c masked=no mcr=no
	...
	Whitelist
	rcs0
		REG[0xdafc-0xdaff]: allow read access
		REG[0xdb00-0xdb1f]: allow read access
		REG[0xdb1c-0xdb1f]: allow rw access

v2:
  - Use ~0u for clr bits so it's just a write (Matt Roper)
  - Simplify helpers now that unused slots are not written

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241209232739.147417-6-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-12-11 07:28:58 -08:00
Matthew Brost
e2d84e5b22 drm/xe: Mark GT work queue with WQ_MEM_RECLAIM
GT ordered work queue can be used to free memory via resets and fence
signaling thus we should allow this work queue to run during reclaim.
Mark with GT ordered work queue with WQ_MEM_RECLAIM appropriately.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241021175705.1584521-5-matthew.brost@intel.com
2024-10-23 11:08:08 -07:00
Himal Prasad Ghimiray
30d105577a
drm/xe/gt: Update handling of xe_force_wake_get return
xe_force_wake_get() now returns the reference count-incremented domain
mask. If it fails for individual domains, the return value will always
be 0. However, for XE_FORCEWAKE_ALL, it may return a non-zero value even
in the event of failure. Use helper xe_force_wake_ref_has_domain to verify
all domains are initialized or not. Update the return handling of
xe_force_wake_get() to reflect this behavior, and ensure that the return
value is passed as input to xe_force_wake_put().

v3
- return xe_wakeref_t instead of int in xe_force_wake_get()
- xe_force_wake_put() error doesn't need to be checked. It internally
WARNS on domain ack failure.

v4
- Rebase fix

v5
- return unsigned int for xe_force_wake_get()
- remove redundant XE_WARN_ON()

v6
- use helper for checking all initialized domains are awake or not.

v7
- Fix commit message

v9
- Remove redundant WARN_ON (Badal)

Cc: Badal Nilawar <badal.nilawar@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-10-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-10-17 10:17:08 -04:00
Nirmoy Das
649f533b7a drm/xe: Add caller info to xe_gt_reset_async
Add caller info to the xe_gt_reset_async() to help debug issues.

v2: s/%pS/%ps(Matt)

Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2874
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241016141717.881143-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-10-17 14:56:34 +02:00
Matthew Auld
67ec9f87bd drm/xe/bmg: improve cache flushing behaviour
The BSpec says that EN_L3_RW_CCS_CACHE_FLUSH must be toggled
on for manual global invalidation to take effect and actually flush
device cache, however this also turns on flushing for things like
pipecontrol, which occurs between submissions for compute/render. This
sounds like massive overkill for our needs, where we already have the
manual flushing on the display side with the global invalidation. Some
observations on BMG:

1. Disabling l2 caching for host writes and stubbing out the driver
   global invalidation but keeping EN_L3_RW_CCS_CACHE_FLUSH enabled, has
   no impact on wb-transient-vs-display IGT, which makes sense since the
   pipecontrol is now flushing the device cache after the render copy.
   Without EN_L3_RW_CCS_CACHE_FLUSH the test then fails, which is also
   expected since device cache is now dirty and display engine can't see
   the writes.

2. Disabling EN_L3_RW_CCS_CACHE_FLUSH, but keeping the driver global
   invalidation also has no impact on wb-transient-vs-display. This
   suggests that the global invalidation still works as expected and is
   flushing the device cache without EN_L3_RW_CCS_CACHE_FLUSH turned on.

With that drop EN_L3_RW_CCS_CACHE_FLUSH. This helps some workloads since
we no longer flush the device cache between submissions as part of
pipecontrol.

Edit: We now also have clarification from HW side that BSpec was indeed
wrong here.

v2:
  - Rebase and update commit message.

BSpec: 71718
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Vitasta Wattal <vitasta.wattal@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241007074541.33937-2-matthew.auld@intel.com
2024-10-09 09:01:42 +01:00
Thomas Hellström
b88132ceb3 Merge drm/drm-next into drm-xe-next
Backmerging to resolve a conflict with core locally.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-10-04 11:46:30 +02:00
Vinay Belgaumkar
491418a258
drm/xe: Restore GT freq on GSC load error
As part of a Wa_22019338487, ensure that GT freq is restored
even when GSC reload is not successful.

Fixes: 3b1592fb78 ("drm/xe/lnl: Apply Wa_22019338487")

Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240925204918.1989574-1-vinay.belgaumkar@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-10-03 14:12:04 -04:00
Linus Torvalds
de848da12f drm next for 6.12-rc1
string:
 - add mem_is_zero()
 
 core:
 - support more device numbers
 - use XArray for minor ids
 - add backlight constants
 - Split dma fence array creation into alloc and arm
 
 fbdev:
 - remove usage of old fbdev hooks
 
 kms:
 - Add might_fault() to drm_modeset_lock priming
 - Add dynamic per-crtc vblank configuration support
 
 dma-buf:
 - docs cleanup
 
 buddy:
 - Add start address support for trim function
 
 printk:
 - pass description to kmsg_dump
 
 scheduler;
 - Remove full_recover from drm_sched_start
 
 ttm:
 - Make LRU walk restartable after dropping locks
 - Allow direct reclaim to allocate local memory
 
 panic:
 - add display QR code (in rust)
 
 displayport:
 - mst: GUID improvements
 
 bridge:
 - Silence error message on -EPROBE_DEFER
 - analogix: Clean aup
 - bridge-connector: Fix double free
 - lt6505: Disable interrupt when powered off
 - tc358767: Make default DP port preemphasis configurable
 - lt9611uxc: require DRM_BRIDGE_ATTACH_NO_CONNECTOR
 - anx7625: simplify OF array handling
 - dw-hdmi: simplify clock handling
 - lontium-lt8912b: fix mode validation
 - nwl-dsi: fix mode vsync/hsync polarity
 
 xe:
 - Enable LunarLake and Battlemage support
 - Introducing Xe2 ccs modifiers for integrated and discrete graphics
 - rename xe perf to xe observation
 - use wb caching on DGFX for system memory
 - add fence timeouts
 - Lunar Lake graphics/media/display workarounds
 - Battlemage workarounds
 - Battlemage GSC support
 - GSC and HuC fw updates for LL/BM
 - use dma_fence_chain_free
 - refactor hw engine lookup and mmio access
 - enable priority mem read for Xe2
 - Add first GuC BMG fw
 - fix dma-resv lock
 - Fix DGFX display suspend/resume
 - Use xe_managed for kernel BOs
 - Use reserved copy engine for user binds on faulting devices
 - Allow mixing dma-fence jobs and long-running faulting jobs
 - fix media TLB invalidation
 - fix rpm in TTM swapout path
 - track resources and VF state by PF
 
 i915:
 - Type-C programming fix for MTL+
 - FBC cleanup
 - Calc vblank delay more accurately
 - On DP MST, Enable LT fallback for UHBR<->non-UHBR rates
 - Fix DP LTTPR detection
 - limit relocations to INT_MAX
 - fix long hangs in buddy allocator on DG2/A380
 
 amdgpu:
 - Per-queue reset support
 - SDMA devcoredump support
 - DCN 4.0.1 updates
 - GFX12/VCN4/JPEG4 updates
 - Convert vbios embedded EDID to drm_edid
 - GFX9.3/9.4 devcoredump support
 - process isolation framework for GFX 9.4.3/4
 - take IOMMU mappings into account for P2P DMA
 
 amdkfd:
 - CRIU fixes
 - HMM fix
 - Enable process isolation support for GFX 9.4.3/4
 - Allow users to target recommended SDMA engines
 - KFD support for targetting queues on recommended SDMA engines
 
 radeon:
 - remove .load and drm_dev_alloc
 - Fix vbios embedded EDID size handling
 - Convert vbios embedded EDID to drm_edid
 - Use GEM references instead of TTM
 - r100 cp init cleanup
 - Fix potential overflows in evergreen CS offset tracking
 
 msm:
 - DPU:
 - implement DP/PHY mapping on SC8180X
 - Enable writeback on SM8150, SC8180X, SM6125, SM6350
 - DP:
 - Enable widebus on all relevant chipsets
 - MSM8998 HDMI support
 - GPU:
 - A642L speedbin support
 - A615/A306/A621 support
 - A7xx devcoredump support
 
 ast:
 - astdp: Support AST2600 with VGA
 - Clean up HPD
 - Fix timeout loop for DP link training
 - reorganize output code by type (VGA, DP, etc)
 - convert to struct drm_edid
 - fix BMC handling for all outputs
 
 exynos:
 - drop stale MAINTAINERS pattern
 - constify struct
 
 loongson:
 - use GEM refcount over TTM
 
 mgag200:
 - Improve BMC handling
 - Support VBLANK intterupts
 - transparently support BMC outputs
 
 nouveau:
 - Refactor and clean up internals
 - Use GEM refcount over TTM's
 
 gm12u320:
 - convert to struct drm_edid
 
 gma500:
 - update i2c terms
 
 lcdif:
 - pixel clock fix
 
 host1x:
 - fix syncpoint IRQ during resume
 - use iommu_paging_domain_alloc()
 
 imx:
 - ipuv3: convert to struct drm_edid
 
 omapdrm:
 - improve error handling
 - use common helper for_each_endpoint_of_node()
 
 panel:
 - add support for BOE TV101WUM-LL2 plus DT bindings
 - novatek-nt35950: improve error handling
 - nv3051d: improve error handling
 - panel-edp: add support for BOE NE140WUM-N6G; revert support for
   SDC ATNA45AF01
 - visionox-vtdr6130: improve error handling; use
   devm_regulator_bulk_get_const()
 - boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus
   DT; Fix porch parameter
 - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1,
   BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
   CMN N116BCP-EA2, CSW MNB601LS1-4
 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
 - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
 - jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor
   for code sharing
 - panel-edp: fix name for HKC MB116AN01
 - jd9365da: fix "exit sleep" commands
 - jdi-fhd-r63452: simplify error handling with DSI multi-style
   helpers
 - mantix-mlaf057we51: simplify error handling with DSI multi-style
   helpers
 - simple:
   support Innolux G070ACE-LH3 plus DT bindings
   support On Tat Industrial Company KD50G21-40NT-A1 plus DT bindings
 - st7701:
   decouple DSI and DRM code
   add SPI support
   support Anbernic RG28XX plus DT bindings
 
 mediatek:
 - support alpha blending
 - remove cl in struct cmdq_pkt
 - ovl adaptor fix
 - add power domain binding for mediatek DPI controller
 
 renesas:
 - rz-du: add support for RZ/G2UL plus DT bindings
 
 rockchip:
 - Improve DP sink-capability reporting
 - dw_hdmi: Support 4k@60Hz
 - vop: Support RGB display on Rockchip RK3066; Support 4096px width
 
 sti:
 - convert to struct drm_edid
 
 stm:
 - Avoid UAF wih managed plane and CRTC helpers
 - Fix module owner
 - Fix error handling in probe
 - Depend on COMMON_CLK
 - ltdc: Fix transparency after disabling plane; Remove unused interrupt
 
 tegra:
 - gr3d: improve PM domain handling
 - convert to struct drm_edid
 - Call drm_atomic_helper_shutdown()
 
 vc4:
 - fix PM during detect
 - replace DRM_ERROR() with drm_error()
 - v3d: simplify clock retrieval
 
 v3d:
 - Clean up perfmon
 
 virtio:
 - add DRM capset
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmbq43gACgkQDHTzWXnE
 hr4+lg/+O/r41E7ioitcM0DWeWem0dTlvQr41pJ8jujHvw+bXNdg0BMGWtsTyTLA
 eOft2AwofsFjg+O7l8IFXOT37mQLdIdfjb3+w5brI198InL3OWC3QV8ZSwY9VGET
 n8crO9jFoxNmHZnFniBZbtI6egTyl6H+2ey3E0MTnKiPUKZQvsK/4+x532yVLPob
 UUOze5wcjyGZc7LJEIZPohPVneCb9ki7sabDQqh4cxIQ0Eg+nqPpWjYM4XVd+lTS
 8QmssbR49LrJ7z9m90qVE+8TjYUCn+ChDPMs61KZAAnc8k++nK41btjGZ23mDKPb
 YEguahCYthWJ4U8K18iXBPnLPxZv5+harQ8OIWAUYqdIOWSXHozvuJ2Z84eHV13a
 9mQ5vIymXang8G1nEXwX/vml9uhVhBCeWu3qfdse2jfaTWYUb1YzhqUoFvqI0R0K
 8wT03MyNdx965CSqAhpH5Jd559ueZmpd+jsHOfhAS+1gxfD6NgoPXv7lpnMUmGWX
 SnaeC9RLD4cgy7j2Swo7TEqQHrvK5XhZSwX94kU6RPmFE5RRKqWgFVQmwuikDMId
 UpNqDnPT5NL2UX4TNG4V4coyTXvKgVcSB9TA7j8NSLfwdGHhiz73pkYosaZXKyxe
 u6qKMwMONfZiT20nhD7RhH0AFnnKosAcO14dhn0TKFZPY6Ce9O8=
 =7jR+
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2024-09-19' of https://gitlab.freedesktop.org/drm/kernel

Pull drm updates from Dave Airlie:
 "This adds a couple of patches outside the drm core, all should be
  acked appropriately, the string and pstore ones are the main ones that
  come to mind.

  Otherwise it's the usual drivers, xe is getting enabled by default on
  some new hardware, we've changed the device number handling to allow
  more devices, and we added some optional rust code to create QR codes
  in the panic handler, an idea first suggested I think 10 years ago :-)

  string:
   - add mem_is_zero()

  core:
   - support more device numbers
   - use XArray for minor ids
   - add backlight constants
   - Split dma fence array creation into alloc and arm

  fbdev:
   - remove usage of old fbdev hooks

  kms:
   - Add might_fault() to drm_modeset_lock priming
   - Add dynamic per-crtc vblank configuration support

  dma-buf:
   - docs cleanup

  buddy:
   - Add start address support for trim function

  printk:
   - pass description to kmsg_dump

  scheduler:
   - Remove full_recover from drm_sched_start

  ttm:
   - Make LRU walk restartable after dropping locks
   - Allow direct reclaim to allocate local memory

  panic:
   - add display QR code (in rust)

  displayport:
   - mst: GUID improvements

  bridge:
   - Silence error message on -EPROBE_DEFER
   - analogix: Clean aup
   - bridge-connector: Fix double free
   - lt6505: Disable interrupt when powered off
   - tc358767: Make default DP port preemphasis configurable
   - lt9611uxc: require DRM_BRIDGE_ATTACH_NO_CONNECTOR
   - anx7625: simplify OF array handling
   - dw-hdmi: simplify clock handling
   - lontium-lt8912b: fix mode validation
   - nwl-dsi: fix mode vsync/hsync polarity

  xe:
   - Enable LunarLake and Battlemage support
   - Introducing Xe2 ccs modifiers for integrated and discrete graphics
   - rename xe perf to xe observation
   - use wb caching on DGFX for system memory
   - add fence timeouts
   - Lunar Lake graphics/media/display workarounds
   - Battlemage workarounds
   - Battlemage GSC support
   - GSC and HuC fw updates for LL/BM
   - use dma_fence_chain_free
   - refactor hw engine lookup and mmio access
   - enable priority mem read for Xe2
   - Add first GuC BMG fw
   - fix dma-resv lock
   - Fix DGFX display suspend/resume
   - Use xe_managed for kernel BOs
   - Use reserved copy engine for user binds on faulting devices
   - Allow mixing dma-fence jobs and long-running faulting jobs
   - fix media TLB invalidation
   - fix rpm in TTM swapout path
   - track resources and VF state by PF

  i915:
   - Type-C programming fix for MTL+
   - FBC cleanup
   - Calc vblank delay more accurately
   - On DP MST, Enable LT fallback for UHBR<->non-UHBR rates
   - Fix DP LTTPR detection
   - limit relocations to INT_MAX
   - fix long hangs in buddy allocator on DG2/A380

  amdgpu:
   - Per-queue reset support
   - SDMA devcoredump support
   - DCN 4.0.1 updates
   - GFX12/VCN4/JPEG4 updates
   - Convert vbios embedded EDID to drm_edid
   - GFX9.3/9.4 devcoredump support
   - process isolation framework for GFX 9.4.3/4
   - take IOMMU mappings into account for P2P DMA

  amdkfd:
   - CRIU fixes
   - HMM fix
   - Enable process isolation support for GFX 9.4.3/4
   - Allow users to target recommended SDMA engines
   - KFD support for targetting queues on recommended SDMA engines

  radeon:
   - remove .load and drm_dev_alloc
   - Fix vbios embedded EDID size handling
   - Convert vbios embedded EDID to drm_edid
   - Use GEM references instead of TTM
   - r100 cp init cleanup
   - Fix potential overflows in evergreen CS offset tracking

  msm:
   - DPU:
      - implement DP/PHY mapping on SC8180X
      - Enable writeback on SM8150, SC8180X, SM6125, SM6350
   - DP:
      - Enable widebus on all relevant chipsets
      - MSM8998 HDMI support
   - GPU:
      - A642L speedbin support
      - A615/A306/A621 support
      - A7xx devcoredump support

  ast:
   - astdp: Support AST2600 with VGA
   - Clean up HPD
   - Fix timeout loop for DP link training
   - reorganize output code by type (VGA, DP, etc)
   - convert to struct drm_edid
   - fix BMC handling for all outputs

  exynos:
   - drop stale MAINTAINERS pattern
   - constify struct

  loongson:
   - use GEM refcount over TTM

  mgag200:
   - Improve BMC handling
   - Support VBLANK intterupts
   - transparently support BMC outputs

  nouveau:
   - Refactor and clean up internals
   - Use GEM refcount over TTM's

  gm12u320:
   - convert to struct drm_edid

  gma500:
   - update i2c terms

  lcdif:
   - pixel clock fix

  host1x:
   - fix syncpoint IRQ during resume
   - use iommu_paging_domain_alloc()

  imx:
   - ipuv3: convert to struct drm_edid

  omapdrm:
   - improve error handling
   - use common helper for_each_endpoint_of_node()

  panel:
   - add support for BOE TV101WUM-LL2 plus DT bindings
   - novatek-nt35950: improve error handling
   - nv3051d: improve error handling
   - panel-edp:
      - add support for BOE NE140WUM-N6G
      - revert support for SDC ATNA45AF01
   - visionox-vtdr6130:
      - improve error handling
      - use devm_regulator_bulk_get_const()
   - boe-th101mb31ig002:
      - Support for starry-er88577 MIPI-DSI panel plus DT
      - Fix porch parameter
   - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1, BOE
     NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
     CMN N116BCP-EA2, CSW MNB601LS1-4
   - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
   - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
   - jd9365da:
      - Support Melfas lmfbx101117480 MIPI-DSI panel plus DT
      - Refactor for code sharing
   - panel-edp: fix name for HKC MB116AN01
   - jd9365da: fix "exit sleep" commands
   - jdi-fhd-r63452: simplify error handling with DSI multi-style
     helpers
   - mantix-mlaf057we51: simplify error handling with DSI multi-style
     helpers
   - simple:
      - support Innolux G070ACE-LH3 plus DT bindings
      - support On Tat Industrial Company KD50G21-40NT-A1 plus DT
        bindings
   - st7701:
      - decouple DSI and DRM code
      - add SPI support
      - support Anbernic RG28XX plus DT bindings

  mediatek:
   - support alpha blending
   - remove cl in struct cmdq_pkt
   - ovl adaptor fix
   - add power domain binding for mediatek DPI controller

  renesas:
   - rz-du: add support for RZ/G2UL plus DT bindings

  rockchip:
   - Improve DP sink-capability reporting
   - dw_hdmi: Support 4k@60Hz
   - vop:
      - Support RGB display on Rockchip RK3066
      - Support 4096px width

  sti:
   - convert to struct drm_edid

  stm:
   - Avoid UAF wih managed plane and CRTC helpers
   - Fix module owner
   - Fix error handling in probe
   - Depend on COMMON_CLK
   - ltdc:
      - Fix transparency after disabling plane
      - Remove unused interrupt

  tegra:
   - gr3d: improve PM domain handling
   - convert to struct drm_edid
   - Call drm_atomic_helper_shutdown()

  vc4:
   - fix PM during detect
   - replace DRM_ERROR() with drm_error()
   - v3d: simplify clock retrieval

  v3d:
   - Clean up perfmon

  virtio:
   - add DRM capset"

* tag 'drm-next-2024-09-19' of https://gitlab.freedesktop.org/drm/kernel: (1326 commits)
  drm/xe: Fix missing conversion to xe_display_pm_runtime_resume
  drm/xe/xe2hpg: Add Wa_15016589081
  drm/xe: Don't keep stale pointer to bo->ggtt_node
  drm/xe: fix missing 'xe_vm_put'
  drm/xe: fix build warning with CONFIG_PM=n
  drm/xe: Suppress missing outer rpm protection warning
  drm/xe: prevent potential UAF in pf_provision_vf_ggtt()
  drm/amd/display: Add all planes on CRTC to state for overlay cursor
  drm/i915/bios: fix printk format width
  drm/i915/display: Fix BMG CCS modifiers
  drm/amdgpu: get rid of bogus includes of fdtable.h
  drm/amdkfd: CRIU fixes
  drm/amdgpu: fix a race in kfd_mem_export_dmabuf()
  drm: new helper: drm_gem_prime_handle_to_dmabuf()
  drm/amdgpu/atomfirmware: Silence UBSAN warning
  drm/amdgpu: Fix kdoc entry in 'amdgpu_vm_cpu_prepare'
  drm/amd/amdgpu: apply command submission parser for JPEG v1
  drm/amd/amdgpu: apply command submission parser for JPEG v2+
  drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3
  drm/amd/pm: update the features set on smu v14.0.2/3
  ...
2024-09-19 10:18:15 +02:00
Matt Roper
58548b9110 drm/xe: Defer gt->mmio initialization until after multi-tile setup
With the recent xe_mmio redesign, tiles and GTs each have their own MMIO
accessor, with the GT inheriting some of the information (such as the
iomap pointer) from their containing tile.  Given that non-root tiles
get initialized later than the root tile (and currently after the point
at which GT MMIO is initialized for _all_ GTs), we wind up incorrectly
inheriting uninitialized pointers for the initialization of GT MMIO for
GTs that reside on non-root tiles.  This causes a driver crash on
multi-tile PVC platforms.

With the general xe_mmio redesign, it's now only necessary to do the
GT-level MMIO setup before the point we start reading/writing GT
registers.  Move initialization of gt->mmio out of xe_info_init (which
runs before non-root tiles are initialized) and to the beginning of
where we start actually accessing the GTs themselves.

The high-level initialization flow now boils down to:
 - General device init, software-only setup
 - (no register access possible yet)
 - Root tile initialization
 - (access to device/tile0 registers possible via xe_root_tile_mmio())
 - Initialization of non-root tiles
 - (access to any tile's registers possible via tile->mmio)
 - GT MMIO initialization, inheriting iomap from each GT's tile
 - (access to any GT's registers possible via gt->mmio)

Fixes: fa599b8c95 ("drm/xe: Populate GT's mmio iomap from tile during init")
Reported-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240917221615.875962-2-matthew.d.roper@intel.com
2024-09-18 12:52:53 -07:00
Lucas De Marchi
a2655358cb
drm/xe/gt: Remove double include
The header generated/xe_wa_oob.h is included twice. Remove one.

Fixes: 27cb2b7fec ("drm/xe/bmg: implement Wa_16023588340")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202407052122.AzuWSPuo-lkp@intel.com/
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708173301.1543871-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 3d122660dc)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-09-12 10:06:02 -04:00
Matt Roper
498ecc54ad drm/xe/gt: Convert register access to use xe_mmio
Stop using GT pointers for register access.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240910234719.3335472-81-matthew.d.roper@intel.com
2024-09-11 15:32:51 -07:00
Lucas De Marchi
c7c3c7b740 Merge drm/drm-next into drm-xe-next
Sync with drm-misc and drm-intel-next for common APIs and refactors.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-09-11 14:10:02 -07:00
Maarten Lankhorst
501d799a47 drm/xe: Wire up device shutdown handler
The system is turning off, and we should probably put the device
in a safe power state. We don't need to evict VRAM or suspend running
jobs to a safe state, as the device is rebooted anyway.

This does not imply the system is necessarily reset, as we can
kexec into a new kernel. Without shutting down, things like
USB Type-C may mysteriously start failing.

References: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/3500
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
[mlankhorst: Add !xe_driver_flr_disabled assert]
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240905150052.174895-4-maarten.lankhorst@linux.intel.com
2024-09-11 19:07:57 +02:00
Dave Airlie
2ef8d63da8 Cross-subsystem Changes:
- Split dma fence array creation into alloc and arm (Matthew Brost)
 
 Driver Changes:
 - Move kernel_lrc to execlist backend (Ilia)
 - Fix type width for pcode coommand (Karthik)
 - Make xe_drm.h include unambiguous (Jani)
 - Fixes and debug improvements for GSC load (Daniele)
 - Track resources and VF state by PF (Michal Wajdeczko)
 - Fix memory leak on error path (Nirmoy)
 - Cleanup header includes (Matt Roper)
 - Move pcode logic to tile scope (Matt Roper)
 - Move hwmon logic to device scope (Matt Roper)
 - Fix media TLB invalidation (Matthew Brost)
 - Threshold config fixes for PF (Michal Wajdeczko)
 - Remove extra "[drm]" from logs (Michal Wajdeczko)
 - Add missing runtime ref (Rodrigo Vivi)
 - Fix circular locking on runtime suspend (Rodrigo Vivi)
 - Fix rpm in TTM swapout path (Thomas)
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmbaZs8ZHGx1Y2FzLmRl
 bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU8cqEACG6sxanezneROhqqTW/kaj
 xEoOkPbVOwP7NBz08VDKwDvGtPjUNHclLlx2AaQkz/uJhThEmW8h8tgjEbvRy6kG
 4VpS5eglu4VZp1ZoEIJKUWeEroz/crJ49NgWfx9baUFW2AUp7iW1RDLW0TRjdb+B
 3++T/J15a+Wa7l5kJLYd/z8O3K9tQ6M4O1MJjZSoijrXv1C0MzgKcJemfU2HQYZD
 3Yz3/7FO4MFJvTCJX+zFFzqaRP+hSPisxknRAa0ImOnbEMdczdNLD74ebGLFwIri
 uP/1d/wVZ84nqQXndZF3D5+zmK0JOAL55yabZVYju0/KoPXb+GmnuiQXztPwen27
 EAnQvHueD1rwXIgKUh8neMh492RejHQhbMIZf1eF1abKEOif0NWjpCdMHxkEfHqW
 76v03Q4fdBIW7uK9PNNcI5hQ//7SbUfdXatwfUmJ61Rkryu3eYXtMA0PtZUIfqRL
 57+TIurQyf5ivRjgnoag0KhAhN7eP7YqN+Q70E8QGKy6Fql79AssTgLWCWRVAW8L
 +kDT3waNaZeDQgB9usZcyXKSrwb4rYGIqmC3ZDKya6XPmmOyTtfywe34UhVPK/FC
 qKg1TpEinEmCftm3WZz1rXzFwe4K4lRs2PFmc/3oHmDp83Qx5Rsu0iosqH6nhi9l
 az2p4dnhRUGMZGtN5EflAQ==
 =rTLT
 -----END PGP SIGNATURE-----

Merge tag 'drm-xe-next-2024-09-05' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

Cross-subsystem Changes:
- Split dma fence array creation into alloc and arm (Matthew Brost)

Driver Changes:
- Move kernel_lrc to execlist backend (Ilia)
- Fix type width for pcode coommand (Karthik)
- Make xe_drm.h include unambiguous (Jani)
- Fixes and debug improvements for GSC load (Daniele)
- Track resources and VF state by PF (Michal Wajdeczko)
- Fix memory leak on error path (Nirmoy)
- Cleanup header includes (Matt Roper)
- Move pcode logic to tile scope (Matt Roper)
- Move hwmon logic to device scope (Matt Roper)
- Fix media TLB invalidation (Matthew Brost)
- Threshold config fixes for PF (Michal Wajdeczko)
- Remove extra "[drm]" from logs (Michal Wajdeczko)
- Add missing runtime ref (Rodrigo Vivi)
- Fix circular locking on runtime suspend (Rodrigo Vivi)
- Fix rpm in TTM swapout path (Thomas)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/eirx5vdvoflbbqlrzi5cip6bpu3zjojm2pxseufu3rlq4pp6xv@eytjvhizfyu6
2024-09-10 13:18:00 +10:00
Rodrigo Vivi
82122d1f54
drm/xe: Add missing runtime reference to wedged upon gt_reset
Fixes this missed case:

xe 0000:00:02.0: [drm] Missing outer runtime PM protection
WARNING: CPU: 99 PID: 1455 at drivers/gpu/drm/xe/xe_pm.c:564 xe_pm_runtime_get_noresume+0x48/0x60 [xe]
Call Trace:
<TASK>
? show_regs+0x67/0x70
? __warn+0x94/0x1b0
? xe_pm_runtime_get_noresume+0x48/0x60 [xe]
? report_bug+0x1b7/0x1d0
? handle_bug+0x46/0x80
? exc_invalid_op+0x19/0x70
? asm_exc_invalid_op+0x1b/0x20
? xe_pm_runtime_get_noresume+0x48/0x60 [xe]
xe_device_declare_wedged+0x91/0x280 [xe]
gt_reset_worker+0xa2/0x250 [xe]

v2: Also move get and get the right Fixes tag (Himal, Brost)

Fixes: fb74b205cd ("drm/xe: Introduce a simple wedged state")
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240830183507.298351-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit bc947d9a8c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-09-04 12:16:40 -04:00
Rodrigo Vivi
bc947d9a8c
drm/xe: Add missing runtime reference to wedged upon gt_reset
Fixes this missed case:

xe 0000:00:02.0: [drm] Missing outer runtime PM protection
WARNING: CPU: 99 PID: 1455 at drivers/gpu/drm/xe/xe_pm.c:564 xe_pm_runtime_get_noresume+0x48/0x60 [xe]
Call Trace:
<TASK>
? show_regs+0x67/0x70
? __warn+0x94/0x1b0
? xe_pm_runtime_get_noresume+0x48/0x60 [xe]
? report_bug+0x1b7/0x1d0
? handle_bug+0x46/0x80
? exc_invalid_op+0x19/0x70
? asm_exc_invalid_op+0x1b/0x20
? xe_pm_runtime_get_noresume+0x48/0x60 [xe]
xe_device_declare_wedged+0x91/0x280 [xe]
gt_reset_worker+0xa2/0x250 [xe]

v2: Also move get and get the right Fixes tag (Himal, Brost)

Fixes: fb74b205cd ("drm/xe: Introduce a simple wedged state")
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240830183507.298351-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-09-03 12:46:39 -04:00
Matt Roper
fe13fd6833
drm/xe/pcode: Treat pcode as per-tile rather than per-GT
There's only one instance of the pcode per tile, and for GT-related
accesses both the primary and media GT share the same register
interface.  Since Xe was using per-GT locking, the pcode mutex wasn't
actually protecting everything that it should since concurrent accesses
related to a tile's primary GT and media GT were possible.

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240829220619.789159-5-matthew.d.roper@intel.com
(cherry picked from commit 3034cc8107)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-09-03 10:36:46 -04:00
Matt Roper
3034cc8107 drm/xe/pcode: Treat pcode as per-tile rather than per-GT
There's only one instance of the pcode per tile, and for GT-related
accesses both the primary and media GT share the same register
interface.  Since Xe was using per-GT locking, the pcode mutex wasn't
actually protecting everything that it should since concurrent accesses
related to a tile's primary GT and media GT were possible.

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240829220619.789159-5-matthew.d.roper@intel.com
2024-08-30 08:56:20 -07:00
Dave Airlie
8bdb468dd7 UAPI Changes:
- Fix OA format masks which were breaking build with gcc-5
 
 Cross-subsystem Changes:
 
 Driver Changes:
 - Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost)
 - Refactor hw engine lookup and mmio access to be used in more places
   (Dominik, Matt Auld, Mika Kuoppala)
 - Enable priority mem read for Xe2 and later (Pallavi Mishra)
 - Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik)
 - Fix refcount and speedup devcoredump (Matthew Brost)
 - Add performance tuning changes to Xe2 (Akshata, Shekhar)
 - Fix OA sysfs entry (Ashutosh)
 - Add first GuC firmware support for BMG (Julia)
 - Bump minimum GuC firmware for platforms under force_probe to match LNL
   and BMG (Julia)
 - Fix access check on user fence creation (Nirmoy)
 - Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas)
 - Document workaround and use proper WA infra (Matt Roper)
 - Fix VF configuration on media GT (Michal Wajdeczko)
 - Fix VM dma-resv lock (Matthew Brost)
 - Allow suspend/resume exec queue backend op to be called multiple times
   (Matthew Brost)
 - Add GT stats to debugfs (Nirmoy)
 - Add hwconfig to debugfs (Matt Roper)
 - Compile out all debugfs code with ONFIG_DEUBG_FS=n (Lucas)
 - Remove dead kunit code (Jani Nikula)
 - Refactor drvdata storing to help display (Jani Nikula)
 - Cleanup unsused xe parameter in pte handling (Himal)
 - Rename s/enable_display/probe_display/ for clarity (Lucas)
 - Fix missing MCR annotation in couple of registers (Tejas)
 - Fix DGFX display suspend/resume (Maarten)
 - Prepare exec_queue_kill for PXP handling (Daniele)
 - Fix devm/drmm issues (Daniele, Matthew Brost)
 - Fix tile and ggtt fini sequences (Matthew Brost)
 - Fix crashes when probing without firmware in place (Daniele, Matthew Brost)
 - Use xe_managed for kernel BOs (Daniele, Matthew Brost)
 - Future-proof dss_per_group calculation by using hwconfig (Matt Roper)
 - Use reserved copy engine for user binds on faulting devices
   (Matthew Brost)
 - Allow mixing dma-fence jobs and long-running faulting jobs (Francois)
 - Cleanup redundant arg when creating use BO (Nirmoy)
 - Prevent UAF around preempt fence (Auld)
 - Fix display suspend/resume (Maarten)
 - Use vma_pages() helper (Thorsten)
 - Calculate pagefault queue size (Stuart, Matthew Auld)
 - Fix missing pagefault wq destroy (Stuart)
 - Fix lifetime handling of HW fence ctx (Matthew Brost)
 - Fix order destroy order for jobs (Matthew Brost)
 - Fix TLB invalidation for media GT (Matthew Brost)
 - Document GGTT (Rodrigo Vivi)
 - Refactor GGTT layering and fix runtime outer protection (Rodrigo Vivi)
 - Handle HPD polling on display pm runtime suspend/resume (Imre, Vinod)
 - Drop unrequired NULL checks (Apoorva, Himal)
 - Use separate rpm lockdep map for non-d3cold-capable devices (Thomas Hellström)
 - Support "nomodeset" kernel command-line option (Thomas Zimmermann)
 - Drop force_probe requirement for LNL and BMG (Lucas, Balasubramani)
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmbPdMMZHGx1Y2FzLmRl
 bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU+OYD/wLnSi/L/N+D2WvNZXVh0YT
 optFm8UClOKDuNNSX55vTAvVZ4EwAS/rx+uyPsU9hicU16aawPd0bsU7uiobXm1v
 LlsTaV0lDYal7w0IA+9q5KR6tt4/HAH6hVNBUaliC8jtGTPH1vKzih2EfUUjjqOY
 TpBGp753EtjXQxA0gZqeCMlXl0waAjoJWCuINGrLQO954XjgDEJzUMQn6TXLdWwZ
 2+gRzgWa+sNgVefoMksf7sB9O6GuFBo83q1Tma4yRhzn6u6MDF0CFgFegRk1QZjg
 RMNAU7GDU8cy/5UpitGl3aoqn3u0bgaxAsGY1LptwGmli9Lq/lImOLptH20I/UA1
 U7rynBIYOTayEIx7jQzaFb1O5ZNZjVmOGmpAx6WtQMcD8bkEejp3Z1Gtl5Y+ZWT8
 l9cAYeE5SvwNF2gSAV5d1TKWGw1K9MPlKA2iAY8UzWlJETqy3GLcUBHQNLg8Xewk
 AzkH9xfaLeQBi8yW2V1W/el+LtvcKWw77iuFI23ojQju3xV9fxLh9mZUeAE7f1Uv
 cpgvSKu+CXtKxuFinOK+CC60F8KK2tX6k0HOg8SamDx83/lCUOeHczlJI6Be0Y9f
 EcEVFsv3mk2gpaEXMe+844T2cQWqMYcHR8bnIGRUXJ9xunF9+OB8bzOVTAF4I80L
 vd4W6/yrrxh9wTtW21BI0A==
 =aT4y
 -----END PGP SIGNATURE-----

Merge tag 'drm-xe-next-2024-08-28' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

UAPI Changes:
- Fix OA format masks which were breaking build with gcc-5

Cross-subsystem Changes:

Driver Changes:
- Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost)
- Refactor hw engine lookup and mmio access to be used in more places
  (Dominik, Matt Auld, Mika Kuoppala)
- Enable priority mem read for Xe2 and later (Pallavi Mishra)
- Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik)
- Fix refcount and speedup devcoredump (Matthew Brost)
- Add performance tuning changes to Xe2 (Akshata, Shekhar)
- Fix OA sysfs entry (Ashutosh)
- Add first GuC firmware support for BMG (Julia)
- Bump minimum GuC firmware for platforms under force_probe to match LNL
  and BMG (Julia)
- Fix access check on user fence creation (Nirmoy)
- Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas)
- Document workaround and use proper WA infra (Matt Roper)
- Fix VF configuration on media GT (Michal Wajdeczko)
- Fix VM dma-resv lock (Matthew Brost)
- Allow suspend/resume exec queue backend op to be called multiple times
  (Matthew Brost)
- Add GT stats to debugfs (Nirmoy)
- Add hwconfig to debugfs (Matt Roper)
- Compile out all debugfs code with ONFIG_DEUBG_FS=n (Lucas)
- Remove dead kunit code (Jani Nikula)
- Refactor drvdata storing to help display (Jani Nikula)
- Cleanup unsused xe parameter in pte handling (Himal)
- Rename s/enable_display/probe_display/ for clarity (Lucas)
- Fix missing MCR annotation in couple of registers (Tejas)
- Fix DGFX display suspend/resume (Maarten)
- Prepare exec_queue_kill for PXP handling (Daniele)
- Fix devm/drmm issues (Daniele, Matthew Brost)
- Fix tile and ggtt fini sequences (Matthew Brost)
- Fix crashes when probing without firmware in place (Daniele, Matthew Brost)
- Use xe_managed for kernel BOs (Daniele, Matthew Brost)
- Future-proof dss_per_group calculation by using hwconfig (Matt Roper)
- Use reserved copy engine for user binds on faulting devices
  (Matthew Brost)
- Allow mixing dma-fence jobs and long-running faulting jobs (Francois)
- Cleanup redundant arg when creating use BO (Nirmoy)
- Prevent UAF around preempt fence (Auld)
- Fix display suspend/resume (Maarten)
- Use vma_pages() helper (Thorsten)
- Calculate pagefault queue size (Stuart, Matthew Auld)
- Fix missing pagefault wq destroy (Stuart)
- Fix lifetime handling of HW fence ctx (Matthew Brost)
- Fix order destroy order for jobs (Matthew Brost)
- Fix TLB invalidation for media GT (Matthew Brost)
- Document GGTT (Rodrigo Vivi)
- Refactor GGTT layering and fix runtime outer protection (Rodrigo Vivi)
- Handle HPD polling on display pm runtime suspend/resume (Imre, Vinod)
- Drop unrequired NULL checks (Apoorva, Himal)
- Use separate rpm lockdep map for non-d3cold-capable devices (Thomas Hellström)
- Support "nomodeset" kernel command-line option (Thomas Zimmermann)
- Drop force_probe requirement for LNL and BMG (Lucas, Balasubramani)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/wd42jsh4i3q5zlrmi2cljejohdsrqc6hvtxf76lbxsp3ibrgmz@y54fa7wwxgsd
2024-08-30 13:41:06 +10:00
Jani Nikula
87d8ecf015
drm/xe: replace #include <drm/xe_drm.h> with <uapi/drm/xe_drm.h>
include/drm/xe_drm.h does not exist. Prefer the explicit uapi include.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240827091539.4136838-1-jani.nikula@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-28 15:17:54 -04:00
Daniel Vetter
4461e9e5c3 Linux 6.11-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmbK2B8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGFwkH/10QpUgzIfbFKbF+
 5hwcvaqS5myxWwJ4PjN0eR1qGE6RzVO0Tb24+TVql+7pxu+iWm1kYgC3+/T5xJsP
 ECAszdmPWSco1xaHrh2y3PyCJjaBiqFbIxdjPp7odjDpG9qarbcty8YpWs44u/gd
 RDXzHUuScEShBhEt0ZhvE1pIDL8jJ8JL3yqOMZ+XaDxtJbjaHw4GHp8efxlBWc8N
 jZKIVJi22q5NWG5T0tGtPWwzCm0ewA/JNMTEvE9leoSoAgO85NZ0ivxMC76q/tbj
 BrYk5KnzfhJs4b/n/KtIwWaLTgLyXKGqHMaMq8sbXtp410aUdgnRJO2cl3fI+1vc
 vxQfAfk=
 =RemI
 -----END PGP SIGNATURE-----

Merge v6.11-rc5 into drm-next

amdgpu pr conconflicts due to patches cherry-picked to -fixes, I might
as well catch up with a backmerge and handle them all. Plus both misc
and intel maintainers asked for a backmerge anyway.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-08-27 14:09:45 +02:00
Tejas Upadhyay
7e81285380
drm/xe/xe2: Make subsequent L2 flush sequential
Issuing the flush on top of an ongoing flush is not desirable.
Lets use lock to make it sequential.

Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240710052750.3031586-1-tejas.upadhyay@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
(cherry picked from commit 71733b8d7f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-19 13:30:47 -04:00