Enable HDCP for Xe by defining functions which take care of
interaction of HDCP as a client with the GSC CS interface.
Add intel_hdcp_gsc_message to Makefile and add corresponding
changes to xe_hdcp_gsc.c to make it build.
--v2
-add kfree at appropriate place [Daniele]
-remove useless define [Daniele]
-move host session logic to xe_gsc_submit.c [Daniele]
-call xe_gsc_check_and_update_pending directly in an if condition
[Daniele]
-use xe_device instead of drm_i915_private [Daniele]
--v3
-use xe prefix for newly exposed function [Daniele]
-remove client specific defines from intel_gsc_mtl_header [Daniele]
-add missing kfree() [Daniele]
-have NULL check for hdcp_message in finish function [Daniele]
-dont have too many variable declarations in the same line [Daniele]
--v4
-don't point the hdcp_message structure in xe_device to anything
until it properly gets initialized [Daniele]
--v5
-Squash commits for buildability
--v6
-Order includes alphabetically [Lucas]
Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com>
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240306024247.1857881-6-suraj.kandpal@intel.com
Since 'grouped target' is used only in 'make' 4.3, it should
be avoided. Replace it with 'multi-target pattern rule' which
has the same behavior.
Fixes: 9616e74b79 ("drm/xe: Add support for OOB workarounds")
Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240302153927.2602241-1-dhirschfeld@habana.ai
[ reword commit message ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 5224ed586b)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
When the driver is built-in but the tests are in loadable modules,
the helpers don't actually get put into the driver:
ERROR: modpost: "xe_kunit_helper_alloc_xe_device" [drivers/gpu/drm/xe/tests/xe_test.ko] undefined!
Change the Makefile to ensure they are always part of the driver
even when the rest of the kunit tests are in loadable modules.
Fixes: 5095d13d75 ("drm/xe/kunit: Define helper functions to allocate fake xe device")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-1-arnd@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 0e6fec6da2)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Since 'grouped target' is used only in 'make' 4.3, it should
be avoided. Replace it with 'multi-target pattern rule' which
has the same behavior.
Fixes: 9616e74b79 ("drm/xe: Add support for OOB workarounds")
Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240302153927.2602241-1-dhirschfeld@habana.ai
[ reword commit message ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
When the driver is built-in but the tests are in loadable modules,
the helpers don't actually get put into the driver:
ERROR: modpost: "xe_kunit_helper_alloc_xe_device" [drivers/gpu/drm/xe/tests/xe_test.ko] undefined!
Change the Makefile to ensure they are always part of the driver
even when the rest of the kunit tests are in loadable modules.
Fixes: 5095d13d75 ("drm/xe/kunit: Define helper functions to allocate fake xe device")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-1-arnd@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
There are very few places that need to include anything from under
display/. Require the display/ prefix in #include directives, and drop
the subdirectory from the header search path.
Sort the include lists while at it.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240122101428.2683468-2-jani.nikula@intel.com
All the other display related files are under display/ subdirectory,
also move xe_display.[ch] there.
Sort the build list while at it.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240122101428.2683468-1-jani.nikula@intel.com
The GSC uC needs to communicate with the CSME to perform certain
operations. Since the GSC can't perform this communication directly on
platforms where it is integrated in GT, the graphics driver needs to
transfer the messages from GSC to CSME and back. The proxy flow must be
manually started after the GSC is loaded to signal to GSC that we're
ready to handle its messages and allow it to query its init data from
CSME.
Note that the component must be removed before the pci_remove call
completes, so we can't use a drmm helper for it and we need to instead
perform the cleanup as part of the removal flow.
v2: add function documentation, more targeted memory clear, clearer logs
and variable names (Alan)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240117182621.2653049-2-daniele.ceraolospurio@intel.com
Building drivers/gpu/drm/xe/xe_gt_pagefault.c with GCC 11 results
in the following build errors:
./include/linux/fortify-string.h:57:33: error: writing 16 bytes into a region of size 0 [-Werror=stringop-overflow=]
57 | #define __underlying_memcpy __builtin_memcpy
| ^
./include/linux/fortify-string.h:644:9: note: in expansion of macro ‘__underlying_memcpy’
644 | __underlying_##op(p, q, __fortify_size); \
| ^~~~~~~~~~~~~
./include/linux/fortify-string.h:689:26: note: in expansion of macro ‘__fortify_memcpy_chk’
689 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \
| ^~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/xe/xe_gt_pagefault.c:340:17: note: in expansion of macro ‘memcpy’
340 | memcpy(pf_queue->data + pf_queue->tail, msg, len * sizeof(u32));
| ^~~~~~
In file included from drivers/gpu/drm/xe/xe_device_types.h:17,
from drivers/gpu/drm/xe/xe_vm_types.h:16,
from drivers/gpu/drm/xe/xe_bo.h:13,
from drivers/gpu/drm/xe/xe_gt_pagefault.c:16:
drivers/gpu/drm/xe/xe_gt_types.h:102:25: note: at offset [1144, 265324] into destination object ‘tile’ of size 8
102 | struct xe_tile *tile;
| ^~~~
Fix these by removing -Wstringop-overflow from drm/xe builds.
Closes: https://lore.kernel.org/all/45ad1d0f-a10f-483e-848a-76a30252edbe@paulmck-laptop/
Fixes: 7a8bc11782 ("drm/xe: Enable W=1 warnings by default")
Suggested-by: Stephen Rothwell <sfr@rothwell.id.au>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
[ This particular warning is broken on GCC11. In future changes it will
be moved to the normal C flags in the top level Makefile (out of
Makefile.extrawarn), but accounting for the compiler support. Just
remove it out of xe's forced extra warnings for now ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Add vram frequency sysfs attributes under the below hierarchy;
/device/tile#/memory/freq0
|-max_freq
|-min_freq
v2: Drop "vram" from attribute names (Rodrigo)
v3: Add documentation for new sysfs (Riana)
Drop prefix from XEHP_PCODE_FREQUENCY_CONFIG (Riana)
v4: Create sysfs under tile#/freq0 after removal of
physical_memsize attrbute
v5: Revert back to creating sysfs under tile#/memory/freq0
Remove definition of GT_FREQUENCY_MULTIPLIER (Rodrigo)
v6: Rename attributes to max/min_freq (Anshuman)
Fix review comments (Rodrigo)
v7: Make documentation more verbose
Move sysfs to separate file (Anshuman)
v8: Fix platform specific conditions and add kernel doc (Anshuman)
Fix typos and remove redundant headers (Riana)
v9: Fix typo (Riana)
Change function name to include "sysfs" (Lucas)
Signed-off-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://lore.kernel.org/r/20240109110418.2065101-1-sujaritha.sundaresan@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
There are scenarios where SR-IOV Virtual Function (VF) driver will
need to get additional data that is not available over VF MMIO BAR
nor could be queried from the GuC firmware and must be obtained
from the Physical Function (PF) driver.
To allow such communication between VF and PF drivers, GuC supports
set of H2G and G2H actions which allows relaying embedded messages,
that are otherwise opaque for the GuC.
To allow use of this communication mechanism, provide functions for
sending requests and handling replies and placeholder where we will
put handlers for incoming requests.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20240104222031.277-8-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
There will be more KUnit tests added that will require fake device.
Define generic helper functions to avoid code duplications.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20231218190629.502-6-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
The GFX doorbell solution provides a mechanism for submission of
workload to the graphics hardware by a ring3 application without
the penalty of ring transition for each workload submission.
This feature is not currently used by the Linux drivers, but in
SR-IOV mode the doorbells are treated as shared resource and the
PF driver must be able to provision exclusive range of doorbells
IDs across all enabled VFs.
Introduce simple GuC doorbell ID manager that will be used by the
PF driver for VFs provisioning and can later be used by submission
code once we are ready to switch from H2G based notifications to
doorbells mechanism.
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20231218190629.502-4-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
The register based interrupts infrastructure does not scale
efficiently to allow delivering interrupts to a large number
of virtual machines. Memory based interrupt reporting provides
an efficient and scalable infrastructure.
Define handler to read and dispatch memory based interrupts.
We will use this handler in upcoming patch.
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231214185955.1791-8-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
PMU uapi is likely to change in the future. Till the uapi is finalized,
remove PMU from Xe. PMU can be re-added after uapi is finalized.
v2: Include xe_drm.h in xe/tests/xe_dma_buf.c (Francois)
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Acked-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Like done in commit 2250c7ead8 ("drm/i915: enable W=1 warnings by default")
for i915, enable W=1 warnings by default in xe.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Goals of this new xe_gt_freq component:
1. Detach sysfs controls and raw freq management from GuC SLPC.
2. Create a directory that could later be aligned with devfreq.
3. Encapsulate all the freq control in a single directory. Although
we only have one freq domain per GT, already start with a numbered
freq0 directory so it could be expanded in the future if multiple
domains or PLL are needed.
Note: Although in the goal #1, the raw freq management control is
mentioned, this patch only starts by the sysfs control. The RP freq
configuration and init freq selection are still under the guc_pc, but
should be moved to this component in a follow-up patch.
v2: - Add /tile# to the doc and remove unnecessary kobject_put (Riana)
- s/ssize_t/int on some ret variables (Vinay)
Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The Local Memory Translation Table (LMTT) provides additional
abstraction for Virtual Functions (VF) accessing device VRAM.
This code is based on prior work of Michal Winiarski.
In this patch we focus only on LMTT initialization. Remaining LMTT
functions will be used once we add a VF provisioning to the PF.
Bspec: 44117, 52404, 59314
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://lore.kernel.org/r/20231128151507.1015-4-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Disable dynamic HW load balancing of compute resource assignment
to engines and instead enabled fixed mode of mapping compute
resources to engines on all platforms with more than one compute
engine.
By default enable only one CCS engine with all compute slices
assigned to it. This is the desired configuration for common
workloads.
PVC platform supports only the fixed CCS mode (workaround 16016805146).
v2: Rebase, make it platform agnostic
v3: Minor code refactoring
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Prep this file to contain C6 toggling as well instead
of just sysfs related stuff.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The version is obtained via a dedicated MKHI GSC HECI command.
The compatibility version is what we want to match against for the GSC,
so we need to call the FW version checker after obtaining the version.
Since this is the first time we send a GSC HECI command via the GSCCS,
this patch also introduces common infrastructure to send such commands
to the GSC. Communication with the GSC FW is done via input/output
buffers, whose addresses are provided via a GSCCS command. The buffers
contain a generic header and a client-specific packet (e.g. PXP, HDCP);
the clients don't care about the header format and/or the GSCCS command
in the batch, they only care about their client-specific header. This
patch therefore introduces helpers that allow the callers to
automatically fill in the input header, submit the GSCCS job and decode
the output header, to make it so that the caller only needs to worry about
their client-specific input and output messages.
v3: squash of 2 separate patches ahead of merge, so that the common
functions and their first user are added at the same time
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.Com> #v1
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
When the GSC FW is loaded, we need to inform it when a GSCCS reset is
coming and then wait 200ms for it to get ready to process the reset.
v2: move WA code to GSC file, use variable in Makefile (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Add the basic definitions and init function. Same as HuC, GSC is only
supported on the media GT on MTL and newer platforms.
Note that the GSC requires submission resources which can't be allocated
during init (because we don't have the hwconfig yet), so it can't be
marked as loadable at the end of the init function. The allocation of
those resources will come in the patch that makes use of them to load
the FW.
v2: better comment, move num FWs define inside the enum (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We will be adding support for the SR-IOV and driver might be then
running, in addition to existing non-virtualized bare-metal mode,
also in Physical Function (PF) or Virtual Function (VF) mode.
Since these additional modes require some changes to the driver,
define enum flag to represent different SR-IOV modes and add a
function where we will detect the actual mode in the runtime.
We start with a forced bare-metal mode as it is sufficient to
enable basic functionality and ensures no impact to existing code.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231115073804.1861-2-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Remove unused IOCTL.
Without any userspace using it we need to remove before we
can be accepted upstream.
At this point we are breaking the compatibility for good,
so we don't need to break when we are in-tree. So, let's
also use this breakage to sort out the IOCTL entries and
fix all the small indentation and line issues.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
This introduces an exclusive version of vga decode for xe.
Rest of the display changes will be re-used from i915.
Currently it adds just a dummy implementation. VGA decode
needs to be handled correctly in i915, proper implementation
will be adopted once the i915 changes are finalized and merged
in upstream.
v2: Addressed Arun's review comments
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Arun R Murthy <arun.r.mruthy@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there.
We do this by recompiling i915/display code twice.
Now that i915 has been adapted to support the Xe build, we can add
the xe/display support.
This initial work is a collaboration of many people and unfortunately
this squashed patch won't fully honor the proper credits.
But let's try to add a few from the squashed patches:
Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
This workaround is primarily implemented by the BIOS. However if the
BIOS applies the workaround it will reserve a small piece of our DSM
(which should be at the top, right below the WOPCM); we just need to
keep that region reserved so that nothing else attempts to re-use it.
v2 (Gustavo):
- Check for NULL media_gt
- Mask bits [5:0] to avoid potential issues in future platforms
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20231102124855.1940491-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
There are a set of engine group busyness counters provided by HW which are
perfect fit to be exposed via PMU perf events.
BSPEC: 46559, 46560, 46722, 46729, 52071, 71028
events can be listed using:
perf list
xe_0000_03_00.0/any-engine-group-busy-gt0/ [Kernel PMU event]
xe_0000_03_00.0/copy-group-busy-gt0/ [Kernel PMU event]
xe_0000_03_00.0/interrupts/ [Kernel PMU event]
xe_0000_03_00.0/media-group-busy-gt0/ [Kernel PMU event]
xe_0000_03_00.0/render-group-busy-gt0/ [Kernel PMU event]
and can be read using:
perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
time counts unit events
1.001139062 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
2.003294678 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
3.005199582 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
4.007076497 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
5.008553068 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
6.010531563 43520 ns xe_0000_8c_00.0/render-group-busy-gt0/
7.012468029 44800 ns xe_0000_8c_00.0/render-group-busy-gt0/
8.013463515 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
9.015300183 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
10.017233010 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
10.971934120 0 ns xe_0000_8c_00.0/render-group-busy-gt0/
The pmu base implementation is taken from i915.
v2:
Store last known value when device is awake return that while the GT is
suspended and then update the driver copy when read during awake.
v3:
1. drop init_samples, as storing counters before going to suspend should
be sufficient.
2. ported the "drm/i915/pmu: Make PMU sample array two-dimensional" and
dropped helpers to store and read samples.
3. use xe_device_mem_access_get_if_ongoing to check if device is active
before reading the OA registers.
4. dropped format attr as no longer needed
5. introduce xe_pmu_suspend to call engine_group_busyness_store
6. few other nits.
v4: minor nits.
v5: take forcewake when accessing the OAG registers
v6:
1. drop engine_busyness_sample_type
2. update UAPI documentation
v7:
1. update UAPI documentation
2. drop MEDIA_GT specific change for media busyness counter.
Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Some copy hardware engine instances are faster than others on PVC.
Use a virtual engine of these plus the reserved instance for the migrate
engine on PVC. The idea being if a fast instance is available it will be
used and the throughput of kernel copies, clears, and pagefault
servicing will be higher.
v2: Use OOB WA, use all copy engines if no WA is required
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
To workaround a HW bug on DG2, driver is required to map the whole
ppgtt virtual address space before GPU workload submission. Thus
set the XE_VM_FLAG_SCRATCH_PAGE flag during vm create so the whole
address space is mapped to point to scratch page.
v1:
- Move the workaround implementation from xe_vm_create to
xe_vm_create_ioctl - Brian
- Reorder error checking in xe_vm_create_ioctl - Jose
- Implement WA only for DG2-G10 and DG2-G12
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Add engines sysfs directory under its GT and
create sub directory for all engine class
(note its not per instance) present on GT.
For example,
DUT# cat /sys/class/drm/cardX/device/tileN/gtN/engines/
bcs/ ccs/
V9 :
- Add missing drmm_add_action_or_reset
V8 :
- Rebase
V7 :
- Remove xe_gt.h from .h and include in .c - Matt
V6 :
- Add kernel doc and arrange file in make file by alphabet - Matt
V5 :
- replace xe_engine with xe_hw_engine - Matt
V4 :
- Rebase to resolve conflicts - CI
V3 :
- Move code in its own file
- Rename API name
V2 :
- Correct class mask logic - Himal
- Remove extra parenthesis
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This is a preparation commit for a larger renaming of engine to exec queue.
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Add generic utility to track range conflicts signaled by a dma-fence.
Tracking implemented via an interval tree. An example use case being
tracking conflicts for pending (un)binds from multiple bind engines. By
being generic ths idea would this could moved to the DRM level and used
in multiple drivers for similar problems.
v2: Make interval tree functions static (CI)
v3: Remove non-static cleanup function (CI)
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We have recently introduced tile for each gpu,
so lets add sysfs entry per tile for userspace
to provide required information specific to tile.
V5:
- define ktype as const
V4:
- Reorder headers - Aravind
V3:
- Make API to return void and add drm_warn - Aravind
V2:
- Add logs in failure path
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
1) Add a new sysfs directory under devices/gt#/ called gtidle
to contain idle properties of GT such as name, idle_status,
idle_residency_ms
2) Remove forcewake calls for residency counter
v2:
- abstract using function pointers (Anshuman)
- remove forcewake calls for residency counter
- use device_attr (Badal)
- move rc functions to guc_pc
- change name to gt_idle (Rodrigo)
v3:
- return error for drmm_add_action_or_reset
- replace file and functions with gt_idle prefix
to gt_idle_sysfs (Himal)
- use enum for gt idle state
- move multiplier to gt idle and initialize (Anshuman)
- correct doc annotation (Rodrigo)
- remove return variable
- use kobj_gt instead of new gtidle kobj
- move residency_ms to gtidle file
- retain xe_guc_pc prefix for functions in guc_rc file (Michal)
v4:
- fix doc errors in xe_guc_pc file
- change u64 to u32 for reading residency counter
- keep gtidle states generic GT_IDLE_C[0/6] (Anshuman)
v5:
- update commit message to include removal of
forcewake calls (Anshuman)
- return void from sysfs initialization function and add warnings
(Andi)
v6:
- remove extra lines (Anshuman)
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We need to flush render caches before fence signalling, where we might
release the memory for reuse. We can't rely on userspace doing this,
so flush render caches after the batch, but before user fence- and
dma_fence signalling.
Copy the cache flush from i915, but omit PIPE_CONTROL_FLUSH_L3, since it
should be implied by the other flushes. Also omit
PIPE_CONTROL_TLB_INVALIDATE since there should be no apparent need to
invalidate TLB after batch completion.
v2:
- Update Makefile for OOB WA.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com> #1
Reported-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The GGTT exists at the tile level. When a tile contains multiple GTs,
they share the same GGTT.
v2:
- Include some changes that were mis-squashed into the VRAM patch.
(Gustavo)
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230601215244.678611-9-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Let xe_guc.c start using XE_WA() for workarounds, starting from a simple
one: Wa_22012773006. It's also changed to start with graphics version
12, since that is the first supported by xe.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-14-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
There are WAs that, due to their nature, cannot be applied from a
central place like xe_wa.c. Those are peppered around the rest of the
code, as needed. Now they have a new name: "out-of-band workarounds".
These workarounds have their names and rules still grouped in xe_wa.c,
inside the xe_wa_oob array, which is generated at compile time by
xe_wa_oob.rules and the hostprog xe_gen_wa_oob. The code generation
guarantees that the header xe_wa_oob.h contains the IDs for the
workarounds that match the index in the table. This way the runtime
checks that are spread throughout the code are simple tests against the
bitmap saved during initialization.
v2: Fix prev_name tracking not working when it's empty, i.e. when there
is more than 1 continuation rule.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-13-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
When doing out-of-tree builds with O= or KBUILD_OUTPUT=, it's important
to also add the directory where the target is saved. Otherwise any file
generated by the build system may not be available for other targets
depending on it.
The $(obj) is added automatically when building the entire kernel,
but it's not added when M=drivers/gpu/drm/xe is added.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230526164358.86393-12-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The goal is to use devcoredump infrastructure to report error states
captured at the crash time.
The error state will contain useful information for GPU hang debug, such
as INSTDONE registers and the current buffers getting executed, as well
as any other information that helps user space and allow later replays of
the error.
The proposal here is to avoid a Xe only error_state like i915 and use
a standard dev_coredump infrastructure to expose the error state.
For our own case, the data is only useful if it is a snapshot of the
time when the GPU crash has happened, since we reset the GPU immediately
after and the registers might have changed. So the proposal here is to
have an internal snapshot to be printed out later.
Also, usually a subsequent GPU hang can be only a cause of the initial
one. So we only save the 'first' hang. The dev_coredump has a delayed
work queue where it remove the coredump and free all the data within a
few moments of the error. When that happens we also reset our capture
state and allow further snapshots.
Right now this infra only print out the time of the hang. More information
will be migrated here on subsequent work. Also, in order to organize the
dump better, the goal is to propose dev_coredump changes itself to allow
multiple files and different controls. But for now we start Xe usage of
it without any dependency on dev_coredump core changes.
v2: Add dma_fence annotation for capture that might happen during long
running. (Thomas and Matt)
Use xe->drm.primary->index on drm_info msg. (Jani)
v3: checkpatch fixes
v4: Fix building and locking issues found by Francois.
Actually let's kill all of the locking in here. gt_reset serialization
already guarantee that there will be only one capture at the same time.
Also, the devcoredump has its own locking to protect the free and reads
and drivers don't need to duplicate it.
Besides this, the dma_fence locking was pushed to a following patch
since it is not needed in this one.
Fix a use after free identified by KASAN: Do not stash the faulty_engine
since that will be freed somewhere else.
v5: Fix Uptime - ktime_get_boottime actually returns the Uptime. (Francois)
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
The macros to handle the RTP tables are very scary, but shouldn't be
used outside of the header adding the infra. Move it to a separate
header and make sure it's only included when it can be.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230427223256.1432787-11-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Currently, unload pvc driver will generate a null dereference
and the call stack is as below.
[ 4850.618000] Call Trace:
[ 4850.620740] <TASK>
[ 4850.623134] ttm_bo_cleanup_memtype_use+0x3f/0x50 [ttm]
[ 4850.628661] ttm_bo_release+0x154/0x2c0 [ttm]
[ 4850.633317] ? drm_buddy_fini+0x62/0x80 [drm_buddy]
[ 4850.638487] ? __kmem_cache_free+0x27d/0x2c0
[ 4850.643054] ttm_bo_put+0x38/0x60 [ttm]
[ 4850.647190] xe_gem_object_free+0x1f/0x30 [xe]
[ 4850.651945] drm_gem_object_free+0x1e/0x30 [drm]
[ 4850.656904] ggtt_fini_noalloc+0x9d/0xe0 [xe]
[ 4850.661574] drm_managed_release+0xb5/0x150 [drm]
[ 4850.666617] drm_dev_release+0x30/0x50 [drm]
[ 4850.671209] devm_drm_dev_init_release+0x3c/0x60 [drm]
There are a couple issues, but the main one is due to TTM has only
one TTM_PL_TT region, but since pvc has 2 tiles and tries to setup
1 TTM_PL_TT each tile. The second will overwrite the first one.
During unload time, the first tile will reset the TTM_PL_TT manger
and when the second tile is trying to free Bo and it will generate
the null reference since the TTM manage is already got reset to 0.
The fix is to use one global TTM_PL_TT manager.
v2: make gtt mgr global and change the name to sys_mgr
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Vivi, Rodrigo <rodrigo.vivi@intel.com>
Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
PAT handling is growing in complexity and will continue to do so in
upcoming platforms. Separate it out to a dedicated file to keep things
tidy.
The code is moved as-is here (aside from a few unused #define's that are
just dropped); further changes will come in future patches.
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://lore.kernel.org/r/20230324210415.2434992-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reduce the use of i915_reg_defs.h so it can be encapsulated in a single
place.
1) If it was being included by mistake, remove
2) If it was included for FIELD_GET()/FIELD_PREP()/GENMASK() and the
like, just include <linux/bitfield.h>
3) If it was included to be able to define additional registers, move
the registers to the relavant headers (regs/xe_regs.h or
regs/xe_gt_regs.h)
v2:
- Squash commit fixing i915_reg_defs.h include and with the one
introducing regs/xe_reg_defs.h
- Remove more cases of i915_reg_defs.h being used when all it was
needed was linux/bitfield.h (Matt Roper)
- Move some registers to the corresponding regs/*.h file (Matt Roper)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo squashed here the removal of the i915 include]
Use the more common "call cc-disable-warning" way to disable warnings.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
TLB invalidation is used by more than USM (page faults) so break this
code out into its own file.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This adds support for stolen memory, with the same allocator as
vram_mgr. This allows us to skip a whole lot of copy-paste,
by re-using parts of xe_ttm_vram_mgr.
The stolen memory may be bound using VM_BIND, so it performs like any
other memory region.
We should be able to map a stolen BO directly using the physical memory
location instead of through GGTT even on old platforms, but I don't know
what the effects are on coherency.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Xe, is a new driver for Intel GPUs that supports both integrated and
discrete platforms starting with Tiger Lake (first Intel Xe Architecture).
The code is at a stage where it is already functional and has experimental
support for multiple platforms starting from Tiger Lake, with initial
support implemented in Mesa (for Iris and Anv, our OpenGL and Vulkan
drivers), as well as in NEO (for OpenCL and Level0).
The new Xe driver leverages a lot from i915.
As for display, the intent is to share the display code with the i915
driver so that there is maximum reuse there. But it is not added
in this patch.
This initial work is a collaboration of many people and unfortunately
the big squashed patch won't fully honor the proper credits. But let's
get some git quick stats so we can at least try to preserve some of the
credits:
Co-developed-by: Matthew Brost <matthew.brost@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Co-developed-by: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Co-developed-by: Francois Dugast <francois.dugast@intel.com>
Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Co-developed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Co-developed-by: Philippe Lecluse <philippe.lecluse@intel.com>
Co-developed-by: Nirmoy Das <nirmoy.das@intel.com>
Co-developed-by: Jani Nikula <jani.nikula@intel.com>
Co-developed-by: José Roberto de Souza <jose.souza@intel.com>
Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Co-developed-by: Dave Airlie <airlied@redhat.com>
Co-developed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Co-developed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>