The GuC compatibility version that we read from the CSS header in
native/PF and the GuC VF version that we get when a VF handshakes with
the GuC are the same version number, so we should store it into the same
structure. This makes all the checks based on the compatibility version
automatically work for VFs without having to copy the value over.
For completion, also copy the wanted version and set the path to a known
string to indicate that the FW is PF-loaded. This way all the info will
be coherent when dumped from debugfs.
v2: several code cleanups and style changes (Michal), rebase on
bootstrap changes.
v3: s/min/wanted/, clarify that handshake must happen before we can get
the VF versions (Michal)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lukasz Laguna <lukasz.laguna@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://lore.kernel.org/r/20250603235432.720833-10-daniele.ceraolospurio@intel.com
Some of our VF functions, even if they take a GT pointer, work
only on primary GT and really are tile-related and would be better
to keep them separate from the rest of true GT-oriented functions.
Move them to a file and update to take a tile pointer instead.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Tomasz Lis <tomasz.lis@intel.com>
Link: https://lore.kernel.org/r/20250602103325.549-3-michal.wajdeczko@intel.com
In upcoming patch we want to separate tile-oriented VF functions
from GT-oriented functions and to allow the former access a GGTT
configuration stored at GT level we need to provide some helpers.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Tomasz Lis<tomasz.lis@intel.com>
Link: https://lore.kernel.org/r/20250602103325.549-2-michal.wajdeczko@intel.com
We have only one GGTT for all IOV functions, with each VF having assigned
a range of addresses for its use. After migration, a VF can receive a
different range of addresses than it had initially.
This implements shifting GGTT addresses within drm_mm nodes, so that
VMAs stay valid after migration. This will make the driver use new
addresses when accessing GGTT from the moment the shifting ends.
By taking the ggtt->lock for the period of VMA fixups, this change
also adds constraint on that mutex. Any locks used during the recovery
cannot ever wait for hardware response - because after migration,
the hardware will not do anything until fixups are finished.
v2: Moved some functs to xe_ggtt.c; moved shift computation to just
after querying; improved documentation; switched some warns to asserts;
skipping fixups when GGTT shift eq 0; iterating through tiles (Michal)
v3: Updated kerneldocs, removed unused funct, properly allocate
balloning nodes if non existent
v4: Re-used ballooning functions from VF init, used bool in place of
standard error codes
v5: Renamed one function
v6: Subject tag change, several kerneldocs updated, some functions
renamed, some moved, added several asserts, shuffled declarations
of variables, revealed more detail in high level functions
v7: Fixed typos, added `_locked` suffix to some functs, improved
readability of asserts, removed unneeded conditional
v8: Moved one function, removed implementation detail from kerneldoc,
added asserts
v9: Code shuffling without much change, and one param rename
v10: Minor error path change, added printing the shift via debugfs
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://lore.kernel.org/r/20250512114018.361843-3-tomasz.lis@intel.com
The balloon nodes, which are used to fill areas of GGTT inaccessible
for a specific VF, were allocated and inserted into GGTT within one
function. To be able to re-use that insertion code during VF
migration recovery, we need to split it.
This patch separates allocation (init/fini functs) from the insertion
of balloons (balloon/deballoon functs). Locks are also moved to ensure
calls from post-migration recovery worker will not cause a deadlock.
v2: Moved declarations to proper header
v3: Rephrased description, introduced "_locked" versions of some
functs, more lockdep checks, some functions renamed, altered error
handling, added missing kerneldocs.
v4: Suffixed more functs with `_locked`, moved lockdep asserts,
fixed finalization in error path, added asserts
v5: Renamed another few functs, used xe_ggtt_node_allocated(),
moved lockdep back again to avoid null dereference, added
asserts, improved comments
v6: Changed params of cleanup_ggtt()
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://lore.kernel.org/r/20250512114018.361843-2-tomasz.lis@intel.com
After restore, GuC will not answer to any messages from VF KMD until
fixups are applied. When that is done, VF KMD sends RESFIX_DONE
message to GuC, at which point GuC resumes normal operation.
This patch implements sending the RESFIX_DONE message at end of
post-migration recovery.
v2: keep pm ref during whole recovery, style fixes (Michal)
v3: assert removal to separate patch, debug message per GuC instead
of one, comments changes (Michal)
v4: improve one debug message (Michal)
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104213449.1455694-4-tomasz.lis@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
To properly support VF Save/Restore procedure, fixups need to be
applied after PF driver finishes its part of VF Restore. The fixups
are required to adjust the ongoing execution for a hardware switch
that happened, because some GFX resources are not fully virtualized,
and assigned to a VF as range from a global pool. The VF on which
a VM is restored will often have different ranges provisioned than
the VF on which save process happened. Those resource fixups are
applied by the VF driver within a restored VM.
A VF driver gets informed that it was migrated by receiving an
interrupt from each GuC. The interrupt assigned for that purpose
is "GUC SW interrupt 0". Seeing that fields set from within the
irq handler should be the trigger for fixups.
The VF can safely do post-migration fixups on resources associated
to each GuC only after that GuC issued the MIGRATED interrupt.
This change introduces a worker to be used for post-migration fixups,
and a mechanism to schedule said worker when all GuCs sent the irq.
v2: renamed and moved functions, updated logged messages, removed
unused includes, used anon struct (Michal)
v3: ordering, kerneldoc, asserts, debug messages,
on_all_tiles -> on_all_gts (Michal)
v4: fixed missing header include
v5: Explained what fixups are, explained which IRQ is used, style
fixes (Michal)
Bspec: 50868
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104213449.1455694-2-tomasz.lis@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Only limited set of registers is accessible for the VF driver and
the hardware will silently drop writes to inaccessible registers.
To improve our VF driver lets intercept all such writes to warn
about such unexpected writes on debug builds or optionally allow
to provide some substitution (as a potential future extension).
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240713142643.1242-2-michal.wajdeczko@intel.com
Each VF is assigned a limited range of the GGTT address space.
To ensure that the VF driver does not use GGTT allocations outside
of the assigned region, explicitly reserve GGTT space below and
above this region when initializing GGTT.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240527112015.1020-1-michal.wajdeczko@intel.com
As part of the its initialization, the VF driver has already
obtained a list of the runtime (fuse) register values from the
PF driver. When VF driver is attempting to read register that is
inaccessible to the VF, use the values from this list instead.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240523192240.844-2-michal.wajdeczko@intel.com
The GuC firmware is loaded and initialized by the PF driver. Make
sure VF drivers only perform permitted operations. For submission
initialization, use number of GuC context IDs from self config.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240521092518.624-3-michal.wajdeczko@intel.com
The VF driver doesn't know which GuC firmware was loaded by the PF
driver and must perform GuC ABI version handshake prior to sending
any other H2G actions to the GuC to submit workloads.
The VF driver also doesn't have access to the fuse registers and
must rely on the runtime info, which includes values of the fuse
registers, that the PF driver is exposing to the VFs.
Add functions to cover that functionality. We will use these
functions in upcoming patches.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240516110546.2216-5-michal.wajdeczko@intel.com