VFs use a relay transaction during the resume/reset flow and use
of the GFP_KERNEL flag may conflict with the reclaim:
-> #0 (fs_reclaim){+.+.}-{0:0}:
[ ] __lock_acquire+0x1874/0x2bc0
[ ] lock_acquire+0xd2/0x310
[ ] fs_reclaim_acquire+0xc5/0x100
[ ] mempool_alloc_noprof+0x5c/0x1b0
[ ] __relay_get_transaction+0xdc/0xa10 [xe]
[ ] relay_send_to+0x251/0xe50 [xe]
[ ] xe_guc_relay_send_to_pf+0x79/0x3a0 [xe]
[ ] xe_gt_sriov_vf_connect+0x90/0x4d0 [xe]
[ ] xe_uc_init_hw+0x157/0x3b0 [xe]
[ ] do_gt_restart+0x1ae/0x650 [xe]
[ ] xe_gt_resume+0xb6/0x120 [xe]
[ ] xe_pm_runtime_resume+0x15b/0x370 [xe]
[ ] xe_pci_runtime_resume+0x73/0x90 [xe]
[ ] pci_pm_runtime_resume+0xa0/0x100
[ ] __rpm_callback+0x4d/0x170
[ ] rpm_callback+0x64/0x70
[ ] rpm_resume+0x594/0x790
[ ] __pm_runtime_resume+0x4e/0x90
[ ] xe_pm_runtime_get_ioctl+0x9c/0x160 [xe]
Since we have a preallocated pool of relay transactions, which
should cover all our normal relay use cases, we may use the
GFP_NOWAIT flag when allocating new outgoing transactions.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Tested-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com>
Reviewed-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250131153713.808-1-michal.wajdeczko@intel.com
The kernel fault injection infrastructure is used to test proper error
handling during probe. The return code of the functions using
ALLOW_ERROR_INJECTION() can be conditionnally modified at runtime by
tuning some debugfs entries. This requires CONFIG_FUNCTION_ERROR_INJECTION
(among others).
One way to use fault injection at probe time by making each of those
functions fail one at a time is:
FAILTYPE=fail_function
DEVICE="0000:00:08.0" # depends on the system
ERRNO=-12 # -ENOMEM, can depend on the function
echo N > /sys/kernel/debug/$FAILTYPE/task-filter
echo 100 > /sys/kernel/debug/$FAILTYPE/probability
echo 0 > /sys/kernel/debug/$FAILTYPE/interval
echo -1 > /sys/kernel/debug/$FAILTYPE/times
echo 0 > /sys/kernel/debug/$FAILTYPE/space
echo 1 > /sys/kernel/debug/$FAILTYPE/verbose
modprobe xe
echo $DEVICE > /sys/bus/pci/drivers/xe/unbind
grep -oP "^.* \[xe\]" /sys/kernel/debug/$FAILTYPE/injectable | \
cut -d ' ' -f 1 | while read -r FUNCTION ; do
echo "Injecting fault in $FUNCTION"
echo "" > /sys/kernel/debug/$FAILTYPE/inject
echo $FUNCTION > /sys/kernel/debug/$FAILTYPE/inject
printf %#x $ERRNO > /sys/kernel/debug/$FAILTYPE/$FUNCTION/retval
echo $DEVICE > /sys/bus/pci/drivers/xe/bind
done
rmmod xe
It will also be integrated into IGT for systematic execution by CI.
v2: Wrappers are not needed in the cases covered by this patch, so
remove them and use ALLOW_ERROR_INJECTION() directly.
v3: Document the use of fault injection at probe time in xe_pci_probe
and refer to it where ALLOW_ERROR_INJECTION() is used.
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240927151207.399354-1-francois.dugast@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We already have mechanism that allows a VF driver to communicate
with the PF driver, now add PF side handlers for VF2PF requests
defined in version 1.0 of VF/PF GuC Relay ABI specification.
The VF2PF_HANDSHAKE request must be used by the VF driver to
negotiate the ABI version prior to sending any other request.
We will reset any negotiated version later during FLR.
The outcome of the VF2PF_QUERY_RUNTIME requests depends on actual
platform, for legacy platforms used as SDV is provided as-is, for
latest platforms it is preliminary, and might be changed.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423180436.2089-5-michal.wajdeczko@intel.com
It looks that declaration of kunit_get_current_test() was
available in xe_guc_relay.c only with CONFIG_KUNIT enabled.
If CONFIG_KUNIT is disabled we fail with:
In file included from ../include/linux/build_bug.h:5,
from ../include/linux/bitfield.h:10,
from ../drivers/gpu/drm/xe/xe_guc_relay.c:6:
../drivers/gpu/drm/xe/xe_guc_relay.c: In function ‘xe_guc_relay_process_guc2vf’:
../drivers/gpu/drm/xe/xe_guc_relay.c:863:52: error: implicit declaration of function ‘kunit_get_current_test’ [-Werror=implicit-function-declaration]
863 | if (unlikely(!IS_SRIOV_VF(relay_to_xe(relay)) && !kunit_get_current_test()))
| ^~~~~~~~~~~~~~~~~~~~~~
../include/linux/compiler.h:77:42: note: in definition of macro ‘unlikely’
77 | # define unlikely(x) __builtin_expect(!!(x), 0)
| ^
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: 811fe9f556 ("drm/xe/guc: Introduce Relay Communication for SR-IOV")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Tested-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://lore.kernel.org/r/20240105171947.321-1-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Add few tests to make sure that some negative and normal use
scenarios of the GuC Relay are implemented correctly.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20240104222031.277-10-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
There are scenarios where SR-IOV Virtual Function (VF) driver will
need to get additional data that is not available over VF MMIO BAR
nor could be queried from the GuC firmware and must be obtained
from the Physical Function (PF) driver.
To allow such communication between VF and PF drivers, GuC supports
set of H2G and G2H actions which allows relaying embedded messages,
that are otherwise opaque for the GuC.
To allow use of this communication mechanism, provide functions for
sending requests and handling replies and placeholder where we will
put handlers for incoming requests.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20240104222031.277-8-michal.wajdeczko@intel.com
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>