qemu-server/test
Dominik Csapak 48ada6982f pci: mdev: adapt to NVIDIA's modern interface with kernel >= 6.8
Since kernel 6.8, NVIDIAs vGPU driver does not use the generic mdev
interface anymore, since they relied on a feature there which is not
available anymore. IIUC the kernel [0] recommends drivers to implement
their own device specific features since putting all in the generic one
does not make sense.

They now have an 'nvidia' folder in the device sysfs path, which
contains the files `creatable_vgpu_types`/`current_vgpu_type` to
control the virtual functions model, and then the whole virtual function
has to be passed through (although without resetting and changing to the
vfio-pci driver).

This patch implements changes so that from a config perspective, it
still is an mediated device, and we map the functionality iff the device
has no mediated devices but the new NVIDIAs sysfsapi and the model name
is 'nvidia-<..>'

It behaves a bit different than mdevs and normal pci passthrough, as we
have to choose the correct device immediately since it's bound to the
pciid, but we must not bind the device to vfio-pci as the NVIDIA driver
implements this functionality itself.

When cleaning up, we iterate over all reserved devices (since for a
mapping we can't know at this point which was chosen besides looking at
the reservations) and reset the vgpu model to '0', so it frees up the
reservation from NVIDIAs side. (We also do that in a loop, since it's
not always immediately ready after QEMU closes)

A general problem (but that was previously also the case) is that a
showcmd (for a not running guest) reserves the pciids, which might block
an execution of a different real vm. This is now a bit more problematic
as we (temporarily) set the vgpu type then.

0: https://docs.kernel.org/driver-api/vfio-pci-device-specific-driver-acceptance.html

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Christoph Heiss <c.heiss@proxmox.com>
Reviewed-by: Christoph Heiss <c.heiss@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-10-24 18:43:52 +02:00
..
cfg2cmd fix #3352: templates: minimize config when starting templates 2024-07-01 10:48:27 +02:00
MigrationTest move helper to check running QEMU version out of the 'Machine' module 2024-07-30 21:19:51 +02:00
ovf_manifests test: add test for OVF with missing default rasd namespace 2020-04-27 13:09:51 +02:00
restore-config-expected test: add tests for restoring config 2021-04-18 18:10:28 +02:00
restore-config-input test: add tests for restoring config 2021-04-18 18:10:28 +02:00
snapshot-expected tests: use valid machine types for snapshot tests 2023-08-17 13:37:57 +02:00
snapshot-input tests: use valid machine types for snapshot tests 2023-08-17 13:37:57 +02:00
Makefile tests: fix invoking migration tests with make 2023-05-22 15:51:58 +02:00
run_config2command_tests.pl pci: mdev: adapt to NVIDIA's modern interface with kernel >= 6.8 2024-10-24 18:43:52 +02:00
run_ovf_tests.pl test: add test for OVF with missing default rasd namespace 2020-04-27 13:09:51 +02:00
run_pci_addr_checks.pl move qemu-configs to own directory 2019-09-24 18:59:35 +02:00
run_qemu_img_convert_tests.pl fix #4249: make image clone or conversion respect bandwidth limit 2023-02-23 17:09:51 +01:00
run_qemu_migrate_tests.pl tests: add migration alias check 2023-06-21 12:48:11 +02:00
run_qemu_restore_config_tests.pl test: unbreak restore_config_test 2021-06-23 12:27:54 +02:00
run_snapshot_tests.pl tests: exit with -1 in case of failures 2017-05-17 13:58:18 +02:00
snapshot-test.pm tests: use valid machine types for snapshot tests 2023-08-17 13:37:57 +02:00
test_get_replicatable_volumes.pl grammar fix: s/does not exists/does not exist/g 2019-12-13 12:20:56 +01:00
test.vmdk fix #2395: refactor qemu_img_convert to accept files as source 2019-10-17 13:57:21 +02:00