qemu-server/PVE
Dominik Csapak 49c51a60db pci: workaround nvidia driver issue on mdev cleanup
in some nvidia grid drivers (e.g. 14.4 and 15.x), their kernel module
tries to clean up the mdev device when the vm is shutdown and if it
cannot do that (e.g. becaues we already cleaned it up), their removal
process cancels with an error such that the vgpu does still exist inside
their book-keeping, but can't be used/recreated/freed until a reboot.

since there seems no obvious way to detect if thats the case besides
either parsing dmesg (which is racy), or the nvidia kernel module
version(which i'd rather not do), we simply test the pci device vendor
for nvidia and add a 10s sleep. that should give the driver enough time
to clean up and we will not find the path anymore and skip the cleanup.

This way, it works with both the newer and older versions of the driver
(some of the older drivers are LTS releases, so they're still
supported).

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2023-03-16 09:08:34 +01:00
..
API2 fix spelling error in comment 2023-01-23 11:20:11 +01:00
CLI tree-wide: switch to official spelling of QEMU in descriptions/messages 2022-12-20 10:26:41 +01:00
QemuServer ovmf efi disk: ignore efitype parameter for ARM VMs 2023-02-23 16:29:57 +01:00
VZDump vzdump: Add VM QGA option to skip fs-freeze/-thaw on backup 2023-02-23 16:34:10 +01:00
Makefile buildsys: use $(MAKE) instead of make 2019-09-24 18:06:16 +02:00
QemuConfig.pm fix #4201: delete cloud-init disk on rollback 2022-11-11 19:26:16 +01:00
QemuMigrate.pm close #2792: allow online migration with replicated snapshots 2023-01-27 09:53:28 +01:00
QemuServer.pm pci: workaround nvidia driver issue on mdev cleanup 2023-03-16 09:08:34 +01:00
QMPClient.pm tree-wide: switch to official spelling of QEMU in descriptions/messages 2022-12-20 10:26:41 +01:00