qemu-server

mirror of https://git.proxmox.com/git/qemu-server synced 2025-07-05 17:18:42 +00:00

Author	SHA1	Message	Date
Dominik Csapak	6fa358a334	pci: make mediated device sysfs path independent of PCI id mdevs have a host-unique UUID they are indexed with in the PCI-id independent `/sys/bus/mdev/devices/<uuid>` path, so there is no need to go through the PCI id for them. Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2022-11-09 09:06:19 +01:00
Thomas Lamprecht	2fa64dbddd	pci: add/improve HW reservation comments Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-11-09 08:55:55 +01:00
Dominik Csapak	1b189121fc	vm start/stop: cleanup passed-through pci devices in more situations if the preparing of PCI devices or the start of the VM fails, we need to cleanup the PCI devices (reservations and mdevs), or else it might happen that there are leftovers which must be manually removed. to include also mdevs now, refactor the cleanup code from 'vm_stop_cleanup' into it's own function, and call that instead of only 'remove_pci_reservation' also simplifies the code, such that it now removes all PCI ids reserved for that VMID, since we cannot have multiple VMs with the same VMID anyway Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-11-09 08:49:45 +01:00
Dominik Csapak	bbf96e0f1e	automatically add 'uuid' parameter when passing through NVIDIA vGPU When passing through an NVIDIA vGPU via mediated devices, their software needs the qemu process to have the 'uuid' parameter set to the one of the vGPU. Since it's currently not possible to pass through multiple vGPUs to one VM (seems to be an NVIDIA driver limitation at the moment), we don't have to take care about that. Sadly, the place we do this, it does not show up in 'qm showcmd' as we don't (want to) query the pci devices in that case, and then we don't have a way of knowing if it's an NVIDIA card or not. But since this is informational with QEMU anyway, i'd say we can ignore that. Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2022-08-12 13:42:33 +02:00
Dominik Csapak	d8a7e9e881	PCI: allow longer pci domains some systems[0] have pci domains longer than the default ('0000') of 4 characters, so change the regex to allow at least 4. 0: https://forum.proxmox.com/threads/problem-with-gpu-passthrough-in-a-virtual-machine.105720/ Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2022-03-16 18:03:35 +01:00
Nicholas Sherlock	d806b017ac	pci: allow override of PCI vendor/device ids This allows mobile- and vGPUs to be presented to the guest as if they were the original desktop variants of the card. It also allows device-ID variants that guests don't know about to be renamed to match compatible sibling devices the guest does have drivers for (e.g. to remove manufacturer-specific vendor ID variants that prevent the use of a device which would otherwise have a supported chipset) e.g. hostpci0: 03:00,vendor-id=0x8086,device-id=0x10f6 Signed-off-by: Nicholas Sherlock <n.sherlock@gmail.com> Reviewed-by: Dominik Csapak <d.csapak@proxmox.com> Tested-by: Dominik Csapak <d.csapak@proxmox.com>	2022-01-25 10:59:23 +01:00
Thomas Lamprecht	d01de38cb6	pci: prepare: improve no-IOMMU error message give some context Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-10-15 19:58:16 +02:00
Thomas Lamprecht	a01593676c	pci reservation: rework helpers style and readability wise both style and readability are naturally subjective to a certain degree... Also, this patch mixes a bit much into one thing, but splitting that up would mean lots of work I just wanted to avoid, sorry about that. Among other things: - avoid a level of indentation in the reserve loop - rename pciids to reservation_list where it was a better fit - make reserve set either pid or time to avoid suggesting that we save both - rename parameters to requested/dropped IDs for easier understanding what's going on in the code - avoid old_pid/pid, use running_pid and reserver_pid instead to clarify what they actually mean - drop useless returns to avoid suggesting the return value has any use and save some lnes - use a hash slice to delete all dropped IDs at once, shorter and faster - use 5 second timeout for reservation, this does nothing intensive nor does it wait for anything, so the critical section should be really short, 5s is really long enough for a wait.. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-10-15 19:58:16 +02:00
Thomas Lamprecht	bda0ebff2d	pci reservation: move lock/reservation file into /run/qemu-server lck needs to die, the days of any 8.3 file naming schemes are long gone (in the server space that is ;) /var/run is /run so use the shorter, and while /var/lock is a OK place for the locks we try to keep lock and lock-object together nowadays. The qemu-server sub-directory avoids overly cluttering the already crowded top-level /run dir Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-10-15 18:17:34 +02:00
Thomas Lamprecht	cda95d5223	pci reservation: encode locklessness of parsers in name to avoid that they're misused Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-10-15 14:44:50 +02:00
Dominik Csapak	3bfee796f4	pci: add helpers to (un)reserve pciids for a vm saves a list of pciid <-> vmid mappings in /var/run that we can check when we start a vm if we're not given a pid but a timeout, we save the time when the reservation will run out (current time + timeout + 5s) since each vm start (until we can save the pid) varies from config to config reserve_pci_usage and remove_pci_reservation always expect a list of ids so that we can update the reservation for a vm all at once Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2021-10-11 09:07:52 +02:00
Thomas Lamprecht	71cb8e0f87	pci related code cleanups Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-10-11 08:39:28 +02:00
Thomas Lamprecht	e2b42bee6d	pci: use local helper to generated generate_mdev_uuid avoid (API) leaking qemu-server specific stuff into pve-common Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-10-11 08:38:28 +02:00
Thomas Lamprecht	82712fcd3c	pci: prepare_pci_device: fixup parameter name Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-10-11 08:37:35 +02:00
Dominik Csapak	acd4b77745	pci: refactor pci device preparation makes the vm start a bit less crowded Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2021-10-08 06:27:19 +02:00
Dominik Csapak	a4d5b84c9c	pci: to not capture first group in PCIRE we do not need this group, but want to use the regex where we have multiple groups, so make it a non-capture group Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2021-10-05 16:14:42 +02:00
Thomas Lamprecht	41af2dfc25	PCI: use warnings/strict and fix setting $vga from config2command fixes commit `74c17b7a23` which moved this code here, but forgot to pass $vga ref, as the module was not using warning nor strict mode this was not caught.. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-10-16 18:03:32 +02:00
Thomas Lamprecht	f7d1505b0c	tree wide cleanups Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-10-16 18:03:32 +02:00
Thomas Lamprecht	d1c1af4b02	tree wide cleanup of s/return undef/return/ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-10-16 16:20:05 +02:00
Stefan Reiter	2141a802b8	fix #3010 : add 'bootorder' parameter for better control of boot devices (also fixes #3011) Deprecates the old-style 'boot' and 'bootdisk' options by adding a new 'order=' subproperty to 'boot'. This allows a user to specify more than one disk in the boot order, helping with newer versions of SeaBIOS/OVMF where disks without a bootindex won't be initialized at all (breaks soft-raid and some LVM setups). This also allows specifying a bootindex for USB and hostpci devices, which was not possible before. Floppy boot support is not supported in the new model, but I doubt that will be a problem (AFAICT we can't even attach floppy disks to a VM?). Default behaviour is intended to stay the same, i.e. while new VMs will receive the new 'order' property, it will be set so the VM starts the same as before (using get_default_bootorder). Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-10-14 12:30:50 +02:00
Dominik Csapak	7de7f675c2	fix mdev cmdline generation during refactoring, the vmid got lost, but is necessary to get the correct mdev id Fixes commit `74c17b7a23` Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> [ reference fixed commit ] Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-07-13 10:29:25 +02:00
Thomas Lamprecht	1fac3a0b31	pci: whitespace, indentation and formating fixes Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-06-25 13:33:26 +02:00
Stefan Reiter	13d689792e	fix #2794 : allow legacy IGD passthrough Legacy IGD passthrough requires address 00:1f.0 to not be assigned to anything on QEMU startup (currently it's assigned to bridge pci.2). Changing this in general would break live-migration, so introduce a new hostpci parameter "legacy-igd", which if set to 1 will move that bridge to be nested under bridge 1. This is safe because: * Bridge 1 is unconditionally created on i440fx, so nesting is ok * Defaults are not changed, i.e. PCI layout only changes when the new parameter is specified manually * hostpci forbids migration anyway Additionally, the PT device has to be assigned address 00:02.0 in the guest as well, which is usually used for VGA assignment. Luckily, IGD PT requires vga=none, so that is not an issue either. See https://git.qemu.org/?p=qemu.git;a=blob;f=docs/igd-assign.txt Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-06-25 13:25:35 +02:00
Stefan Reiter	74c17b7a23	cfg2cmd: hostpci: move code to PCI.pm To avoid further cluttering config_to_command with subsequent changes. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-06-25 13:25:35 +02:00
Stefan Reiter	2cf61f33d9	fix #2264 : add virtio-rng device Allow a user to add a virtio-rng-pci (an emulated hardware random number generator) to a VM with the rng0 setting. The setting is version_guard()-ed. Limit the selection of entropy source to one of three: /dev/urandom (preferred): Non-blocking kernel entropy source /dev/random: Blocking kernel source /dev/hwrng: Hardware RNG on the host for passthrough QEMU itself defaults to /dev/urandom (or the equivalent getrandom() call) if no source file is given, but I don't fully trust that behaviour to stay constant, considering the documentation [0] already disagrees with the code [1], so let's always specify the file ourselves. /dev/urandom is preferred, since it prevents host entropy starvation. The quality of randomness is still good enough to emulate a hwrng, since a) it's still seeded from the kernel's true entropy pool periodically and b) it's mixed with true entropy in the guest as well. Additionally, all sources about entropy predicition attacks I could find mention that to predict /dev/urandom results, /dev/random has to be accessed or manipulated in one way or the other - this is not possible from a VM however, as the entropy we're talking about comes from the hosts blocking pool. More about the entropy and security implications of the non-blocking interface in [2] and [3]. Note further that only one /dev/hwrng exists at any given time, if multiple RNGs are available, only the one selected in '/sys/devices/virtual/misc/hw_random/rng_current' will feed the file. Selecting this is left as an exercise to the user, if at all required. We limit the available entropy to 1 KiB/s by default, but allow the user to override this. Interesting to note is that the limiter does not work linearly, i.e. max_bytes=1024/period=1000 means that up to 1 KiB of data becomes available on a 1000 millisecond timer, not that 1 KiB is streamed to the guest over the course of one second - hence the configurable period. The default used here is the same as given in the QEMU documentation [0] and has been verified to affect entropy availability in a guest by measuring /dev/random throughput. 1 KiB/s is enough to avoid any early-boot entropy shortages, and already has a significant impact on /dev/random availability in the guest. [0] https://wiki.qemu.org/Features/VirtIORNG [1] https://git.qemu.org/?p=qemu.git;a=blob;f=crypto/random-platform.c;h=f92f96987d7d262047c7604b169a7fdf11236107;hb=HEAD [2] https://lwn.net/Articles/261804/ [3] https://lwn.net/Articles/808575/ Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-03-06 18:09:04 +01:00
Dominik Csapak	2513b862e6	fix #2566 : increase scsi limit to 31 to achieve this we have to add 3 new scsihw addresses since lsi controllers can only hold 7 scsi drives we go up to 31, since this is the limit for virtio-scsi-single devices we have reserved (we can increase this in the future) to make it more future proof, we add a new pci bridge under pci bridge 1, so we have to adapt the bridge adding code (we did not need this for q35 previously) impact on live migration: since on older versions of qemu-server we do not have those config settings, there is no problem from old -> new new->old is not supported anyway and this breaks so that the vm crashes and loses the configs for scsi15-30 (same behaviour as e.g. with audio0 and migration from new->old) tested with 31 scsi disk on i440fx + virtio-scsi i440fx + lsi q35 + virtio-scsi q35 + lsi with ovmf + seabios Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2020-01-31 20:26:26 +01:00
Thomas Lamprecht	e2b0d85dda	PCIe passthrough: fixup: avoid addr conflict and cleanup a bit Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2019-09-06 19:27:30 +02:00
Thomas Lamprecht	d7d698f60c	pci: add conflict tests best viewed with: git show -w Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2019-09-06 19:27:30 +02:00
Aaron Lauterer	c4e1638148	Add support for up to 16 PCI(e) devices For non pci express passthrough additional addresses are reserved. For pcie passthrough pcie root ports are needed (unless guest is like windows 7). The first 4 pcie root ports are defined by default in the pve-q35.cfg files. If more than 4 pcie devices are passed through the needed root ports are created on demand. This helps to keep live migration possible without adding a new pve-q35.cfg file. For the windows 7 like guests additional addresses are reserved as well. Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>	2019-09-06 19:27:30 +02:00
Aaron Lauterer	d438e06028	Add PCI address for audio device Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>	2019-07-18 08:24:39 +02:00
Dominik Csapak	6dbcb07367	add ivshmem device to config with such a shared memory device, a vm can share data with other vms or with the host via memory one of the use cases is looking-glass[1] with pci-passthrough, which copies the guest fb to the host and you get a high-speed, low-latency display client for the vm on vm stop we delete the file again 1: https://looking-glass.hostfission.com/ Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2019-02-26 08:01:12 +01:00
Dominik Csapak	739ba34024	add win7 pcie quirk Win7 is very picky about pcie assignments and fails with 'error 12' the way we add hospci devices. To combat that, we simply give the hostpci device a normal port instead. Start with address 0x10, so that we have space before those devices, and between them and the ones configured in pve-q35.cfg should we need it in the future. Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2018-12-17 14:00:23 +01:00
Dominik Csapak	b71351a7ed	QemuServer: remove PCI sysfs helpers and use them from PVE::SysFSTools, where they got moved to Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2018-11-19 14:06:11 +01:00
Wolfgang Bumiller	d559309fcf	arm: pci addressing, keyboard and ehci controller On arm we start off with a pcie bridge pcie.0. We need a keyboard in addition to the tablet device, and we need to connect both to an 'ehci' controller. To do all this, we also pass the $arch variable through a whole lot of function calls to ultimately also adapt the hotplug code to take care of the new keyboard device. Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>	2018-11-13 14:44:28 +01:00
Dominik Csapak	55655ebc32	fix #1952 : make vga memory configurable we change 'vga' to a property string and add a 'memory' property with this, the user can better control the memory given to the virtual gpu, this is especially useful for spice/qxl since high resolutions need more memory Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2018-11-09 13:45:07 +01:00
Dominik Csapak	de9768f002	refactor PCI into own file to reduce QemuServer.pm size also move the $device hash out of any function Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2016-06-22 09:13:16 +02:00

36 Commits