For aarch64, the virt machine has an initial pcie.0 bus. The other pci
bridges that get added on top are called pci.N, see the relevant
section in config_to_command() which adds them.
In particular this fixes adding an RNG device, which is require for
OVMF PXE boot.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250618152009.120524-4-f.ebner@proxmox.com
The only supported machine for aarch64 is 'virt', so there is no need
to check if that is the machine. Also many (all?) other machines for
aarch64 in QEMU also don't have a 'pci' bus by default.
The parameter is also transitively removed from the functions:
1. print_hostpci_devices()
2. print_rng_device_commandline()
3. get_usb_controllers()
4. print_netdevice_full()
5. print_vga_device()
Should this ever be required in the future again, or also for the
$arch itself right now, it would be nicer to properly abstract this
away instead of passing the parameters along everywhere, e.g. by using
a class.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250618152009.120524-3-f.ebner@proxmox.com
As the debhelpers for systemd in Bookworm to not find those files
then, so we would either need to handle the enable/(re)starting
ourselves, or switch back to shipping those files in the aliased /lib
directory for now – the latter is less work and restores the status
quo.
Reported-by: Daniel Herzig <d.herzig@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
when creating or cleaning up NVIDIA vGPUs, we mistakenly assumed a
PCI domain of 0000, but this might be different.
Use 'normalize_pci_id' from PVE::SysFSTools, which handles this already
correctly.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Link: https://lore.proxmox.com/20250704061852.251189-3-d.csapak@proxmox.com
(cherry picked from commit 6c0cce85ac)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
pve-firewall introduced a new helper for deciding whether to create a
firewall bridge for a given tap interface. In addition to checking for
nftables, it also checks for the type of the bridge. This fixes an
issue with OVS and the nftables firewall, where firewall bridges are
still required in order for the guest firewall to work and the new
helper in pve-firewall checks for that condition now.
Previously, only the vm network script checked the condition for
creating a firewall bridge properly, but not the function for
hotplugging VM network devices. This caused a firewall bridge to
always get created when hotplugging a network device. The additional
firewall bridge had no influence on the functionality of nftables, but
was unnecessary.
For that matter a helper in qemu-server is introduced that should be
used by all call sites.
The have_sdn guards can be removed because pve-manager >= 8.2.10 has a
hard dependency on libpve-network-perl which includes the required
modules.
Signed-off-by: Stefan Hanreich <s.hanreich@proxmox.com>
Moves this repo closer to our others, having a top-level src folder
where all relevant sources and assets are included and a debian/
folder with the packaging metadata.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
using the new top-level `make tidy` target, which calls perltidy via
our wrapper to enforce the desired style as closely as possible.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
See pve-common's commit 5ae1f2e ("buildsys: add tidy make target")
for details about the chosen xargs parameters.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Using the LLVM style with some minor adaptions to avoid cramping to
much into single lines.
For now no make target, but with the .clang-format file one can simply
run:
clang-format -i qmeventd.[ch]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
As reported in the community forum [0] and the virtio-win project [1],
virtiofsd will run into its open file limit when used with a Windows
guest that reads too many files. It's also reported that the issue
does not occur with Linux guests and a workaround is using
'--inode-file-handles=mandatory' on virtiofsd command line.
The option is described as follows in the virtiofsd help:
> When to use file handles to reference inodes instead of O_PATH file
> descriptors (never, prefer, mandatory)
and the default is 'never'.
Fix the above issue by using 'prefer' rather than 'mandatory', because
that should not break other edge cases:
> prefer: Attempt to generate file handles, but fall back to O_PATH
> file descriptors where the underlying filesystem does not support
> file handles. Useful when there are various different filesystems
> under the shared directory and some of them do not support file
> handles.
[0]: https://forum.proxmox.com/threads/165565/
[1]: https://github.com/virtio-win/kvm-guest-drivers-windows/issues/1136
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Markus Frank <m.frank@proxmox.com>
Link: https://lore.proxmox.com/20250502142133.59401-1-f.ebner@proxmox.com
If no vfio device is present during migration, and the transferred
(main) memory did not change between loop cycles, we get a warning:
Use of uninitialized value $last_vfio_transferred in string ne
To silence that, check if the transferred vfio value is defined before,
and always write a defined value to $last_vfio_transferred.
This was noticed by a forum user:
https://forum.proxmox.com/threads/166161/
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Link: https://lore.proxmox.com/20250519144357.3515197-1-d.csapak@proxmox.com
According to git history, the $vmid argument was never used.
Reported-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Otherwise, a rescan operation would add fleecing images as unused
disks, even if they are already recorded in the special 'fleecing'
section.
Usually, fleecing images are cleaned up directly after backup, so this
is less likely to be an issue after commit 8009da73 ("fix #6317:
backup: use correct cleanup_fleecing_images helper"), but still makes
sense for future-proofing and for other edge cases where cleanup might
have failed.
Reported-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250422080951.10072-1-f.ebner@proxmox.com
The local one is specific for `allocate_fleecing_images` and has a
comment stating to use the one from `PVE::QemuConfig` in all other
cases.
The `cleanup` sub already called this, but only if the VM was running.
We do allocate fleecing images for previously-stopped VMs as well,
though, so we also need to do the cleanup.
As for the `detach_fleecing_images()` call: while could have stayed in
the `vm_running_locall()` branch, it also performs this check and this
way the entire fleecing cleanup stays together in one place.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
VirtIO-fs using writeback cache seems very broken at the moment. If a
guest accesses a file (even just using 'touch'), that the host is
currently writing, the guest can permanently end up with a truncated
version of that file. Even subsequent operations like moving the file,
will not result in the correct file being visible, but just rename the
truncated one.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Version 5.2.0 of libpve-guest-common-perl is required for the
PVE/Mapping/Dir.pm module, but there was a transitive dependency for
libpve-cluster-perl missing for tracking the corresponding file on the
cluster file system and build would still fail with: > unknown file
'mapping/directory.cfg' at /usr/share/perl5/PVE/Cluster.pm
Version 5.2.2 of libpve-guest-common-perl depends on recent enough
libpve-cluster-perl to fix this.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Markus Frank <m.frank@proxmox.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Laurențiu Leahu-Vlăducu <l.leahu-vladucu@proxmox.com>
Reviewed-by: Daniel Kral <d.kral@proxmox.com>
Tested-by: Laurențiu Leahu-Vlăducu <l.leahu-vladucu@proxmox.com>
Tested-by: Daniel Kral <d.kral@proxmox.com>
Tested-by: Lukas Wagner <l.wagner@proxmox.com>
Link: https://lore.proxmox.com/20250407134950.265270-6-m.frank@proxmox.com
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
add dir mapping checks to check_local_resources
Since the VM needs to be powered off for migration, migration should
work with a directory on shared storage with all caching settings.
Signed-off-by: Markus Frank <m.frank@proxmox.com>
Link: https://lore.proxmox.com/20250407134950.265270-5-m.frank@proxmox.com
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Add support for sharing directories with a guest VM.
virtio-fs needs virtiofsd to be started. In order to start virtiofsd
as a process (despite being a daemon it is does not run in the
background), a double-fork is used.
virtiofsd should close itself together with QEMU.
There are the parameters dirid and the optional parameters direct-io,
cache and writeback. Additionally the expose-xattr & expose-acl
parameter can be set to expose xattr & acl settings from the shared
filesystem to the guest system.
The dirid gets mapped to the path on the current node and is also used
as a mount tag (name used to mount the device on the guest).
example config:
```
virtiofs0: foo,direct-io=1,cache=always,expose-acl=1
virtiofs1: dirid=bar,cache=never,expose-xattr=1,writeback=1
```
For information on the optional parameters see the coherent doc patch
and the official gitlab README:
https://gitlab.com/virtio-fs/virtiofsd/-/blob/main/README.md
Also add a permission check for virtiofs directory access.
Add virtiofsd to the Recommends list for the qemu-server Debian
package, this allows users to opt-out of installing this package, e.g.
for certification reasons.
Signed-off-by: Markus Frank <m.frank@proxmox.com>
Link: https://lore.proxmox.com/20250407134950.265270-3-m.frank@proxmox.com
Tested-by: Lukas Wagner <l.wagner@proxmox.com>
[TL: squash d/control change and re-add Lukas' T-b, as nothing
essentially changed from the v16 where his tag applied]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
In spirit, this is a revert of 502870a0 ("qmeventd: extract vmid from
cgroup file instead of cmdline"), but instead of relying on the custom
'id' commandline option that's added by a Proxmox VE patch to QEMU,
rely on the standard 'pidfile' option to extract the VM ID.
As reported in the community forum [0], at least during stop mode
backup, it seems to be possible to end up with the VM process having
> 0::/system.slice/pvescheduler.service
as its single cgroup entry. It's not clear what exactly happens and
there was no success to reproduce the issue. Might be a rare bug in
systemd or in pve-common's enter_systemd_scope() code.
This was not the first time relying on the cgroup entry caused issues,
see d0b58753 ("qmeventd: improve getting VMID from PID in presence of
legacy cgroup entries").
To avoid such edge cases and issues in the future, go back to
extracting the VM ID from the process's commandline.
It's enough to care about the first occurrence of the 'pidfile'
option, because that's the one added by Proxmox VE, so the 'continue's
in the loop turn into 'break's. Even though a later option would
override the first for QEMU itself to use, that's not supported
anyways and the important part is the VM ID which is present in the
first.
[0]: https://forum.proxmox.com/threads/147409/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20240614092134.18729-1-f.ebner@proxmox.com
This makes it a bit more obvious what happens and having an actual
error for bogus $PVE_MACHINE_VERSION entries.
Note that there was no auto-vivification before, as we never directly
accessed $PVE_MACHINE_VERSION->{$verstr}->{highest} but used
get_machine_pve_revisions to query a specific QEMU machine version's
PVE revisions and then operated on the return value, and that method
returns undef if there is no entry at all.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This should have been in the patch doing the change :(
Fixes: 65b2041 ("vm-network-scripts: move scripts to /usr/libexec")
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Moves the network scripts from /var/lib/qemu-server into
/usr/libexec/qemu-server.
/usr/libexec is described as binaries run by programs which are not
intended to be directly executed by the user on [FHS 4.7]. On the other
hand /var/lib corresponds to variable state information, which does not
fit the use case here, see [FHS 5.8].
For the sake of preventing race conditions during upgrade we ship both
versions until version 9. This is required as package files are first
unpacked, including the removal of files not shipped by the new
version anymore, and only then configured, which triggers the restart
of the services.
[FHS 4.7]: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s07.html
[FHS 5.8]: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch05s08.html
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Link: https://lore.proxmox.com/20250218133206.318155-1-m.sandoval@proxmox.com
Fiona Ebner <f.ebner@proxmox.com> says:
Record the created fleecing images in the VM configuration to be able
to remove left-overs after hard failures.
Adds a new special configuration section 'fleecing', making special
section handling more generic as preparation, as well as fixing some
corner cases in configuration parsing and adding tests.
Fiona Ebner (16):
migration: remove unused variable
test: avoid duplicate mock module in restore config test
test: add parse config tests
parse config: be precise about section type checks
test: add test case exposing issue with unknown sections
parse config: skip unknown sections and warn about their presence
vzdump: anchor matches for pending and special sections
vzdump: skip all special sections
config: make special section handling generic
test: parse config: test config with duplicate sections
parse config: warn about duplicate sections
check type: require schema as an argument
config: add fleecing section
fix#5440: vzdump: better cleanup fleecing images after hard errors
migration: attempt to clean up potential left-over fleecing images
destroy vm: clean up potential left-over fleecing images
Link: https://lore.proxmox.com/20250127112923.31703-1-f.ebner@proxmox.com
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Clean up left-over fleecing images before the guest is migrated to a
different node and they'd really become orphaned.
Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250127112923.31703-16-f.ebner@proxmox.com
By recording the allocated fleecing images in the VM config, they
are not immediately orphaned, should a hard error occur during
backup that prevents cleanup.
They are attempted to be cleaned up during the next backup run.
In the cleanup helper, check if fleecing images are still attached in
QEMU and detach them. This allows recovering from more failure
scenarios. However, to avoid a deadlock, a left-over backup job needs
to be canceled first. While canceling a left-over backup already
happens when cleanup is done for a subsquent backup, it is required
for other cases that like cleanup before migration (to be added in a
following commit).
Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250127112923.31703-15-f.ebner@proxmox.com
Currently, a duplicate section will quietly override the previous
instance of the section with the same identifier. Keep the current
behavior of preferring later entries, but issue a warning or die when
parsing strictly.
The entry for 'pending' in the result needs to start out as undefined
for the check to also work in presence of empty sections. Avoid
changing the returned value itself, by making sure to initialize the
entry before returning.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250127112923.31703-12-f.ebner@proxmox.com
Collect special sections below a common 'special-sections' key in
preparation to introduce a new special section.
The special 'cloudinit' section was added in the top-level of the
configuration structure, but it's cleaner to group special sections
more similar to snapshots.
The 'cloudinit' key was already initialized, so having the new
'special-sections' key be always initialized should not cause issues
after checking and adapting all usages of 'cloudinit' which this patch
attempts to do.
Add compat handling for remote migration which might receive the
configuration hash from a node that does not yet have the changes.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Link: https://lore.proxmox.com/20250127112923.31703-10-f.ebner@proxmox.com