Commit Graph

196 Commits

Author SHA1 Message Date
Aaron Lauterer
60404e3c1a tests: add migration test for pending disk
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2023-06-21 12:48:11 +02:00
Aaron Lauterer
a0dbed5a6d migration: only migrate disks used by the guest
When scanning all configured storages for disk images belonging to the
VM, the migration could easily fail if a storage is not available, but
enabled. That storage might not even be used by the VM at all.

By not scanning all storages and only looking at the disk images
referenced in the VM config, we can avoid unnecessary failures.
Some information that used to be provided by the storage scanning needs
to be fetched explicilty (size, format).

Behaviorally the biggest change is that unreferenced disk images will
not be migrated anymore. Only images referenced in the config will be
migrated.

The tests have been adapted accordingly.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2023-06-21 12:48:11 +02:00
Dominik Csapak
42ac818005 add test for mapped pci devices
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By:  Markus Frank <m.frank@proxmox.com>
2023-06-19 07:21:49 +02:00
Dominik Csapak
a52eb3c4e9 check local resources: extend for mapped resources
by adding them to their own list, saving the nodes where they are not
allowed, and return those on 'wantarray' so we don't break existing
callers that don't expect it.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By:  Markus Frank <m.frank@proxmox.com>
2023-06-19 07:21:07 +02:00
Dominik Csapak
9b71c34d61 enable cluster mapped PCI devices for guests
this patch allows configuring pci devices that are mapped via cluster
resource mapping when the user has 'Resource.Use' on the ACL path
'/mapping/pci/{ID}' (in  addition to the usual required vm config
privileges)

When given multiple mappings in the config, we use them as alternatives
for the passthrough, and will select the first free one on startup.
It is using our regular pci reservation mechanism for regular devices and
we introduce a selection mechanism for mediated devices.

A few changes to the inner workings were required to make this work well:
* parse_hostpci now returns a different structure where we have a list
  of lists (first level is for the different alternatives and second
  level is for the different devices that should be passed through
  together)
* factor out the 'parse_hostpci_devices' which parses each device from
  the config and does some precondition checks
* reserve_pci_usage now behaves slightly different when trying to
  reserve an device with the same VMID that's already reserved for,
  since for checking which alternative we can use, we already must
  reserve one (this means that qm showcmd can actually reserve devices,
  albeit only for up to 10 seconds)
* configuring a mediated device on a multifunction device is not
  supported anymore, and results in failure to start (previously, it
  just chose the first device to do it). This is a breaking change
* configuring a single pci device twice on different hostpci slots now
  fails during commandline generation instead on qemu start, so we had
  to adapt one test where this occurred (it could never have worked
  anyway)
* we now check permissions during clone/restore, meaning raw/real
  devices can only be cloned/restored by root@pam from now on.
  this is a breaking change.

Fixes #3574: Improve SR-IOV usability
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By:  Markus Frank <m.frank@proxmox.com>
2023-06-16 16:24:02 +02:00
Fiona Ebner
5674d19810 remove left-over mentions of to-be-dropped, outdated QMP commands
The commands snapshot-drive and delete-drive-snapshot have been unused
by qemu-server since commit eba2b721 ("use qemu's blockdev-snapshot
functions") and are now going to be dropped in our QEMU builds too, so
get rid of these left-overs.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-06-07 19:36:45 +02:00
Fiona Ebner
17bacc2182 cfg2cmd: replace deprecated no-hpet option with hpet=off machine flag
like the deprecation message printed by QEMU suggests.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-06-07 17:35:41 +02:00
Fiona Ebner
a7547a7c9f tests: fix invoking migration tests with make
Even if between single quotes, the dollar sign needs to be escaped
here. Otherwise, there will be an error
> Search pattern not terminated at -e line 1.
and no migration tests would be run. The error did not lead to
aborting though, making it harder to notice.

Fixes: aac89f6c ("tests: avoid calling test script to get target names")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-05-22 15:51:58 +02:00
Thomas Lamprecht
aac89f6cfa tests: avoid calling test script to get target names
As otherwise we couple *all* Makefile targets to the dependencies of
the test script, even for a simple make call (e.g., done on building
the source), so use a much simpler heuristic that just depends on
perl, which is essential in Debian.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-19 15:06:46 +02:00
Thomas Lamprecht
1edeff742d tests: simplify outputting available migration test names
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-19 15:06:46 +02:00
Fiona Ebner
da8fc2f2ad test: mock calls that can fail in a chroot environment
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-05-19 15:06:46 +02:00
Leo Nunner
56d16f169c fix #4249: make image clone or conversion respect bandwidth limit
Previously, cloning a stopped VM didn't respect bwlimit. Passing the -r
(ratelimit) parameter to qemu-img convert fixes this issue.

Signed-off-by: Leo Nunner <l.nunner@proxmox.com>
 [ T: reword subject line slightly ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-02-23 17:09:51 +01:00
Alexandre Derumier
6eabfbd15f tests: add memory tests
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
2023-02-15 14:34:25 +01:00
Fiona Ebner
5cbf4d727d close #2792: allow online migration with replicated snapshots
Since commit 9b6efe43 ("migrate: add live-migration of replicated
disks") live-migration with replicated volumes is possible. When
handling the replication, it is checked that all local volumes
previously detected as replicatable are actually replicated. So the
check if migration with snapshots is possible can just allow volumes
that are detected as replicatable.

Note that VM state files are also replicated.

If there is an invalid configuration with a non-replicatable volume or
state file and replication is enabled, then replication will fail, and
thus migration will fail early.

Trying to live-migrate to a non-replication target (needs --force)
will still fail if there are snapshots, because they are (correctly)
detected as non-replicated.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-27 09:53:28 +01:00
Fiona Ebner
83f04be3d5 migration: nbd export: switch away from deprecated QMP command
The 'nbd-server-add' QMP command has been deprecated since QEMU 5.2 in
favor of a more general 'block-export-add'.

When using 'nbd-server-add', QEMU internally converts the parameters
and calls blk_exp_add() which is also used by 'block-export-add'. It
does one more thing, namely calling nbd_export_set_on_eject_blk() to
auto-remove the export from the server when the backing drive goes
away. But that behavior is not needed in our case, stopping the NBD
server removes the exports anyways.

It was checked with a debugger that the parameters to blk_exp_add()
are still the same after this change. Well, the block node names are
autogenerated and not consistent across invocations.

The alternative to using 'query-block' would be specifying a
predictable 'node-name' for our '-drive' commandline. It's not that
difficult for this use case, but in general one needs to be careful
(e.g. it can't be specified for an empty CD drive, but would need to
be set when inserting a CD later). Querying the actual 'node-name'
seemed a bit more future-proof.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-01-13 14:04:39 +01:00
Fiona Ebner
7bd9abd243 tree-wide: switch to official spelling of QEMU in descriptions/messages
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2022-12-20 10:26:41 +01:00
Thomas Lamprecht
2ceb59d4b1 ovmf cmd assembly: reorder arguments
in preparation of reworking the new separate method for OVMF cmd
assembly, do this in a separate very targeted commit to make it more
clear that the next reworking-commit doesn't messes with our tests at
all.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-12-12 11:41:50 +01:00
Alexandre Derumier
f314976230 test: add qemu 7.1 multiqueue netdev test
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
 [ T: fixup missing trailing backslash in test ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-13 16:43:12 +01:00
Alexandre Derumier
53ca628507 test: add qemu 7.1 default netdev rx|tx_queue_size=1024
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
2022-11-13 16:42:24 +01:00
Alexandre Derumier
620d6b328f virtio-net: increase defaults rx|tx-queue-size to 1024
This is reducing packet drop on high pps, and also needed for dpdk.

Redhat already have use it by default in rhev and his openstack platform too
since 2019.

I'm using it in production since 6 months, I don't have seen performance regression.

fix: (which ask for custom option, but setting it by default seem fine for me)

https://bugzilla.proxmox.com/show_bug.cgi?id=1546
https://bugzilla.proxmox.com/show_bug.cgi?id=2349
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
2022-11-13 16:42:23 +01:00
Thomas Lamprecht
15b9ce0e9a tests: cfg2cmd: add multi-q base test for 7.0 machine version
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-13 16:41:41 +01:00
Thomas Lamprecht
cd1db1b3e0 migrate test: fix some more grave indentation/whitespace errors
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-13 14:56:57 +01:00
Alexandre Derumier
73ed64967e migration : add del_nets_bridge_fdb
at the end of a live migration, we need to remove old mac entries
on source host (vm is not yet stopped), before resume vm on target host

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
 [T: resolve conflicts and rework on apply ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-13 14:56:57 +01:00
Thomas Lamprecht
d74f424e39 test: usb: cover more ports on checking xhci 7.1+
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-11 09:10:18 +01:00
Dominik Csapak
b1099442b6 tests: add tests for various combinations of configs for usb
q35 + usb passthrough
q35 + usb3 passthrough
q35 + usb3 passthrough with new xhci controller
old machine type + new usb config error
old machine type + q35 + new usb config error
old ostype (w2k) + new usb config error

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2022-11-11 08:52:01 +01:00
Dominik Csapak
4862922a2b fix #4324: USB: use qemu-xhci for machine versions >= 7.1
going by reports in the forum (e.g. [0]) and semi-official qemu
information[1], we should prefer qemu-xhci over nec-usb-xhci

for compatibility purposes, we guard that behind the machine version,
so that guests with a fixed version don't suddenly have a different usb
controller after a reboot (which could potentially break some hardcoded
guest configs)

0: https://forum.proxmox.com/threads/proxmox-usb-connect-disconnect-loop.117063/
1: https://www.kraxel.org/blog/2018/08/qemu-usb-tips/

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2022-11-10 17:02:34 +01:00
Thomas Lamprecht
0d6962f935 cpu config: map depreacated IceLake-Client CPU type to IceLake-Server
the former CPU type never existed on the market and will be dropped
by QEMU 7.1, so map it to the server variant as they're pretty much
identical anyway FIWCT.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-08-30 09:09:13 +02:00
Thomas Lamprecht
6884a7d7fa fix #4115: enable option to name QEMU threads after their main purpose
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-06-17 14:25:49 +02:00
Thomas Lamprecht
188eb9c374 tests: preset RBD fsid to avoid unavailable rados command
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-28 18:20:52 +02:00
Alexandre Derumier
6b4320545d add test for virtio-balloon free-page-reporting=on. (qemu 6.2)
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
2022-04-27 11:09:04 +02:00
Alexandre Derumier
c70e4ec397 memory: enable balloon free-page-reporting for auto-memory reclaim
Allow balloon device  driver to report hints of guest free pages to
the host, for auto memory reclaim

https://lwn.net/Articles/759413/
https://events19.linuxfoundation.org/wp-content/uploads/2017/12/KVMForum2018.pdf

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
[ T: fixup tests ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-04-27 11:08:50 +02:00
Fabian Grünbichler
e594231bf1 migrate: move tunnel-helpers to pve-guest-common
besides the log calls these don't need any parts of the migration state,
so let's make them generic and re-use them for container migration and
replication in the future.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2022-02-09 18:49:55 +01:00
Fabian Ebner
fe2c506926 snapshot: implement __snapshot_activate_storages
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-02-08 10:43:03 +01:00
Nicholas Sherlock
d806b017ac pci: allow override of PCI vendor/device ids
This allows mobile- and vGPUs to be presented to the guest as if they
were the original desktop variants of the card. It also allows
device-ID variants that guests don't know about to be renamed to
match compatible sibling devices the guest does have drivers for
(e.g. to remove manufacturer-specific vendor ID variants that prevent
the use of a device which would otherwise have a supported chipset)

e.g. hostpci0: 03:00,vendor-id=0x8086,device-id=0x10f6

Signed-off-by: Nicholas Sherlock <n.sherlock@gmail.com>
Reviewed-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Dominik Csapak <d.csapak@proxmox.com>
2022-01-25 10:59:23 +01:00
Fabian Ebner
e5a6919c38 cfg2cmd: turn smm off when SeaBIOS and serial display are used
Since commit 277d33454f77ec1d1e0bc04e37621e4dd2424b67 in pve-qemu,
smm=off is no longer the default, but with SeaBIOS and serial display,
this can lead to a boot loop.

Reported in the community forum [0] and reproduced with a Debian 10
VM.

[0]: https://forum.proxmox.com/threads/pve-7-0-all-vms-with-cloud-init-seabios-fail-during-boot-process-bootloop-disk-not-found.97310/post-427129

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-11 10:32:17 +01:00
Thomas Lamprecht
cc18103635 cfg2cmd: switch off ACPI hotplug on bridges for q35 VMs
See commit 17858a1695 (hw/acpi/ich9: Set ACPI PCI hot-plug as default
on Q35)[0] in upstream QEMU repository for details about why the change
was made.

As that change affects systemds predictable interface naming[1],
e.g., by going from a previously `ens18` name to `enp6s18`, it may
have rather bad effects for users that did not setup some .link files
to enforce a specific naming by an more stable information like the
NIC's MAC-Address

The alternative would be making the preferred mode of hotplug an
option like `hotplug-mode=<acpi|pcie>`, but it does not seems like
one would like to change that much in the first place...

Note the changes to the tests and especially the tests with q35
machines that did not change.

[0]: https://gitlab.com/qemu-project/qemu/-/commit/17858a1695
[1]: https://www.freedesktop.org/software/systemd/man/systemd.net-naming-scheme.html#Naming

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Reviewed-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-11-04 15:30:30 +01:00
Thomas Lamprecht
02cfca4b71 tests: cfg2cmd: add a few q35 related tests
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Reviewed-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-11-04 15:30:30 +01:00
Thomas Lamprecht
d08e787cae test: cfg2cmd: fix command output
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-11-03 16:46:28 +01:00
Dominik Csapak
90b20b152c use non SMM ovmf code file for i440fx machines
ovmf with SMM enabled will not boot on i440fx (hangs on graphics
initialization), so load the non SMM variant.

should be no issue regarding live-migration since it never worked with
this anyway.

adapts the test and adds one with q35

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-by: Stefan Reiter <s.reiter@proxmox.com>
2021-10-21 12:38:58 +02:00
Thomas Lamprecht
3d0ee5d41c tests: fixup simple1-template.conf.cmd
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-10-20 18:18:31 +02:00
Thomas Lamprecht
39c55c8f6e tests: cfg2cmd: add 4MB-EFI-secboot and TPM test
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-10-05 20:20:51 +02:00
Thomas Lamprecht
738dc81cba further improve on #3329, ensure write-back is used over write-around
Suggested-by: Rick Altherr <kc8apf@kc8apf.net>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-07-05 20:47:50 +02:00
Thomas Lamprecht
5620282fbd cfg2cmd: add btrfs-store and test for cache mode
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-06-24 18:42:55 +02:00
Fabian Grünbichler
85fcf79e21 template: add -snapshot to KVM command
this allows effectively setting ALL volumes as read-only, even if the
disk controller does not support it. without it, IDE and SATA disks
with (base) volumes which are marked read-only/immutable on the storage
level prevent the template VM from starting for backup purposes.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-06-23 12:37:40 +02:00
Fabian Grünbichler
2c53ff94fa test: add template drive read-only tests
ensuring the current behaviour:

templates will pass readonly=on to Qemu, except for SATA and IDE drives
which don't support that flag.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-06-23 12:37:40 +02:00
Fabian Grünbichler
75c430cee8 test: unbreak restore_config_test
for unprivileged users (and possibly some root setups). reading from
pmxcfs now results in a hard error for unprivileged users, so there
might be some more of these lurking somewhere..

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-06-23 12:27:54 +02:00
Stefan Reiter
6d5673c3b6 cfg2cmd: make io_uring default
The 'aio' setting is not visible to the guest, and so can be changed
during migrations or snapshots without issue. It is thus only
dependendent on the actual QEMU version being >= 6.0, not machine
version.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-06-23 12:02:44 +02:00
Fabian Ebner
cc1cdadbf4 test: fix restore config test as unprivileged user
after upgrading to bullseye, the cfs_read_file call within
restore_update_config_line() results in an error:
    Is a directory!
when done as an unprivileged user.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-06-18 17:26:24 +02:00
Thomas Lamprecht
9da0feb5e5 cfg2cmd: add test for efidisk rbd cache handling
I don't think this is something which will get broken by accident but
still nice to "document" this behavior in a regression test

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-06-16 15:24:57 +02:00
Stefan Reiter
378ad769dd cfg2cmd: use long form QEMU parameters to avoid warning in 6.0
QEMU warns us about this:

kvm: -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait: warning: short-form boolean option 'server' deprecated
Please use server=on instead
kvm: -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait: warning: short-form boolean option 'nowait' deprecated
Please use wait=off instead
kvm: -vnc unix:/var/run/qemu-server/100.vnc,password: warning: short-form boolean option 'password' deprecated
Please use password=on instead

The new syntax is backwards compatible to at least QEMU 4.0.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-05-28 11:31:15 +02:00
Fabian Ebner
0783c3c271 migration: move finishing block jobs to phase2 for better/uniform error handling
avoids the possibility to die during phase3_cleanup and instead of needing to
duplicate the cleanup ourselves, benefit from phase2_cleanup doing so.

The duplicate cleanup was also very incomplete: it didn't stop the remote kvm
process (leading to 'VM already running' when trying to migrate again
afterwards), but it removed its disks, and it didn't unlock the config, didn't
close the tunnel and didn't cancel the block-dirty bitmaps.

Since migrate_cancel should do nothing after the (non-storage) migrate process
has completed, even that cleanup step is fine here.

Since phase3 is empty at the moment, the order of operations is still the same.

Also add a test, that would complain about finish_tunnel not being called before
this patch. That test also checks that local disks are not already removed
before finishing the block jobs.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
eb5751ba02 migration: cleanup_remotedisks: simplify and include more disks
Namely, those migrated with storage_migrate by using the information from
volume_map. Call cleanup_remotedisks in phase1_cleanup as well, because that's
where we end if sync_offline_local_volumes fails, and some disks might already
have been transfered successfully. Note that the local disks are still here, so
this is fine.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
93a1c63f4c test: migration: add parse_volume_id calls
so it fails when something bad comes in.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
eabac302ba restore: update config: remove unused parameter
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:10:28 +02:00
Fabian Ebner
c62d7cf547 test: add tests for restoring config
Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:10:28 +02:00
Stefan Reiter
d4be7f31b5 cfg2cmd: fix +pveN machine types with pxe
Pinned machine versions like "pc-i440fx-4.2+pve2.pxe" would otherwise
get a second "+pve0" suffix, which is incorrect.

Also deal with non-pve pinned versions correctly, i.e.
"pc-i440fx-5.2.pxe" becomes "pc-i440fx-5.2+pve0.pxe".

Handle .pxe suffixes in Machine.pm as well, and add two test cases.

Co-developed-by: Luca Berneking <luca@berneking.net>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-04-18 17:58:56 +02:00
Stefan Reiter
27a5be5376 snapshot: set migration caps before savevm-start
A "savevm" call (both our async variant and the upstream sync one) use
migration code internally. As such, they both expect migration
capabilities to be set.

This is usually not a problem, as the default set of capabilities is ok,
however, it leads to differing snapshot settings if one does a snapshot
after a machine has been live-migrated (as the capabilities will persist
from that), which could potentially lead to discrepencies in snapshots
(currently it seems to be fine, but it still makes sense to set them to
safeguard against future changes).

Note that we do set the "dirty-bitmaps" capability now (if
query-proxmox-support reports true), which has three effects:

1) PBS dirty-bitmaps are preserved in snapshots, enabling
   fast-incremental backups to work after rollback (as long as no newer
   backups exist), including for hibernate/resume
2) snapshots taken from now on, with a QEMU version supporting bitmap
   migration, *might* lead to incompatibility of these snapshots with
   QEMU versions that don't know about bitmaps at all (i.e. < 5.0 IIRC?)
   - forward compatibility is still given, and all other capabilities we
   set go back to very old versions
3) since we now explicitly disable bitmap saving if the version doesn't
   report support, we avoid crashes even with not-updated QEMU versions

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-03-16 20:44:51 +01:00
Thomas Lamprecht
4dd1e83c75 always pin windows VMs to a machine version by default
A fix for violating a important standard for booting[0] in recently
packaged QEMU 5.2 surfaced some issues with Windows based VMs in our
forum[1], which seem to be quite sensitive for such changes (it seems
they derive lots of their device assignment from ACPI).
User visible effects are loss of any network configuration due to
windows thinking it was swapped with a new one, and starts with a
fresh config - this is mostly problematic for setups with static
address assignment.

There may be lots of other, more subtle, effects and the PVE admin is
also not always the VM admin, so we really need to avoid such
negative effects. Do this by pinning the version of any windows based
VMs to either the minimum of (5.1, kvm-version) for existing VMs or
the kvm-version at time of VM creation for new ones.

There are patches in pve-manager for user to be able to change the
pinned version themself in the webinterface, so this can now also get
adapted more easily if there surface any other issues (with new or
old version) in the future.

0: https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg08484.html
1: https://forum.proxmox.com/threads/warning-latest-patch-just-broke-all-my-windows-vms-6-3-4-patch-inside.84915/page-2#post-373331

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-03-05 20:46:46 +01:00
Stefan Reiter
483c9676f8 snapshot-test: mock query-savevm better
Otherwise the new printing functions produce warnings about undefined
numbers. These stats are guaranteed to be returned by real QEMU, so mock
them with some sensible values.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-02-25 21:20:06 +01:00
Fabian Ebner
8b70288e7d test: migrate: correctly mock storage module
by fixing a typo. Since cfs_read_file within the storage module was not mocked,
the tests could fail on some setups. Now that get_bandwidth_limit is mocked,
cfs_read_file is not called anymore, but still mock it too for good measure and
to make it more future-proof.

Reported-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-02-08 16:25:49 +01:00
Thomas Lamprecht
0435f8798c buildsys: clean: remove migration test runtime files
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-01-12 12:01:41 +01:00
Fabian Ebner
c97a9c6ed8 tests: mock storage locking for migration tests
by doing it in a local directory instead of /var/lock/pve-manager, which is
used by the installed/non-test PVE code. This also covers the shared case,
which will become relevant after fixing #3229 (currently migration doesn't
touch disks on shared storages).

Reported-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-01-12 11:56:55 +01:00
Fabian Ebner
0494299f57 tests: allow running migration tests in parallel
It's not easily possible to use separate JSON files for the test configuration,
because part of it is generated with perl code. While this could be encoded too,
it seems cleaner to use the "run a single test by specifing the name"
functionality while adding a way for make to obtain a list of the test names.

Each test has (and needs) its own directory now, meaning the log files do not
need to be renamed anymore.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-12-18 17:47:27 +01:00
Fabian Ebner
48831384b8 create test environment for migration
and the associated parts for 'qm start'.

Each test will first populate the MigrationTest/run directory
with the relevant configuration files and files keeping track of the
state of everything necessary. Second, the mock-script for migration
is executed, which in turn will execute the 'qm start' mock-script
(if it's an online test that gets far enough). The scripts will simulate
a migration and update the relevant files in the MigrationTest/run directory.
Finally, the main test script will evaluate the state.

The main checks are the volume IDs on the source and target and the VM
configuration itself. Additional checks are the vm_status and expected_calls,
keeping track if certain calls have been made.

The rationale behind creating two mock-scripts is two-fold:
1. It removes the need to hard code responses for the tunnel
   and to recycle logic for determining and allocating migration volumes.
   Some of that logic already happens in the API part, so it was necessary
   to mock the whole CLI-Handler.
2. It allows testing the code relevant for migration in 'qm start' as well,
   and it should even be possible to test different versions of the
   mock-scripts against each other. With a bit of extra work and things
   like 'git worktree', it might even be possible to automate this.

The helper get_patched config is useful to change pre-defined configuration
files on the fly, avoiding the new to explicitly define whole configurations to
test for something in many cases.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-12-15 15:21:37 +01:00
Stefan Reiter
27b25d037e config_to_command: use -no-shutdown option
Ignore shutdowns triggered from within the guest in favor of detecting
them via qmeventd and stopping the QEMU process that way.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-11-05 11:22:47 +01:00
Thomas Lamprecht
2bf945fcb9 tests: make module truthy
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-10-19 15:33:16 +02:00
Thomas Lamprecht
ce11958aab tests: do not use for-loop for globs
they are rather inefficient for this

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-10-19 15:32:31 +02:00
Thomas Lamprecht
808a65b522 fix some FH close
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-10-19 15:30:53 +02:00
Thomas Lamprecht
f7d1505b0c tree wide cleanups
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-10-16 18:03:32 +02:00
Thomas Lamprecht
d1c1af4b02 tree wide cleanup of s/return undef/return/
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-10-16 16:20:05 +02:00
Stefan Reiter
6c4f3e6d15 cfg2cmd: add tests for new boot order property
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-10-14 12:30:50 +02:00
Stefan Reiter
3441a023dd cfg2cmd: add test for legacy-style bootorder
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-10-14 12:30:50 +02:00
Thomas Lamprecht
6e5bda530e tests: add cfg2cmd test for virtio-blk disk with iothread on
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-09-02 13:27:27 +02:00
Thomas Lamprecht
3eb2f3eb56 followup cleanup
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-08-21 10:44:00 +02:00
Thomas Lamprecht
1e27bda1aa tests: cfg2cmd: check also warnings
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-08-21 10:38:27 +02:00
Thomas Lamprecht
90d96715f8 tests: cfg2cmd: get testname earlier
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-08-21 10:38:08 +02:00
Aaron Lauterer
789fe8e818 cfg2cmd: vga: fix #2749: disable edid for Win+BIOS+VGA machines
Edid support was added with Qemu 5. Windows guests seem to not be able
to get all possible resolutions if the default std VGA device is used as
GPU and the VM boots in BIOS mode. The result is that only one of the
following three resolutions can be configured:

800x600
1024x768
1920x1080

It is important to note that just booting a Windows VM with the edid=off
parameter will not make the large list of resolutions available. It
seems that Windows is caching the list of possible resolutions
somewhere [0].

Uninstalling the 'Microsoft Basic Display Adapter' in the device manager
and rebooting the VM is one way I found to force Windows to recreate the
list of possible resolutions.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>

[0] https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg07128.html
2020-08-19 18:22:43 +02:00
Fabian Grünbichler
121e340094 cfg2cmd test: hardcode/mock bridge MTU
otherwise the netdev test reads the MTU value from the test host's vmbr0
bridge, or fails if no such bridge exists.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-06-17 10:39:47 +02:00
Thomas Lamprecht
6f40c2d101 cfg2cmd: add simple MTU test
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-06-04 11:19:13 +02:00
Dominik Csapak
39322a9341 test: add test for OVF with missing default rasd namespace
sometimes vendors do not put the 'rasd' namespaces in the top level
Envelope, but in every 'rasd' element this adds a test for this

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2020-04-27 13:09:51 +02:00
Dominik Csapak
31bf5a0f2b test: print more info when OVF parsing fails
When one of the ovf tests fails to parse at all, we just get the
'die' message of the failing component, but not which file actually
failed to parse.

To get better output, convert the parsing also to a test and ok() and
fail() respectively and then printing the error.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2020-04-27 13:08:58 +02:00
Stefan Reiter
5bc084707b cfg2cmd: add test cases for custom CPU models
Requires a mock CPU-model config, which is given as a raw string to also
test parsing capabilities. Also tests defaulting behaviour.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-By: Fabian Ebner <f.ebner@proxmox.com>
Tested-By: Fabian Ebner <f.ebner@proxmox.com>
2020-04-07 17:27:58 +02:00
Stefan Reiter
c4581b9cc5 Rework get_cpu_options and allow custom CPU models
If a cputype is custom (check via prefix), try to load options from the
custom CPU model config, and set values accordingly.

While at it, extract currently hardcoded values into seperate sub and add
reasonings.

Since the new flag resolving outputs flags in sorted order for
consistency, adapt the test cases to not break. Only the order is
changed, not which flags are present.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-By: Fabian Ebner <f.ebner@proxmox.com>
Tested-By: Fabian Ebner <f.ebner@proxmox.com>
2020-04-07 17:27:58 +02:00
Fabian Grünbichler
0c498cca36 vm_start: condense signature
as preparation for refactoring it further. remote migration will add
another 1-2 parameters, and it is already unwieldly enough as it is.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-04-01 17:42:15 +02:00
Dominik Csapak
818ce80ec1 fix efidisks on storages with minimum sizes bigger than OVMF_VARS.fd
on storages where the minimum size of images is bigger than the real
OVMF_VARS.fd file, they get padded to their minimum size

when using such an image, qemu maps it fully to the vm, but the efi
does not find the vars region and creates a file on the first efi
partition it finds

this breaks some settings in the ovmf, such as resolution

to fix this, we have to specify the size for the pflash, so that
qemu only maps the first n bytes in the vm (this only works for
raw files, not for qcow2)

we also have to use the correct size when converting between storages
in 'clone_disk' (used for move disk and cloning vms) and when
live migrating to different storages

when we now expect that the source image is always correctly used/created
(e.g. raw with size=x in pflash argument) then we always create the
target correctly

when encountering users which have a non-valid image (e.g. a efidisk
moved from zfs to qcow2 before this patch), we have to tell them to
recreate the efidisk and the settings on it

we have to version_guard it to 4.1+pve2 (since we haven't bumped yet
since the change to pve2)

also add 2 tests, one for the old version and one for the new

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-by: Stefan Reiter <s.reiter@proxmox.com>
[ Thomas: rebased to master ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-03-30 09:41:55 +02:00
Stefan Reiter
a04dd5c455 Simplify QEMU version check and require 3.0+
Some of the recent QMP changes require at least 2.8.0, but since the
oldest version we officially package for 6.x is 4.0.0 anyway, checking
for at least 3.0 should not break anyone's setup.

Note that this does not affect machine version checks, only the
installed QEMU binary version.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-02-12 11:19:29 +01:00
Stefan Reiter
ac0077cc33 Use 'QEMU version' -> '+pve-version' mapping for machine types
The previously introduced approach can fail for pinned versions when a
new QEMU release is introduced. The saner approach is to use a mapping
that gives one pve-version for each QEMU release.

Fortunately, the old system has not been bumped yet, so we can still
change it without too much effort.

QEMU versions without a mapping are assumed to be pve0, 4.1 is mapped to
pve1 since thats what we had as our default previously.

Pinned machine versions (i.e. pc-i440fx-4.1) are always assumed to be
pve0, for specific pve-versions they'd have to be pinned as well (i.e.
pc-i440fx-4.1+pve1).

The new logic also makes the pve-version dynamic, and starts VMs with
the lowest possible 'feature-level', i.e. if a feature is only available
with 4.1+pve2, but the VM isn't using it, we still start it with
4.1+pve0.

We die if we don't support a version that is requested from us. This
allows us to use the pve-version as live-migration blocks (i.e. bumping
the version and then live-migrating a VM which uses the new feature (so
is running with the bumped version) to an outdated node will present the
user with a helpful error message and fail instead of silently modifying
the config and only failing *after* the migration).

$version_guard is introduced in config_to_command to use for features
that need to check pve-version, it automatically handles selecting the
newest necessary pve-version for the VM.

Tests have to be adjusted, since all of them now resolve to pve0 instead
of pve1. EXPECT_ERROR matching is changed to use 'eq' instead of regex
to allow special characters in error messages.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-02-12 10:32:57 +01:00
Dominik Csapak
844d8fa628 move the vmgenid device after readconfig on q35
and adapt the tests

this does not impact live migration, since the order here does not
change the device layout

we want this to consistently have the readconfig first

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2020-01-31 20:26:26 +01:00
Thomas Lamprecht
ae200950d4 grammar fix: s/does not exists/does not exist/g
bump versioned build-dependency, as qemu-server has tests checking
for errors, and we fixed an grammar error in pve-storage, so we need
the newer version to ensure our test go through

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-12-13 12:20:56 +01:00
Thomas Lamprecht
a546da0319 cfg2cmd: allow to test for expected error messages
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-12-10 11:08:33 +01:00
Thomas Lamprecht
38277afcd4 qemu-server: make nodename mock-able for tests
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-12-10 11:08:33 +01:00
Stefan Reiter
6db4c69e1d cfg2cmd: test runs_at_least_qemu_version and version_cmp explicitly
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2019-12-09 11:42:49 +01:00
Stefan Reiter
8b26544e50 cfg2cmd: minor cleanup
We never shipped a 4.1.0 QEMU, so it makes more sense to test as 4.1.1

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2019-12-09 11:42:49 +01:00
Dominik Csapak
844b55fb89 fix #2510: hostpci: always check if device exists
if the user set a device as hostpci with the 'shorthand' syntax:

hostpciX: 00:12

we ignored it on starting and showcmd and continued.
Since the user explicitly wanted to passthrough a device, we now check
if there is actually a device with that id

for explicitly configured devices (00:12.1), we did not check if it exists,
but the kvm call failed with a non-obvious error message

now we always call 'lspci' from SysFSTools to check if it actually exists,
and fail if not. With this, we can drop the workaround for adding
'0000' if no domain was given, since lspci does it already for us

this fixes #2510, an issue with using mediated devices where the users did not have
the domain in the config, since we forgot to add the default domain there

the only issue with this patch is that it changes the behaviour of
'showcmd' slightly, as in now, we die if the device was explicitly
given, but did not exists (we showed the commandline, now we fail)

this also slightly changes the commandline for qemu (adding always
the domain), which is not a problem since we cannot live migrate
or snapshot such vms, but we have to adapt the tests

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-12-09 11:30:14 +01:00
Dominik Csapak
0360faadc7 cfg2cmd test: add tests for multifunction devices
by mocking the lspci call

the mocked lspci code is basically the same as the real one,
only difference is the source of the devices and
there is no verbose flag

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2019-12-09 11:30:14 +01:00
Thomas Lamprecht
bdd1feef5b fix #2469: fix qemu-img convert src_format detection
This reverts commit c5151cb8bb which is
a revert of the wrongly done revert of
commit e2414e73ce.
2019-12-09 10:31:33 +01:00
Thomas Lamprecht
c5151cb8bb Revert "fix #2469: fix qemu-img convert src_format detection"
This reverts commit e2414e73ce.
2019-11-26 13:06:57 +01:00
Thomas Lamprecht
9471e48bf9 implement PVE Version addition for QEMU machine
With our QEMU 4.1.1 package we can pass a additional internal version
to QEMU's machine, it will be split out there and ignored, but
returned on a QMP 'query-machines' call.

This allows us to use it for increasing the granularity with which we
can roll-out HW layout changes/additions for VMs. Until now we
required a machine version bump, happening normally every major
release of QEMU, with seldom, for us irrelevant, exceptions.
This often delays rolling out a feature, which would break
live-migration, by several months. That can now be avoided, the new
"pve-version" component of the machine can be bumped at will, and
thus we are much more flexible.

That versions orders after the ($major, $minor) version components
from an stable release - it can thus also be reset on the next
release.

The implementation extends the qemu-machine REGEX, remembers
"pve-version" when doing a "query-machines" and integrates support
into the min_version and extract_version helpers.

We start out with a version of 1.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reviewed-by: Stefan Reiter <s.reiter@proxmox.com>
2019-11-25 16:43:38 +01:00
Fabian Grünbichler
e2414e73ce fix #2469: fix qemu-img convert src_format detection
if we don't know which format the source volume/file has, let qemu-img
decide.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-11-25 13:45:46 +01:00
Thomas Lamprecht
050fcfdd98 cfg2cmd test: fix spice enhancement test
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2019-11-25 07:45:44 +01:00