qemu-server

mirror of https://git.proxmox.com/git/qemu-server synced 2025-10-20 17:14:34 +00:00

Author	SHA1	Message	Date
Thomas Lamprecht	e693c49190	migration: factor out variable + code cleanup Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-04-19 21:51:21 +02:00
Thomas Lamprecht	7de328c629	migration: log: s/migration_caps/migration capabilities/ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-04-19 21:48:31 +02:00
Thomas Lamprecht	a89bd10084	migration: do not set default speed limit the claim that QEMU limits this to 32M otherwise is bogus, at least with any current QEMU version.. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-04-19 21:46:52 +02:00
Thomas Lamprecht	6539865a9d	migration: refactor and tidy-up code Use an early die so that the rest can loose an indentation level for the actual migration status reporting code Extract common used members of the stat hash for shorter code. use `git show -w --word-diff=color --word-diff-regex='\w+'` for getting a better view of actual changes Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-04-19 14:59:54 +02:00
Fabian Ebner	0783c3c271	migration: move finishing block jobs to phase2 for better/uniform error handling avoids the possibility to die during phase3_cleanup and instead of needing to duplicate the cleanup ourselves, benefit from phase2_cleanup doing so. The duplicate cleanup was also very incomplete: it didn't stop the remote kvm process (leading to 'VM already running' when trying to migrate again afterwards), but it removed its disks, and it didn't unlock the config, didn't close the tunnel and didn't cancel the block-dirty bitmaps. Since migrate_cancel should do nothing after the (non-storage) migrate process has completed, even that cleanup step is fine here. Since phase3 is empty at the moment, the order of operations is still the same. Also add a test, that would complain about finish_tunnel not being called before this patch. That test also checks that local disks are not already removed before finishing the block jobs. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2021-04-18 18:30:41 +02:00
Fabian Ebner	a6be63ac9b	migration: split out replication from scan_local_volumes and avoid one loop over the config, by extending foreach_volid to include the drivename. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2021-04-18 18:30:41 +02:00
Fabian Ebner	4b26ffbfa5	migration: keep track of replicated volumes via local_volumes by extending filter_local_volumes. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2021-04-18 18:30:41 +02:00
Fabian Ebner	efe0d457c6	migration: use storage_migration for checks instead of online_local_volumes Like this we don't need to worry about auto-vivifaction. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2021-04-18 18:30:41 +02:00
Fabian Ebner	eb5751ba02	migration: cleanup_remotedisks: simplify and include more disks Namely, those migrated with storage_migrate by using the information from volume_map. Call cleanup_remotedisks in phase1_cleanup as well, because that's where we end if sync_offline_local_volumes fails, and some disks might already have been transfered successfully. Note that the local disks are still here, so this is fine. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2021-04-18 18:30:41 +02:00
Fabian Ebner	ad8b9d5e2d	migration: simplify removal of local volumes and get rid of self->{volumes} This also changes the behavior to remove the local copies of offline migrated volumes only after the migration has finished successfully (this is relevant for mixed settings, e.g. online migration with unused/vmstate disks). local_volumes contains both, the volumes previously in $self->{volumes} and the volumes in $self->{online_local_volumes}, and hence is the place to look for which volumes we need to remove. Of course, replicated volumes still need to be skipped. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2021-04-18 18:30:41 +02:00
Fabian Ebner	efbbe59da4	migration: add nbd migrated volumes to volume_map earlier and avoid a little bit of duplication by creating a helper Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2021-04-18 18:30:41 +02:00
Fabian Ebner	c3417e3b6e	migration: save targetstorage and bwlimit in local_volumes hash and re-use information It is enough to call get_bandwith_limit once for each source_storage. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2021-04-18 18:30:41 +02:00
Fabian Ebner	2c4ba4c3ee	migration: fix calculation of bandwith limit for non-disk migration The case with: 1. no generic 'migration' limit from the storage plugin 2. a migrate_speed limit in the VM config was broken. It would assign 0 to migrate_speed when picking the minimum value and then default to the default value. Fix it by checking if bwlimit is 0 before picking the minimum. Also, make it a bit more readable by avoiding the trick of //-assigning bwlimit before the units match up and relying on getting back the original bwlimit value as the minimum. Instead, only \|\|-assign after the units match up and don't rely on other things. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2021-04-18 18:30:41 +02:00
Fabian Ebner	3276a43470	migration: split out config_update_local_disksizes from scan_local_volumes Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2021-04-18 18:30:41 +02:00
Fabian Ebner	62a4c963b8	migration: avoid re-scanning all volumes by using the information obtained in the first scan. This also makes sure we only scan local storages. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2021-04-18 18:30:41 +02:00
Fabian Ebner	d10b78f4d2	migration: split sync_disks into two functions by making local_volumes class-accessible. One functions is for scanning all local volumes and one is for actually syncing offline volumes via storage_migrate. The exception is replicated volumes, this still happens during the scan for now. Also introduce a filter_local_volumes helper, to makes life easier. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2021-04-18 18:30:41 +02:00
Fabian Ebner	eb3acec88a	migration: sort volumes migrated with storage_migrate Having a deterministic order here is useful for testing. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-12-15 15:21:37 +01:00
Fabian Ebner	7d730f953c	migration: factor out starting remote tunnel so it can be mocked when testing. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-12-15 15:21:37 +01:00
Fabian Ebner	27fa645e66	use new move_config_to_node method Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-12-15 15:21:37 +01:00
Fabian Ebner	e219712561	deactivate volumes after storage_migrate Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-11-24 16:19:35 +01:00
Fabian Ebner	78bd57d9c3	adapt to new storage_migrate activation behavior Offline migrated volumes are now activated within storage_migrate. Online migrated volumes can be assumed to be already active. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-11-24 16:19:29 +01:00
Fabian Ebner	19ff368213	don't migrate replicated VM whose replication job is marked for removal while it didn't actually fail, we probably want to avoid the behavior: With remove_job=full: * run_replication called during migration causes the replicated volumes to be removed * migration continues by fully copying all volumes With remove_job=local: * run_replication called during migration causes the job (and local replication snapshots) to be removed * migration continues by fully copying all volumes and renaming them to avoid collision with the still existing remote volumes Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-11-09 10:08:22 +01:00
Fabian Ebner	c2c96d7378	fix checks for transfering replication state/switching job target In some cases $self->{replicated_volumes} will be auto-vivified to {} by checks like next if $self->{replicated_volumes}->{$volid} and then {} would evaluate to true in a boolean context. Now the replication information is retrieved once in prepare, and used to decide whether to make the calls or not. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-11-09 10:08:22 +01:00
Fabian Ebner	68980d6626	Repeat check for replication target in locked section No need to warn twice, so the warning from the outside check was removed. Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-11-09 10:08:22 +01:00
Thomas Lamprecht	e5d611c382	fix various conditionally declared vars Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-10-16 16:52:11 +02:00
Fabian Ebner	1264d6c511	Use correct option for storage_migrate Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-08-04 13:57:09 +02:00
Stefan Reiter	b53ba8d0f1	fixup: use parse_property_string instead of parse_cpu_conf_basic The latter was removed and replaced with a validator. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-07-09 14:45:21 +02:00
Fabian Ebner	9b29cbd0ed	update_disksize: make interface leaner Pass new size directly, so the function doesn't need to know about how some hash is organized. And return a message directly, instead of both size-strings. Also dropped the wantarray, because both existing callers use the message anyways. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-07-01 09:18:13 +02:00
Fabian Ebner	1c2174833b	sync_disks: fix check Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-07-01 09:13:06 +02:00
Fabian Grünbichler	ae194a5c5e	migrate: cleanup forwarding code fixing the following two issues: - the legacy code path was never converted to the new fork_tunnel signature (which probably means that nothing triggers it in practice anymore?) - the NBD Unix socket got forwarded multiple times if more than one disk was migrated via NBD (this is harmless, but wrong) for the second issue I opted to keep the code compatible with the possibility that Qemu starts supporting multiple NBD servers in the future (and the target node could thus return multiple UNIX socket paths). currently we can only start one NBD server on one socket, and each drive-mirror simply starts a new connection over that single socket. I took the liberty of renaming the variables/keys since I found 'tunnel_addr' and 'sock_addr' rather confusing. Reviewed-By: Mira Limbeck <m.limbeck@proxmox.com> Tested-By: Mira Limbeck <m.limbeck@proxmox.com> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2020-05-06 16:16:50 +02:00
Dominik Csapak	cd37203880	migrate: skip rescan for efidisk and shared volumes we really only want to rescan the disk size of the disks we actually need, and that are only the local disks (for which we have to allocate the correct size on the target) also we want to always skip the efidisk, since we get the wanted size after the loop, and this produced a confusing log line (for details why we do not want the 'real' size, see commit `818ce80ec1`) Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>	2020-05-04 17:35:12 +02:00
Fabian Grünbichler	6f4b11e9db	migrate: don't accidentally take NBD code paths by avoiding auto-vivification of $self->{online_local_volumes} via iteration. most code paths don't care whether it's undef or a reference to an empty list, but this caused the (already) fixed bug of calling nbd_stop without having started an NBD server in the first place. Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2020-05-04 17:34:58 +02:00
Thomas Lamprecht	3e802221e1	migrate: only stop NBD if we got a NBD url from the target Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-04-29 16:22:33 +02:00
Fabian Ebner	ae180b8f08	Include vmstate and unused volumes in foreach_volid and refactor the test_volid closure. Like this get_replicatable_volumes doesn't need a separate loop for unused volumes anymore. For get_vm_volumes, which is used for activation/deactivation of volumes at migration and deactivation in vm_stop_cleanup, includes those volumes now. For migration it's an improvement, because those volumes might need to be migrated and for vm_stop_cleanup it shouldn't hurt. The last user of foreach_volid is check_vm_disks_local used by migrate_vm_precondition, where information about the additional volumes doesn't hurt either. Note that replicate is (still) set by default, so the behavior for get_replicatable_volumes for unused volumes should not change. Hibernation vmstate files are now also included and recognized as 'is_vmstate'. The 'size' attribute will not be overwritten by subsequent iterations for the same volid anymore (a volid may appear both in the config and in snapshots), so the size from the current config is now preferred. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-04-29 12:14:40 +02:00
Fabian Ebner	b24f07d406	Fix test_volid call for vmstate and fix check for snapshots on migration by excluding vmstate. It is referenced by snapshots, but is not a volume containing a snapshot. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-04-29 12:14:40 +02:00
Fabian Grünbichler	90ff65b63a	migrate: simplify replicated_volume loop (no change compared to previous iteration except for readability) Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2020-04-20 11:24:23 +02:00
Fabian Ebner	cee620e671	Fix live migration with replicated unused volumes by counting only local volumes that will be live-migrated via qemu_drive_mirror, i.e. those listed in $self->{online_local_volumes}. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-04-20 11:12:56 +02:00
Thomas Lamprecht	38311a1d17	migrate: workaround issues with format switch on storage live migration Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-04-17 15:27:38 +02:00
Fabian Ebner	ea5b400812	sync_disks: log output of storage_migrate Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-04-08 22:11:54 +02:00
Fabian Ebner	49a5a0d84b	sync_disks: be more verbose if storage_migrate fails If storage_migrate dies, the error message might not include the volume ID or the target storage ID, but those might be good to know. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-04-08 22:11:54 +02:00
Fabian Ebner	cc1a3820db	sync_disks: use allow_rename to avoid collisions on the target storage This makes it possible to migrate a VM with volumes store1:vm-123-disk-0 store2:vm-123-disk-0 to some targetstorage. Also prevents migration failure when there is an orphaned disk with the same volid on the target. To avoid confusion, the name should not change for 'vmstate'-volumes. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-04-08 22:11:54 +02:00
Fabian Ebner	97ece9ddce	Update volume IDs in one go Use 'update_volume_ids' for the live-migrated disks as well. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-04-08 22:11:54 +02:00
Fabian Ebner	37666e4caa	Take note of changes to the volume IDs when migrating and update the config Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-04-08 22:11:54 +02:00
Fabian Ebner	1f726e0a85	Use new storage_migrate interface Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-04-08 22:11:54 +02:00
Fabian Ebner	912792e245	Switch to using foreach_volume instead of foreach_drive It was necessary to move foreach_volid back to QemuServer.pm In VZDump/QemuServer.pm and QemuMigrate.pm the dependency on QemuConfig.pm was already there, just the explicit "use" was missing. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>	2020-04-08 22:11:54 +02:00
Stefan Reiter	58c64ad5d9	Include "-cpu" parameter with live-migration This is required to support custom CPU models, since the "cpu-models.conf" file is not versioned, and can be changed while a VM using a custom model is running. Changing the file in such a state can lead to a different "-cpu" argument on the receiving side. This patch fixes this by passing the entire "-cpu" option (extracted from /proc/.../cmdline) as a "qm start" parameter. Note that this is only done if the VM to migrate is using a custom model (which we can check just fine, since the <vmid>.conf is versioned with pending changes), thus not breaking any live-migration directionality. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-04-07 17:27:58 +02:00
Fabian Grünbichler	bf8fc5a307	migrate: allow arbitrary source->target storage maps the syntax is backwards compatible, providing a single storage ID or '1' works like before. the new helper ensures consistent behaviour at all call sites. Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2020-04-02 17:47:14 +02:00
Stefan Reiter	c05f1b33ea	migration: fix downtime limit auto-increase `485449e37` ("qmp: use migrate-set-parameters in favor of deprecated options") changed the initial "migrate_set_downtime" QMP call to the more recent "migrate-set-parameters", but forgot to do so for the auto-increase code further below. Since the units of the two calls don't match, this would have caused the auto-increase to increase the limit to absurd levels as soon as it kicked in (ms treated as s). Update the second call to the new version as well, and while at it remove the unnecessary "defined()" check for $migrate_downtime, which is always initialized from the defaults anyway. Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>	2020-04-02 16:48:51 +02:00
Fabian Grünbichler	6a039d06e9	migrate: improve cleanup_remotedisks to also handle cases where disk allocation failed in the remote vm_start, and we only have a bitmap but no target drive information. Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2020-04-01 17:41:07 +02:00
Dominik Csapak	818ce80ec1	fix efidisks on storages with minimum sizes bigger than OVMF_VARS.fd on storages where the minimum size of images is bigger than the real OVMF_VARS.fd file, they get padded to their minimum size when using such an image, qemu maps it fully to the vm, but the efi does not find the vars region and creates a file on the first efi partition it finds this breaks some settings in the ovmf, such as resolution to fix this, we have to specify the size for the pflash, so that qemu only maps the first n bytes in the vm (this only works for raw files, not for qcow2) we also have to use the correct size when converting between storages in 'clone_disk' (used for move disk and cloning vms) and when live migrating to different storages when we now expect that the source image is always correctly used/created (e.g. raw with size=x in pflash argument) then we always create the target correctly when encountering users which have a non-valid image (e.g. a efidisk moved from zfs to qcow2 before this patch), we have to tell them to recreate the efidisk and the settings on it we have to version_guard it to 4.1+pve2 (since we haven't bumped yet since the change to pve2) also add 2 tests, one for the old version and one for the new Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> Tested-by: Stefan Reiter <s.reiter@proxmox.com> Reviewed-by: Stefan Reiter <s.reiter@proxmox.com> [ Thomas: rebased to master ] Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2020-03-30 09:41:55 +02:00

1 2 3 4 5

226 Commits