Commit Graph

236 Commits

Author SHA1 Message Date
Wolfgang Bumiller
205dbf39b1 allow migrating raw btrfs volumes
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2021-06-23 12:26:40 +02:00
Thomas Lamprecht
db861a4617 migrate prepare: make content type check generic
to avoid false-positives, e.g., from a ISO on a ISO only storage.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-06-23 12:15:43 +02:00
Thomas Lamprecht
8a5bd88907 migrate prepare: use also explicit variable for storecfg
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-06-23 12:15:16 +02:00
Fabian Ebner
24b84b4766 migrate: enforce that image content type is available
and use it for the vdisk_list call too. This avoids scanning (and picking up
volumes from!) storages that are not even configured to hold images.

Previously, the content type was only enforced when a storage map was present.

Also serves a bit as a preparation to enforce content type on guest startup,
because now migration failure happens early and not only when trying to start
the guest on the remote node.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-06-21 11:17:48 +02:00
Fabian Ebner
0d2db08414 prefer storage_check_enabled over storage_check_node
storage_check_enabled simply checks for the 'disable' option and then calls
storage_check_node.

While not strictly necessary for a second call where only the storage differs,
e.g. in case of clone, it is more future-proof: if support for a target storage
is added at some point, it might be easy to miss adapting the call.

For the migration checks, the situation is improved by now always catching
disabled (target) storages.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-06-21 11:17:48 +02:00
Fabian Ebner
692f604bb0 Revert "revert spice_ticket prefix change in 7827de4"
This reverts commit ff09c795ed. We wanted to wait
until PVE 7.0 for the change to not break migration new -> old until then.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Reviewed-by: Stefan Reiter <s.reiter@proxmox.com>
2021-06-08 14:56:10 +02:00
Thomas Lamprecht
8f43ac4893 Revert "migration: do not set default speed limit"
The default was changed for 5.2, so while it is not 32 MiB/s anymore,
it is still 128 MiB/s which I did not notice on my 1 Gbps (or < 125
MiB/s) setup. For users with links faster than one gigabit it now did
some limiting - so setup a very high limit so than even 100G should
not max this out.

This reverts commit a89bd10084.
2021-04-29 15:48:21 +02:00
Fabian Ebner
9938d24df2 migrate: fix memory migration start time
The variable is only ever used for calculating the average speed of memory
migration, but it was set before disk mirroring already. But the disk
sizes are not included in the calculation, resulting in (very) wrong values.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-23 15:00:44 +02:00
Thomas Lamprecht
b68a957b2e migration: keep log rate steady if polling gets more frequent
Either we're done in a few seconds anyway, or if the VM dirties lots
of pages we need quite a bit of time, and then it does not help to
output roughly the same status 10 times a second...

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-19 22:08:19 +02:00
Thomas Lamprecht
0fca250af0 migration: rework logging to more humand friendly, less spammy
* use render_bytes where possible, to get quick to read and grasp
  units printed
* xbzrle is only interesting if actually pages/bytes are send using
  it, so only log in that case
* log if VM dirties more than we send
* log current speed we get from QEMU

In general there are less lines logged and huge integers are avoided.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-19 21:54:37 +02:00
Thomas Lamprecht
e693c49190 migration: factor out variable + code cleanup
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-19 21:51:21 +02:00
Thomas Lamprecht
7de328c629 migration: log: s/migration_caps/migration capabilities/
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-19 21:48:31 +02:00
Thomas Lamprecht
a89bd10084 migration: do not set default speed limit
the claim that QEMU limits this to 32M otherwise is bogus, at least
with any current QEMU version..

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-19 21:46:52 +02:00
Thomas Lamprecht
6539865a9d migration: refactor and tidy-up code
Use an early die so that the rest can loose an indentation level for
the actual migration status reporting code

Extract common used members of the stat hash for shorter code.

use `git show -w --word-diff=color --word-diff-regex='\w+'` for
getting a better view of actual changes

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-19 14:59:54 +02:00
Fabian Ebner
0783c3c271 migration: move finishing block jobs to phase2 for better/uniform error handling
avoids the possibility to die during phase3_cleanup and instead of needing to
duplicate the cleanup ourselves, benefit from phase2_cleanup doing so.

The duplicate cleanup was also very incomplete: it didn't stop the remote kvm
process (leading to 'VM already running' when trying to migrate again
afterwards), but it removed its disks, and it didn't unlock the config, didn't
close the tunnel and didn't cancel the block-dirty bitmaps.

Since migrate_cancel should do nothing after the (non-storage) migrate process
has completed, even that cleanup step is fine here.

Since phase3 is empty at the moment, the order of operations is still the same.

Also add a test, that would complain about finish_tunnel not being called before
this patch. That test also checks that local disks are not already removed
before finishing the block jobs.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
a6be63ac9b migration: split out replication from scan_local_volumes
and avoid one loop over the config, by extending foreach_volid to include the
drivename.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
4b26ffbfa5 migration: keep track of replicated volumes via local_volumes
by extending filter_local_volumes.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
efe0d457c6 migration: use storage_migration for checks instead of online_local_volumes
Like this we don't need to worry about auto-vivifaction.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
eb5751ba02 migration: cleanup_remotedisks: simplify and include more disks
Namely, those migrated with storage_migrate by using the information from
volume_map. Call cleanup_remotedisks in phase1_cleanup as well, because that's
where we end if sync_offline_local_volumes fails, and some disks might already
have been transfered successfully. Note that the local disks are still here, so
this is fine.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
ad8b9d5e2d migration: simplify removal of local volumes and get rid of self->{volumes}
This also changes the behavior to remove the local copies of offline migrated
volumes only after the migration has finished successfully (this is relevant
for mixed settings, e.g. online migration with unused/vmstate disks).

local_volumes contains both, the volumes previously in $self->{volumes}
and the volumes in $self->{online_local_volumes}, and hence is the place
to look for which volumes we need to remove. Of course, replicated
volumes still need to be skipped.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
efbbe59da4 migration: add nbd migrated volumes to volume_map earlier
and avoid a little bit of duplication by creating a helper

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
c3417e3b6e migration: save targetstorage and bwlimit in local_volumes hash and re-use information
It is enough to call get_bandwith_limit once for each source_storage.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
2c4ba4c3ee migration: fix calculation of bandwith limit for non-disk migration
The case with:
1. no generic 'migration' limit from the storage plugin
2. a migrate_speed limit in the VM config
was broken. It would assign 0 to migrate_speed when picking the minimum value
and then default to the default value. Fix it by checking if bwlimit is 0
before picking the minimum.

Also, make it a bit more readable by avoiding the trick of //-assigning bwlimit
before the units match up and relying on getting back the original bwlimit value
as the minimum. Instead, only ||-assign after the units match up and don't rely
on other things.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
3276a43470 migration: split out config_update_local_disksizes from scan_local_volumes
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
62a4c963b8 migration: avoid re-scanning all volumes
by using the information obtained in the first scan. This
also makes sure we only scan local storages.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
d10b78f4d2 migration: split sync_disks into two functions
by making local_volumes class-accessible. One functions is for scanning all local
volumes and one is for actually syncing offline volumes via storage_migrate. The
exception is replicated volumes, this still happens during the scan for now.

Also introduce a filter_local_volumes helper, to makes life easier.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
eb3acec88a migration: sort volumes migrated with storage_migrate
Having a deterministic order here is useful for testing.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-12-15 15:21:37 +01:00
Fabian Ebner
7d730f953c migration: factor out starting remote tunnel
so it can be mocked when testing.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-12-15 15:21:37 +01:00
Fabian Ebner
27fa645e66 use new move_config_to_node method
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-12-15 15:21:37 +01:00
Fabian Ebner
e219712561 deactivate volumes after storage_migrate
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-24 16:19:35 +01:00
Fabian Ebner
78bd57d9c3 adapt to new storage_migrate activation behavior
Offline migrated volumes are now activated within storage_migrate.
Online migrated volumes can be assumed to be already active.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-24 16:19:29 +01:00
Fabian Ebner
19ff368213 don't migrate replicated VM whose replication job is marked for removal
while it didn't actually fail, we probably want to avoid the behavior:

With remove_job=full:
    * run_replication called during migration causes the replicated volumes to
      be removed
    * migration continues by fully copying all volumes

With remove_job=local:
    * run_replication called during migration causes the job (and local
      replication snapshots) to be removed
    * migration continues by fully copying all volumes and renaming them to
      avoid collision with the still existing remote volumes

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-09 10:08:22 +01:00
Fabian Ebner
c2c96d7378 fix checks for transfering replication state/switching job target
In some cases $self->{replicated_volumes} will be auto-vivified
to {} by checks like
next if $self->{replicated_volumes}->{$volid}
and then {} would evaluate to true in a boolean context.

Now the replication information is retrieved once in prepare,
and used to decide whether to make the calls or not.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-09 10:08:22 +01:00
Fabian Ebner
68980d6626 Repeat check for replication target in locked section
No need to warn twice, so the warning from the outside check
was removed.

Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-09 10:08:22 +01:00
Thomas Lamprecht
e5d611c382 fix various conditionally declared vars
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-10-16 16:52:11 +02:00
Fabian Ebner
1264d6c511 Use correct option for storage_migrate
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-08-04 13:57:09 +02:00
Stefan Reiter
b53ba8d0f1 fixup: use parse_property_string instead of parse_cpu_conf_basic
The latter was removed and replaced with a validator.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2020-07-09 14:45:21 +02:00
Fabian Ebner
9b29cbd0ed update_disksize: make interface leaner
Pass new size directly, so the function doesn't need to know about
how some hash is organized. And return a message directly, instead
of both size-strings. Also dropped the wantarray, because both
existing callers use the message anyways.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-07-01 09:18:13 +02:00
Fabian Ebner
1c2174833b sync_disks: fix check
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-07-01 09:13:06 +02:00
Fabian Grünbichler
ae194a5c5e migrate: cleanup forwarding code
fixing the following two issues:
- the legacy code path was never converted to the new fork_tunnel
signature (which probably means that nothing triggers it in practice
anymore?)
- the NBD Unix socket got forwarded multiple times if more than one disk
was migrated via NBD (this is harmless, but wrong)

for the second issue I opted to keep the code compatible with the
possibility that Qemu starts supporting multiple NBD servers in the
future (and the target node could thus return multiple UNIX socket
paths). currently we can only start one NBD server on one socket, and
each drive-mirror simply starts a new connection over that single
socket.

I took the liberty of renaming the variables/keys since I found
'tunnel_addr' and 'sock_addr' rather confusing.

Reviewed-By: Mira Limbeck <m.limbeck@proxmox.com>
Tested-By: Mira Limbeck <m.limbeck@proxmox.com>

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-05-06 16:16:50 +02:00
Dominik Csapak
cd37203880 migrate: skip rescan for efidisk and shared volumes
we really only want to rescan the disk size of the disks we actually
need, and that are only the local disks (for which we have to allocate
the correct size on the target)

also we want to always skip the efidisk, since we get the wanted
size after the loop, and this produced a confusing log line
(for details why we do not want the 'real' size,
see commit 818ce80ec1)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2020-05-04 17:35:12 +02:00
Fabian Grünbichler
6f4b11e9db migrate: don't accidentally take NBD code paths
by avoiding auto-vivification of $self->{online_local_volumes} via
iteration. most code paths don't care whether it's undef or a reference
to an empty list, but this caused the (already) fixed bug of calling
nbd_stop without having started an NBD server in the first place.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-05-04 17:34:58 +02:00
Thomas Lamprecht
3e802221e1 migrate: only stop NBD if we got a NBD url from the target
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-04-29 16:22:33 +02:00
Fabian Ebner
ae180b8f08 Include vmstate and unused volumes in foreach_volid
and refactor the test_volid closure. Like this get_replicatable_volumes doesn't
need a separate loop for unused volumes anymore. For get_vm_volumes, which is used
for activation/deactivation of volumes at migration and deactivation in vm_stop_cleanup,
includes those volumes now. For migration it's an improvement, because those volumes
might need to be migrated and for vm_stop_cleanup it shouldn't hurt. The last user
of foreach_volid is check_vm_disks_local used by migrate_vm_precondition,
where information about the additional volumes doesn't hurt either.

Note that replicate is (still) set by default, so the behavior for
get_replicatable_volumes for unused volumes should not change.

Hibernation vmstate files are now also included and recognized as 'is_vmstate'.
The 'size' attribute will not be overwritten by subsequent iterations for the
same volid anymore (a volid may appear both in the config and in snapshots),
so the size from the current config is now preferred.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-04-29 12:14:40 +02:00
Fabian Ebner
b24f07d406 Fix test_volid call for vmstate and fix check for snapshots on migration
by excluding vmstate. It is referenced by snapshots, but
is not a volume containing a snapshot.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-04-29 12:14:40 +02:00
Fabian Grünbichler
90ff65b63a migrate: simplify replicated_volume loop
(no change compared to previous iteration except for readability)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-04-20 11:24:23 +02:00
Fabian Ebner
cee620e671 Fix live migration with replicated unused volumes
by counting only local volumes that will be live-migrated via qemu_drive_mirror,
i.e. those listed in $self->{online_local_volumes}.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-04-20 11:12:56 +02:00
Thomas Lamprecht
38311a1d17 migrate: workaround issues with format switch on storage live migration
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-04-17 15:27:38 +02:00
Fabian Ebner
ea5b400812 sync_disks: log output of storage_migrate
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-04-08 22:11:54 +02:00
Fabian Ebner
49a5a0d84b sync_disks: be more verbose if storage_migrate fails
If storage_migrate dies, the error message might not include the
volume ID or the target storage ID, but those might be good to know.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-04-08 22:11:54 +02:00