Commit Graph

254 Commits

Author SHA1 Message Date
Fabian Grünbichler
a20dc58a1b explain 'nocheck' in more places
was only explained in git history and vm_stop, add comments in other
relevant places to avoid future breakage.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2022-11-21 13:42:52 +01:00
Fabian Grünbichler
eef93bc590 migrate: add remote migration handling
remote migration uses a websocket connection to a task worker running on
the target node instead of commands via SSH to control the migration.
this websocket tunnel is started earlier than the SSH tunnel, and allows
adding UNIX-socket forwarding over additional websocket connections
on-demand.

the main differences to regular intra-cluster migration are:
- source VM config and disks are only removed upon request via --delete
- shared storages are treated like local storages, since we can't
assume they are shared across clusters (with potentical to extend this
by marking storages as shared)
- NBD migrated disks are explicitly pre-allocated on the target node via
tunnel command before starting the target VM instance
- in addition to storages, network bridges and the VMID itself is
transformed via a user defined mapping
- all commands and migration data streams are sent via a WS tunnel proxy
- pending changes and snapshots are discarded on the target side (for
  the time being)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2022-11-17 15:21:39 +01:00
Fabian Grünbichler
05b2a4ae9c migrate: refactor remote VM/tunnel start
no semantic changes intended, except for:
- no longer passing the main migration UNIX socket to SSH twice for
forwarding
- dropping the 'unix:' prefix in start_remote_tunnel's timeout error message

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2022-11-17 15:21:39 +01:00
Thomas Lamprecht
71cc2c4177 migration: cloudinit check: bump manager dependency and guard with cloudinit drive
The former to ensure the manager that depends on the newer
qemu-server is actually installed and the latter to avoid false
positives

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-16 12:47:43 +01:00
Alexandre Derumier
73ed64967e migration : add del_nets_bridge_fdb
at the end of a live migration, we need to remove old mac entries
on source host (vm is not yet stopped), before resume vm on target host

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
 [T: resolve conflicts and rework on apply ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-13 14:56:57 +01:00
Alexandre Derumier
9c88e85446 migration: test targetnode min version for cloudinit section
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
2022-11-08 17:23:30 +01:00
Fabian Ebner
13d121d79b fix #3861: migrate: fix live migration when cloud-init changes storage
Generalizes fd95d780 ("migrate: send updated TPM state volid to target
node") to also handle other offline migrated disks appearing in the
VM config, which currently should only be cloud-init.

Breaks migration new -> old under similar (edge-case-)conditions as
fd95d780 did.

Keep sending the 'tpmstate0' STDIN parameter to avoid breaking new ->
old in the scenario fd95d780 fixed.

Keep parsing the vm_start 'tpmstate0' STDIN parameter to avoid
breaking old -> new, and to be able to keep sending it.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-04-28 18:29:12 +02:00
Fabian Ebner
8a0d269b75 migrate: resume initially running VM when failing after convergence
When phase2() is aborted after the migration already converged, then
after migrate_cancel, the VM might be in POSTMIGRATE state.

(There also is a conditional for SHUTDOWN state in QEMU's
migration_iteration_finish(), so it's likely possible to end up there
if the VM is shut down at the right time during migration, but no need
to resume then).

Detect the POSTMIGRATE state and resume the VM if it wasn't paused at
the beginning of the migration. There is no direct way to go to
PAUSED, so just print an error if the VM was paused at the beginning
of the migration.

Reported-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-04-26 11:42:25 +02:00
Fabian Ebner
0028391f95 migrate: add log for guest fstrim
and make a failure noticable.

Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-04-26 11:42:25 +02:00
Fabian Ebner
a183576e30 migrate: keep VM paused after migration if it was before
Also cannot issue a guest agent command in that case.

Reported in the community forum:
https://forum.proxmox.com/threads/106618

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-04-21 08:57:11 +02:00
Fabian Grünbichler
e594231bf1 migrate: move tunnel-helpers to pve-guest-common
besides the log calls these don't need any parts of the migration state,
so let's make them generic and re-use them for container migration and
replication in the future.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2022-02-09 18:49:55 +01:00
Fabian Grünbichler
82a0367149 move map_storage to PVE::JSONSchema::map_id
since we are going to reuse the same mechanism/code for network bridge
mapping and pve-container.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2022-02-09 18:46:20 +01:00
Fabian Grünbichler
fd95d780a2 migrate: send updated TPM state volid to target node
The volid may change if local-storage migration is involved, we need
to tell the target node the new one and update the in-memory config
for starting the target VM accordingly.

Reported here: https://forum.proxmox.com/threads/99906/#post-431345

this possibly breaks migration new -> old iff
- spice is not used (else the explicit ticket wins because it comes
  later)
- a local TPM state volume is used
- that local TPM state volume has a different volume id on the target
  node (switched storage, volname already taken, ..)

because the target node will then mis-interpret the tpmstate0 line as
spice ticket and set it accordingly. if the old tpm state volume ID does
not exist on the target node, migration will fail. if it exists by
chance, it might work albeit with a wrong spice ticket (new because of
this patch) and tpm state volume (pre-existing breakage).

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-11-22 16:55:17 +01:00
Stefan Reiter
f9dde219f2 fix #3075: add TPM v1.2 and v2.0 support via swtpm
Starts an instance of swtpm per VM in it's systemd scope, it will
terminate by itself if the VM exits, or be terminated manually if
startup fails.

Before first use, a TPM state is created via swtpm_setup. State is
stored in a 'tpmstate0' volume, treated much the same way as an efidisk.

It is migrated 'offline', the important part here is the creation of the
target volume, the actual data transfer happens via the QEMU device
state migration process.

Move-disk can only work offline, as the disk is not registered with
QEMU, so 'drive-mirror' wouldn't work. swtpm itself has no method of
moving a backing storage at runtime.

For backups, a bit of a workaround is necessary (this may later be
replaced by NBD support in swtpm): During the backup, we attach the
backing file of the TPM as a read-only drive to QEMU, so our backup
code can detect it as a block device and back it up as such, while
ensuring consistency with the rest of disk state ("snapshot" semantic).

The name for the ephemeral drive is specifically chosen as
'drive-tpmstate0-backup', diverging from our usual naming scheme with
the '-backup' suffix, to avoid it ever being treated as a regular drive
from the rest of the stack in case it gets left over after a backup for
some reason (shouldn't happen).

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
2021-10-05 06:51:02 +02:00
Thomas Lamprecht
f8830c4d6e migrate: code style, use up to 100cc if it helps to reduce line-bloat
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-09-22 09:26:18 +02:00
Thomas Lamprecht
95b3583b5e migrate: simplify code and add comment
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-09-22 09:25:53 +02:00
Fabian Ebner
d213ba299d migrate: use correct target storage id for checks
The '--targetstorage' parameter does not apply to shared storages.

Example for a problem solved with the enabled check: Given a VM with
images only on a shared storage 'storeA', not available on the target
node (i.e. restricted by the nodes property). Then using
'--targetstorage storeB' would make offline migration suddenly
"work", but of course the disks would not be accessible and then
trying to migrate back would fail...

Example for a problem solved with the content type check: if a
VM had a shared ISO image, and there was a '--targetstorage storeA'
option, availablity of the 'iso' content type is checked for
'storeA', which is wrong as the ISO would not be moved to that
storage.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-09-22 08:57:35 +02:00
Mira Limbeck
104f47a9f8 fix #2563: allow live migration with local cloud-init disk
The content of the ISO should be the same on both nodes, so offline
migrate the ISO, but don't regenerate it on VM start on the target node.

This way even with snippets the content will not change during live
migration.

Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
2021-07-23 11:04:22 +02:00
Wolfgang Bumiller
205dbf39b1 allow migrating raw btrfs volumes
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2021-06-23 12:26:40 +02:00
Thomas Lamprecht
db861a4617 migrate prepare: make content type check generic
to avoid false-positives, e.g., from a ISO on a ISO only storage.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-06-23 12:15:43 +02:00
Thomas Lamprecht
8a5bd88907 migrate prepare: use also explicit variable for storecfg
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-06-23 12:15:16 +02:00
Fabian Ebner
24b84b4766 migrate: enforce that image content type is available
and use it for the vdisk_list call too. This avoids scanning (and picking up
volumes from!) storages that are not even configured to hold images.

Previously, the content type was only enforced when a storage map was present.

Also serves a bit as a preparation to enforce content type on guest startup,
because now migration failure happens early and not only when trying to start
the guest on the remote node.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-06-21 11:17:48 +02:00
Fabian Ebner
0d2db08414 prefer storage_check_enabled over storage_check_node
storage_check_enabled simply checks for the 'disable' option and then calls
storage_check_node.

While not strictly necessary for a second call where only the storage differs,
e.g. in case of clone, it is more future-proof: if support for a target storage
is added at some point, it might be easy to miss adapting the call.

For the migration checks, the situation is improved by now always catching
disabled (target) storages.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-06-21 11:17:48 +02:00
Fabian Ebner
692f604bb0 Revert "revert spice_ticket prefix change in 7827de4"
This reverts commit ff09c795ed. We wanted to wait
until PVE 7.0 for the change to not break migration new -> old until then.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Reviewed-by: Stefan Reiter <s.reiter@proxmox.com>
2021-06-08 14:56:10 +02:00
Thomas Lamprecht
8f43ac4893 Revert "migration: do not set default speed limit"
The default was changed for 5.2, so while it is not 32 MiB/s anymore,
it is still 128 MiB/s which I did not notice on my 1 Gbps (or < 125
MiB/s) setup. For users with links faster than one gigabit it now did
some limiting - so setup a very high limit so than even 100G should
not max this out.

This reverts commit a89bd10084.
2021-04-29 15:48:21 +02:00
Fabian Ebner
9938d24df2 migrate: fix memory migration start time
The variable is only ever used for calculating the average speed of memory
migration, but it was set before disk mirroring already. But the disk
sizes are not included in the calculation, resulting in (very) wrong values.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-23 15:00:44 +02:00
Thomas Lamprecht
b68a957b2e migration: keep log rate steady if polling gets more frequent
Either we're done in a few seconds anyway, or if the VM dirties lots
of pages we need quite a bit of time, and then it does not help to
output roughly the same status 10 times a second...

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-19 22:08:19 +02:00
Thomas Lamprecht
0fca250af0 migration: rework logging to more humand friendly, less spammy
* use render_bytes where possible, to get quick to read and grasp
  units printed
* xbzrle is only interesting if actually pages/bytes are send using
  it, so only log in that case
* log if VM dirties more than we send
* log current speed we get from QEMU

In general there are less lines logged and huge integers are avoided.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-19 21:54:37 +02:00
Thomas Lamprecht
e693c49190 migration: factor out variable + code cleanup
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-19 21:51:21 +02:00
Thomas Lamprecht
7de328c629 migration: log: s/migration_caps/migration capabilities/
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-19 21:48:31 +02:00
Thomas Lamprecht
a89bd10084 migration: do not set default speed limit
the claim that QEMU limits this to 32M otherwise is bogus, at least
with any current QEMU version..

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-19 21:46:52 +02:00
Thomas Lamprecht
6539865a9d migration: refactor and tidy-up code
Use an early die so that the rest can loose an indentation level for
the actual migration status reporting code

Extract common used members of the stat hash for shorter code.

use `git show -w --word-diff=color --word-diff-regex='\w+'` for
getting a better view of actual changes

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-04-19 14:59:54 +02:00
Fabian Ebner
0783c3c271 migration: move finishing block jobs to phase2 for better/uniform error handling
avoids the possibility to die during phase3_cleanup and instead of needing to
duplicate the cleanup ourselves, benefit from phase2_cleanup doing so.

The duplicate cleanup was also very incomplete: it didn't stop the remote kvm
process (leading to 'VM already running' when trying to migrate again
afterwards), but it removed its disks, and it didn't unlock the config, didn't
close the tunnel and didn't cancel the block-dirty bitmaps.

Since migrate_cancel should do nothing after the (non-storage) migrate process
has completed, even that cleanup step is fine here.

Since phase3 is empty at the moment, the order of operations is still the same.

Also add a test, that would complain about finish_tunnel not being called before
this patch. That test also checks that local disks are not already removed
before finishing the block jobs.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
a6be63ac9b migration: split out replication from scan_local_volumes
and avoid one loop over the config, by extending foreach_volid to include the
drivename.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
4b26ffbfa5 migration: keep track of replicated volumes via local_volumes
by extending filter_local_volumes.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
efe0d457c6 migration: use storage_migration for checks instead of online_local_volumes
Like this we don't need to worry about auto-vivifaction.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
eb5751ba02 migration: cleanup_remotedisks: simplify and include more disks
Namely, those migrated with storage_migrate by using the information from
volume_map. Call cleanup_remotedisks in phase1_cleanup as well, because that's
where we end if sync_offline_local_volumes fails, and some disks might already
have been transfered successfully. Note that the local disks are still here, so
this is fine.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
ad8b9d5e2d migration: simplify removal of local volumes and get rid of self->{volumes}
This also changes the behavior to remove the local copies of offline migrated
volumes only after the migration has finished successfully (this is relevant
for mixed settings, e.g. online migration with unused/vmstate disks).

local_volumes contains both, the volumes previously in $self->{volumes}
and the volumes in $self->{online_local_volumes}, and hence is the place
to look for which volumes we need to remove. Of course, replicated
volumes still need to be skipped.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
efbbe59da4 migration: add nbd migrated volumes to volume_map earlier
and avoid a little bit of duplication by creating a helper

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
c3417e3b6e migration: save targetstorage and bwlimit in local_volumes hash and re-use information
It is enough to call get_bandwith_limit once for each source_storage.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
2c4ba4c3ee migration: fix calculation of bandwith limit for non-disk migration
The case with:
1. no generic 'migration' limit from the storage plugin
2. a migrate_speed limit in the VM config
was broken. It would assign 0 to migrate_speed when picking the minimum value
and then default to the default value. Fix it by checking if bwlimit is 0
before picking the minimum.

Also, make it a bit more readable by avoiding the trick of //-assigning bwlimit
before the units match up and relying on getting back the original bwlimit value
as the minimum. Instead, only ||-assign after the units match up and don't rely
on other things.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
3276a43470 migration: split out config_update_local_disksizes from scan_local_volumes
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
62a4c963b8 migration: avoid re-scanning all volumes
by using the information obtained in the first scan. This
also makes sure we only scan local storages.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
d10b78f4d2 migration: split sync_disks into two functions
by making local_volumes class-accessible. One functions is for scanning all local
volumes and one is for actually syncing offline volumes via storage_migrate. The
exception is replicated volumes, this still happens during the scan for now.

Also introduce a filter_local_volumes helper, to makes life easier.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-04-18 18:30:41 +02:00
Fabian Ebner
eb3acec88a migration: sort volumes migrated with storage_migrate
Having a deterministic order here is useful for testing.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-12-15 15:21:37 +01:00
Fabian Ebner
7d730f953c migration: factor out starting remote tunnel
so it can be mocked when testing.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-12-15 15:21:37 +01:00
Fabian Ebner
27fa645e66 use new move_config_to_node method
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-12-15 15:21:37 +01:00
Fabian Ebner
e219712561 deactivate volumes after storage_migrate
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-24 16:19:35 +01:00
Fabian Ebner
78bd57d9c3 adapt to new storage_migrate activation behavior
Offline migrated volumes are now activated within storage_migrate.
Online migrated volumes can be assumed to be already active.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-24 16:19:29 +01:00
Fabian Ebner
19ff368213 don't migrate replicated VM whose replication job is marked for removal
while it didn't actually fail, we probably want to avoid the behavior:

With remove_job=full:
    * run_replication called during migration causes the replicated volumes to
      be removed
    * migration continues by fully copying all volumes

With remove_job=local:
    * run_replication called during migration causes the job (and local
      replication snapshots) to be removed
    * migration continues by fully copying all volumes and renaming them to
      avoid collision with the still existing remote volumes

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-09 10:08:22 +01:00