when a backup includes a cloudinit disk on a non-existent storage,
the restore fails with 'storage' does not exist
this happens because we want to get the format of the disk, by
checking the source storage
we fix this by using the target storage first and only the source as
fallback
this will still fail if neither storage exists
(which is ok, since we cannot restore then anyway)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
With Qemu 4.2 a new `audiodev` property was introduced [0] to explicitly
specify the backend to be used for the audio device. This is accompanied
with a warning that the fallback to the default audio backend is
deprecated.
[0] https://wiki.qemu.org/ChangeLog/4.2#Audio
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
This makes it possible to migrate a VM with volumes store1:vm-123-disk-0
store2:vm-123-disk-0 to some targetstorage. Also prevents migration failure
when there is an orphaned disk with the same volid on the target.
To avoid confusion, the name should not change for 'vmstate'-volumes.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
It was necessary to move foreach_volid back to QemuServer.pm
In VZDump/QemuServer.pm and QemuMigrate.pm the dependency on
QemuConfig.pm was already there, just the explicit "use" was missing.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Upstream marks these as having a micro-version of >=90, unfortunately the
machine versions are bumped earlier so testing them is made unnecessarily
difficult, since the version checking code would abort on migrations etc...
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
[ Thomas: do so refactor ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
so that pve-container and qemu-server use the same one, in preparation
for moving it to JSONSchema and having a bridgepair format.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Just like with live-migration, custom CPU models might change after a
snapshot has been taken (or a VM suspended), which would lead to a
different QEMU invocation on rollback/resume.
Save the "-cpu" argument as a new "runningcpu" option into the VM conf
akin to "runningmachine" and use as override during rollback/resume.
No functional change with non-custom CPU types intended.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This is required to support custom CPU models, since the
"cpu-models.conf" file is not versioned, and can be changed while a VM
using a custom model is running. Changing the file in such a state can
lead to a different "-cpu" argument on the receiving side.
This patch fixes this by passing the entire "-cpu" option (extracted
from /proc/.../cmdline) as a "qm start" parameter. Note that this is
only done if the VM to migrate is using a custom model (which we can
check just fine, since the <vmid>.conf *is* versioned with pending
changes), thus not breaking any live-migration directionality.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
in addition to printing it. preparation for remote cluster migration,
where we want to return this in a structured fashion over the migration
tunnel instead of parsing stdout via SSH.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
into one sub that retrieves the local disks, and the actual NBD
allocation. that way, remote incoming migration can just call the NBD
allocation with a custom list of volume names/storages/..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
the syntax is backwards compatible, providing a single storage ID or '1'
works like before. the new helper ensures consistent behaviour at all
call sites.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
as preparation of targetstorage mapping and remote migration. this also
removes re-using of the $local_volumes hash in the original code.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
to start breaking up vm_start before extending parts for new migration
features like storage and network mapping.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
as preparation for refactoring it further. remote migration will add
another 1-2 parameters, and it is already unwieldly enough as it is.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
on storages where the minimum size of images is bigger than the real
OVMF_VARS.fd file, they get padded to their minimum size
when using such an image, qemu maps it fully to the vm, but the efi
does not find the vars region and creates a file on the first efi
partition it finds
this breaks some settings in the ovmf, such as resolution
to fix this, we have to specify the size for the pflash, so that
qemu only maps the first n bytes in the vm (this only works for
raw files, not for qcow2)
we also have to use the correct size when converting between storages
in 'clone_disk' (used for move disk and cloning vms) and when
live migrating to different storages
when we now expect that the source image is always correctly used/created
(e.g. raw with size=x in pflash argument) then we always create the
target correctly
when encountering users which have a non-valid image (e.g. a efidisk
moved from zfs to qcow2 before this patch), we have to tell them to
recreate the efidisk and the settings on it
we have to version_guard it to 4.1+pve2 (since we haven't bumped yet
since the change to pve2)
also add 2 tests, one for the old version and one for the new
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-by: Stefan Reiter <s.reiter@proxmox.com>
[ Thomas: rebased to master ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
by only checking for replicatable volumes when a replication job is
defined, and passing only actually replicated volumes to the target node
via STDIN, and back via STDOUT.
otherwise this can pick up theoretically replicatable, but not actually
replicated volumes and treat them wrong.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
$cpu_fmt is being reused for custom CPUs as well as VM-specific CPU
settings. The "pve-vm-cpu-conf" format is introduced to verify a config
specifically for use as VM-specific settings.
"pve-cpu-conf" is registered for use in custom CPU API calls (where no
additional checks are required).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
There is a need to set $noerr, because otherwise migration for a
VM with a non-replicatable volume fails with:
missing replicate feature on volume 'myfs:107/vm-107-disk-2.raw'
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
with incremental drive-mirror and dirty-bitmap tracking.
1.) get replicated disks that are currently referenced by running VM
2.) add a block-dirty-bitmap to each of them
3.) replicate ALL replicated disks
4.) pass bitmaps from 2) to drive-mirror for disks from 1)
5.) skip replicated disks when cleaning up volumes on either source or
target
added error handling is just removing the bitmaps if an error occurs at
any point after 2, except when the handover to the target node has
already happened, since the bitmaps are cleaned up together with the
source VM in that case.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Tested-by: Stefan Reiter <s.reiter@proxmox.com>
by re-using a dirty bitmap that represents changes since the divergence
of source and target volume. requires a qemu that supports incremental
drive-mirroring, and will die otherwise.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Tested-by: Stefan Reiter <s.reiter@proxmox.com>
Moved code so that initialization of drivedesc_hash stays a single block.
Avoid auto-vivication in parse_drive.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
E.g.: If a feature requires 4.1+pveN and we're using machine version 4.2
we don't need to increase the pve version to N (4.2+pve0 is enough).
We check this by doing a min_version call against a non-existant higher
pve-version for the major/minor tuple we want to test for, which can
only work if the major/minor alone is high enough.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
As the NBD server spawned by qemu can only listen on a single socket,
we're dependent on a version being passed to vm_start that indicates
which protocol can be used, TCP or Unix, by the source node.
The change in socket type (TCP to Unix) comes with a different URI. For
unix sockets it has the form: 'nbd:unix:<path/to/socket>:exportname=<device>'.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
With Qemu 4.2 we encountered a problem with unix sockets and SSH socket
forwarding for drive-mirror. It seems the socket gets reopened again and
again after it closes for some reason. This can be worked around by
specifying 'block-job-cancel' instead of 'block-job-complete' when we're
not interested in swapping the disks again from NBD to their original
protocol. This is always the case when we use drive-mirror for live
migrating a VM.
qemu_drive_mirror is used for migration and for clone_disk. All in all
we have 3 cases to handle. Either the 'skip' case which skips the
completion of the job. The 'wait' case which was the default before and
still is when $completion is undefined. And the new 'wait_noswap' case
which is used for the live migration.
If 'wait_noswap' is specified, we issue a 'block-job-cancel' once the block
job is in 'ready' state. This completes the block job without swapping the
disks.
clone_disk always uses 'block-job-cancel' via the qemu_blockjobs_cancel
sub.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
removes safe_string_ne and safe_num_ne code which is now shared in
GuestHelpers. also change all the calls to use the shared definitions.
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
The initialization for the drive keys in $confdesc is changed
to be a single for-loop iterating over the keys of $drivedesc_hash and
the initialization of the unusedN keys is move to directly below it.
To avoid the need to change all the call sites, functions with more than
a few callers are exported from the submodule and imported into QemuServer.pm.
For callers of the now imported functions within QemuServer.pm, the prefix
PVE::QemuServer is dropped, because it is unnecessary and now even confusing.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
which contains the full descriptions of the drives, and
make parse_drive not depend on $confdesc anymore.
In preparation to moving drive-related code to its own module.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Allow a user to add a virtio-rng-pci (an emulated hardware random
number generator) to a VM with the rng0 setting. The setting is
version_guard()-ed.
Limit the selection of entropy source to one of three:
/dev/urandom (preferred): Non-blocking kernel entropy source
/dev/random: Blocking kernel source
/dev/hwrng: Hardware RNG on the host for passthrough
QEMU itself defaults to /dev/urandom (or the equivalent getrandom()
call) if no source file is given, but I don't fully trust that
behaviour to stay constant, considering the documentation [0] already
disagrees with the code [1], so let's always specify the file ourselves.
/dev/urandom is preferred, since it prevents host entropy starvation.
The quality of randomness is still good enough to emulate a hwrng, since
a) it's still seeded from the kernel's true entropy pool periodically
and b) it's mixed with true entropy in the guest as well.
Additionally, all sources about entropy predicition attacks I could find
mention that to predict /dev/urandom results, /dev/random has to be
accessed or manipulated in one way or the other - this is not possible
from a VM however, as the entropy we're talking about comes from the
*hosts* blocking pool.
More about the entropy and security implications of the non-blocking
interface in [2] and [3].
Note further that only one /dev/hwrng exists at any given time, if
multiple RNGs are available, only the one selected in
'/sys/devices/virtual/misc/hw_random/rng_current' will feed the file.
Selecting this is left as an exercise to the user, if at all required.
We limit the available entropy to 1 KiB/s by default, but allow the user
to override this. Interesting to note is that the limiter does not work
linearly, i.e. max_bytes=1024/period=1000 means that up to 1 KiB of data
becomes available on a 1000 millisecond timer, not that 1 KiB is
streamed to the guest over the course of one second - hence the
configurable period.
The default used here is the same as given in the QEMU documentation [0]
and has been verified to affect entropy availability in a guest by
measuring /dev/random throughput. 1 KiB/s is enough to avoid any
early-boot entropy shortages, and already has a significant impact on
/dev/random availability in the guest.
[0] https://wiki.qemu.org/Features/VirtIORNG
[1] https://git.qemu.org/?p=qemu.git;a=blob;f=crypto/random-platform.c;h=f92f96987d7d262047c7604b169a7fdf11236107;hb=HEAD
[2] https://lwn.net/Articles/261804/
[3] https://lwn.net/Articles/808575/
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
1. Avoids the error
"VM 111 qmp command 'block_resize' failed - The new size must be a multiple of 512"
for qcow2 disks.
2. Because volume_import expects disk sizes to be a multiple of 1 KiB.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Some of the recent QMP changes require at least 2.8.0, but since the
oldest version we officially package for 6.x is 4.0.0 anyway, checking
for at least 3.0 should not break anyone's setup.
Note that this does not affect machine version checks, only the
installed QEMU binary version.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Live-migrating a VM with more than 14 SCSI disks to a node that doesn't
support it yet is broken. Use a bumped pve-version to represent that and
give the user a nice error message instead.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The previously introduced approach can fail for pinned versions when a
new QEMU release is introduced. The saner approach is to use a mapping
that gives one pve-version for each QEMU release.
Fortunately, the old system has not been bumped yet, so we can still
change it without too much effort.
QEMU versions without a mapping are assumed to be pve0, 4.1 is mapped to
pve1 since thats what we had as our default previously.
Pinned machine versions (i.e. pc-i440fx-4.1) are always assumed to be
pve0, for specific pve-versions they'd have to be pinned as well (i.e.
pc-i440fx-4.1+pve1).
The new logic also makes the pve-version dynamic, and starts VMs with
the lowest possible 'feature-level', i.e. if a feature is only available
with 4.1+pve2, but the VM isn't using it, we still start it with
4.1+pve0.
We die if we don't support a version that is requested from us. This
allows us to use the pve-version as live-migration blocks (i.e. bumping
the version and then live-migrating a VM which uses the new feature (so
is running with the bumped version) to an outdated node will present the
user with a helpful error message and fail instead of silently modifying
the config and only failing *after* the migration).
$version_guard is introduced in config_to_command to use for features
that need to check pve-version, it automatically handles selecting the
newest necessary pve-version for the VM.
Tests have to be adjusted, since all of them now resolve to pve0 instead
of pve1. EXPECT_ERROR matching is changed to use 'eq' instead of regex
to allow special characters in error messages.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>