Enables live-restore functionality using the 'alloc-track' QEMU driver.
This allows starting a VM immediately when restoring from a PBS
snapshot. The snapshot is mounted into the VM, so it can boot from that,
while guest reads and a 'block-stream' job handle the restore in the
background.
If an error occurs, the VM is deleted and all data written during the
restore is lost.
The VM remains locked during the restore, which automatically prohibits
any modifications to the config while restoring. Some modifications
might potentially be safe, however, this is experimental enough that I
believe this would cause more bad stuff(tm) than actually satisfy any
use cases.
Pool handling is slightly adjusted so the VM can be added to the pool
before the restore starts.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Uses the custom 'alloc-track' filter node to redirect writes to the
original drives target, while unwritten blocks will be read from the
specified PBS snapshot.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
...so it works with other block jobs as well. Intended use case is
block-stream, which also requires a new "auto" (wait only) completion
mode, since it finishes automatically anyway.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
cloud-init's SLAAC option was disabled in 2018 because there was no
support for it. Now that cloud-init 19.4 or newer versions are more
widespread, we can finally reenable it.
Also include minimum required cloud-init version for SLAAC support in
format description.
Tested on Ubuntu 20.04 (ci 20.4), CentOS 8 (ci 19.4), Debian 10 (ci
20.2).
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
A fix was also provided in bugzilla by user wsapplegate:
https://bugzilla.proxmox.com/show_bug.cgi?id=3314
Tested on Ubuntu 20.04, CentOS 8 and Debian 10.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
In testing this usually completes almost immediately, but in theory this
is a storage/IO operation and as such can take a bit to finish. It's
certainly not unthinkable that it might take longer than the default *3
seconds* we've given it so far. Make it a minute.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Only show "not supported by QEMU version" message if we determine that
to be the actual cause, just print the error otherwise.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Commit abff03211f switched to iterating over the
values instead of the keys, but didn't update the variable name. Use target_sid,
because target is already in use for the target node.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
A "savevm" call (both our async variant and the upstream sync one) use
migration code internally. As such, they both expect migration
capabilities to be set.
This is usually not a problem, as the default set of capabilities is ok,
however, it leads to differing snapshot settings if one does a snapshot
after a machine has been live-migrated (as the capabilities will persist
from that), which could potentially lead to discrepencies in snapshots
(currently it seems to be fine, but it still makes sense to set them to
safeguard against future changes).
Note that we do set the "dirty-bitmaps" capability now (if
query-proxmox-support reports true), which has three effects:
1) PBS dirty-bitmaps are preserved in snapshots, enabling
fast-incremental backups to work after rollback (as long as no newer
backups exist), including for hibernate/resume
2) snapshots taken from now on, with a QEMU version supporting bitmap
migration, *might* lead to incompatibility of these snapshots with
QEMU versions that don't know about bitmaps at all (i.e. < 5.0 IIRC?)
- forward compatibility is still given, and all other capabilities we
set go back to very old versions
3) since we now explicitly disable bitmap saving if the version doesn't
report support, we avoid crashes even with not-updated QEMU versions
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
At this stage, there are no keys in %storage_limits to iterate over. The
refactoring in commit 9f3d73bc35 broke the logic
by accident.
Also explicitly set zero if there is no limit to avoid repeating the
get_bandwith_limit call for the same storage. When accessing the value later,
zero is already correctly handled as 'no limit'.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
and use file_set_contents to really commit it afterwards. Mostly done as a
preparation for the later patch for sanitizing the config on restore, but
shouldn't hurt by itself either.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Commit "a941bbd0 client: raise HTTP_TIMEOUT to 120s" in proxmox-backup
did the same, however, we would now still fail after 60 seconds since
the QMP call would time out.
Increase the timeout here to the same +5 seconds to give some time to
receive a response, so if the HTTP call in proxmox-backup times out, we
can still get a useful error message instead of timing out the QMP call
too.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The existing check_vm_modify_config_perm doesn't do so anymore, but
the check only got re-added to the modify/delete paths. See commits
165be267eb and
e30f75c571 for context.
In the future, it might make sense to generalise the
check_vm_modify_config_perm and have it not only take keys, but both
new and old values, and use that generalised function everywhere.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
A fix for violating a important standard for booting[0] in recently
packaged QEMU 5.2 surfaced some issues with Windows based VMs in our
forum[1], which seem to be quite sensitive for such changes (it seems
they derive lots of their device assignment from ACPI).
User visible effects are loss of any network configuration due to
windows thinking it was swapped with a new one, and starts with a
fresh config - this is mostly problematic for setups with static
address assignment.
There may be lots of other, more subtle, effects and the PVE admin is
also not always the VM admin, so we really need to avoid such
negative effects. Do this by pinning the version of any windows based
VMs to either the minimum of (5.1, kvm-version) for existing VMs or
the kvm-version at time of VM creation for new ones.
There are patches in pve-manager for user to be able to change the
pinned version themself in the webinterface, so this can now also get
adapted more easily if there surface any other issues (with new or
old version) in the future.
0: https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg08484.html
1: https://forum.proxmox.com/threads/warning-latest-patch-just-broke-all-my-windows-vms-6-3-4-patch-inside.84915/page-2#post-373331
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Moving to Ceph is very slow when bs=1. Instead, use a larger block size in
combination with the (currently) PVE-specific osize option to specify the
desired output size.
Suggested-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Since CDRoms and disks share the same config keys, we need to check if
it actually is a CDRom and then check the permissions accordingly.
Otherwise it is possible for someone without VM.Config.CDROM
permissions, but with VM.Config.Disk permissions to remove a CD drive
while being unable to create a CDRom drive.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
...taking card not to lose the custom precision for byte conversion.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
currently only pending changes are applied when we regenerate
image on a running vm, but not the pending delete.
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
Previously one could specify a CPU flag like 'pcidfoobar' and it would
be accepted, even though we attempt to filter VM-only flags for
security. AFAICT none of the flags we allow can be turned into any
others just by appending text, but better safe than sorry.
Reported-by: Oguz Bektas <o.bektas@proxmox.com>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
by checking if the vm is paused at the beginning and skipping the
resume now we also skip the qga freeze/thaw (which cannot work if the
vm is paused)
moved the 'vm_is_paused' sub from the api to PVE/QemuServer.pm so it
is available everywhere we need it.
since a suspend backup would pause the vm anyway, we can skip that
step also
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Fabian Ebner <f.ebner@proxmox.com>
this was previously covered by the "lets destroy ever disk which
matches the VMID" feature we disarmed a bit.
As unused disks are referenced in the config, it is not subtle to
destroy them (and we always did in the past) so fix that regression
again for explicitly referenced but unused disks.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Since an old change released with a version bump on 2009-09-07, we
search all enabled storages for VMID maching volumes on VM removal
and purge those too.
This has multiple pitfalls and may be quite unexpected for some
users.
It can make problems when:
* on recovery a VM is created, before disks are reattached the admin
notices some settings issues and chooses to just recreate the VM;
but during destroying the dummy VM all related disks get destroyed
unconditionally which may result in data loss. This actually
happened and is the original reason for the decision to change
this.
* a storage is shared between PVE instance (between a set of clusters
and/or single nodes), while this is against our rules it may still
come as a surprise if destroying a VM on node A may destroy
unrelated and unreferenced disks on the unrelated node B without
asking or allowing to avoid that.
As this the removal of matching but unreferenced disks can result in
permanent data loss (up to the last backup) and may be to subtle and
unforgiving, allow to opt-out of it.
In the long run we want to make this opt-in, but that is an API
change and so needs to wait for next major release. But, we can adapt
the GUI already to make it opt-in there, catching most of the cases.
side-note: CT do not have this behavior at all
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
While the do_import method cleans up the current disk it was
importing on any error the following cases are not handled:
* multiple disks, first few succeed then one fails, only the last
failed one was taken care of before this patch
* error after the import disk loop was not handled
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
On clone_vm when cloning the disks while the VM is running, we use
drive-mirror. We skip completion until the last disk, but with a
cloudinit disk there's no drive-mirror and so no completion done. If it
is the last disk in the hash, we never complete the drive-mirror jobs
and no further cloning is possible as there are already active jobs
using the disks.
To fix it we have to call qemu_drive_mirror_monitor directly in the case
of cloudinit when completion is requested and there are jobs defined.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
The phrasing left some room for speculation when this would be triggered.
E.g. after cloning a full VM?
Currently the only instances where it is used is when a disk is moved or
a VM migrated.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
We only added the format extension when it was not 'raw'. But on file level
storages we always require it. To fix this, always add the format
extension if the storage provides the 'path' property.
This is the same logic we use in create_disks for cloudinit disks.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
by partially reverting 4df98f2f14 and fixing the
line-length issue differently. The commit didn't update two later usages of
$size, breaking copying the efidisk. The other usage as a parameter to
qemu_img_convert() is luckily only cosmetic, for progress output.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
this fixes the issue that we did not generate the correct repository
url for pbs storages that contained an ipv6 address or a port
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
showing off it's monstrosity of a method signature, needs to be
cleaned up in a followup commit
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Extends print_recursive_hash for the CLI to handle JSON booleans so the
result will actually show up in 'qm status --verbose'.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Offline migrated volumes are now activated within storage_migrate.
Online migrated volumes can be assumed to be already active.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
while it didn't actually fail, we probably want to avoid the behavior:
With remove_job=full:
* run_replication called during migration causes the replicated volumes to
be removed
* migration continues by fully copying all volumes
With remove_job=local:
* run_replication called during migration causes the job (and local
replication snapshots) to be removed
* migration continues by fully copying all volumes and renaming them to
avoid collision with the still existing remote volumes
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
In some cases $self->{replicated_volumes} will be auto-vivified
to {} by checks like
next if $self->{replicated_volumes}->{$volid}
and then {} would evaluate to true in a boolean context.
Now the replication information is retrieved once in prepare,
and used to decide whether to make the calls or not.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
No need to warn twice, so the warning from the outside check
was removed.
Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
When the VM is in status 'shutdown', i.e. after the guest issues a
powerdown while a backup is running, QEMU requires a 'system_reset' to
be issued before 'cont' can boot the guest again.
Additionally, when the VM has been powered down during a backup, the
logically correct call would be a 'vm_start', so automatically vm_resume
from vm_start in case this situation occurs. This also means the GUI can
cope with this almost unchanged.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Ignore shutdowns triggered from within the guest in favor of detecting
them via qmeventd and stopping the QEMU process that way.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Now that VMs can be started during a backup, it makes sense to create a
dirty bitmap in these cases too, since the VM might be resumed and thus
continue running normally even after the backup is done.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Connect and send the vmid of the VM being backed up. This prevents
qmeventd from SIGTERMing the underlying QEMU instance, even if the guest
shuts itself down, until we close the socket connection (in cleanup,
which happens on success and abort, or if we crash the file handle will
be closed as well).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
by adding the missing argument (otherwise all the other ones are shifted
one slot to the left, which is of course bogus).
this has been broken since 2018 (d559309), but was only made
visible/caused a failure with the recent changes adding
use strict;
use warnings;
to PVE::QemuServer::PCI
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
if the 'backup' qmp call itself times out or fails, we still want to
try to cancel the backup, else it can happen that there is still
a backup running even when vzdump thinks it was canceled
qapi docs says that backup cancel always returns success, even
if no backup is running
since we hold a global and a per vm lock for the backup, this should be
ok, since we should not reach this code without that lock
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
We query QEMU if it's safe before enabling it, as on versions without
the necessary patches it not only would be useless, but can actually
lead to hangs.
PBS state is always migrated, as it's a small amount of data anyway, so
we don't need to set a specific flag for it.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Specifying 'boot: order=' was intended to be used for an empty bootorder
(i.e. no boot devices), but as it turns out our format parser doesn't
like empty '-list' properties if they are nested in a subformat.
Fixing this in JSONSchema sounds like a risky move, so instead just
write 'boot: ' (without 'order=') to indicate an empty bootorder. The
rest of the code handles it just fine, as this was valid before too.
Incidentally also fixes a bug where you couldn't create a new VM without
any disks if no explicit 'boot' property was specified (i.e. a simple
'qm create 100' without any parameters would fail).
Reported-by: Dominic Jäger <d.jaeger@proxmox.com>
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
fixes commit 74c17b7a23 which moved
this code here, but forgot to pass $vga ref, as the module was not
using warning nor strict mode this was not caught..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
After migration or a rollback the cloudinit disk might not be allocated, so
volume_size_info() fails. As we override the value anyway for cloudinit
and efi disks simply move the volume_size_info() call into the 'else'
branch.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
All volumes contained in $vollist are activated. In this case a snapshot
of the volume. For cloudinit disks no snapshots are created so don't add
it to the list of volumes to activate as it otherwise fails with no
logical volume found.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
The API is updated to handle the deprecation correctly, i.e. when
updating the 'order' attribute, the old 'legacy' (default_key) values
are removed (would now be ignored anyway).
When removing a device that is in the bootorder list, it will be removed
from the aforementioned. Note that non-existing devices in the list will
not cause an error - they will simply be ignored - but it's still nice
to not have them in there.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
(also fixes#3011)
Deprecates the old-style 'boot' and 'bootdisk' options by adding a new
'order=' subproperty to 'boot'.
This allows a user to specify more than one disk in the boot order,
helping with newer versions of SeaBIOS/OVMF where disks without a
bootindex won't be initialized at all (breaks soft-raid and some LVM
setups).
This also allows specifying a bootindex for USB and hostpci devices,
which was not possible before. Floppy boot support is not supported in
the new model, but I doubt that will be a problem (AFAICT we can't even
attach floppy disks to a VM?).
Default behaviour is intended to stay the same, i.e. while new VMs will
receive the new 'order' property, it will be set so the VM starts the
same as before (using get_default_bootorder).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The format is unused in this commit, but will replace the current
string-based format of the 'boot' property. It is included since the
parameter of bootorder_from_legacy follows it.
Two helper methods are introduced:
* bootorder_from_legacy: Parses the legacy format into a hash closer to
what the new format represents
* get_default_bootdevices: Encapsulates the legacy default behaviour if
nothing is specified in the boot order
resolve_first_disk is simplified and gets a new $cdrom parameter to
control the behaviour of excluding CD-ROMs or instead searching for only
them.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
We use verification for something more in-depth on the PBS server, so
avoid that term to avoid misunderstandings.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
We already keep hugepages if they are created with the kernel
commandline (hugepagesz=x hugepages=y), but some setups (specifically
hugepages across multiple NUMA nodes) cannot be configured that way.
Since we always clear these hugepages at VM shutdown, rebooting a VM
that uses them might not work, since the requested count might not be
available anymore by the time we want to use them (also, we would then
no longer allocate them correctly on the NUMA nodes).
Add a 'keephugepages' parameter to skip cleanup and simply leave them
untouched.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
until we maybe have a 'pbs-backup' that links Qemu and PBS like
'pbs-restore', we need to do a regular backup for the template case to
support all storage types and image formats.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Otherwise a warning is printed if the bios is not set in the config.
reported via community forum:
https://forum.proxmox.com/threads/warning-in-qemuserver.74683/
reproduced and tested that the patch fixes the issue.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
This still works even if all drives were clean. It then shows the very
magical line:
INFO: backup was done incrementally, reused 34.00 GiB (100%)
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
QEMU handles it just as well as with VMA, so this was probably just
forgotten to implement for PBS.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Edid support was added with Qemu 5. Windows guests seem to not be able
to get all possible resolutions if the default std VGA device is used as
GPU and the VM boots in BIOS mode. The result is that only one of the
following three resolutions can be configured:
800x600
1024x768
1920x1080
It is important to note that just booting a Windows VM with the edid=off
parameter will not make the large list of resolutions available. It
seems that Windows is caching the list of possible resolutions
somewhere [0].
Uninstalling the 'Microsoft Basic Display Adapter' in the device manager
and rebooting the VM is one way I found to force Windows to recreate the
list of possible resolutions.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
[0] https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg07128.html
There can't be a dirty bitmap when the VM was off, and if it was off we
will also shut it down after the backup, so no point in creating one.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
When $target is 0, that means we don't have to upload any data, in which
case we're immediately done.
Otherwise incremental backups with no changes display a really weird
status: 0% (0.0 B of 0.0 B), duration 0, read: 0 B/s, write: 0 B/s
when they're actually done already.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Previously 'read' and 'write' would always show the same value, which is
of little use. Change it so 'write' excludes reused bytes, thus
displaying the actual upload speed.
$last_reused needs to be initialized to contain reused data from 'clean'
dirty bitmaps to ensure the first output line is correct.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Uses the new 'query-pbs-bitmap-info' QMP call to retrieve additional
information about each drive's dirty bitmap. Returned info is also used
to calculate $target by simply adding all the dirty values (dirty is
equal to size in case the entire drive will be backed up).
"Backup is sparse" message is suppressed for PBS, since it makes little
sense (if zero chunks appear in the clean area of a bitmap, they won't
be counted, and a user is probably more interested in the 'reused' data
anyway).
Also removes the need for the hacky $first_round query-backup handling.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
In config_to_command, '-loadstate' will be added whenever there is a
vmstate in the config. But in vm_start_nolock, the resume parameter
is used to calculate the appropriate timeout and to remove the vmstate
after the start. The resume parameter was only set if there is a
'suspended' lock, but apparently [0] we cannot rely on the lock to be
set if and only if there is a vmstate.
[0]: https://forum.proxmox.com/threads/task-error-start-failed.72450
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
during refactoring, the vmid got lost, but is necessary to get
the correct mdev id
Fixes commit 74c17b7a23
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
[ reference fixed commit ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
pbs-restore might not stay there like that forever and if
this code path changes we should remember to also remove the
environment variables
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
If 'query-proxmox-support' is not known to QEMU, assume that no other
features are supported either.
If 'pbs' is not supported at all, error out with a nice message.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Use the new register_format(3) call to use a validator (instead of a
parser) for 'pve-(vm-)?cpu-conf'. This way the $cpu_fmt hash can be used for
generating the documentation, while still applying the same verification
rules as before.
Since the function no longer parses but only verifies, the parsing in
print_cpu_device/get_cpu_options has to go via JSONSchema directly.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
reuse can also come from the current backup - so drop the "from last
backup" as this can be very confusing if one reads it after making
the first backup ever, with no last backup existing.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
happened due to moving the code from another scope which had no $res,
and not noticing as it was still working after all.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
normally this is done centrally in the managers code, but we do not
have the info for PBS there.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Pass new size directly, so the function doesn't need to know about
how some hash is organized. And return a message directly, instead
of both size-strings. Also dropped the wantarray, because both
existing callers use the message anyways.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
The $total != $transferred check is changed to a log, as QEMU reports
only actually transferred bytes, and it is indeed correct for
incremental backups to have differing values from $total.
The 'incremental' parameter is always set, QEMU will figure out if it should
re-use an existing bitmap or create a new one on its own.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This allows setting ciuser, cipassword and all other cloudinit settings that
are not part of the network without VM.Config.Network permissions.
Keep VM.Config.Network still as fallback so custom roles that add
VM.Config.Network but not VM.Config.Cloudinit don't break.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
Legacy IGD passthrough requires address 00:1f.0 to not be assigned to
anything on QEMU startup (currently it's assigned to bridge pci.2).
Changing this in general would break live-migration, so introduce a new
hostpci parameter "legacy-igd", which if set to 1 will move that bridge
to be nested under bridge 1.
This is safe because:
* Bridge 1 is unconditionally created on i440fx, so nesting is ok
* Defaults are not changed, i.e. PCI layout only changes when the new
parameter is specified manually
* hostpci forbids migration anyway
Additionally, the PT device has to be assigned address 00:02.0 in the
guest as well, which is usually used for VGA assignment. Luckily, IGD PT
requires vga=none, so that is not an issue either.
See https://git.qemu.org/?p=qemu.git;a=blob;f=docs/igd-assign.txt
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Move the logic which volumes are included in the backup job to its own
method and adapt the VZDump code accordingly. This makes it possible to
develop other features around backup jobs.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
should not really happen on modern systems, but random_bytes just
returns false if it fails to generate random bytes, in which case we
want to die instead of returning an empty 'random' string.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
We used the VNC API $ticket as password for VNC, but QEMU limits the
password to the first 8 chars and ignores the rest[0].
As our tickets start with a static string (e.g., "PVE") the entropy
was a bit limited.
For Proxmox VE this does not matters much as the noVNC viewer
provided by has to go always over the API call, and so a valid
ticket and correct permissions for the requested VM are enforced
anyway.
This patch helps external users, which often use NoVNC-Websockify,
circumventing the API and relying solely on the VNC password to avoid
snooping on VNC sessions.
A 'generate-password' parameter is added, if set a password from good
entropy (using libopenssl) is generated.
For simplicity of mapping random bits to ranges we extract 6 bit of
entropy per character and add the integer value of '!' (first
printable ASCII char) to that. This way we get 64^8 possibilities,
which even with millions of guesses per second one would need years
of guessing and mostly just DDOS the server with websocket upgrade
requests.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Tested-By: Dominik Csapak <d.csapak@proxmox.com>
Reviewed-By: Dominik Csapak <d.csapak@proxmox.com>
'vga' is a property string, we can't just assume it starts with the default key's value here either.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
'vga' is a property string, we can't just assume it starts with the
default key's value.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
when checking whether a to-be-added drive's and the VM's replication
status are matching. otherwise, we end up in a failing generic
'parse_volume_id' with no mention of the actual reason.
adding 'replicate=0' to the new drive string fixes the underlying issue
with and without this patch, so this is just a cosmetic/usability
improvement.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
As perl hashes have random order, sort them before iterating through.
This makes the output of 'qm cloudinit dump <vmid> network' consistent
between calls if the config has not changed.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
netdev_add is now a proper qmp command, which means that it verifies
the parameter types properly
instead of sending strings, we now have to choose the correct
types for the parameters
bool for vhost
and uint64 for queues
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
the special case was dropped when moving this to pve-storage.
fixes commit c6d517835a
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
More API calls will follow for this path, for now add the 'index' call to
list all custom and default CPU models.
Any user can list the default CPU models, as these are public anyway, but
custom models are restricted to users with Sys.Audit on /nodes.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Explicitly allows changing other properties than the cputype, even if
the currently set cputype is not accessible by the user. This way, an
administrator can assign a custom CPU type to a VM for a less privileged
user without breaking edit functionality for them.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
fixing the following two issues:
- the legacy code path was never converted to the new fork_tunnel
signature (which probably means that nothing triggers it in practice
anymore?)
- the NBD Unix socket got forwarded multiple times if more than one disk
was migrated via NBD (this is harmless, but wrong)
for the second issue I opted to keep the code compatible with the
possibility that Qemu starts supporting multiple NBD servers in the
future (and the target node could thus return multiple UNIX socket
paths). currently we can only start one NBD server on one socket, and
each drive-mirror simply starts a new connection over that single
socket.
I took the liberty of renaming the variables/keys since I found
'tunnel_addr' and 'sock_addr' rather confusing.
Reviewed-By: Mira Limbeck <m.limbeck@proxmox.com>
Tested-By: Mira Limbeck <m.limbeck@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
fixes commit 940e2a3a06
QEMU 4.1 will fail to start a guest with an audio device set with:
> Property '.audiodev' not found
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
If /dev/hwrng exists, but no actual generator is connected (or it is
disabled on the host), QEMU will happily start the VM but crash as soon
as the guest accesses the VirtIO RNG device.
To prevent this unfortunate behaviour, check if a useable hwrng is
connected to the host before allowing the VM to be started.
While at it, clean up config_to_command by moving new and existing rng
source checks to a seperate sub.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
redirecting to the saved STDOUT in case of a template backup or a VM
without any disks failed because of the erroneous '=':
Backup of VM 123123 failed - command '/usr/bin/vma create -v -c [...]' failed:
Bad filehandle: =5 at /usr/share/perl/5.28/IPC/Open3.pm line 58.
https://forum.proxmox.com/threads/vzdump-to-stdout.69364
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
It's possible to have a VM with OVMF but without an efidisk, so don't
call parse_drive on a potential undef value.
Partial revert of 818c3b8d91 ("cfg2cmd: ovmf: code cleanup")
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
and move the lock call and decision logic closer together
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Tested-by: Fabian Ebner <f.ebner@proxmox.com>
This reverts commit b5490d8a98.
When resizing a volume of a running VM, a qmp block_resize command
is issued. This is non-blocking, so the size on the storage immediately
after issuing the command might still be the old one.
This is part of the issue reported in bug #2621.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
we really only want to rescan the disk size of the disks we actually
need, and that are only the local disks (for which we have to allocate
the correct size on the target)
also we want to always skip the efidisk, since we get the wanted
size after the loop, and this produced a confusing log line
(for details why we do not want the 'real' size,
see commit 818ce80ec1)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
by avoiding auto-vivification of $self->{online_local_volumes} via
iteration. most code paths don't care whether it's undef or a reference
to an empty list, but this caused the (already) fixed bug of calling
nbd_stop without having started an NBD server in the first place.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
lock_file is used by PVE::QemuServer::Memory, but it does properly 'use
PVE::Tools ...' itself so we can drop them in the main module.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
regex to reduce the code duplication, as archive_info and
decompressor_info provides the same information as well.
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
VM was can be true for stop mode backup, we cannot check the "is VM
currently running" as that doesn't tells us anything (could be the
backup process), so check the mode also..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
as the nbd server could have been stopped by something else.
Further, it makes no sense to die and mark the migration thus as
failed, just because of a NBD server stop issue.
At this point the migration hand off to the target was done already,
so normally we're good, if it fails we have other (followup) problems
anyway.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
and refactor the test_volid closure. Like this get_replicatable_volumes doesn't
need a separate loop for unused volumes anymore. For get_vm_volumes, which is used
for activation/deactivation of volumes at migration and deactivation in vm_stop_cleanup,
includes those volumes now. For migration it's an improvement, because those volumes
might need to be migrated and for vm_stop_cleanup it shouldn't hurt. The last user
of foreach_volid is check_vm_disks_local used by migrate_vm_precondition,
where information about the additional volumes doesn't hurt either.
Note that replicate is (still) set by default, so the behavior for
get_replicatable_volumes for unused volumes should not change.
Hibernation vmstate files are now also included and recognized as 'is_vmstate'.
The 'size' attribute will not be overwritten by subsequent iterations for the
same volid anymore (a volid may appear both in the config and in snapshots),
so the size from the current config is now preferred.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
when a backup includes a cloudinit disk on a non-existent storage,
the restore fails with 'storage' does not exist
this happens because we want to get the format of the disk, by
checking the source storage
we fix this by using the target storage first and only the source as
fallback
this will still fail if neither storage exists
(which is ok, since we cannot restore then anyway)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Some OVF files to not declare 'rasd' as a default namespace (in the
top level Envelope element), but inline in each element (e.g.
<rasd:HostResource xmlns:rasd="foo">...</rasd:HostResource>)
This trips up our relative findvalue with
> XPath error : Undefined namespace prefix
To avoid this, search in the global XPathContext (where we register
those namespaces ourselves) and pass the item_node as context
parameter.
This works then for both cases
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this is only used for migration via 'qm mtunnel', regular users should
never need to resume a VM that does not logically belong to the node it
is running on
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
by counting only local volumes that will be live-migrated via qemu_drive_mirror,
i.e. those listed in $self->{online_local_volumes}.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
With Qemu 4.2 a new `audiodev` property was introduced [0] to explicitly
specify the backend to be used for the audio device. This is accompanied
with a warning that the fallback to the default audio backend is
deprecated.
[0] https://wiki.qemu.org/ChangeLog/4.2#Audio
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
If storage_migrate dies, the error message might not include the
volume ID or the target storage ID, but those might be good to know.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
This makes it possible to migrate a VM with volumes store1:vm-123-disk-0
store2:vm-123-disk-0 to some targetstorage. Also prevents migration failure
when there is an orphaned disk with the same volid on the target.
To avoid confusion, the name should not change for 'vmstate'-volumes.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
It was necessary to move foreach_volid back to QemuServer.pm
In VZDump/QemuServer.pm and QemuMigrate.pm the dependency on
QemuConfig.pm was already there, just the explicit "use" was missing.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Upstream marks these as having a micro-version of >=90, unfortunately the
machine versions are bumped earlier so testing them is made unnecessarily
difficult, since the version checking code would abort on migrations etc...
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
[ Thomas: do so refactor ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
so that pve-container and qemu-server use the same one, in preparation
for moving it to JSONSchema and having a bridgepair format.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Can be specified for a particular VM or via a custom CPU model (VM takes
precedence).
QEMU's default limit only allows up to 1TB of RAM per VM. Increasing the
physical address bits available to a VM can fix this.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
If a cputype is custom (check via prefix), try to load options from the
custom CPU model config, and set values accordingly.
While at it, extract currently hardcoded values into seperate sub and add
reasonings.
Since the new flag resolving outputs flags in sorted order for
consistency, adapt the test cases to not break. Only the order is
changed, not which flags are present.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-By: Fabian Ebner <f.ebner@proxmox.com>
Tested-By: Fabian Ebner <f.ebner@proxmox.com>
To avoid hardcoding even more CPU-flag related things for custom CPU
models, introduce a dynamic approach to resolving flags.
resolve_cpu_flags takes a list of hashes (as documented in the
comment) and resolves them to a valid "-cpu" argument without
duplicates. This also helps by providing a reason why specific CPU flags
have been added, and thus allows for useful warning messages should a
flag be overwritten by another.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-By: Fabian Ebner <f.ebner@proxmox.com>
Tested-By: Fabian Ebner <f.ebner@proxmox.com>
Just like with live-migration, custom CPU models might change after a
snapshot has been taken (or a VM suspended), which would lead to a
different QEMU invocation on rollback/resume.
Save the "-cpu" argument as a new "runningcpu" option into the VM conf
akin to "runningmachine" and use as override during rollback/resume.
No functional change with non-custom CPU types intended.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This is required to support custom CPU models, since the
"cpu-models.conf" file is not versioned, and can be changed while a VM
using a custom model is running. Changing the file in such a state can
lead to a different "-cpu" argument on the receiving side.
This patch fixes this by passing the entire "-cpu" option (extracted
from /proc/.../cmdline) as a "qm start" parameter. Note that this is
only done if the VM to migrate is using a custom model (which we can
check just fine, since the <vmid>.conf *is* versioned with pending
changes), thus not breaking any live-migration directionality.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
in addition to printing it. preparation for remote cluster migration,
where we want to return this in a structured fashion over the migration
tunnel instead of parsing stdout via SSH.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
into one sub that retrieves the local disks, and the actual NBD
allocation. that way, remote incoming migration can just call the NBD
allocation with a custom list of volume names/storages/..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
both where previously missing. the existing 'check_storage_access'
helper is not applicable here since it operates on a full set of VM
config options, not just storage IDs.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
the syntax is backwards compatible, providing a single storage ID or '1'
works like before. the new helper ensures consistent behaviour at all
call sites.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
485449e37 ("qmp: use migrate-set-parameters in favor of deprecated options")
changed the initial "migrate_set_downtime" QMP call to the more recent
"migrate-set-parameters", but forgot to do so for the auto-increase code
further below.
Since the units of the two calls don't match, this would have caused the
auto-increase to increase the limit to absurd levels as soon as it kicked
in (ms treated as s).
Update the second call to the new version as well, and while at it remove
the unnecessary "defined()" check for $migrate_downtime, which is always
initialized from the defaults anyway.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
as preparation of targetstorage mapping and remote migration. this also
removes re-using of the $local_volumes hash in the original code.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
to start breaking up vm_start before extending parts for new migration
features like storage and network mapping.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
as preparation for refactoring it further. remote migration will add
another 1-2 parameters, and it is already unwieldly enough as it is.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
to also handle cases where disk allocation failed in the remote
vm_start, and we only have a bitmap but no target drive information.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
on storages where the minimum size of images is bigger than the real
OVMF_VARS.fd file, they get padded to their minimum size
when using such an image, qemu maps it fully to the vm, but the efi
does not find the vars region and creates a file on the first efi
partition it finds
this breaks some settings in the ovmf, such as resolution
to fix this, we have to specify the size for the pflash, so that
qemu only maps the first n bytes in the vm (this only works for
raw files, not for qcow2)
we also have to use the correct size when converting between storages
in 'clone_disk' (used for move disk and cloning vms) and when
live migrating to different storages
when we now expect that the source image is always correctly used/created
(e.g. raw with size=x in pflash argument) then we always create the
target correctly
when encountering users which have a non-valid image (e.g. a efidisk
moved from zfs to qcow2 before this patch), we have to tell them to
recreate the efidisk and the settings on it
we have to version_guard it to 4.1+pve2 (since we haven't bumped yet
since the change to pve2)
also add 2 tests, one for the old version and one for the new
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-by: Stefan Reiter <s.reiter@proxmox.com>
[ Thomas: rebased to master ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
since bitmaps are set early on, and 'qm start' potentially has allocated
the disks but still failed. we can only clean up what we know about
anyway, so the disk part is still only best effort.
also use replicated_volumes instead of bitmap existence to check for
replicated volumes, since 'qm start' on an old node that does not
understand replicated volumes might have allocated a new volume that we
DO want to clean up, and not skip.
also cleanup disks after stopping target VM, otherwise we might end up
in a situation where the target VM is still running and using the disks,
thus blocking the disk cleanup.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
by only checking for replicatable volumes when a replication job is
defined, and passing only actually replicated volumes to the target node
via STDIN, and back via STDOUT.
otherwise this can pick up theoretically replicatable, but not actually
replicated volumes and treat them wrong.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
fixes commit 0b2f574b4c
enforce_vm_running_for_backup is now witout return value, for the PBS
I forgot to remove an now outdated call to handle_vm_powerstate, drop
that.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
$cpu_fmt is being reused for custom CPUs as well as VM-specific CPU
settings. The "pve-vm-cpu-conf" format is introduced to verify a config
specifically for use as VM-specific settings.
"pve-cpu-conf" is registered for use in custom CPU API calls (where no
additional checks are required).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Turn CPUConfig into a SectionConfig with parsing/writing support for
custom CPU models. IO is handled using cfs.
Namespacing will be provided using "custom-" prefix for custom model
names (in VM config only, cpu-models.conf will contain unprefixed
names).
Includes two overrides to avoid writing redundant information to the
config file, additionally get_custom_model is used to retrieve a custom
model configuration by name.
Resolve custom names in print_cpu_device when a custom cpu is passed.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
There is a need to set $noerr, because otherwise migration for a
VM with a non-replicatable volume fails with:
missing replicate feature on volume 'myfs:107/vm-107-disk-2.raw'
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
as we need at least pve-qemu in 4.2 for this to work, the target side
is implicitly checked with "to old version" check for migrate or the
mirror will fail anyway.
Just use the simple "qemu binary version check", as we could stil
live migrate an older snapshot with older machine versions if both
sides have a recent enough qemu.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
with incremental drive-mirror and dirty-bitmap tracking.
1.) get replicated disks that are currently referenced by running VM
2.) add a block-dirty-bitmap to each of them
3.) replicate ALL replicated disks
4.) pass bitmaps from 2) to drive-mirror for disks from 1)
5.) skip replicated disks when cleaning up volumes on either source or
target
added error handling is just removing the bitmaps if an error occurs at
any point after 2, except when the handover to the target node has
already happened, since the bitmaps are cleaned up together with the
source VM in that case.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Tested-by: Stefan Reiter <s.reiter@proxmox.com>
to make migration logs a bit easier to grasp with a quick glance.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Tested-by: Stefan Reiter <s.reiter@proxmox.com>
by re-using a dirty bitmap that represents changes since the divergence
of source and target volume. requires a qemu that supports incremental
drive-mirroring, and will die otherwise.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Tested-by: Stefan Reiter <s.reiter@proxmox.com>
Moved code so that initialization of drivedesc_hash stays a single block.
Avoid auto-vivication in parse_drive.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
E.g.: If a feature requires 4.1+pveN and we're using machine version 4.2
we don't need to increase the pve version to N (4.2+pve0 is enough).
We check this by doing a min_version call against a non-existant higher
pve-version for the major/minor tuple we want to test for, which can
only work if the major/minor alone is high enough.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Clarify why a cancel is actually not really canceling here, because
we're already finished with storage migration and the block jobs are
all in ready state and we (source) are going to stop soon to hand
over to target.
> Note that if you issue 'block-job-cancel' after 'drive-mirror' has
> indicated (via the event BLOCK_JOB_READY) that the source and
> destination are synchronized, then the event triggered by this
> command changes to BLOCK_JOB_COMPLETED, to indicate that the
> mirroring has ended and the destination now has a point-in-time
> copy tied to the time of the cancellation
-- qapi/block-core.json (QEMU 4.2)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The change to the prefixed version broke migration from new to old
qemu-server version. This reverts the change and adds a TODO comment for
7.0 to change it to the prefixed version then.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
...instead of booting with an invalid config once and then silently
changing the memory size for consequent VM starts.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Tested-by: Alwin Antreich <a.antreich@proxmox.com>
This cannot work, since we adjust the 'memory' property of the VM config
on hotplugging, but then the user-defined NUMA topology won't match for
the next start attempt.
Check needs to happen here, since it otherwise fails early with "total
memory for NUMA nodes must be equal to vm static memory".
With this change the error message reflects what is actually happening
and doesn't allow VMs with exactly 1GB of RAM either.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Tested-by: Alwin Antreich <a.antreich@proxmox.com>