Starts an instance of swtpm per VM in it's systemd scope, it will
terminate by itself if the VM exits, or be terminated manually if
startup fails.
Before first use, a TPM state is created via swtpm_setup. State is
stored in a 'tpmstate0' volume, treated much the same way as an efidisk.
It is migrated 'offline', the important part here is the creation of the
target volume, the actual data transfer happens via the QEMU device
state migration process.
Move-disk can only work offline, as the disk is not registered with
QEMU, so 'drive-mirror' wouldn't work. swtpm itself has no method of
moving a backing storage at runtime.
For backups, a bit of a workaround is necessary (this may later be
replaced by NBD support in swtpm): During the backup, we attach the
backing file of the TPM as a read-only drive to QEMU, so our backup
code can detect it as a block device and back it up as such, while
ensuring consistency with the rest of disk state ("snapshot" semantic).
The name for the ephemeral drive is specifically chosen as
'drive-tpmstate0-backup', diverging from our usual naming scheme with
the '-backup' suffix, to avoid it ever being treated as a regular drive
from the rest of the stack in case it gets left over after a backup for
some reason (shouldn't happen).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
like for other API calls, repeat the cheap checks done for early abort
before forking and without locks after forking and obtaining the lock,
and only hold the flock in the forked worker instead of across the fork.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Fabian Ebner <f.ebner@proxmox.com>
@bootorder only contains entries for non-legacy bootorder entries,
but the default one contains all cdroms anyway, and if the user
explicitely disabled cdroms, it is ok to not add them back
for the new cdrom drive.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
We unconditionally added an entry into the bootorder whenever we
edited the drive, even if it was already in there. Instead we only want to do
that if the bootorder list does not contain it already.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Attaching an ISO image to a VM is usually/often done for two reasons:
* booting an installer image
* supplying additional drivers to an installer (e.g. virtio)
Both of these cases (the latter at least with SeaBIOS and the Windows
installer) require the disk to be marked as bootable.
For this reason, enable the bootable flag for all new CDROM drives
attached to a VM by adding it to the bootorder list. It is appended to
the end, as otherwise it would cause new drives to boot before already
existing boot targets, which would be a more grave (and IMO bad)
behaviour change.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
otherwise a user with only VM.Config.CDROM can detach a disk from a VM
by updating it to a cdrom drive
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
storage_check_enabled simply checks for the 'disable' option and then calls
storage_check_node.
While not strictly necessary for a second call where only the storage differs,
e.g. in case of clone, it is more future-proof: if support for a target storage
is added at some point, it might be easy to miss adapting the call.
For the migration checks, the situation is improved by now always catching
disabled (target) storages.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Enables live-restore functionality using the 'alloc-track' QEMU driver.
This allows starting a VM immediately when restoring from a PBS
snapshot. The snapshot is mounted into the VM, so it can boot from that,
while guest reads and a 'block-stream' job handle the restore in the
background.
If an error occurs, the VM is deleted and all data written during the
restore is lost.
The VM remains locked during the restore, which automatically prohibits
any modifications to the config while restoring. Some modifications
might potentially be safe, however, this is experimental enough that I
believe this would cause more bad stuff(tm) than actually satisfy any
use cases.
Pool handling is slightly adjusted so the VM can be added to the pool
before the restore starts.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Commit abff03211f switched to iterating over the
values instead of the keys, but didn't update the variable name. Use target_sid,
because target is already in use for the target node.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
The existing check_vm_modify_config_perm doesn't do so anymore, but
the check only got re-added to the modify/delete paths. See commits
165be267eb and
e30f75c571 for context.
In the future, it might make sense to generalise the
check_vm_modify_config_perm and have it not only take keys, but both
new and old values, and use that generalised function everywhere.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
A fix for violating a important standard for booting[0] in recently
packaged QEMU 5.2 surfaced some issues with Windows based VMs in our
forum[1], which seem to be quite sensitive for such changes (it seems
they derive lots of their device assignment from ACPI).
User visible effects are loss of any network configuration due to
windows thinking it was swapped with a new one, and starts with a
fresh config - this is mostly problematic for setups with static
address assignment.
There may be lots of other, more subtle, effects and the PVE admin is
also not always the VM admin, so we really need to avoid such
negative effects. Do this by pinning the version of any windows based
VMs to either the minimum of (5.1, kvm-version) for existing VMs or
the kvm-version at time of VM creation for new ones.
There are patches in pve-manager for user to be able to change the
pinned version themself in the webinterface, so this can now also get
adapted more easily if there surface any other issues (with new or
old version) in the future.
0: https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg08484.html
1: https://forum.proxmox.com/threads/warning-latest-patch-just-broke-all-my-windows-vms-6-3-4-patch-inside.84915/page-2#post-373331
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Since CDRoms and disks share the same config keys, we need to check if
it actually is a CDRom and then check the permissions accordingly.
Otherwise it is possible for someone without VM.Config.CDROM
permissions, but with VM.Config.Disk permissions to remove a CD drive
while being unable to create a CDRom drive.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
by checking if the vm is paused at the beginning and skipping the
resume now we also skip the qga freeze/thaw (which cannot work if the
vm is paused)
moved the 'vm_is_paused' sub from the api to PVE/QemuServer.pm so it
is available everywhere we need it.
since a suspend backup would pause the vm anyway, we can skip that
step also
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Fabian Ebner <f.ebner@proxmox.com>
Since an old change released with a version bump on 2009-09-07, we
search all enabled storages for VMID maching volumes on VM removal
and purge those too.
This has multiple pitfalls and may be quite unexpected for some
users.
It can make problems when:
* on recovery a VM is created, before disks are reattached the admin
notices some settings issues and chooses to just recreate the VM;
but during destroying the dummy VM all related disks get destroyed
unconditionally which may result in data loss. This actually
happened and is the original reason for the decision to change
this.
* a storage is shared between PVE instance (between a set of clusters
and/or single nodes), while this is against our rules it may still
come as a surprise if destroying a VM on node A may destroy
unrelated and unreferenced disks on the unrelated node B without
asking or allowing to avoid that.
As this the removal of matching but unreferenced disks can result in
permanent data loss (up to the last backup) and may be to subtle and
unforgiving, allow to opt-out of it.
In the long run we want to make this opt-in, but that is an API
change and so needs to wait for next major release. But, we can adapt
the GUI already to make it opt-in there, catching most of the cases.
side-note: CT do not have this behavior at all
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
showing off it's monstrosity of a method signature, needs to be
cleaned up in a followup commit
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
No need to warn twice, so the warning from the outside check
was removed.
Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
All volumes contained in $vollist are activated. In this case a snapshot
of the volume. For cloudinit disks no snapshots are created so don't add
it to the list of volumes to activate as it otherwise fails with no
logical volume found.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
The API is updated to handle the deprecation correctly, i.e. when
updating the 'order' attribute, the old 'legacy' (default_key) values
are removed (would now be ignored anyway).
When removing a device that is in the bootorder list, it will be removed
from the aforementioned. Note that non-existing devices in the list will
not cause an error - they will simply be ignored - but it's still nice
to not have them in there.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
(also fixes#3011)
Deprecates the old-style 'boot' and 'bootdisk' options by adding a new
'order=' subproperty to 'boot'.
This allows a user to specify more than one disk in the boot order,
helping with newer versions of SeaBIOS/OVMF where disks without a
bootindex won't be initialized at all (breaks soft-raid and some LVM
setups).
This also allows specifying a bootindex for USB and hostpci devices,
which was not possible before. Floppy boot support is not supported in
the new model, but I doubt that will be a problem (AFAICT we can't even
attach floppy disks to a VM?).
Default behaviour is intended to stay the same, i.e. while new VMs will
receive the new 'order' property, it will be set so the VM starts the
same as before (using get_default_bootorder).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This allows setting ciuser, cipassword and all other cloudinit settings that
are not part of the network without VM.Config.Network permissions.
Keep VM.Config.Network still as fallback so custom roles that add
VM.Config.Network but not VM.Config.Cloudinit don't break.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
should not really happen on modern systems, but random_bytes just
returns false if it fails to generate random bytes, in which case we
want to die instead of returning an empty 'random' string.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
We used the VNC API $ticket as password for VNC, but QEMU limits the
password to the first 8 chars and ignores the rest[0].
As our tickets start with a static string (e.g., "PVE") the entropy
was a bit limited.
For Proxmox VE this does not matters much as the noVNC viewer
provided by has to go always over the API call, and so a valid
ticket and correct permissions for the requested VM are enforced
anyway.
This patch helps external users, which often use NoVNC-Websockify,
circumventing the API and relying solely on the VNC password to avoid
snooping on VNC sessions.
A 'generate-password' parameter is added, if set a password from good
entropy (using libopenssl) is generated.
For simplicity of mapping random bits to ranges we extract 6 bit of
entropy per character and add the integer value of '!' (first
printable ASCII char) to that. This way we get 64^8 possibilities,
which even with millions of guesses per second one would need years
of guessing and mostly just DDOS the server with websocket upgrade
requests.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Tested-By: Dominik Csapak <d.csapak@proxmox.com>
Reviewed-By: Dominik Csapak <d.csapak@proxmox.com>
'vga' is a property string, we can't just assume it starts with the default key's value here either.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
'vga' is a property string, we can't just assume it starts with the
default key's value.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
when checking whether a to-be-added drive's and the VM's replication
status are matching. otherwise, we end up in a failing generic
'parse_volume_id' with no mention of the actual reason.
adding 'replicate=0' to the new drive string fixes the underlying issue
with and without this patch, so this is just a cosmetic/usability
improvement.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
More API calls will follow for this path, for now add the 'index' call to
list all custom and default CPU models.
Any user can list the default CPU models, as these are public anyway, but
custom models are restricted to users with Sys.Audit on /nodes.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Explicitly allows changing other properties than the cputype, even if
the currently set cputype is not accessible by the user. This way, an
administrator can assign a custom CPU type to a VM for a less privileged
user without breaking edit functionality for them.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
and move the lock call and decision logic closer together
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Tested-by: Fabian Ebner <f.ebner@proxmox.com>
This reverts commit b5490d8a98.
When resizing a volume of a running VM, a qmp block_resize command
is issued. This is non-blocking, so the size on the storage immediately
after issuing the command might still be the old one.
This is part of the issue reported in bug #2621.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
this is only used for migration via 'qm mtunnel', regular users should
never need to resume a VM that does not logically belong to the node it
is running on
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
It was necessary to move foreach_volid back to QemuServer.pm
In VZDump/QemuServer.pm and QemuMigrate.pm the dependency on
QemuConfig.pm was already there, just the explicit "use" was missing.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Just like with live-migration, custom CPU models might change after a
snapshot has been taken (or a VM suspended), which would lead to a
different QEMU invocation on rollback/resume.
Save the "-cpu" argument as a new "runningcpu" option into the VM conf
akin to "runningmachine" and use as override during rollback/resume.
No functional change with non-custom CPU types intended.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This is required to support custom CPU models, since the
"cpu-models.conf" file is not versioned, and can be changed while a VM
using a custom model is running. Changing the file in such a state can
lead to a different "-cpu" argument on the receiving side.
This patch fixes this by passing the entire "-cpu" option (extracted
from /proc/.../cmdline) as a "qm start" parameter. Note that this is
only done if the VM to migrate is using a custom model (which we can
check just fine, since the <vmid>.conf *is* versioned with pending
changes), thus not breaking any live-migration directionality.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
both where previously missing. the existing 'check_storage_access'
helper is not applicable here since it operates on a full set of VM
config options, not just storage IDs.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
the syntax is backwards compatible, providing a single storage ID or '1'
works like before. the new helper ensures consistent behaviour at all
call sites.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
as preparation for refactoring it further. remote migration will add
another 1-2 parameters, and it is already unwieldly enough as it is.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
on storages where the minimum size of images is bigger than the real
OVMF_VARS.fd file, they get padded to their minimum size
when using such an image, qemu maps it fully to the vm, but the efi
does not find the vars region and creates a file on the first efi
partition it finds
this breaks some settings in the ovmf, such as resolution
to fix this, we have to specify the size for the pflash, so that
qemu only maps the first n bytes in the vm (this only works for
raw files, not for qcow2)
we also have to use the correct size when converting between storages
in 'clone_disk' (used for move disk and cloning vms) and when
live migrating to different storages
when we now expect that the source image is always correctly used/created
(e.g. raw with size=x in pflash argument) then we always create the
target correctly
when encountering users which have a non-valid image (e.g. a efidisk
moved from zfs to qcow2 before this patch), we have to tell them to
recreate the efidisk and the settings on it
we have to version_guard it to 4.1+pve2 (since we haven't bumped yet
since the change to pve2)
also add 2 tests, one for the old version and one for the new
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-by: Stefan Reiter <s.reiter@proxmox.com>
[ Thomas: rebased to master ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
by only checking for replicatable volumes when a replication job is
defined, and passing only actually replicated volumes to the target node
via STDIN, and back via STDOUT.
otherwise this can pick up theoretically replicatable, but not actually
replicated volumes and treat them wrong.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
For secure live migration with local disks via NBD over a unix socket,
we have to somehow communicate from the source node to the target node
if it supports it. This is because there can only be one NBD server with
exactly one socket bound.
The source node passes that information via STDIN. Support for
'spice_ticket: (...)' is added in addition to 'nbd_protocol_version:
<version>'. As old source nodes send the spice ticket without a prefix,
we still have to have a fallback for this case. New information should
always be passed via a prefix that is matched, otherwise it will be
recognized as spice ticket.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
With Qemu 4.2 we encountered a problem with unix sockets and SSH socket
forwarding for drive-mirror. It seems the socket gets reopened again and
again after it closes for some reason. This can be worked around by
specifying 'block-job-cancel' instead of 'block-job-complete' when we're
not interested in swapping the disks again from NBD to their original
protocol. This is always the case when we use drive-mirror for live
migrating a VM.
qemu_drive_mirror is used for migration and for clone_disk. All in all
we have 3 cases to handle. Either the 'skip' case which skips the
completion of the job. The 'wait' case which was the default before and
still is when $completion is undefined. And the new 'wait_noswap' case
which is used for the live migration.
If 'wait_noswap' is specified, we issue a 'block-job-cancel' once the block
job is in 'ready' state. This completes the block job without swapping the
disks.
clone_disk always uses 'block-job-cancel' via the qemu_blockjobs_cancel
sub.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
The initialization for the drive keys in $confdesc is changed
to be a single for-loop iterating over the keys of $drivedesc_hash and
the initialization of the unusedN keys is move to directly below it.
To avoid the need to change all the call sites, functions with more than
a few callers are exported from the submodule and imported into QemuServer.pm.
For callers of the now imported functions within QemuServer.pm, the prefix
PVE::QemuServer is dropped, because it is unnecessary and now even confusing.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
The http-server has a 64KB payload limit for post requests, so note
that explicit even if it's a theoretical maximum as the reamainig
params also need some space in the request
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
'input-data' can be used to pass arbitrary data to a guest when running
an agent command with 'guest-exec'. Most guest-agent implementations
treat this as STDIN to the command given by "path"/"arg", but some go as
far as relying solely on this parameter, and even fail if "path" or
"arg" are set (e.g. Mikrotik Cloud Hosted Router) - thus "command" needs
to be made optional.
Via the API, an arbitrary string can be passed, on the command line ('qm
guest exec'), an additional '--pass-stdin' flag allows to forward STDIN
of the qm process to 'input-data', with a size limitation of 1 MiB to
not overwhelm QMP.
Without 'input-data' (API) or '--pass-stdin' (CLI) behaviour is unchanged.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Because of alignment and rounding in the storage backend, the effective
size might not match the 'newsize' parameter we passed along.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
The description for vm_config was out of date and from the description
for vm_pending it was hard to tell what the difference to vm_config was.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
regression introduced with commit a85ff91b
previously we set $target to undef if it's localnode or localhost, then
we check if node exists.
with regression commit, behaviour changes as we do the node check in
else, but $target may be undef. this causes an error:
no such cluster node ''
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
improved readability
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
VM.Audit can see the current config and the list of snapshots
already, so there is no real reason to disallow
the config of snapshots
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
This is the guarantee that this call operates on it's created config.
A VMID cannot be reused afterall. So only remove the guarantee at the
last step, just before throwing up the error message about the clone
failure.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
We clone the source VM firewall config before forking the "realcmd"
worker, but did not mind cleaning it up again if the clone failed
somewhere in the worker.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
wrap around code which can possibly fail in evals to handle them
gracefully, and log errors.
note: this results in a change of behavior in the API. since errors
are handled gracefully instead of "die"ing, when there is a pending
change which cannot be applied for some reason, it will get logged in
the tasklog but the vm will continue booting regardless. the
non-applied change will stay in the pending section of the
configuration.
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Do the same as for the "create" case, only trigger the "start after
create/restore" task after the locked "realcmd" was done. Else, the
start can never succeed, it also acquires a lock, but restore only
release it once outside of realcmd.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
bump versioned build-dependency, as qemu-server has tests checking
for errors, and we fixed an grammar error in pve-storage, so we need
the newer version to ensure our test go through
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
only VM.PowerMgmt is not enough, since we allocate space on a storage,
so we need VM.Config.Disk on the vm and Datastore.AllocateSpace on the storage
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Sometimes, a user wants to remove the 'suspended' state without
resuming the vm from that state. Since the vm is locked with
'suspended', this was not possible without help from root@pam
This patch allows to delete the vmstate and the suspended lock and
related config entries with it. The user still has to have the right
priviliges and the vm cannot be 'protected' for this to work
Inspired-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
we did not actually delete the state if we deleted the 'vmstate' config,
leaving stray vmstates on the disks
actually implement the removal, requiring 'VM.Config.Disk' and
'VM.PowerMgmt' privs
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
'pve-qm-machine' is auto-registered, but for re-use for a new
runningmachine we added the newer pve-qemu-machine standard option.
Use that one to avoid confusion.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
With the noerr flag set in parse_volume_id we have to check if
$volname is defined before comparing it to 'cloudinit'.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
this is useful as meta information for e.g., provisioning or config
management systems
adding the info also to the 'status' api call to make it easier to show
it in the gui
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
QMP and monitor helpers are moved from QemuServer.pm.
By using only vm_running_locally instead of check_running, a cyclic
dependency to QemuConfig is avoided. This also means that the $nocheck
parameter serves no more purpose, and has thus been removed along with
vm_mon_cmd_nocheck.
Care has been taken to avoid errors resulting from this, and
occasionally a manual check for a VM's existance inserted on the
callsite.
Methods have been renamed to avoid redundant naming:
* vm_qmp_command -> qmp_cmd
* vm_mon_cmd -> mon_cmd
* vm_human_monitor_command -> hmp_cmd
mon_cmd is exported since it has many users. This patch also changes all
non-package users of vm_qmp_command to use the mon_cmd helper. Includes
mocking for tests.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>