The new ciupgrade option was missing in $cloudinitoptions in
PVE::API2::Qemu, so $check_vm_modify_config_perm defaulted to
requiring root@pam for modifying the option. To fix this, add
ciupgrade to $cloudinitoptions. This also fixes an issue where
ciupgrade was missing in the output of `qm cloudinit pending`,
as it also relies on $cloudinitoptions.
This issue was originally reported in the forum [0].
Also add a comment to avoid similar issues when adding new options in
the future.
[0]: https://forum.proxmox.com/threads/131043/
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
All calling sites except for QemuConfig.pm::get_replicatable_volumes()
already enabled it. Making it the non-configurable default results in a
change in the VM replication. Now a disk image only referenced in the
pending section will also be replicated.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
Make it possible to optionally iterate over disks in the pending section
of VMs, similar as to how snapshots are handled already.
This is for example useful in the migration if we don't want to rely on
the scanning of all storages.
All calling sites are adapted and enable it, except for
QemuConfig::get_replicatable_volumes as that would cause a change for
the replication if pending disks would be included.
The following lists the calling sites and if they should be fine with
the change (source [0]):
1. QemuMigrate: scan_local_volumes(): needed to include pending disk
images
2. API2/Qemu.pm: check_vm_disks_local() for migration precondition:
related to migration, so more consistent with pending
3. QemuConfig.pm: get_replicatable_volumes(): would change the behavior
of the replication, will not use it for now.
4. QemuServer.pm: get_vm_volumes(): is used multiple times by:
4a. vm_stop_cleanup() to deactivate/unmap: should also be fine with
including pending
4b. QemuMigrate.pm: in prepare(): part of migration, so more consistent
with pending
4c. QemuMigrate.pm: in phase3_cleanup() for deactivation: part of
migration, so more consistent with pending
[0] https://lists.proxmox.com/pipermail/pve-devel/2023-May/056868.html
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
to allow early checking of the merged config, if the backup archive
passed in is a proper volume where extraction is possible.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
to avoid duplicate work, always set 'volid' to the backup volume's volid, if it
was successfully parsed as such.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
for offline migration, limit the allowed nodes to the ones where the
mapped resources are available
this adds new info to the api call namely the 'mapped-resources' list,
as well as the 'unavailable-resources' info in the 'not_allowed_nodes'
object
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Markus Frank <m.frank@proxmox.com>
this patch allows configuring pci devices that are mapped via cluster
resource mapping when the user has 'Resource.Use' on the ACL path
'/mapping/pci/{ID}' (in addition to the usual required vm config
privileges)
When given multiple mappings in the config, we use them as alternatives
for the passthrough, and will select the first free one on startup.
It is using our regular pci reservation mechanism for regular devices and
we introduce a selection mechanism for mediated devices.
A few changes to the inner workings were required to make this work well:
* parse_hostpci now returns a different structure where we have a list
of lists (first level is for the different alternatives and second
level is for the different devices that should be passed through
together)
* factor out the 'parse_hostpci_devices' which parses each device from
the config and does some precondition checks
* reserve_pci_usage now behaves slightly different when trying to
reserve an device with the same VMID that's already reserved for,
since for checking which alternative we can use, we already must
reserve one (this means that qm showcmd can actually reserve devices,
albeit only for up to 10 seconds)
* configuring a mediated device on a multifunction device is not
supported anymore, and results in failure to start (previously, it
just chose the first device to do it). This is a breaking change
* configuring a single pci device twice on different hostpci slots now
fails during commandline generation instead on qemu start, so we had
to adapt one test where this occurred (it could never have worked
anyway)
* we now check permissions during clone/restore, meaning raw/real
devices can only be cloned/restored by root@pam from now on.
this is a breaking change.
Fixes#3574: Improve SR-IOV usability
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Markus Frank <m.frank@proxmox.com>
this patch allows configuring usb devices that are mapped via
cluster resource mapping when the user has 'Mapping.Use' on the ACL
path '/mapping/usb/{ID}' (in addition to the usual required vm config
privileges)
for now, this is only valid if there is exactly one mapping for the
host, since we don't track passed through usb devices yet
This now also checks permissions on clone/restore, meaning a
'non-mapped' device can only be cloned/restored as root@pam user.
That is a breaking change.
Refactor the checks for restoring into a sub, so we have central place
where we can add such checks
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Markus Frank <m.frank@proxmox.com>
similar to how we handle the PCI module and format. This makes the
'verify_usb_device' method and format unnecessary since
we simply check the format with a regex.
while doing tihs, i noticed that we don't correctly check for the
case-insensitive variant for 'spice' during hotplug, so fix that too
With this we can also remove some parameters from the get_usb_devices
and get_usb_controllers functions
while were at it, refactor the permission checks for the usb config too
and use the new 'my sub' style for the functions
also make print_usbdevice_full parse the device itself, so we don't have
to do it in multiple places (especially in places where we don't see
that this is needed)
No functional change intended
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Markus Frank <m.frank@proxmox.com>
This was not only rather inefficient (getting the config from the
archive twice) but also wrong, as we can override options on restore,
so we can do the check only when the backed-up config and override
config got merged.
If this is to late from POV of volume deletion or the like, then the
issue is that those things happen to early, as we can only know what
to do with the actual target config, so destructive actions that
happen before that are wrong by design.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Similar to the corresponding endpoint for containers. Because disks
are involved, this can be a longer running operation, as is also
indicated by the 60 seconds timeout used in qemu_block_resize() which
is called by this endpoint.
This is a breaking API change.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Support for this was added in QEMU 5.1 by commit 7fa140abf6 ("qcow2:
Allow resize of images with internal snapshots").
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
This patch partially reverts commit 1b5706cd16,
by reintroducing the old format for return values (key, value, pending,
delete), but drops the "force-delete" return value. Right now, this
endpoint does not conform to its own format, because the return values
are as follows:
{
key => {
old => 'foo',
new => 'bar',
},
[…]
}
While the format specified is
[
{
key => 'baz',
old => 'foo',
new => 'bar',
},
[…]
]
This leads to the endpoint being broken when used through 'qm' and
'pvesh'. Using the API works fine, because the format doesn't get
verified there. Reverting this change brings the advantage that we can
also use PVE::GuestHelpers::format_pending when calling the endpoint
through qm again.
Signed-off-by: Leo Nunner <l.nunner@proxmox.com>
these config keys only affect the cloudinit drive contents (and state of the
guest inside the VM), they are not used anywhere on the hypervisor side, so
they should not require VM.Config.Network (which allows a lot more, such as
changing vNIC VLAN tags or the bridges they are connected to).
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
When rolling back to the snapshot of a VM that includes RAM, the VM
gets started by the rollback task anyway, so no additional start task
is needed. Previously, when rolling back with the start parameter and
the VM snapshot included RAM, a start task was created. That task
failed because the VM had already been started by the rollback task.
Additionally documented this behaviour in the description of the start
parameter
Signed-off-by: Stefan Hanreich <s.hanreich@proxmox.com>
was only explained in git history and vm_stop, add comments in other
relevant places to avoid future breakage.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
When applying the series introducing those calls, the helper was moved
to pve-common's CGroup.pm (see 07c10d5 ("cgroup: move get_cpuunits
helper from qemu-server as clamp_cpu_shares") in pve-common) instead
of pve-guest-common's GuestHelpers.pm. But these calls were not
updated.
Reported in the community forum:
https://forum.proxmox.com/threads/118267
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
which wraps the remote_migrate_vm API endpoint, but does the
precondition checks that can be done up front itself.
this now just leaves the FP retrieval and target node name lookup to the
sync part of the API endpoint, which should be do-able in <30s ..
an example invocation:
$ qm remote-migrate 1234 4321 'host=123.123.123.123,apitoken=PVEAPIToken=user@pve!incoming=aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee,fingerprint=aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb' --target-bridge vmbr0 --target-storage zfs-a:rbd-b,nfs-c:dir-d,zfs-e --online
will migrate the local VM 1234 to the host 123.123.1232.123 using the
given API token, mapping the VMID to 4321 on the target cluster, all its
virtual NICs to the target vm bridge 'vmbr0', any volumes on storage
zfs-a to storage rbd-b, any on storage nfs-c to storage dir-d, and any
other volumes to storage zfs-e. the source VM will be stopped but remain
on the source node/cluster after the migration has finished.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
entry point for the remote migration on the source side, mainly
preparing the API client that gets passed to the actual migration code
and doing some parameter parsing.
querying of the remote sides resources (like available storages, free
VMIDs, lookup of endpoint details for specific nodes, ...) should be
done by the client - see next commit with CLI example.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
remote migration uses a websocket connection to a task worker running on
the target node instead of commands via SSH to control the migration.
this websocket tunnel is started earlier than the SSH tunnel, and allows
adding UNIX-socket forwarding over additional websocket connections
on-demand.
the main differences to regular intra-cluster migration are:
- source VM config and disks are only removed upon request via --delete
- shared storages are treated like local storages, since we can't
assume they are shared across clusters (with potentical to extend this
by marking storages as shared)
- NBD migrated disks are explicitly pre-allocated on the target node via
tunnel command before starting the target VM instance
- in addition to storages, network bridges and the VMID itself is
transformed via a user defined mapping
- all commands and migration data streams are sent via a WS tunnel proxy
- pending changes and snapshots are discarded on the target side (for
the time being)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
the following two endpoints are used for migration on the remote side
POST /nodes/NODE/qemu/VMID/mtunnel
which creates and locks an empty VM config, and spawns the main qmtunnel
worker which binds to a VM-specific UNIX socket.
this worker handles JSON-encoded migration commands coming in via this
UNIX socket:
- config (set target VM config)
-- checks permissions for updating config
-- strips pending changes and snapshots
-- sets (optional) firewall config
- disk (allocate disk for NBD migration)
-- checks permission for target storage
-- returns drive string for allocated volume
- disk-import, query-disk-import, bwlimit
-- handled by PVE::StorageTunnel
- start (returning migration info)
- fstrim (via agent)
- ticket (creates a ticket for a WS connection to a specific socket)
- resume
- stop
- nbdstop
- unlock
- quit (+ cleanup)
this worker serves as a replacement for both 'qm mtunnel' and various
manual calls via SSH. the API call will return a ticket valid for
connecting to the worker's UNIX socket via a websocket connection.
GET+WebSocket upgrade /nodes/NODE/qemu/VMID/mtunnelwebsocket
gets called for connecting to a UNIX socket via websocket forwarding,
i.e. once for the main command mtunnel, and once each for the memory
migration and each NBD drive-mirror/storage migration.
access is guarded by a short-lived ticket binding the authenticated user
to the socket path. such tickets can be requested over the main mtunnel,
which keeps track of socket paths currently used by that
mtunnel/migration instance.
each command handler should check privileges for the requested action if
necessary.
both mtunnel and mtunnelwebsocket endpoints are not proxied, the
client/caller is responsible for ensuring the passed 'node' parameter
and the endpoint handling the call are matching.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
from GuestHelpers. This function checks all necessary permissions and
raises an exception if the user does not have the correct ones.
This is necessary for the new 'privileged' tags and 'user-tag-access'
permissions to work.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
The process for editing Cloud-init drives checked for inconsistent
permissions: for adding, the VM.Config.Disk permission was needed, while
the VM.Config.CDROM permission was needed to remove a drive. The regex
in drive_is_cloudinit needed to be adapted since the drive names have
different formats before/after they are actually generated.
Due to the regex letting names fall through before, Cloud-init drives
were being checked as disks, even though they are actually treated as
CDROM drives. Due to this, it makes more sense to check for
VM.Config.CDROM instead, while also requiring VM.Config.Cloudinit, since
generating a Cloud-init drive already generates default values that are
passed to the VM.
Signed-off-by: Leo Nunner <l.nunner@proxmox.com>
- The forced-remove flag wasn't really used AFAICT and makes
no sense IMO.
- Whether or not we care about non-MAC changes does not
belong here, but should instead taken into account in the
actual hotplug path recording the cloud-init state (iow.
into $cloudinit_record_changed().)
(This is not done here atm.)
- It seems much simpler to just have:
* 'old' = the old value if it's not a new value
* 'new' = the new value unless it's being deleted
* If only one of them is set it's an addition or removal.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This allow to regenerate the config drive with 1 api call.
This also avoid to delete drive first, and recreate it again.
As it's a readonly drive, we can simply live update it,
and eject/replace it with qemu monitor
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
While the clamping already happens before setting the actual systemd
CPU{Shares, Weight}, it can be done here too, to avoid writing new
out-of-range values into the config.
Can't use a validator enforcing this because existing out-of-range
values should not become errors upon parsing the config.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
This will break possibly existing workflows like
1. add second cloud-init
2. remove first cloud-init
to change the cloud-init storage.
On the other hand, it avoids unintended misconfiguration of having
mutliple cloud-init drives with potentially different settings.
Also in preparation for adding cloud-init-related API calls, where
not being able to assume that there's only one cloud-init drive/state
would complicate things quite a bit.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Prevent the user from suspending the vm at all, as while suspension
itself may finish, the saved state is incomplete as we can neither
save nor restore PCIe device state in any generic fashion, so
resuming will almost certainly break.
The single case when it could work is when the guest OS didn't uses
the passed through device at all, so there's no state, but that's
really odd (as why bother passing through then), and the user should
rather remove the hostpci entry in that case.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
[ T: reword commit message slightly ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
if the configured display hardware has the (optional) default type, but
some other attribute is set, this would match against `undef` and spew
lots of warnings in the logs.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Generalizes fd95d780 ("migrate: send updated TPM state volid to target
node") to also handle other offline migrated disks appearing in the
VM config, which currently should only be cloud-init.
Breaks migration new -> old under similar (edge-case-)conditions as
fd95d780 did.
Keep sending the 'tpmstate0' STDIN parameter to avoid breaking new ->
old in the scenario fd95d780 fixed.
Keep parsing the vm_start 'tpmstate0' STDIN parameter to avoid
breaking old -> new, and to be able to keep sending it.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
In preparation to allow passing along certain parameters together with
'archive'. Moving the parameter checks to after the
conflicts-with-'archive' to ensure that the more telling error will
trigger first.
All check helpers should handle empty params fine, but check first
just to make sure and to avoid all the superfluous function calls.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>