Commit Graph

319 Commits

Author SHA1 Message Date
Dominik Csapak
48ada6982f pci: mdev: adapt to NVIDIA's modern interface with kernel >= 6.8
Since kernel 6.8, NVIDIAs vGPU driver does not use the generic mdev
interface anymore, since they relied on a feature there which is not
available anymore. IIUC the kernel [0] recommends drivers to implement
their own device specific features since putting all in the generic one
does not make sense.

They now have an 'nvidia' folder in the device sysfs path, which
contains the files `creatable_vgpu_types`/`current_vgpu_type` to
control the virtual functions model, and then the whole virtual function
has to be passed through (although without resetting and changing to the
vfio-pci driver).

This patch implements changes so that from a config perspective, it
still is an mediated device, and we map the functionality iff the device
has no mediated devices but the new NVIDIAs sysfsapi and the model name
is 'nvidia-<..>'

It behaves a bit different than mdevs and normal pci passthrough, as we
have to choose the correct device immediately since it's bound to the
pciid, but we must not bind the device to vfio-pci as the NVIDIA driver
implements this functionality itself.

When cleaning up, we iterate over all reserved devices (since for a
mapping we can't know at this point which was chosen besides looking at
the reservations) and reset the vgpu model to '0', so it frees up the
reservation from NVIDIAs side. (We also do that in a loop, since it's
not always immediately ready after QEMU closes)

A general problem (but that was previously also the case) is that a
showcmd (for a not running guest) reserves the pciids, which might block
an execution of a different real vm. This is now a bit more problematic
as we (temporarily) set the vgpu type then.

0: https://docs.kernel.org/driver-api/vfio-pci-device-specific-driver-acceptance.html

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Christoph Heiss <c.heiss@proxmox.com>
Reviewed-by: Christoph Heiss <c.heiss@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-10-24 18:43:52 +02:00
Dominik Csapak
d7fe48e9aa pci: device reservation: allow one to only free a subset of IDs
Add an optional parameter to the helper that removes PCI reservations
so that we can partially release IDs again. This will be necessary for
NVIDIAs new sysfs api

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Christoph Heiss <c.heiss@proxmox.com>
Reviewed-by: Christoph Heiss <c.heiss@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-10-24 18:40:55 +02:00
Dominik Csapak
fc23c72a42 pci: device selection: don't reserve PCI IDs when VM is already running
Since the only way this could happen is when we're being called
from 'qm showcmd' and there we don't want to reserve or create anything.

In case the VM was not running, we actually reserve the devices, so we
want to call 'cleanup_pci_devices' after to remove those again. This
minimizes the timespan where those devices are not available for real vm
starts.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Christoph Heiss <c.heiss@proxmox.com>
Reviewed-by: Christoph Heiss <c.heiss@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-10-24 18:39:37 +02:00
Maximiliano Sandoval
be8c868f0c fix typos in user-visible strings
This includes docs, and strings printed to stderr or stdout.

These were caught with:

    typos --exclude test --exclude changelog

Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
2024-10-24 13:15:06 +02:00
Fiona Ebner
84b4bc9ab1 move helper to check running QEMU version out of the 'Machine' module
The version of the running QEMU binary is not related to the machine
version and so it's a bit confusing to have the helper in the
'Machine' module. It cannot live in the 'Helpers' module, because that
would lead to a cyclic inclusion Helpers <-> Monitor. Thus,
'QMPHelpers' is chosen as the new home.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-07-30 21:19:51 +02:00
Mira Limbeck
f63cc6dbeb fix 4493: cloud-init: fix generated Windows config
Cloudbase-Init, a cloud-init reimplementation for Windows, supports only
a subset of the configuration options of cloud-init. Some features
depend on support by the Metadata Service (ConfigDrive2 here) and have
further limitations [0].

To support a basic setup the following changes were made:
 - password is saved as plaintext for any Windows guests (ostype)
 - DNS servers are added to each of the interfaces
 - SSH public keys are passed via metadata

Network and metadata generation for Cloudbase-Init is separate from the
default ConfigDrive2 one so as to not interfere with any other OSes that
depend on the current ConfigDrive2 implementation.

DNS search domains were removed because Cloudbase-Init's ENI parser
doesn't handle it at all.
The password set via `cipassword` is used for the Admin user configured
in the cloudbase-init.conf in the guest while the `ciuser` parameter is
ignored. The Admin user has to be set in the cloudbase-init.conf file
instead.
Specifying a different user does not work.

For the password to work the `ostype` needs to be any Windows variant
before `cipassword` is set. Otherwise the password will be encrypted and
the encrypted password used as plaintext password in the guest.

The `citype` needs to be `configdrive2`, which is the default for
Windows guests, for the generated configs to be compatible with
Cloudbase-Init.

[0] https://cloudbase-init.readthedocs.io/en/latest/index.html

Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
2024-07-30 19:49:28 +02:00
Wolfgang Bumiller
a38204c14b fix #5528: override cgroup methods to call systemd via dbus
Systemd reapplies its known values on reload, so we cannot simply call
into PVE::CGroup. Call systemd's SetUnitProperties method via dbus
instead.

The hotplug and startup code also calculated different values, as one
operated within systemd's value framework (documented in
systemd.resource-control(5)) and one worked with cgroup values
(distinguishing between cgroup v1 and v2 manually).

This is now unified by overriding `change_cpu_quota()` and
`change_cpu_shares()` via `PVE::QemuServer::CGroup` which now takes
systemd-based values and sends those directly via dbus.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2024-07-23 08:05:53 +02:00
Fiona Ebner
87084b18dd drive: tpm: fix default version in schema
Since the check in start_swtpm() only checks for an explicitly
configured v2.0 to opt-in to version 2, the actual default is v1.2
and not v2.0 like the schema stated.

Of course, it would be nicer to have the default be v2.0, but changing
the check to use that default would break any TPM state without an
explicitly configured version.

There doesn't seem to be any code beside start_swtpm() accessing the
version.

Fixes: f9dde219 ("fix #3075: add TPM v1.2 and v2.0 support via swtpm")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-07-01 10:37:48 +02:00
Fiona Ebner
4ee69c54f6 monitor: allow passing timeout for a HMP command
Passing the timeout key with an explicit value of undef is fine,
because both the absence of the timeout key and an explicit value of
undef will lead to $timeout being undef in the qmp_cmd() function.

In preparation to increase the timeout for certain (e.g. disk-related)
HMP commands.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-06-11 13:56:44 +02:00
Filip Schauer
67dca4238b cpu config: fix get_cpu_bitness always reverting to default cpu type
This fixes the broken prevention of starting a VM with a 32-bit CPU
using a 64-bit OVMF (UEFI) BIOS.

Fixes: 89d5b1c9 ("prevent starting a 32-bit VM using a 64-bit OVMF BIOS")
Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
[FE: add Fixes trailer, add prefix to title]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-04-24 11:37:28 +02:00
Thomas Lamprecht
43569a32ae api: create vm: fix missing import for serializing machine type
The machine handling was transformed into a full fledged property
string with a (sub) format, but the single call-site for print_machine
was seemingly not tested, as this could have never worked due to a
missing import of the print_property_string helper.

Fixes: 8082eb8 ("config: define machine schema as property-string")
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-04-20 12:27:30 +02:00
Markus Frank
2db4c27283 fix #3784: config: Parameter for guest vIOMMU + test-cases
vIOMMU enables the option to passthrough pci devices to L2 VMs in L1
VMs via Nested Virtualisation and adds an extra isolation.

Uses the new property-string from the "config: define machine schema
as property-string"-commit to add the viommu option to the machine
parameter.

Currently there are two vIOMMU implementation in QEMU to choose:
intel or virtio

Virtio-iommu is more recent but less used in production than intel-iommu.

The assert_valid_machine_property function prevents using intel-iommu with
i440fx.

Signed-off-by: Markus Frank <m.frank@proxmox.com>
 [ TL: tiny coding style fix to extract variable inside if expr ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-04-11 16:40:17 +02:00
Markus Frank
8082eb8ca1 config: define machine schema as property-string
Convert the machine parameter to a property-string and use the machine
type as the default key for backward compatibility.

Signed-off-by: Markus Frank <m.frank@proxmox.com>
2024-04-11 10:18:27 +02:00
Hannes Duerr
6906c2ab33 drive: style fix the name of the get_scsi_device_type method
Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-04-10 13:56:50 +02:00
Hannes Duerr
7e2065956f fix #5363: cloudinit: make creation of scsi cloudinit discs possible again
Upon obtaining the device type, a check is performed to determine if it
is a CD drive. It is important to note that Cloudinit drives are always
assigned as CD drives. If the drive has not yet been allocated, the test
will fail due to the unset cd attribute.
To avoid this, an explicit check is now performed to determine if it is
a Cloudinit drive that has not yet been assigned.

Fixes: d1feab4 ("fix #4957: add vendor and product information passthrough for SCSI-Disks")
Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
2024-04-10 13:56:41 +02:00
Filip Schauer
caa88bc80a cpu config: die on hotplug of non x86_64 CPUs
When attempting a CPU hotplug on an architecture other than x86_64, die
with a clean error instead of attempting a hotplug with a known
non-working device command line. Also move the corresponding FIXME up to
the error.

Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
2024-03-14 14:01:21 +01:00
Fiona Ebner
f6039cedf1 disk import: warn when fallback is used instead of requested format
Might avoid some confusion. Reported in the community forum:
https://forum.proxmox.com/threads/142988/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-03-14 14:01:12 +01:00
Thomas Lamprecht
5ebb6018ed cpu config: implement is_native_arch locally for now
could be a better fit in PVE::Tools, like proposed by Filip, but OTOH.
Tools is already crowded as is, so wait if we need it on more places
outside of qemu-server.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2024-03-08 15:16:40 +01:00
Filip Schauer
d1a7abd07c cpu config: Unify the default value for 'kvm'
Make the default value for 'kvm' consistent, taking into account
whether the VM will run on the same CPU architecture as the host.

This would be a breaking change to CPU hotplug for VMs with a
different CPU architecture running on an x86_64 host, as in this case
the default CPU type for CPU hotplug changes from 'kvm64' to 'qemu64'.
However, CPU hotplug of non x86_64 architectures is not supported
anyway, so this is not a breaking change after all.

It should be noted that this change does alter the CPU hotplug
behaviour when emulating an x86_64 CPU on a non-x86_64 host. This is
however not officially supported in Proxmox VE.

Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
2024-03-08 14:24:37 +01:00
Filip Schauer
89d5b1c90b prevent starting a 32-bit VM using a 64-bit OVMF BIOS
Instead of starting a VM with a 32-bit CPU type and a 64-bit OVMF image,
throw an error before starting the VM telling the user that OVMF is not
supported on 32-bit CPU types.

To obtain a list of 32-bit CPU types, refer to the builtin_x86_defs in
target/i386/cpu.c of QEMU. Exclude any entries that have the long mode
feature (CPUID_EXT2_LM).

Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
2024-03-08 14:24:32 +01:00
Filip Schauer
5416ff700f cpu config: add helper to get the default CPU type
Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
2024-03-08 14:24:32 +01:00
Fabian Grünbichler
9946d6fa57 fix #4085: properly activate cicustom storage(s)
PVE::Storage::path() neither activates the storage of the passed-in volume, nor
does it ensure that the returned value is actually a file or block device, so
this actually fixes two issues. PVE::Storage::abs_filesystem_path() actually
takes care of both, while still calling path() under the hood (since $volid
here is always a proper volid, unless we change the cicustom schema at some
point in the future).

Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2024-01-31 12:28:46 +01:00
Fiona Ebner
613fed51fa drive: product/vendor: add comment with rationale for limits
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-01-26 11:28:32 +01:00
Hannes Duerr
d1feab4aa2 fix #4957: add vendor and product information passthrough for SCSI-Disks
adds vendor and product information for SCSI devices to the json schema
and checks in the VM create/update API call if it is possible to add
these to QEMU as a device option

Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
[FE: add missing space to exception message
     use config option for exception e.g. scsi0 rather than 'product'
     style fixes]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2024-01-26 11:12:58 +01:00
Hannes Duerr
a17c753528 drive: Create get_scsi_devicetype
Encapsulation of the functionality for determining the scsi device type
in a new function for reusability in QemuServer/Drive.pm

Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
2024-01-05 16:26:25 +01:00
Hannes Duerr
028aee4554 Move NEW_DISK_RE to QemuServer/Drive.pm
Prepare for introduction of new helper

Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
2024-01-05 16:26:25 +01:00
Hannes Duerr
ad464e0b87 Move path_is_scsi to QemuServer/Drive.pm
Prepare for introduction of new helper

Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
2024-01-05 16:26:25 +01:00
Alexandre Derumier
2f2da05217 cpu config: add QEMU 8.1 cpu models
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
[FE: add prefix to commit title, capitalize QEMU]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-12-12 11:27:42 +01:00
Fiona Ebner
da84155405 machine: get current: add flag if current machine is deprecated in list context
Will be used for a warning.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-11-12 18:48:01 +01:00
Fiona Ebner
be690b7a91 machine: get current: return early from loop if possible
No point iterating through the rest if we already got the current
machine.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-11-12 18:48:01 +01:00
Fiona Ebner
7d6a629269 machine: get current: make it clear that pve-version only exists for the current machine
by adding a comment and grouping the code better. See the PVE QEMU
patch "PVE: Allow version code in machine type" for reference. The way
the code was written previously made it look like a bug where
$pve_version might be overwritten multiple times.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-11-12 18:48:01 +01:00
Fiona Ebner
081eed3b79 machine: get current: improve naming and style
No functional change intended.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-11-12 18:48:01 +01:00
Friedrich Weber
95f1de689e vm start: set higher timeout if using PCI passthrough
The default VM startup timeout is `max(30, VM memory in GiB)` seconds.
Multiple reports in the forum [0] [1] and the bug tracker [2] suggest
this is too short when using PCI passthrough with a large amount of VM
memory, since QEMU needs to map the whole memory during startup (see
comment #2 in [2]). As a result, VM startup fails with "got timeout".

To work around this, set a larger default timeout if at least one PCI
device is passed through. The question remains how to choose an
appropriate timeout. Users reported the following startup times:

ref | RAM | time  | ratio (s/GiB)
---------------------------------
[1] | 60G |  135s |  2.25
[1] | 70G |  157s |  2.24
[1] | 80G |  277s |  3.46
[2] | 65G |  213s |  3.28
[2] | 96G | >290s | >3.02

The data does not really indicate any simple (e.g. linear)
relationship between RAM and startup time (even data from the same
source). However, to keep the heuristic simple, assume linear growth
and multiply the default timeout by 4 if at least one `hostpci[n]`
option is present, obtaining `4 * max(30, VM memory in GiB)`. This
covers all cases above, and should still leave some headroom.

[0]: https://forum.proxmox.com/threads/83765/post-552071
[1]: https://forum.proxmox.com/threads/126398/post-592826
[2]: https://bugzilla.proxmox.com/show_bug.cgi?id=3502

Suggested-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
2023-10-06 18:12:07 +02:00
Alexandre Derumier
7f8c808772 add memory parser
In preparation to add more properties to the memory configuration like
maximum hotpluggable memory and whether virtio-mem devices should be
used.

This also allows to get rid of the cyclic include of PVE::QemuServer
in the memory module.

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
[FE: also convert new usage in get_derived_property
     remove cyclic include of PVE::QemuServer
     add commit message]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-18 17:08:48 +02:00
Fiona Ebner
87b0f305e7 introduce QMPHelpers module
moving qemu_{device,object}{add,del} helpers there for now.

In preparation to remove the cyclic include of PVE::QemuServer in the
memory module and generally for better modularity in the future.

No functional change intended.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-18 17:08:48 +02:00
Fiona Ebner
2951c45ec0 memory: replace deprecated check_running() call
PVE::QemuServer::check_running() does both
PVE::QemuConfig::assert_config_exists_on_node()
PVE::QemuServer::Helpers::vm_running_locally()

The former one isn't needed here when doing hotplug, because the API
already assert that the VM config exists. It also would introduce a
new cyclic dependency between PVE::QemuServer::Memory <->
PVE::QemuConfig with the proposed virtio-mem patch set.

In preparation to remove the cyclic include of PVE::QemuServer in the
memory module.

No functional change intended.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-18 17:08:48 +02:00
Fiona Ebner
0e32bf5bc2 move NUMA-related code into memory module
which is the only user of the parse_numa() helper. While at it, avoid
the duplication of MAX_NUMA.

In preparation to remove the cyclic include of PVE::QemuServer in the
memory module.

No functional change intended.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-18 17:08:48 +02:00
Fiona Ebner
261b67e4aa move parse_number_sets() helper to helpers module
In preparation to move parse_numa() to the memory module.

No functional change intended.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-09-18 17:08:48 +02:00
Filip Schauer
6d1ac42b95 drive: Fix typo in description of efitype
Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
2023-09-06 11:55:11 +02:00
Alexandre Derumier
a132d50ea3 memory: use static_memory in foreach_dimm
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
2023-09-01 12:46:02 +02:00
Fiona Ebner
ec11b92abb cloudinit: restore previous default for package upgrades
Commit efa3355d ("fix #3428: cloudinit: add parameter for upgrade on
boot") changed the default, but this is a breaking change. The bug
report was only about making the option configurable.

The commit doesn't give an explicit reason for why, and arguably,
doing the upgrade is not an issue for most users. It also leads to a
different cloud-init instance ID, because of the different setting,
which in turn leads to ssh host key regeneration within the VM.

Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-06-21 12:40:58 +02:00
Alexandre Derumier
c961ab6d80 cpuconfig: add missing qemu 8.0 cpu models
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
2023-06-19 17:02:00 +02:00
Dominik Csapak
9b71c34d61 enable cluster mapped PCI devices for guests
this patch allows configuring pci devices that are mapped via cluster
resource mapping when the user has 'Resource.Use' on the ACL path
'/mapping/pci/{ID}' (in  addition to the usual required vm config
privileges)

When given multiple mappings in the config, we use them as alternatives
for the passthrough, and will select the first free one on startup.
It is using our regular pci reservation mechanism for regular devices and
we introduce a selection mechanism for mediated devices.

A few changes to the inner workings were required to make this work well:
* parse_hostpci now returns a different structure where we have a list
  of lists (first level is for the different alternatives and second
  level is for the different devices that should be passed through
  together)
* factor out the 'parse_hostpci_devices' which parses each device from
  the config and does some precondition checks
* reserve_pci_usage now behaves slightly different when trying to
  reserve an device with the same VMID that's already reserved for,
  since for checking which alternative we can use, we already must
  reserve one (this means that qm showcmd can actually reserve devices,
  albeit only for up to 10 seconds)
* configuring a mediated device on a multifunction device is not
  supported anymore, and results in failure to start (previously, it
  just chose the first device to do it). This is a breaking change
* configuring a single pci device twice on different hostpci slots now
  fails during commandline generation instead on qemu start, so we had
  to adapt one test where this occurred (it could never have worked
  anyway)
* we now check permissions during clone/restore, meaning raw/real
  devices can only be cloned/restored by root@pam from now on.
  this is a breaking change.

Fixes #3574: Improve SR-IOV usability
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By:  Markus Frank <m.frank@proxmox.com>
2023-06-16 16:24:02 +02:00
Dominik Csapak
e3971865b4 enable cluster mapped USB devices for guests
this patch allows configuring usb devices that are mapped via
cluster resource mapping when the user has 'Mapping.Use' on the ACL
path '/mapping/usb/{ID}' (in addition to the usual required vm config
privileges)

for now, this is only valid if there is exactly one mapping for the
host, since we don't track passed through usb devices yet

This now also checks permissions on clone/restore, meaning a
'non-mapped' device can only be cloned/restored as root@pam user.
That is a breaking change.

Refactor the checks for restoring into a sub, so we have central place
where we can add such checks

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By:  Markus Frank <m.frank@proxmox.com>
2023-06-16 16:24:02 +02:00
Dominik Csapak
0cf8d56c6d usb: refactor usb code and move some into USB module
similar to how we handle the PCI module and format. This makes the
'verify_usb_device' method and format unnecessary since
we simply check the format with a regex.

while doing tihs, i noticed that we don't correctly check for the
case-insensitive variant for 'spice' during hotplug, so fix that too

With this we can also remove some parameters from the get_usb_devices
and get_usb_controllers functions

while were at it, refactor the permission checks for the usb config too
and use the new 'my sub' style for the functions

also make print_usbdevice_full parse the device itself, so we don't have
to do it in multiple places (especially in places where we don't see
that this is needed)

No functional change intended

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By:  Markus Frank <m.frank@proxmox.com>
2023-06-16 16:24:02 +02:00
Thomas Lamprecht
56d28037b5 helpers: actualy future proof and allow also checking releases
ensuring the editor state is saved helps -.-

Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-06-16 13:53:52 +02:00
Thomas Lamprecht
e17312c99f helpers: future proof and allow also checking releases
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-06-16 13:26:43 +02:00
Thomas Lamprecht
9c6eabf028 fix #4784: helpers: cope with native versions in manager version check
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-06-16 13:22:40 +02:00
Alexandre Derumier
1359e23fb4 cpuconfig: add new x86-64-vX models
https://gitlab.com/x86-psABIs/x86-64-ABI/
https://lists.gnu.org/archive/html/qemu-devel/2021-06/msg01592.html
"
In 2020, AMD, Intel, Red Hat, and SUSE worked together to define
three microarchitecture levels on top of the historical x86-64
baseline:

  * x86-64:    original x86_64 baseline instruction set
  * x86-64-v2: vector instructions up to Streaming SIMD
               Extensions 4.2 (SSE4.2)  and Supplemental
               Streaming SIMD Extensions 3 (SSSE3), the
               POPCNT instruction, and CMPXCHG16B
  * x86-64-v3: vector instructions up to AVX2, MOVBE,
               and additional bit-manipulation instructions.
  * x86-64-v4: vector instructions from some of the
               AVX-512 variants.
"

This patch add new builtin model derivated from qemu64 model,
to be compatible between intel/amd.

mandatory flags from qemu-doc generator:
https://gitlab.com/qemu/qemu/-/blob/master/scripts/cpu-x86-uarch-abi.py

levels = [
    [ # x86-64 baseline
        "cmov",
        "cx8",
        "fpu",
        "fxsr",
        "mmx",
        "syscall",
        "sse",
        "sse2",
    ],
    [ # x86-64-v2
        "cx16",
        "lahf-lm",
        "popcnt",
        "pni",
        "sse4.1",
        "sse4.2",
        "ssse3",
    ],
    [ # x86-64-v3
        "avx",
        "avx2",
        "bmi1",
        "bmi2",
        "f16c",
        "fma",
        "abm",
        "movbe",
	"xsave"  #missing from qemu doc currently
    ],
    [ # x86-64-v4
        "avx512f",
        "avx512bw",
        "avx512cd",
        "avx512dq",
        "avx512vl",
    ],
]

x86-64-v1 : I'm skipping it, as it's basicaly qemu64|kvm64 -vme,-cx16 for compat Opteron_G1 from 2004
            so will use it as qemu64|kvm64 is higher are not working on opteron_g1 anyway

x86-64-v2 : Derived from qemu, +popcnt;+pni;+sse4.1;+sse4.2;+ssse3

min intel: Nehalem
min amd : Opteron_G3

x86-64-v2-AES : Derived from qemu, +aes;+popcnt;+pni;+sse4.1;+sse4.2;+ssse3

min intel: Westmere
min amd : Opteron_G3

x86-64-v3 : Derived from qemu64 +aes;+popcnt;+pni;+sse4.1;+sse4.2;+ssse3;+avx;+avx2;+bmi1;+bmi2;+f16c;+fma;+abm;+movbe+xsave

min intel: Haswell
min amd : EPYC_v1

x86-64-v4 : Derived from qemu64 +aes;+popcnt;+pni;+sse4.1;+sse4.2;+ssse3;+avx;+avx2;+bmi1;+bmi2;+f16c;+fma;+abm;+movbe;+xsave;+avx512f;+avx512bw;+avx512cd;+avx512dq;+avx512vl

min intel: Skylake
min amd : EPYC_v4

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
2023-06-12 17:30:11 +02:00
Leo Nunner
3e546c5ada cloudinit: pass through hostname via fqdn field
If no FQDN is provided, we simply set it to the current hostname. This
ensures that the hostname *really* gets set, since we encountered an
issue on Fedora and CentOS based systems where no hostname got set at
all.

When there's no FQDN set in the cloudinit config, this leads to the
following entry:

    127.0.1.1 <hostname> <hostname>

Which doesn't seem to cause any issues.

Tested on:
 - Ubuntu 23.04
 - CentOS 8
 - Fedora 38
 - Debian 11
 - SUSE 15.4

Signed-off-by: Leo Nunner <l.nunner@proxmox.com>
2023-06-07 19:33:28 +02:00
Leo Nunner
efa3355d3b fix #3428: cloudinit: add parameter for upgrade on boot
up until now, we did an automatic upgrade after the first boot in our
standard cloud-init config. This has been requested to be toggleable
several times [1][2]. With this patch, "package_upgrade" is disabled by
default, and needs to be enabled manually, diverging from the previous
behaviour.

[1] https://forum.proxmox.com/threads/how-to-prevent-automatic-apt-upgrade-during-the-first-boot-with-cloud-init.68472/
[2] https://forum.proxmox.com/threads/cloud-init-ohne-package-upgrade.123841/

Signed-off-by: Leo Nunner <l.nunner@proxmox.com>
2023-06-07 18:25:46 +02:00
Fiona Ebner
2cf203194b memory: hotplug: sort by numerical ID rather than slot when unplugging
While, usually, the slot should match the ID, it's not explicitly
guaranteed and relies on QEMU internals. Using the numerical ID is
more future-proof and more consistent with plugging, where no slot
information (except the maximum limit) is relied upon.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-03-17 14:05:02 +01:00
Alexandre Derumier
d1b25a2267 memory: don't use foreach_reversedimm for unplug
simple use dimm_list() returned by qemu

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
2023-03-16 09:53:53 +01:00
Alexandre Derumier
1e28e8ba92 memory: rename qemu_dimm_list to qemu_memdevices_list
current qemu_dimm_list can return any kind of memory devices.

make it more generic, with an optionnal type device

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
2023-03-16 09:53:53 +01:00
Matthias Heiserer
1183c8f1a0 ovmf efi disk: ignore efitype parameter for ARM VMs
Required because there's one single efi for ARM, and the 2m/4m
difference doesn't seem to apply.

Signed-off-by: Matthias Heiserer <m.heiserer@proxmox.com>
 [ T: move description to format and reword subject ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-02-23 16:29:57 +01:00
Alexandre Derumier
dafb728cb8 memory: remove calls to parse_hotplug_features
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
[FE: style: avoid overly long lines]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-15 14:34:25 +01:00
Alexandre Derumier
d82ae201ac memory: refactor sockets
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
2023-02-15 14:34:25 +01:00
Alexandre Derumier
2166f6a905 memory: extract some code to their own sub for mocking
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
[FE: style: avoid overly long lines]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
2023-02-15 14:34:05 +01:00
Alexandre Derumier
39c074fe23 qemu_memory_hotplug: remove unused $opt arg
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
2023-02-03 11:23:47 +01:00
Alexandre Derumier
62b26624c6 memory hot-plug: fix check for maximal memory value
Fixes: 4d3f29e ("memory hotplug patch v10")
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2023-01-02 13:56:43 +01:00
Thomas Lamprecht
305e9cec5d memory hot plug: round down to nearest even phys-bits for heuristics
Mira found out that 41 phys-bits the limit is pretty much the same as
with 40, as such odd sizes are a bit unexpected anyway lets mask the
LSB and use that as base, that way we're good again.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-17 17:45:59 +01:00
Thomas Lamprecht
33b0d3b7be memory hotplug: rework max memory handling, make phys-bits dependent
QEMU 7.1 introduced some actual checks for the max memory value in
1caab5cf86bd ("i386/pc: bounds check phys-bits against max used GPA")
and while correct it breaks our by-luck working hard coded max mem of
4 TB for cases with smaller phys bit address sizes, like some older
CPUs or most CPU types have per default if not 'host' or 'max'.

QEMU uses 40 bits per default if the CPU isn't set to host or
phys-bits is not set explicitly.

For 40 bit it seems that depending on machine type one has a max
possible mem of: i440 -> 752, q35 -> 722 GiB, but instead of reducing
it to 704 GiB (512+1128+64) in a hard coded way we acutally check for
the bit size that will probably be used and use that to determine the
max memory size useable.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-17 16:54:49 +01:00
Thomas Lamprecht
d1901fe2ce cpu config: indentation fixup
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-17 16:46:40 +01:00
Leo Nunner
f5a88e9870 fix #4321: properly check cloud-init drive permissions
The process for editing Cloud-init drives checked for inconsistent
permissions: for adding, the VM.Config.Disk permission was needed, while
the VM.Config.CDROM permission was needed to remove a drive. The regex
in drive_is_cloudinit needed to be adapted since the drive names have
different formats before/after they are actually generated.

Due to the regex letting names fall through before, Cloud-init drives
were being checked as disks, even though they are actually treated as
CDROM drives. Due to this, it makes more sense to check for
VM.Config.CDROM instead, while also requiring VM.Config.Cloudinit, since
generating a Cloud-init drive already generates default values that are
passed to the VM.

Signed-off-by: Leo Nunner <l.nunner@proxmox.com>
2022-11-17 08:10:28 +01:00
Wolfgang Bumiller
1b5706cd16 drop get_pending_changes and simplify cloudinit_pending api call
- The forced-remove flag wasn't really used AFAICT and makes
  no sense IMO.
- Whether or not we care about non-MAC changes does not
  belong here, but should instead taken into account in the
  actual hotplug path recording the cloud-init state (iow.
  into $cloudinit_record_changed().)
  (This is not done here atm.)
- It seems much simpler to just have:
  * 'old' = the old value if it's not a new value
  * 'new' = the new value unless it's being deleted
  * If only one of them is set it's an addition or removal.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-11-16 18:17:07 +01:00
Wolfgang Bumiller
4b785da1a9 delay cloudinit generation in hotplug
Hotpluggieg generated a cloudinit image based on old values
in order to attach the device and later update it again, but
the update was only done if cloudinit hotplug was enabled.
This is weird, let's not.

Also introduce 'apply_cloudinit_config' which also write the
config, which, as it turns out, is the only thing we
actually need anyway, currently.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-11-16 18:17:07 +01:00
Wolfgang Bumiller
0337d531a0 Partially-revert "cloudinit: add cloudinit section for current generated config"
This partially reverts commit 95a5135dad.
Particularly the unprotected write to the config when
generating the cloudinit file. We leave the rest as is for
now and update the callers to deal with the config later.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2022-11-16 18:17:07 +01:00
Wolfgang Bumiller
3de134ef4a Revert "cloudinit: avoid unsafe write of VM config"
This reverts commit b137c30c3a.

In preparation of fixing the special:cloudinig section.
2022-11-16 18:16:56 +01:00
Thomas Lamprecht
b137c30c3a cloudinit: avoid unsafe write of VM config
there's no guarantee that we're locked here and it also produces
unnecessary extra IO in most cases.

While at it also avoid that a special:cloudinit section is added on
start to *every* VM, which caused another bug to trigger (see prev.
commit) and is just odd for users that ain't using cloudinit

Note in two call sites that we may need to write the config indeed
out there on the caller side.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-16 12:03:53 +01:00
Thomas Lamprecht
05eae0f21f cleanup validate_cpu_conf
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-15 08:49:04 +01:00
Thomas Lamprecht
f6b24f427d usb: fixup: include USB config only for non-q35 again
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-11 07:43:03 +01:00
Thomas Lamprecht
342f049352 usb: small style/code cleanups
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-10 17:02:34 +01:00
Thomas Lamprecht
e68881e4a3 usb: get controllers: avoid separate loop for usb 2 devs and improve variable names
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-10 17:02:34 +01:00
Thomas Lamprecht
871ebe1775 usb: rename check_usb_index into assert_usb_index_is_useable
to better convey that this might die

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-10 17:02:34 +01:00
Dominik Csapak
0c3d18ef13 USB: increase max usb devices to 14 for newer machine version and ostype
for machine versions >= 7.1 and ostype linux or windows > 7, we use the
qemu-xhci controller where we have up to 14 usable ports, so make them
available to the user

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2022-11-10 17:02:34 +01:00
Dominik Csapak
4862922a2b fix #4324: USB: use qemu-xhci for machine versions >= 7.1
going by reports in the forum (e.g. [0]) and semi-official qemu
information[1], we should prefer qemu-xhci over nec-usb-xhci

for compatibility purposes, we guard that behind the machine version,
so that guests with a fixed version don't suddenly have a different usb
controller after a reboot (which could potentially break some hardcoded
guest configs)

0: https://forum.proxmox.com/threads/proxmox-usb-connect-disconnect-loop.117063/
1: https://www.kraxel.org/blog/2018/08/qemu-usb-tips/

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2022-11-10 17:02:34 +01:00
Dominik Csapak
3deccbd7d0 USB: use machine_type_is_q35 instead of regex
we refactored that into PVE::QemuServer::Machine a while ago, so we can
use it here

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2022-11-10 17:02:34 +01:00
Dominik Csapak
b06a24927c USB: print_usbdevice_full: error out on invalid configuration
should not happen normally, but an inattentive user of that function
may forget to check the validity of the parsed device, so err
on the safe side here

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2022-11-10 17:02:34 +01:00
Dominik Csapak
238af88edc move 'windows_version' to Helpers
to avoid a cyclic dependency when we want to use that in PVE::QemuServer::USB

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2022-11-10 17:02:34 +01:00
Dominik Csapak
6fa358a334 pci: make mediated device sysfs path independent of PCI id
mdevs have a host-unique UUID they are indexed with in the PCI-id
independent `/sys/bus/mdev/devices/<uuid>` path, so there is no need
to go through the PCI id for them.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2022-11-09 09:06:19 +01:00
Thomas Lamprecht
2fa64dbddd pci: add/improve HW reservation comments
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-09 08:55:55 +01:00
Dominik Csapak
1b189121fc vm start/stop: cleanup passed-through pci devices in more situations
if the preparing of PCI devices or the start of the VM fails, we need
to cleanup the PCI devices (reservations *and* mdevs), or else it
might happen that there are leftovers which must be manually removed.

to include also mdevs now, refactor the cleanup code from
'vm_stop_cleanup' into it's own function, and call that instead of
only 'remove_pci_reservation'

also simplifies the code, such that it now removes all PCI ids
reserved for that VMID, since we cannot have multiple VMs with the
same VMID anyway

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-11-09 08:49:45 +01:00
Alexandre Derumier
2be1fb0af4 api2: add cloudinit config api
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
2022-11-08 17:24:59 +01:00
Alexandre Derumier
95a5135dad cloudinit: add cloudinit section for current generated config.
Instead using vm pending options for pending cloudinit generated config,

write current generated cloudinit config in a new [special:cloudinit] SECTION.

Currently, some options like vm name, nic mac address can be hotplugged,
so they are not way to know if the cloud-init disk is already updated.

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
2022-11-08 17:23:30 +01:00
Alexandre Derumier
9c88e85446 migration: test targetnode min version for cloudinit section
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
2022-11-08 17:23:30 +01:00
Thomas Lamprecht
adc67fe917 cpu config: fix depreacation mapping on CPU hotplug of custom types
we need to do the independent of is_custom_model to ensure the
reported model is understood by QEMU

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
2022-08-30 09:25:06 +02:00
Thomas Lamprecht
0d6962f935 cpu config: map depreacated IceLake-Client CPU type to IceLake-Server
the former CPU type never existed on the market and will be dropped
by QEMU 7.1, so map it to the server variant as they're pretty much
identical anyway FIWCT.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-08-30 09:09:13 +02:00
Thomas Lamprecht
b0ab346381 cpu config: minor code style nits/comment
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-08-30 09:09:13 +02:00
Dominik Csapak
bbf96e0f1e automatically add 'uuid' parameter when passing through NVIDIA vGPU
When passing through an NVIDIA vGPU via mediated devices, their
software needs the qemu process to have the 'uuid' parameter set to the
one of the vGPU. Since it's currently not possible to pass through multiple
vGPUs to one VM (seems to be an NVIDIA driver limitation at the moment),
we don't have to take care about that.

Sadly, the place we do this, it does not show up in 'qm showcmd' as we
don't (want to) query the pci devices in that case, and then we don't
have a way of knowing if it's an NVIDIA card or not. But since this
is informational with QEMU anyway, i'd say we can ignore that.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2022-08-12 13:42:33 +02:00
Alexandre Derumier
9b1971c5c9 cpuconfig: add amd epyc milan model
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
2022-05-20 09:45:18 +02:00
Dominic Jäger
e6ac9fed7b api: support VM disk import
Extend qm importdisk functionality to the API.

Co-authored-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Co-authored-by: Dominic Jäger <d.jaeger@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-04-04 16:41:13 +02:00
Fabian Ebner
c1accf9db9 schema: drive: use separate schema when disk allocation is possible
via the special syntax <storeid>:<size>.

Not worth it by itself, but this is anticipating a new 'import-from'
parameter which is only used upon import/allocation, but shouldn't be
part of the schema for the config or other API enpoints.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-04-04 16:41:13 +02:00
Dominik Csapak
d8a7e9e881 PCI: allow longer pci domains
some systems[0] have pci domains longer than the default ('0000') of 4
characters, so change the regex to allow at least 4.

0: https://forum.proxmox.com/threads/problem-with-gpu-passthrough-in-a-virtual-machine.105720/

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2022-03-16 18:03:35 +01:00
Thomas Lamprecht
11f9264fed cpu config: code format/whitespace fixes
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2022-03-16 15:53:00 +01:00
Fabian Ebner
84c253e947 parse ovf: untaint path when calling file_size_info
Prepare for calling parse_ovf via API, where the -T switch is used.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2022-03-11 10:45:59 +01:00
Nicholas Sherlock
d806b017ac pci: allow override of PCI vendor/device ids
This allows mobile- and vGPUs to be presented to the guest as if they
were the original desktop variants of the card. It also allows
device-ID variants that guests don't know about to be renamed to
match compatible sibling devices the guest does have drivers for
(e.g. to remove manufacturer-specific vendor ID variants that prevent
the use of a device which would otherwise have a supported chipset)

e.g. hostpci0: 03:00,vendor-id=0x8086,device-id=0x10f6

Signed-off-by: Nicholas Sherlock <n.sherlock@gmail.com>
Reviewed-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Dominik Csapak <d.csapak@proxmox.com>
2022-01-25 10:59:23 +01:00
Mira Limbeck
ea18b60455 fix #3792: cloudinit: use of uninitialized value
With the patch adding vendor-data support to cloud-init, a use of
uninitialized value was introduced. This can be fixed by setting it to
an empty string if no vendor-data is defined.

vendor-data can only be set via --cicustom and is optional.

Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
2021-12-21 15:45:18 +01:00
Dominik Csapak
1319908f5d exclude efidisk and tpmstate for boot disk selection
else we cannot create a vm without a disk but with a tpmstate/efidisk,
since the api tries to generate the default bootorder with them included

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2021-11-15 16:57:52 +01:00
Aaron Lauterer
1071373027 Drive: add valid_drive_names_with_unused
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2021-11-09 16:16:00 +01:00
Thomas Lamprecht
115cb432bc cloud init: add comment regarding 3 MiB size limit
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Originally-by: Mira Limbeck <m.limbeck@proxmox.com>
2021-11-04 13:14:17 +01:00