PVE::Storage::path() neither activates the storage of the passed-in volume, nor
does it ensure that the returned value is actually a file or block device, so
this actually fixes two issues. PVE::Storage::abs_filesystem_path() actually
takes care of both, while still calling path() under the hood (since $volid
here is always a proper volid, unless we change the cicustom schema at some
point in the future).
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
adds vendor and product information for SCSI devices to the json schema
and checks in the VM create/update API call if it is possible to add
these to QEMU as a device option
Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
[FE: add missing space to exception message
use config option for exception e.g. scsi0 rather than 'product'
style fixes]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Encapsulation of the functionality for determining the scsi device type
in a new function for reusability in QemuServer/Drive.pm
Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
by adding a comment and grouping the code better. See the PVE QEMU
patch "PVE: Allow version code in machine type" for reference. The way
the code was written previously made it look like a bug where
$pve_version might be overwritten multiple times.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
The default VM startup timeout is `max(30, VM memory in GiB)` seconds.
Multiple reports in the forum [0] [1] and the bug tracker [2] suggest
this is too short when using PCI passthrough with a large amount of VM
memory, since QEMU needs to map the whole memory during startup (see
comment #2 in [2]). As a result, VM startup fails with "got timeout".
To work around this, set a larger default timeout if at least one PCI
device is passed through. The question remains how to choose an
appropriate timeout. Users reported the following startup times:
ref | RAM | time | ratio (s/GiB)
---------------------------------
[1] | 60G | 135s | 2.25
[1] | 70G | 157s | 2.24
[1] | 80G | 277s | 3.46
[2] | 65G | 213s | 3.28
[2] | 96G | >290s | >3.02
The data does not really indicate any simple (e.g. linear)
relationship between RAM and startup time (even data from the same
source). However, to keep the heuristic simple, assume linear growth
and multiply the default timeout by 4 if at least one `hostpci[n]`
option is present, obtaining `4 * max(30, VM memory in GiB)`. This
covers all cases above, and should still leave some headroom.
[0]: https://forum.proxmox.com/threads/83765/post-552071
[1]: https://forum.proxmox.com/threads/126398/post-592826
[2]: https://bugzilla.proxmox.com/show_bug.cgi?id=3502
Suggested-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
In preparation to add more properties to the memory configuration like
maximum hotpluggable memory and whether virtio-mem devices should be
used.
This also allows to get rid of the cyclic include of PVE::QemuServer
in the memory module.
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
[FE: also convert new usage in get_derived_property
remove cyclic include of PVE::QemuServer
add commit message]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
moving qemu_{device,object}{add,del} helpers there for now.
In preparation to remove the cyclic include of PVE::QemuServer in the
memory module and generally for better modularity in the future.
No functional change intended.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
PVE::QemuServer::check_running() does both
PVE::QemuConfig::assert_config_exists_on_node()
PVE::QemuServer::Helpers::vm_running_locally()
The former one isn't needed here when doing hotplug, because the API
already assert that the VM config exists. It also would introduce a
new cyclic dependency between PVE::QemuServer::Memory <->
PVE::QemuConfig with the proposed virtio-mem patch set.
In preparation to remove the cyclic include of PVE::QemuServer in the
memory module.
No functional change intended.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
which is the only user of the parse_numa() helper. While at it, avoid
the duplication of MAX_NUMA.
In preparation to remove the cyclic include of PVE::QemuServer in the
memory module.
No functional change intended.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Commit efa3355d ("fix #3428: cloudinit: add parameter for upgrade on
boot") changed the default, but this is a breaking change. The bug
report was only about making the option configurable.
The commit doesn't give an explicit reason for why, and arguably,
doing the upgrade is not an issue for most users. It also leads to a
different cloud-init instance ID, because of the different setting,
which in turn leads to ssh host key regeneration within the VM.
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
this patch allows configuring pci devices that are mapped via cluster
resource mapping when the user has 'Resource.Use' on the ACL path
'/mapping/pci/{ID}' (in addition to the usual required vm config
privileges)
When given multiple mappings in the config, we use them as alternatives
for the passthrough, and will select the first free one on startup.
It is using our regular pci reservation mechanism for regular devices and
we introduce a selection mechanism for mediated devices.
A few changes to the inner workings were required to make this work well:
* parse_hostpci now returns a different structure where we have a list
of lists (first level is for the different alternatives and second
level is for the different devices that should be passed through
together)
* factor out the 'parse_hostpci_devices' which parses each device from
the config and does some precondition checks
* reserve_pci_usage now behaves slightly different when trying to
reserve an device with the same VMID that's already reserved for,
since for checking which alternative we can use, we already must
reserve one (this means that qm showcmd can actually reserve devices,
albeit only for up to 10 seconds)
* configuring a mediated device on a multifunction device is not
supported anymore, and results in failure to start (previously, it
just chose the first device to do it). This is a breaking change
* configuring a single pci device twice on different hostpci slots now
fails during commandline generation instead on qemu start, so we had
to adapt one test where this occurred (it could never have worked
anyway)
* we now check permissions during clone/restore, meaning raw/real
devices can only be cloned/restored by root@pam from now on.
this is a breaking change.
Fixes#3574: Improve SR-IOV usability
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Markus Frank <m.frank@proxmox.com>
this patch allows configuring usb devices that are mapped via
cluster resource mapping when the user has 'Mapping.Use' on the ACL
path '/mapping/usb/{ID}' (in addition to the usual required vm config
privileges)
for now, this is only valid if there is exactly one mapping for the
host, since we don't track passed through usb devices yet
This now also checks permissions on clone/restore, meaning a
'non-mapped' device can only be cloned/restored as root@pam user.
That is a breaking change.
Refactor the checks for restoring into a sub, so we have central place
where we can add such checks
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Markus Frank <m.frank@proxmox.com>
similar to how we handle the PCI module and format. This makes the
'verify_usb_device' method and format unnecessary since
we simply check the format with a regex.
while doing tihs, i noticed that we don't correctly check for the
case-insensitive variant for 'spice' during hotplug, so fix that too
With this we can also remove some parameters from the get_usb_devices
and get_usb_controllers functions
while were at it, refactor the permission checks for the usb config too
and use the new 'my sub' style for the functions
also make print_usbdevice_full parse the device itself, so we don't have
to do it in multiple places (especially in places where we don't see
that this is needed)
No functional change intended
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Markus Frank <m.frank@proxmox.com>
https://gitlab.com/x86-psABIs/x86-64-ABI/https://lists.gnu.org/archive/html/qemu-devel/2021-06/msg01592.html
"
In 2020, AMD, Intel, Red Hat, and SUSE worked together to define
three microarchitecture levels on top of the historical x86-64
baseline:
* x86-64: original x86_64 baseline instruction set
* x86-64-v2: vector instructions up to Streaming SIMD
Extensions 4.2 (SSE4.2) and Supplemental
Streaming SIMD Extensions 3 (SSSE3), the
POPCNT instruction, and CMPXCHG16B
* x86-64-v3: vector instructions up to AVX2, MOVBE,
and additional bit-manipulation instructions.
* x86-64-v4: vector instructions from some of the
AVX-512 variants.
"
This patch add new builtin model derivated from qemu64 model,
to be compatible between intel/amd.
mandatory flags from qemu-doc generator:
https://gitlab.com/qemu/qemu/-/blob/master/scripts/cpu-x86-uarch-abi.py
levels = [
[ # x86-64 baseline
"cmov",
"cx8",
"fpu",
"fxsr",
"mmx",
"syscall",
"sse",
"sse2",
],
[ # x86-64-v2
"cx16",
"lahf-lm",
"popcnt",
"pni",
"sse4.1",
"sse4.2",
"ssse3",
],
[ # x86-64-v3
"avx",
"avx2",
"bmi1",
"bmi2",
"f16c",
"fma",
"abm",
"movbe",
"xsave" #missing from qemu doc currently
],
[ # x86-64-v4
"avx512f",
"avx512bw",
"avx512cd",
"avx512dq",
"avx512vl",
],
]
x86-64-v1 : I'm skipping it, as it's basicaly qemu64|kvm64 -vme,-cx16 for compat Opteron_G1 from 2004
so will use it as qemu64|kvm64 is higher are not working on opteron_g1 anyway
x86-64-v2 : Derived from qemu, +popcnt;+pni;+sse4.1;+sse4.2;+ssse3
min intel: Nehalem
min amd : Opteron_G3
x86-64-v2-AES : Derived from qemu, +aes;+popcnt;+pni;+sse4.1;+sse4.2;+ssse3
min intel: Westmere
min amd : Opteron_G3
x86-64-v3 : Derived from qemu64 +aes;+popcnt;+pni;+sse4.1;+sse4.2;+ssse3;+avx;+avx2;+bmi1;+bmi2;+f16c;+fma;+abm;+movbe+xsave
min intel: Haswell
min amd : EPYC_v1
x86-64-v4 : Derived from qemu64 +aes;+popcnt;+pni;+sse4.1;+sse4.2;+ssse3;+avx;+avx2;+bmi1;+bmi2;+f16c;+fma;+abm;+movbe;+xsave;+avx512f;+avx512bw;+avx512cd;+avx512dq;+avx512vl
min intel: Skylake
min amd : EPYC_v4
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
If no FQDN is provided, we simply set it to the current hostname. This
ensures that the hostname *really* gets set, since we encountered an
issue on Fedora and CentOS based systems where no hostname got set at
all.
When there's no FQDN set in the cloudinit config, this leads to the
following entry:
127.0.1.1 <hostname> <hostname>
Which doesn't seem to cause any issues.
Tested on:
- Ubuntu 23.04
- CentOS 8
- Fedora 38
- Debian 11
- SUSE 15.4
Signed-off-by: Leo Nunner <l.nunner@proxmox.com>
While, usually, the slot should match the ID, it's not explicitly
guaranteed and relies on QEMU internals. Using the numerical ID is
more future-proof and more consistent with plugging, where no slot
information (except the maximum limit) is relied upon.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
current qemu_dimm_list can return any kind of memory devices.
make it more generic, with an optionnal type device
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
Required because there's one single efi for ARM, and the 2m/4m
difference doesn't seem to apply.
Signed-off-by: Matthias Heiserer <m.heiserer@proxmox.com>
[ T: move description to format and reword subject ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Mira found out that 41 phys-bits the limit is pretty much the same as
with 40, as such odd sizes are a bit unexpected anyway lets mask the
LSB and use that as base, that way we're good again.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
QEMU 7.1 introduced some actual checks for the max memory value in
1caab5cf86bd ("i386/pc: bounds check phys-bits against max used GPA")
and while correct it breaks our by-luck working hard coded max mem of
4 TB for cases with smaller phys bit address sizes, like some older
CPUs or most CPU types have per default if not 'host' or 'max'.
QEMU uses 40 bits per default if the CPU isn't set to host or
phys-bits is not set explicitly.
For 40 bit it seems that depending on machine type one has a max
possible mem of: i440 -> 752, q35 -> 722 GiB, but instead of reducing
it to 704 GiB (512+1128+64) in a hard coded way we acutally check for
the bit size that will probably be used and use that to determine the
max memory size useable.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The process for editing Cloud-init drives checked for inconsistent
permissions: for adding, the VM.Config.Disk permission was needed, while
the VM.Config.CDROM permission was needed to remove a drive. The regex
in drive_is_cloudinit needed to be adapted since the drive names have
different formats before/after they are actually generated.
Due to the regex letting names fall through before, Cloud-init drives
were being checked as disks, even though they are actually treated as
CDROM drives. Due to this, it makes more sense to check for
VM.Config.CDROM instead, while also requiring VM.Config.Cloudinit, since
generating a Cloud-init drive already generates default values that are
passed to the VM.
Signed-off-by: Leo Nunner <l.nunner@proxmox.com>
- The forced-remove flag wasn't really used AFAICT and makes
no sense IMO.
- Whether or not we care about non-MAC changes does not
belong here, but should instead taken into account in the
actual hotplug path recording the cloud-init state (iow.
into $cloudinit_record_changed().)
(This is not done here atm.)
- It seems much simpler to just have:
* 'old' = the old value if it's not a new value
* 'new' = the new value unless it's being deleted
* If only one of them is set it's an addition or removal.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Hotpluggieg generated a cloudinit image based on old values
in order to attach the device and later update it again, but
the update was only done if cloudinit hotplug was enabled.
This is weird, let's not.
Also introduce 'apply_cloudinit_config' which also write the
config, which, as it turns out, is the only thing we
actually need anyway, currently.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This partially reverts commit 95a5135dad.
Particularly the unprotected write to the config when
generating the cloudinit file. We leave the rest as is for
now and update the callers to deal with the config later.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
there's no guarantee that we're locked here and it also produces
unnecessary extra IO in most cases.
While at it also avoid that a special:cloudinit section is added on
start to *every* VM, which caused another bug to trigger (see prev.
commit) and is just odd for users that ain't using cloudinit
Note in two call sites that we may need to write the config indeed
out there on the caller side.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>