fixes commit 74c17b7a23 which moved
this code here, but forgot to pass $vga ref, as the module was not
using warning nor strict mode this was not caught..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
(also fixes#3011)
Deprecates the old-style 'boot' and 'bootdisk' options by adding a new
'order=' subproperty to 'boot'.
This allows a user to specify more than one disk in the boot order,
helping with newer versions of SeaBIOS/OVMF where disks without a
bootindex won't be initialized at all (breaks soft-raid and some LVM
setups).
This also allows specifying a bootindex for USB and hostpci devices,
which was not possible before. Floppy boot support is not supported in
the new model, but I doubt that will be a problem (AFAICT we can't even
attach floppy disks to a VM?).
Default behaviour is intended to stay the same, i.e. while new VMs will
receive the new 'order' property, it will be set so the VM starts the
same as before (using get_default_bootorder).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The format is unused in this commit, but will replace the current
string-based format of the 'boot' property. It is included since the
parameter of bootorder_from_legacy follows it.
Two helper methods are introduced:
* bootorder_from_legacy: Parses the legacy format into a hash closer to
what the new format represents
* get_default_bootdevices: Encapsulates the legacy default behaviour if
nothing is specified in the boot order
resolve_first_disk is simplified and gets a new $cdrom parameter to
control the behaviour of excluding CD-ROMs or instead searching for only
them.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
during refactoring, the vmid got lost, but is necessary to get
the correct mdev id
Fixes commit 74c17b7a23
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
[ reference fixed commit ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Use the new register_format(3) call to use a validator (instead of a
parser) for 'pve-(vm-)?cpu-conf'. This way the $cpu_fmt hash can be used for
generating the documentation, while still applying the same verification
rules as before.
Since the function no longer parses but only verifies, the parsing in
print_cpu_device/get_cpu_options has to go via JSONSchema directly.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Pass new size directly, so the function doesn't need to know about
how some hash is organized. And return a message directly, instead
of both size-strings. Also dropped the wantarray, because both
existing callers use the message anyways.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Legacy IGD passthrough requires address 00:1f.0 to not be assigned to
anything on QEMU startup (currently it's assigned to bridge pci.2).
Changing this in general would break live-migration, so introduce a new
hostpci parameter "legacy-igd", which if set to 1 will move that bridge
to be nested under bridge 1.
This is safe because:
* Bridge 1 is unconditionally created on i440fx, so nesting is ok
* Defaults are not changed, i.e. PCI layout only changes when the new
parameter is specified manually
* hostpci forbids migration anyway
Additionally, the PT device has to be assigned address 00:02.0 in the
guest as well, which is usually used for VGA assignment. Luckily, IGD PT
requires vga=none, so that is not an issue either.
See https://git.qemu.org/?p=qemu.git;a=blob;f=docs/igd-assign.txt
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
As perl hashes have random order, sort them before iterating through.
This makes the output of 'qm cloudinit dump <vmid> network' consistent
between calls if the config has not changed.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
More API calls will follow for this path, for now add the 'index' call to
list all custom and default CPU models.
Any user can list the default CPU models, as these are public anyway, but
custom models are restricted to users with Sys.Audit on /nodes.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Some OVF files to not declare 'rasd' as a default namespace (in the
top level Envelope element), but inline in each element (e.g.
<rasd:HostResource xmlns:rasd="foo">...</rasd:HostResource>)
This trips up our relative findvalue with
> XPath error : Undefined namespace prefix
To avoid this, search in the global XPathContext (where we register
those namespaces ourselves) and pass the item_node as context
parameter.
This works then for both cases
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
It was necessary to move foreach_volid back to QemuServer.pm
In VZDump/QemuServer.pm and QemuMigrate.pm the dependency on
QemuConfig.pm was already there, just the explicit "use" was missing.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Can be specified for a particular VM or via a custom CPU model (VM takes
precedence).
QEMU's default limit only allows up to 1TB of RAM per VM. Increasing the
physical address bits available to a VM can fix this.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
If a cputype is custom (check via prefix), try to load options from the
custom CPU model config, and set values accordingly.
While at it, extract currently hardcoded values into seperate sub and add
reasonings.
Since the new flag resolving outputs flags in sorted order for
consistency, adapt the test cases to not break. Only the order is
changed, not which flags are present.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-By: Fabian Ebner <f.ebner@proxmox.com>
Tested-By: Fabian Ebner <f.ebner@proxmox.com>
To avoid hardcoding even more CPU-flag related things for custom CPU
models, introduce a dynamic approach to resolving flags.
resolve_cpu_flags takes a list of hashes (as documented in the
comment) and resolves them to a valid "-cpu" argument without
duplicates. This also helps by providing a reason why specific CPU flags
have been added, and thus allows for useful warning messages should a
flag be overwritten by another.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-By: Fabian Ebner <f.ebner@proxmox.com>
Tested-By: Fabian Ebner <f.ebner@proxmox.com>
This is required to support custom CPU models, since the
"cpu-models.conf" file is not versioned, and can be changed while a VM
using a custom model is running. Changing the file in such a state can
lead to a different "-cpu" argument on the receiving side.
This patch fixes this by passing the entire "-cpu" option (extracted
from /proc/.../cmdline) as a "qm start" parameter. Note that this is
only done if the VM to migrate is using a custom model (which we can
check just fine, since the <vmid>.conf *is* versioned with pending
changes), thus not breaking any live-migration directionality.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
$cpu_fmt is being reused for custom CPUs as well as VM-specific CPU
settings. The "pve-vm-cpu-conf" format is introduced to verify a config
specifically for use as VM-specific settings.
"pve-cpu-conf" is registered for use in custom CPU API calls (where no
additional checks are required).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Turn CPUConfig into a SectionConfig with parsing/writing support for
custom CPU models. IO is handled using cfs.
Namespacing will be provided using "custom-" prefix for custom model
names (in VM config only, cpu-models.conf will contain unprefixed
names).
Includes two overrides to avoid writing redundant information to the
config file, additionally get_custom_model is used to retrieve a custom
model configuration by name.
Resolve custom names in print_cpu_device when a custom cpu is passed.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Moved code so that initialization of drivedesc_hash stays a single block.
Avoid auto-vivication in parse_drive.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
...instead of booting with an invalid config once and then silently
changing the memory size for consequent VM starts.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Tested-by: Alwin Antreich <a.antreich@proxmox.com>
This cannot work, since we adjust the 'memory' property of the VM config
on hotplugging, but then the user-defined NUMA topology won't match for
the next start attempt.
Check needs to happen here, since it otherwise fails early with "total
memory for NUMA nodes must be equal to vm static memory".
With this change the error message reflects what is actually happening
and doesn't allow VMs with exactly 1GB of RAM either.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Tested-by: Alwin Antreich <a.antreich@proxmox.com>
and make it match with what parse_drive does. Even though the 'real' format
was pve-volume-id, callers already expected that parse_drive returns a hash
with a valid 'file' key (e.g. PVE/API2/Qemu.pm:1147ff).
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Reviewed-By: Fabian Grünbichler <f.gruenbichler@proxmox.com>
avoids a genisoimage output like:
> Total translation table size: 0
> Total rockridge attributes bytes: 417
> Total directory bytes: 0
> Path table size(bytes): 10
> Max brk space used 0
> 178 extents written (0 MB)
on every VM start.
Rather than that useless output, tell genisoimage to be quiet, which
still prints errors but nothing else. Additionally print a short
single line about that we're to create the cloud-init iso.
Reformat while at it.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
If for whatever reason there is no size in the property string
of a drive, 'qm rescan' would do nothing for that drive and
live migration would also fail.
Also adds a check to avoid potential auto-vivification of volid_hash->{$volid}
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
The initialization for the drive keys in $confdesc is changed
to be a single for-loop iterating over the keys of $drivedesc_hash and
the initialization of the unusedN keys is move to directly below it.
To avoid the need to change all the call sites, functions with more than
a few callers are exported from the submodule and imported into QemuServer.pm.
For callers of the now imported functions within QemuServer.pm, the prefix
PVE::QemuServer is dropped, because it is unnecessary and now even confusing.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Allow a user to add a virtio-rng-pci (an emulated hardware random
number generator) to a VM with the rng0 setting. The setting is
version_guard()-ed.
Limit the selection of entropy source to one of three:
/dev/urandom (preferred): Non-blocking kernel entropy source
/dev/random: Blocking kernel source
/dev/hwrng: Hardware RNG on the host for passthrough
QEMU itself defaults to /dev/urandom (or the equivalent getrandom()
call) if no source file is given, but I don't fully trust that
behaviour to stay constant, considering the documentation [0] already
disagrees with the code [1], so let's always specify the file ourselves.
/dev/urandom is preferred, since it prevents host entropy starvation.
The quality of randomness is still good enough to emulate a hwrng, since
a) it's still seeded from the kernel's true entropy pool periodically
and b) it's mixed with true entropy in the guest as well.
Additionally, all sources about entropy predicition attacks I could find
mention that to predict /dev/urandom results, /dev/random has to be
accessed or manipulated in one way or the other - this is not possible
from a VM however, as the entropy we're talking about comes from the
*hosts* blocking pool.
More about the entropy and security implications of the non-blocking
interface in [2] and [3].
Note further that only one /dev/hwrng exists at any given time, if
multiple RNGs are available, only the one selected in
'/sys/devices/virtual/misc/hw_random/rng_current' will feed the file.
Selecting this is left as an exercise to the user, if at all required.
We limit the available entropy to 1 KiB/s by default, but allow the user
to override this. Interesting to note is that the limiter does not work
linearly, i.e. max_bytes=1024/period=1000 means that up to 1 KiB of data
becomes available on a 1000 millisecond timer, not that 1 KiB is
streamed to the guest over the course of one second - hence the
configurable period.
The default used here is the same as given in the QEMU documentation [0]
and has been verified to affect entropy availability in a guest by
measuring /dev/random throughput. 1 KiB/s is enough to avoid any
early-boot entropy shortages, and already has a significant impact on
/dev/random availability in the guest.
[0] https://wiki.qemu.org/Features/VirtIORNG
[1] https://git.qemu.org/?p=qemu.git;a=blob;f=crypto/random-platform.c;h=f92f96987d7d262047c7604b169a7fdf11236107;hb=HEAD
[2] https://lwn.net/Articles/261804/
[3] https://lwn.net/Articles/808575/
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
'input-data' can be used to pass arbitrary data to a guest when running
an agent command with 'guest-exec'. Most guest-agent implementations
treat this as STDIN to the command given by "path"/"arg", but some go as
far as relying solely on this parameter, and even fail if "path" or
"arg" are set (e.g. Mikrotik Cloud Hosted Router) - thus "command" needs
to be made optional.
Via the API, an arbitrary string can be passed, on the command line ('qm
guest exec'), an additional '--pass-stdin' flag allows to forward STDIN
of the qm process to 'input-data', with a size limitation of 1 MiB to
not overwhelm QMP.
Without 'input-data' (API) or '--pass-stdin' (CLI) behaviour is unchanged.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Live-migrating a VM with more than 14 SCSI disks to a node that doesn't
support it yet is broken. Use a bumped pve-version to represent that and
give the user a nice error message instead.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The previously introduced approach can fail for pinned versions when a
new QEMU release is introduced. The saner approach is to use a mapping
that gives one pve-version for each QEMU release.
Fortunately, the old system has not been bumped yet, so we can still
change it without too much effort.
QEMU versions without a mapping are assumed to be pve0, 4.1 is mapped to
pve1 since thats what we had as our default previously.
Pinned machine versions (i.e. pc-i440fx-4.1) are always assumed to be
pve0, for specific pve-versions they'd have to be pinned as well (i.e.
pc-i440fx-4.1+pve1).
The new logic also makes the pve-version dynamic, and starts VMs with
the lowest possible 'feature-level', i.e. if a feature is only available
with 4.1+pve2, but the VM isn't using it, we still start it with
4.1+pve0.
We die if we don't support a version that is requested from us. This
allows us to use the pve-version as live-migration blocks (i.e. bumping
the version and then live-migrating a VM which uses the new feature (so
is running with the bumped version) to an outdated node will present the
user with a helpful error message and fail instead of silently modifying
the config and only failing *after* the migration).
$version_guard is introduced in config_to_command to use for features
that need to check pve-version, it automatically handles selecting the
newest necessary pve-version for the VM.
Tests have to be adjusted, since all of them now resolve to pve0 instead
of pve1. EXPECT_ERROR matching is changed to use 'eq' instead of regex
to allow special characters in error messages.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
to achieve this we have to add 3 new scsihw addresses since lsi
controllers can only hold 7 scsi drives
we go up to 31, since this is the limit for virtio-scsi-single devices
we have reserved (we can increase this in the future)
to make it more future proof, we add a new pci bridge under pci
bridge 1, so we have to adapt the bridge adding code (we did not
need this for q35 previously)
impact on live migration:
since on older versions of qemu-server we do not have those config
settings, there is no problem from old -> new
new->old is not supported anyway and this breaks so that
the vm crashes and loses the configs for scsi15-30
(same behaviour as e.g. with audio0 and migration from new->old)
tested with 31 scsi disk on
i440fx + virtio-scsi
i440fx + lsi
q35 + virtio-scsi
q35 + lsi
with ovmf + seabios
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
The package will be used for custom CPU models as a SectionConfig, hence
the name. For now we simply move some CPU related helper functions and
declarations over from QemuServer to reduce clutter there.
Exports are to avoid changing all call sites, functions have useful
names on their own.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
With our QEMU 4.1.1 package we can pass a additional internal version
to QEMU's machine, it will be split out there and ignored, but
returned on a QMP 'query-machines' call.
This allows us to use it for increasing the granularity with which we
can roll-out HW layout changes/additions for VMs. Until now we
required a machine version bump, happening normally every major
release of QEMU, with seldom, for us irrelevant, exceptions.
This often delays rolling out a feature, which would break
live-migration, by several months. That can now be avoided, the new
"pve-version" component of the machine can be bumped at will, and
thus we are much more flexible.
That versions orders after the ($major, $minor) version components
from an stable release - it can thus also be reset on the next
release.
The implementation extends the qemu-machine REGEX, remembers
"pve-version" when doing a "query-machines" and integrates support
into the min_version and extract_version helpers.
We start out with a version of 1.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reviewed-by: Stefan Reiter <s.reiter@proxmox.com>