Replace the DEVICE_NATIVE_ENDIAN MemoryRegionOps by a pair of
DEVICE_LITTLE_ENDIAN / DEVICE_BIG_ENDIAN.
Add the "endianness" property to select the device endianness.
This property is unspecified by default, and machines need to
set it explicitly.
Set the proper endianness for each machine using the device.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250213122217.62654-3-philmd@linaro.org>
Introduce the EndianMode type and the DEFINE_PROP_ENDIAN() macros.
Endianness can be BIG, LITTLE or unspecified (default).
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250213122217.62654-2-philmd@linaro.org>
When the %data argument is not modified, we can declare it const.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250210133134.90879-8-philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250210133134.90879-7-philmd@linaro.org>
Commit f0ec14c78c ("tests/avocado: Fix console data loss") fixed
QEMUMachine's problem with console, we don't need to use the sleep()
kludges.
Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250206131052.30207-15-philmd@linaro.org>
Make microblaze tests a bit more generic.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250206131052.30207-14-philmd@linaro.org>
The archive used in test_microblaze_s3adsp1800.py (testing a
big-endian target) contains a big-endian kernel. Rename using
the _BE suffix.
Similarly, the archive in test_microblazeel_s3adsp1800 (testing
a little-endian target) contains a little-endian kernel. Rename
using _LE suffix.
These changes will help when adding cross-endian kernel tests.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250206131052.30207-13-philmd@linaro.org>
The SMC91C111 includes an MMU Command register which permits
the guest to remove entries from the RX FIFO. The datasheet
does not specify what happens if the guest tries to do this
when the FIFO is already empty; there are no status registers
containing error bits which might be applicable.
Currently we don't guard at all against pop of an empty
RX FIFO, with the result that we allow the guest to drive
the rx_fifo_len index to negative values, which will cause
smc91c111_receive() to write to the rx_fifo[] array out of
bounds when we receive the next packet.
Instead ignore attempts to pop an empty RX FIFO.
Cc: qemu-stable@nongnu.org
Fixes: 80337b66a8 ("NIC emulation for qemu arm-softmmu")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2780
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250207151157.3151776-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
opentitan_machine_init() calls get_system_memory(),
which is declared in "exec/address-spaces.h". Include
it in order to avoid when refactoring unrelated headers:
hw/riscv/opentitan.c:83:29: error: call to undeclared function 'get_system_memory'
83 | MemoryRegion *sys_mem = get_system_memory();
| ^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20250206181827.41557-4-philmd@linaro.org>
Using the auto_create_sdcard feature without SD Bus is irrelevant.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20250204200934.65279-8-philmd@linaro.org>
MachineClass::auto_create_sdcard is only useful to automatically
create a SD card, attach a IF_SD block drive to it and plug the
card onto a SD bus. None of the RISCV machines modified by this
commit try to use the IF_SD interface.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250204200934.65279-7-philmd@linaro.org>
MachineClass::auto_create_sdcard is only useful to automatically
create a SD card, attach a IF_SD block drive to it and plug the
card onto a SD bus. None of the ARM machines modified by this
commit try to use the IF_SD interface.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250204200934.65279-6-philmd@linaro.org>
A number of machines create an if=sd drive by default even though
they lack an SD bus, and therefore cannot use the drive.
This drive is created when the machine sets flag
@auto_create_sdcard.
See for example running HMP "info block" on the HPPA C3700 machine:
$ qemu-system-hppa -M C3700 -monitor stdio -S
(qemu) info block
floppy0: [not inserted]
Removable device: not locked, tray closed
sd0: [not inserted]
Removable device: not locked, tray closed
$ qemu-system-hppa -M C3700 -sd /bin/sh
qemu-system-hppa: -sd /bin/sh: machine type does not support if=sd,bus=0,unit=0
Delete that from machines that lack an SD bus.
Note, only the ARM and RISCV targets use such feature:
$ git grep -wl IF_SD hw | cut -d/ -f-2 | sort -u
hw/arm
hw/riscv
$
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250204200934.65279-5-philmd@linaro.org>
Invert the 'no_sdcard' logic, renaming it as the more explicit
"auto_create_sdcard". Machines are supposed to create a SD Card
drive when this flag is set. In many cases it doesn't make much
sense (as boards don't expose SD Card host controller), but this
is patch only aims to expose that nonsense; so no logical change
intended (mechanical patch using gsed).
Most of the changes are:
- mc->no_sdcard = ON_OFF_AUTO_OFF;
+ mc->auto_create_sdcard = true;
Except in
. hw/core/null-machine.c
. hw/arm/xilinx_zynq.c
. hw/s390x/s390-virtio-ccw.c
where the disabled option is manually removed (since default):
- mc->no_sdcard = ON_OFF_AUTO_ON;
+ mc->auto_create_sdcard = false;
- mc->auto_create_sdcard = false;
and in system/vl.c we change the 'default_sdcard' type to boolean.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250204200934.65279-4-philmd@linaro.org>
Update MachineClass::no_sdcard default implicit AUTO
initialization to explicit OFF. This flag is consumed
in system/vl.c::qemu_disable_default_devices(). Use
this place to assert we don't have anymore AUTO state.
In hw/ppc/e500.c we add the ppce500_machine_class_init()
method to initialize once all the inherited classes.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250204200934.65279-3-philmd@linaro.org>
MachineClass::no_sdcard is initialized as false by default.
To catch all uses, convert it to a tri-state, having the
current default (false) becoming AUTO.
No logical change intended.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250204200934.65279-2-philmd@linaro.org>
Because the legacy Xen backend devices can optionally be plugged on the
TYPE_PLATFORM_BUS_DEVICE, have it inherit TYPE_DYNAMIC_SYS_BUS_DEVICE.
Remove the implicit TYPE_XENSYSDEV instance_size.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alexander Graf <graf@amazon.com>
Tested-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20250125181343.59151-10-philmd@linaro.org>
Makes the code less sensitive regarding changes in the class hierarchy which
will be performed in the next patch.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250127094129.15941-1-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Because the TPM TIS sysbus device can be optionally plugged on the
TYPE_PLATFORM_BUS_DEVICE, have it inherit TYPE_DYNAMIC_SYS_BUS_DEVICE.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20250125181343.59151-9-philmd@linaro.org>
Because the network eTSEC device can be optionally plugged on the
TYPE_PLATFORM_BUS_DEVICE, have it inherit TYPE_DYNAMIC_SYS_BUS_DEVICE.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Tested-by: Bernhard Beschow <shentey@gmail.com>
Acked-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20250125181343.59151-8-philmd@linaro.org>
Do not explain why _X86_IOMMU devices are user_creatable,
have them inherit TYPE_DYNAMIC_SYS_BUS_DEVICE, to explicit
they can optionally be plugged on TYPE_PLATFORM_BUS_DEVICE.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250125181343.59151-7-philmd@linaro.org>
Because the RAM FB device can be optionally plugged on the
TYPE_PLATFORM_BUS_DEVICE, have it inherit TYPE_DYNAMIC_SYS_BUS_DEVICE.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250125181343.59151-6-philmd@linaro.org>
Do not explain why VFIO_PLATFORM devices are user_creatable,
have them inherit TYPE_DYNAMIC_SYS_BUS_DEVICE, to make explicit
that they can optionally be plugged on TYPE_PLATFORM_BUS_DEVICE.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alexander Graf <graf@amazon.com>
Message-Id: <20250125181343.59151-5-philmd@linaro.org>
Some TYPE_SYS_BUS_DEVICEs can be optionally dynamically
plugged on the TYPE_PLATFORM_BUS_DEVICE.
Rather than sometimes noting that with comment around
the 'user_creatable = true' line in each DeviceRealize
handler, introduce an abstract TYPE_DYNAMIC_SYS_BUS_DEVICE
class.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250125181343.59151-4-philmd@linaro.org>
When multiple QOM types are registered in the same file,
it is simpler to use the the DEFINE_TYPES() macro. In
particular because type array declared with such macro
are easier to review.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20250125181343.59151-3-philmd@linaro.org>
Rather than using the obscure system_bus_info.instance_size,
directly use sizeof(BusState).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Message-Id: <20250125181343.59151-2-philmd@linaro.org>
Currently, neither i386 nor ARM have real hardware support for per-
thread cache, and there is no clear demand for this specific cache
topology.
Additionally, since ARM even can't support this special cache topology
in device tree, it is unnecessary to support it at this moment, even
though per-thread cache might have potential scheduling benefits for
VMs without CPU affinity.
Therefore, disable thread-level cache topology in the general machine
part. At present, i386 has not enabled SMP cache, so disabling the
thread parameter does not pose compatibility issues.
In the future, if there is a clear demand for this feature, the correct
approach would be to add a new control field in MachineClass.smp_props
and enable it only for the machines that require it.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250110145115.1574345-2-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This changes replaces the use of an explicit literal constant for
the APIC base address mask with the existing symbolic constant
intended for this purpose.
Additionally, we remove the comment about not being able to
re-enable the APIC after disabling it. This is no longer
the case after the APIC implementation's state machine was
modified in 9.0.
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241209203629.74436-11-phil@philjordan.eu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
When a property value is static (not provided by QMP or CLI),
error shouldn't happen, otherwise it is a programming error.
Therefore simplify and use &error_abort as this can't fail.
Reported-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Message-Id: <20241108154317.12129-11-philmd@linaro.org>
sprintf() is deprecated on Darwin since macOS 13.0 / XCode 14.1.
Using qemu_hexdump_line() both fixes the deprecation warning and
simplifies the code base.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
[rth: Keep the linebreaks every 16 bytes]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240412073346.458116-12-richard.henderson@linaro.org>
[PMD: Rebased]
The migration result data is not included in the guestperf
report information; include the result as a report entry
so the developer can check whether the migration was successful
after running guestperf.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Message-ID: <6303400c2983ffe5647f07caa6406f00ceae4581.1739530098.git.yong.huang@smartx.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Guestperf tool does not cover the multifd compression option
currently, it is worth supporting so that developers can
analysis the migration performance with different
compression algorithms.
Multifd support 4 compression algorithms currently:
zlib, zstd, qpl, uadk
To request that multifd with the specified compression
algorithm such as zlib:
$ ./tests/migration-stress/guestperf.py \
--multifd --multifd-channels 4 --multifd-compression zlib \
--output output.json
To run the entire standardized set of multifd compression
comparisons, with unix migration:
$ ./tests/migration-stress/guestperf-batch.py \
--dst-host localhost --transport unix \
--filter compr-multifd-compression* --output outputdir
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <c0e3313d81e8130f8119ef4f242e4625886278cf.1739530098.git.yong.huang@smartx.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The way to enable multifd migration has been changed by commit,
82137e6c8c (migration: enforce multifd and postcopy preempt to
be set before incoming), and guestperf has not made the
necessary changes. If multifd migration had been enabled in the
previous manner, the following error would have occurred:
Multifd must be set before incoming starts
Supporting deferred migration will fix it.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <8874e170f890ce0bc6f25cb0d9b9ae307ce2e070.1739530098.git.yong.huang@smartx.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
qmp_migrate guarantees that cpr_channel is not null for
MIG_MODE_CPR_TRANSFER when cpr_state_save is called:
qmp_migrate()
if (s->parameters.mode == MIG_MODE_CPR_TRANSFER && !cpr_channel) {
return;
}
cpr_state_save(cpr_channel)
but cpr_state_save checks for mode differently before using channel,
and Coverity cannot infer that they are equivalent in outgoing QEMU,
and warns that channel may be NULL:
cpr_state_save(channel)
MigMode mode = migrate_mode();
if (mode == MIG_MODE_CPR_TRANSFER) {
f = cpr_transfer_output(channel, errp);
To make Coverity happy, assert that channel != NULL in cpr_state_save.
Resolves: Coverity CID 1590980
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Message-ID: <1738788841-211843-1-git-send-email-steven.sistare@oracle.com>
[assert instead of using parameters.mode in cpr_state_save]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Update the migrate_cancel command documentation with a few words about
postcopy and the expected state of the machine after migration.
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20250213175927.19642-10-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The qmp_migrate_cancel() command is poorly tested and code inspection
reveals that there might be concurrency issues with its usage. Add a
test that runs a migration and calls qmp_migrate_cancel() at specific
moments.
In order to make the test more deterministic, instead of calling
qmp_migrate_cancel() at random moments during migration, do it after
the migration status change events are seen.
The expected result is that qmp_migrate_cancel() on the source ends
migration on the source with the "cancelled" state and ends migration
on the destination with the "failed" state. The only exception is that
a failed migration should continue in the failed state.
Cancelling is not allowed during postcopy (no test is added for this
because it's a trivial check in the code).
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20250213175927.19642-9-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Introduce a new migration_test_add_suffix to allow programmatic
creation of tests based on a suffix. Pass the test name into the test
so it can know which variant to run.
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20250213175927.19642-8-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The expected outcome from qmp_migrate_cancel() is that the source
migration goes to the terminal state
MIGRATION_STATUS_CANCELLED. Anything different from this is a bug when
cancelling.
Make sure there is never a state transition from an unspecified state
into FAILED. Code that sets FAILED, should always either make sure
that the old state is not CANCELLING or specify the old state.
Note that the destination is allowed to go into FAILED, so there's no
issue there.
(I don't think this is relevant as a backport because cancelling does
work, it just doesn't show the right state at the end)
Fixes: 3dde8fdbad ("migration: Merge precopy/postcopy on switchover start")
Fixes: d0edb8a173 ("migration: Create the postcopy preempt channel asynchronously")
Fixes: 8518278a6a ("migration: implementation of background snapshot thread")
Fixes: bf78a046b9 ("migration: refactor migrate_fd_connect failures")
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20250213175927.19642-7-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
After postcopy has started, it's not possible to recover the source
machine in case a migration error occurs because the destination has
already been changing the state of the machine. For that same reason,
it doesn't make sense to try to cancel the migration after postcopy
has started. Reject the cancel command during postcopy.
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20250213175927.19642-6-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
If the destination side fails at migration_ioc_process_incoming()
before starting the coroutine, it will report the error but QEMU will
not exit.
Set the migration state to FAILED and exit the process if
exit-on-error allows.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2633
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20250213175927.19642-5-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Remove all instances of _fd_ from the migration generic code. These
functions have grown over time and the _fd_ part is now just
confusing.
migration_fd_error() -> migration_error() makes it a little
vague. Since it's only used for migration_connect() failures, change
it to migration_connect_set_error().
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20250213175927.19642-4-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
There's no need for two separate functions and this _fd_ is a historic
artifact that makes little sense nowadays.
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20250213175927.19642-3-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
There's no point passing the error into migration cancel only for it
to call migrate_set_error().
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-ID: <20250213175927.19642-2-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We're currently only checking the QEMUFile error after
qemu_loadvm_state(). This was causing a TLS termination error from
multifd recv threads to be ignored.
Start checking the migration error as well to avoid missing further
errors.
Regarding compatibility concerning the TLS termination error that was
being ignored, for QEMUs <= 9.2 - if the old QEMU is being used as
migration source - the recently added migration property
multifd-tls-clean-termination needs to be set to OFF in the
*destination* machine.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We're currently changing the way the source multifd migration handles
the shutdown of the multifd channels when TLS is in use to perform a
clean termination by calling gnutls_bye().
Older src QEMUs will always close the channel without terminating the
TLS session. New dst QEMUs treat an unclean termination as an error.
Add multifd_clean_tls_termination (default true) that can be switched
on the destination whenever a src QEMU <= 9.2 is in use.
(Note that the compat property is only strictly necessary for src
QEMUs older than 9.1. Due to synchronization coincidences, src QEMUs
9.1 and 9.2 can put the destination in a condition where it doesn't
see the unclean termination. Still, make the property more inclusive
to facilitate potential backports.)
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The multifd recv side has been getting a TLS error of
GNUTLS_E_PREMATURE_TERMINATION at the end of migration when the send
side closes the sockets without ending the TLS session. This has been
masked by the code not checking the migration error after loadvm.
Start ending the TLS session at multifd_send_shutdown() so the recv
side always sees a clean termination (EOF) and we can start to
differentiate that from an actual premature termination that might
possibly happen in the middle of the migration.
There's nothing to be done if a previous migration error has already
broken the connection, so add a comment explaining it and ignore any
errors coming from gnutls_bye().
This doesn't break compat with older recv-side QEMUs because EOF has
always caused the recv thread to exit cleanly.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Add a read flag that can inform a channel that it's ok to receive an
EOF at any moment. Channels that have some form of strict EOF
tracking, such as TLS session termination, may choose to ignore EOF
errors with the use of this flag.
This is being added for compatibility with older migration streams
that do not include a TLS termination step.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We want to pass flags into qio_channel_tls_readv() but
qio_channel_readv_full_all_eof() doesn't take a flags argument.
No functional change.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The correct way of calling qcrypto_tls_session_handshake() requires
calling qcrypto_tls_session_get_handshake_status() right after it so
there's no reason to have a separate method.
Refactor qcrypto_tls_session_handshake() to inform the status in its
own return value and alter the callers accordingly.
No functional change.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>