Add basic types and flags used by VFIO multifd device state transfer
support.
Since we'll be introducing a lot of multifd transfer specific code,
add a new file migration-multifd.c to home it, wired into main VFIO
migration code (migration.c) via migration-multifd.h header file.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/4eedd529e6617f80f3d6a66d7268a0db2bc173fa.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
So it can be safety accessed from multiple threads.
This variable type needs to be changed to unsigned long since
32-bit host platforms lack the necessary addition atomics on 64-bit
variables.
Using 32-bit counters on 32-bit host platforms should not be a problem
in practice since they can't realistically address more memory anyway.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/dc391771d2d9ad0f311994f0cb9e666da564aeaf.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
And rename existing load_device_config_state trace event to
load_device_config_state_end for consistency since it is triggered at the
end of loading of the VFIO device config state.
This way both the start and end points of particular device config
loading operation (a long, BQL-serialized operation) are known.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/1b6c5a2097e64c272eb7e53f9e4cca4b79581b38.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This SaveVMHandler helps device provide its own asynchronous transmission
of the remaining data at the end of a precopy phase via multifd channels,
in parallel with the transfer done by save_live_complete_precopy handlers.
These threads are launched only when multifd device state transfer is
supported.
Management of these threads in done in the multifd migration code,
wrapping them in the generic thread pool.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/qemu-devel/eac74a4ca7edd8968bbf72aa07b9041c76364a16.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Since device state transfer via multifd channels requires multifd
channels with packets and is currently not compatible with multifd
compression add an appropriate query function so device can learn
whether it can actually make use of it.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/1ff0d98b85f470e5a33687406e877583b8fab74e.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The newly introduced device state buffer can be used for either storing
VFIO's read() raw data, but already also possible to store generic device
states. After noticing that device states may not easily provide a max
buffer size (also the fact that RAM MultiFDPages_t after all also want to
have flexibility on managing offset[] array), it may not be a good idea to
stick with union on MultiFDSendData.. as it won't play well with such
flexibility.
Switch MultiFDSendData to a struct.
It won't consume a lot more space in reality, after all the real buffers
were already dynamically allocated, so it's so far only about the two
structs (pages, device_state) that will be duplicated, but they're small.
With this, we can remove the pretty hard to understand alloc size logic.
Because now we can allocate offset[] together with the SendData, and
properly free it when the SendData is freed.
[MSS: Make sure to clear possible device state payload before freeing
MultiFDSendData, remove placeholders for other patches not included]
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Acked-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/qemu-devel/7b02baba8e6ddb23ef7c349d312b9b631db09d7e.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This way if there are fields there that needs explicit disposal (like, for
example, some attached buffers) they will be handled appropriately.
Add a related assert to multifd_set_payload_type() in order to make sure
that this function is only used to fill a previously empty MultiFDSendData
with some payload, not the other way around.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/6755205f2b95abbed251f87061feee1c0e410836.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
multifd_send() function is currently not thread safe, make it thread safe
by holding a lock during its execution.
This way it will be possible to safely call it concurrently from multiple
threads.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/dd0f3bcc02ca96a7d523ca58ea69e495a33b453b.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Add a basic support for receiving device state via multifd channels -
channels that are shared with RAM transfers.
Depending whether MULTIFD_FLAG_DEVICE_STATE flag is present or not in the
packet header either device state (MultiFDPacketDeviceState_t) or RAM
data (existing MultiFDPacket_t) is read.
The received device state data is provided to
qemu_loadvm_load_state_buffer() function for processing in the
device's load_state_buffer handler.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/9b86f806c134e7815ecce0eee84f0e0e34aa0146.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Read packet header first so in the future we will be able to
differentiate between a RAM multifd packet and a device state multifd
packet.
Since these two are of different size we can't read the packet body until
we know which packet type it is.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/832ad055fe447561ac1ad565d61658660cb3f63f.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Some drivers might want to make use of auxiliary helper threads during VM
state loading, for example to make sure that their blocking (sync) I/O
operations don't block the rest of the migration process.
Add a migration core managed thread pool to facilitate this use case.
The migration core will wait for these threads to finish before
(re)starting the VM at destination.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/b09fd70369b6159c75847e69f235cb908b02570c.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
All callers to migration_incoming_state_destroy() other than
postcopy_ram_listen_thread() do this call with BQL held.
Since migration_incoming_state_destroy() ultimately calls "load_cleanup"
SaveVMHandlers and it will soon call BQL-sensitive code it makes sense
to always call that function under BQL rather than to have it deal with
both cases (with BQL and without BQL).
Add the necessary bql_lock() and bql_unlock() to
postcopy_ram_listen_thread().
qemu_loadvm_state_main() in postcopy_ram_listen_thread() could call
"load_state" SaveVMHandlers that are expecting BQL to be held.
In principle, the only devices that should be arriving on migration
channel serviced by postcopy_ram_listen_thread() are those that are
postcopiable and whose load handlers are safe to be called without BQL
being held.
But nothing currently prevents the source from sending data for "unsafe"
devices which would cause trouble there.
Add a TODO comment there so it's clear that it would be good to improve
handling of such (erroneous) case in the future.
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/21bb5ca337b1d5a802e697f553f37faf296b5ff4.1741193259.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
qemu_loadvm_load_state_buffer() and its load_state_buffer
SaveVMHandler allow providing device state buffer to explicitly
specified device via its idstr and instance id.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/71ca753286b87831ced4afd422e2e2bed071af25.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This QEMU_VM_COMMAND sub-command and its switchover_start SaveVMHandler is
used to mark the switchover point in main migration stream.
It can be used to inform the destination that all pre-switchover main
migration stream data has been sent/received so it can start to process
post-switchover data that it might have received via other migration
channels like the multifd ones.
Add also the relevant MigrationState bit stream compatibility property and
its hw_compat entry.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Zhang Chen <zhangckid@gmail.com> # for the COLO part
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/311be6da85fc7e49a7598684d80aa631778dcbce.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Migration code wants to manage device data sending threads in one place.
QEMU has an existing thread pool implementation, however it is limited
to queuing AIO operations only and essentially has a 1:1 mapping between
the current AioContext and the AIO ThreadPool in use.
Implement generic (non-AIO) ThreadPool by essentially wrapping Glib's
GThreadPool.
This brings a few new operations on a pool:
* thread_pool_wait() operation waits until all the submitted work requests
have finished.
* thread_pool_set_max_threads() explicitly sets the maximum thread count
in the pool.
* thread_pool_adjust_max_threads_to_work() adjusts the maximum thread count
in the pool to equal the number of still waiting in queue or unfinished work.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/b1efaebdbea7cb7068b8fb74148777012383e12b.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
These names conflict with ones used by future generic thread pool
equivalents.
Generic names should belong to the generic pool type, not specific (AIO)
type.
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/70f9e0fb4b01042258a1a57996c64d19779dc7f0.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This function name conflicts with one used by a future generic thread pool
function and it was only used by one test anyway.
Update the trace event name in thread_pool_submit_aio() accordingly.
Acked-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/6830f07777f939edaf0a2d301c39adcaaf3817f0.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
It's possible for {load,save}_cleanup SaveVMHandlers to get called without
the corresponding {load,save}_setup handler being called first.
One such example is if {load,save}_setup handler of a proceeding device
returns error.
In this case the migration core cleanup code will call all corresponding
cleanup handlers, even for these devices which haven't had its setup
handler called.
Since this behavior can generate some surprises let's clearly document it
in these SaveVMHandlers description.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/qemu-devel/991636623fb780350f493b5f045cb17e13ce4c0f.1741124640.git.maciej.szmigiero@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
As an outcome of KVM forum 2024 "vfio-platform: live and let die?"
talk, let's deprecate vfio-platform devices.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250305124225.952791-1-eric.auger@redhat.com
[ clg: Fixed spelling in vfio-amd-xgbe section ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
As suggested by Cédric, I'm glad to be a maintainer of vfio-igd.
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250227162741.9860-1-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
We want the device in the D0 power state going into reset, but the
config write can enable the BARs in the address space, which are
then removed from the address space once we clear the memory enable
bit in the command register. Re-order to clear the command bit
first, so the power state change doesn't enable the BARs.
Cc: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-6-alex.williamson@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The pm_cap on the PCIExpressDevice object can be distilled down
to the new instance on the PCIDevice object.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-5-alex.williamson@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This is now redundant to PCIDevice.pm_cap.
Cc: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-4-alex.williamson@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Switch callers directly initializing the PCI PM capability with
pci_add_capability() to use pci_pm_init().
Cc: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Stefan Weil <sw@weilnetz.de>
Cc: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Klaus Jensen <its@irrelevant.dk>
Cc: Jesper Devantier <foss@defmacro.it>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-3-alex.williamson@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The memory and IO BARs for devices are only accessible in the D0 power
state. In other power states the PCI spec defines that the device
responds to TLPs and messages with an Unsupported Request response.
To approximate this behavior, consider the BARs as unmapped when the
device is not in the D0 power state. This makes the BARs inaccessible
and has the additional bonus for vfio-pci that we don't attempt to DMA
map BARs for devices in a non-D0 power state.
To support this, an interface is added for devices to register the PM
capability, which allows central tracking to enforce valid transitions
and unmap BARs in non-D0 states.
NB. We currently have device models (eepro100 and pcie_pci_bridge)
that register a PM capability but do not set wmask to enable writes to
the power state field. In order to maintain migration compatibility,
this new helper does not manage the wmask to enable guest writes to
initiate a power state change. The contents and write access of the
PM capability are still managed by the caller.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-2-alex.williamson@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Investigate the git history to uncover when and why the VFIO
properties were introduced and update the models. This is mostly
targeting vfio-pci device, since vfio-platform, vfio-ap and vfio-ccw
devices are simpler.
Sort the properties based on the QEMU version in which they were
introduced.
Cc: Tony Krowiak <akrowiak@linux.ibm.com>
Cc: Eric Farman <farman@linux.ibm.com>
Cc: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com> # vfio-ccw
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250217173455.449983-1-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This error was discovered by fuzzing qemu-img.
In the QED block driver, the need_check_timer timer is freed in
bdrv_qed_detach_aio_context, but the pointer to the timer is not
set to NULL. This can lead to a use-after-free scenario
in bdrv_qed_drain_begin().
The need_check_timer pointer is set to NULL after freeing the timer.
Which helps catch this condition when checking in bdrv_qed_drain_begin().
Closes: https://gitlab.com/qemu-project/qemu/-/issues/2852
Signed-off-by: Denis Rastyogin <gerben@altlinux.org>
Message-ID: <20250304083927.37681-1-gerben@altlinux.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
At least the simple trace backend works by spawning a helper thread,
and setting up an atexit() handler that coordinates completion with
the helper thread. But since atexit registrations survive fork() but
helper threads do not, this means that qemu-nbd configured to use the
simple trace will deadlock waiting for a thread that no longer exists
when it has daemonized.
Better is to follow the example of vl.c: don't call any setup
functions that might spawn helper threads until we are in the final
process that will be doing the work worth tracing.
Tested by configuring with --enable-trace-backends=simple, then running
qemu-nbd --fork --trace=nbd_\*,file=qemu-nbd.trace -f raw -r README.rst
followed by `nbdinfo nbd://localhost`, and observing that the trace
file is now created without hanging.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-ID: <20250227220625.870246-2-eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQQNhkKjomWfgLCz0aQfewwSUazn0QUCZ8ezJgAKCRAfewwSUazn
0T9pAQCKb+C69kbf9cEVg7PU/z5I0ALFtCNWCKxSkWZPDPik4gEA3IYwdrJ+csuX
8nWL0fzyk+8+LDzEwEgCYoNcMnttRQw=
=P8mi
-----END PGP SIGNATURE-----
Merge tag 'pull-loongarch-20250305' of https://gitlab.com/bibo-mao/qemu into staging
loongarch queue
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQQNhkKjomWfgLCz0aQfewwSUazn0QUCZ8ezJgAKCRAfewwSUazn
# 0T9pAQCKb+C69kbf9cEVg7PU/z5I0ALFtCNWCKxSkWZPDPik4gEA3IYwdrJ+csuX
# 8nWL0fzyk+8+LDzEwEgCYoNcMnttRQw=
# =P8mi
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 05 Mar 2025 10:12:54 HKT
# gpg: using EDDSA key 0D8642A3A2659F80B0B3D1A41F7B0C1251ACE7D1
# gpg: Good signature from "bibo mao <maobibo@loongson.cn>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 7044 3A00 19C0 E97A 31C7 13C4 8E86 8FB7 A176 9D4C
# Subkey fingerprint: 0D86 42A3 A265 9F80 B0B3 D1A4 1F7B 0C12 51AC E7D1
* tag 'pull-loongarch-20250305' of https://gitlab.com/bibo-mao/qemu:
target/loongarch: Adjust the cpu reset action to a proper position
hw/loongarch/virt: Enable cpu hotplug feature on virt machine
hw/loongarch/virt: Update the ACPI table for hotplug cpu
hw/loongarch/virt: Implement cpu plug interface
hw/loongarch/virt: Implement cpu unplug interface
hw/loongarch/virt: Add basic cpu plug interface framework
hw/loongarch/virt: Add topo properties on CPU object
hw/loongarch/virt: Add CPU topology support
hw/intc/loongarch_extioi: Use cpu plug notification
hw/intc/loongarch_extioi: Implment cpu hotplug interface
hw/intc/loongarch_extioi: Add basic hotplug framework
hw/intc/loongarch_extioi: Move gpio irq initial to common code
hw/intc/loongarch_ipi: Notify ipi object when cpu is plugged
hw/intc/loongarch_ipi: Implment cpu hotplug interface
hw/intc/loongarch_ipi: Add basic hotplug framework
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* CSR coverity fixes
* Fix unexpected behavior of vector reduction instructions when vl is 0
* Fix incorrect vlen comparison in prop_vlen_set
* Throw debug exception before page fault
* Remove redundant "hart_idx" masking from APLIC
* Add support for Control Transfer Records Ext
* Remove redundant struct members from the IOMMU
* Remove duplicate definitions from the IOMMU
* Fix tick_offset migration for Goldfish RTC
* Add serial alias in virt machine DTB
* Remove Bin Meng from RISC-V maintainers
* Add support for Control Transfer Records Ext
* Log guest errors when reserved bits are set in PTEs
* Add missing Sdtrig disas CSRs
* Correct the hpmevent sscofpmf mask
* Mask upper sscofpmf bits during validation
* Remove warnings about Smdbltrp/Smrnmi being disabled
* Respect mseccfg.RLB bit for TOR mode PMP entry
* Update KVM support to Linux 6.14-rc3
* IOMMU HPM support
* Support Sscofpmf/Svade/Svadu/Smnpm/Ssnpm extensions in KVM
* Add --ignore-family option to binfmt
* Refinement for AIA with KVM acceleration
* Reset time changes for KVM
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmfHrkEACgkQr3yVEwxT
gBNGTA/+N9nBPZt5cv0E/0EDZMQS8RQrQvz1yHRgAXOq8RnOdcL72v8wovGAfnVu
l0BXDoVBvw4f2Xm9Q4ptlfH8HAefCeQ4E/K9j5Lwxr8OqZHFg6e+JQIyZOt6wBWI
hJbz1/laJIbXq3cGgwcE/l0aGfb2UAAsA4dsZVt/MnjAV8GS7BF9RCkgCPxD4FZA
0PLiq9dF+4o4q7PxnxAbUVz/uhLzqmcnQemQFHbf9Wms3tZEDKmPSoKP/v+01Rkw
tm+cgy7OocpgygbMc0nykYG50P+raUBSesk/jFGeKj8cU4IeMuzDsVPWcd4rG+0X
Z+nENfOY7vOqMCXgaQCW2r4vEQx2Gj0yQG6xmVAemRWzFHJdz5W01/uUSHzJSB+L
+VbAH55HYKr6sbgecqInQ/rsHKyw6D5QFcj/guz+kvhsH9rJ5q60uywrWL5OEuaK
vKv7cSZghlf9bwy6soassXxk8z+j4psJ7WnnVpynNKMew9yFFDhayuIFbo9952gH
3+NCm2cQrkTYJOXAJwkxBD+I4AXxNSuxNjaVANk9q80uqbT9JiHM7pcvbJI00Fji
OutJSPYtVXEin9Ev3sJ05YQHsIcZ/Noi3O5IdaRI0AMk/8gyGyhFCVgSpV52dH59
HguPK05e5cW/xgElGUPHrU+UtzE05p18HnSoVPclF/B5rc8QXN0=
=dobk
-----END PGP SIGNATURE-----
Merge tag 'pull-riscv-to-apply-20250305-1' of https://github.com/alistair23/qemu into staging
Third RISC-V PR for 10.0
* CSR coverity fixes
* Fix unexpected behavior of vector reduction instructions when vl is 0
* Fix incorrect vlen comparison in prop_vlen_set
* Throw debug exception before page fault
* Remove redundant "hart_idx" masking from APLIC
* Add support for Control Transfer Records Ext
* Remove redundant struct members from the IOMMU
* Remove duplicate definitions from the IOMMU
* Fix tick_offset migration for Goldfish RTC
* Add serial alias in virt machine DTB
* Remove Bin Meng from RISC-V maintainers
* Add support for Control Transfer Records Ext
* Log guest errors when reserved bits are set in PTEs
* Add missing Sdtrig disas CSRs
* Correct the hpmevent sscofpmf mask
* Mask upper sscofpmf bits during validation
* Remove warnings about Smdbltrp/Smrnmi being disabled
* Respect mseccfg.RLB bit for TOR mode PMP entry
* Update KVM support to Linux 6.14-rc3
* IOMMU HPM support
* Support Sscofpmf/Svade/Svadu/Smnpm/Ssnpm extensions in KVM
* Add --ignore-family option to binfmt
* Refinement for AIA with KVM acceleration
* Reset time changes for KVM
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmfHrkEACgkQr3yVEwxT
# gBNGTA/+N9nBPZt5cv0E/0EDZMQS8RQrQvz1yHRgAXOq8RnOdcL72v8wovGAfnVu
# l0BXDoVBvw4f2Xm9Q4ptlfH8HAefCeQ4E/K9j5Lwxr8OqZHFg6e+JQIyZOt6wBWI
# hJbz1/laJIbXq3cGgwcE/l0aGfb2UAAsA4dsZVt/MnjAV8GS7BF9RCkgCPxD4FZA
# 0PLiq9dF+4o4q7PxnxAbUVz/uhLzqmcnQemQFHbf9Wms3tZEDKmPSoKP/v+01Rkw
# tm+cgy7OocpgygbMc0nykYG50P+raUBSesk/jFGeKj8cU4IeMuzDsVPWcd4rG+0X
# Z+nENfOY7vOqMCXgaQCW2r4vEQx2Gj0yQG6xmVAemRWzFHJdz5W01/uUSHzJSB+L
# +VbAH55HYKr6sbgecqInQ/rsHKyw6D5QFcj/guz+kvhsH9rJ5q60uywrWL5OEuaK
# vKv7cSZghlf9bwy6soassXxk8z+j4psJ7WnnVpynNKMew9yFFDhayuIFbo9952gH
# 3+NCm2cQrkTYJOXAJwkxBD+I4AXxNSuxNjaVANk9q80uqbT9JiHM7pcvbJI00Fji
# OutJSPYtVXEin9Ev3sJ05YQHsIcZ/Noi3O5IdaRI0AMk/8gyGyhFCVgSpV52dH59
# HguPK05e5cW/xgElGUPHrU+UtzE05p18HnSoVPclF/B5rc8QXN0=
# =dobk
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 05 Mar 2025 09:52:01 HKT
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20250305-1' of https://github.com/alistair23/qemu: (59 commits)
target/riscv/kvm: add missing KVM CSRs
target/riscv/kvm: add kvm_riscv_reset_regs_csr()
target/riscv/cpu: remove unneeded !kvm_enabled() check
hw/intc/aplic: refine kvm_msicfgaddr
hw/intc/aplic: refine the APLIC realize
hw/intc/imsic: refine the IMSIC realize
binfmt: Add --ignore-family option
binfmt: Normalize host CPU architecture
binfmt: Shuffle things around
target/riscv/kvm: Add some exts support
docs/specs/riscv-iommu.rst: add HPM support info
hw/riscv: add IOMMU HPM trace events
hw/riscv/riscv-iommu.c: add RISCV_IOMMU_CAP_HPM cap
hw/riscv/riscv-iommu: add hpm events mmio write
hw/riscv/riscv-iommu: add IOHPMCYCLES mmio write
hw/riscv/riscv-iommu: add IOCOUNTINH mmio writes
hw/riscv/riscv-iommu: instantiate hpm_timer
hw/riscv/riscv-iommu: add riscv_iommu_hpm_incr_ctr()
hw/riscv/riscv-iommu: add riscv-iommu-hpm file
hw/riscv/riscv-iommu-bits.h: HPM bits
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If the chardev is client, the socket file path in localAddr may be NULL.
This is because the socket path comes from getsockname(), according
to man page, getsockname() returns the current address bound by the
socket sockfd. If the chardev is client, it's socket is unbound sockfd.
Therefore, when computing the client chardev socket file path, using
remoteAddr is more appropriate.
Signed-off-by: Haoqian He <haoqian.he@smartx.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20250225104526.2924175-1-haoqian.he@smartx.com>
This patch implements DCH (delete character) and ICH (insert
character) commands.
DCH - Delete Character:
"As characters are deleted, the remaining characters between the
cursor and right margin move to the left. Character attributes move
with the characters. The terminal adds blank spaces with no visual
character attributes at the right margin. DCH has no effect outside
the scrolling margins" [1].
ICH - Insert Character:
"The ICH sequence inserts Pn blank characters with the normal
character attribute. The cursor remains at the beginning of the
blank characters. Text between the cursor and right margin moves to
the right. Characters scrolled past the right margin are lost. ICH
has no effect outside the scrolling margins" [2].
Without these commands console is barely usable.
[1] https://vt100.net/docs/vt510-rm/DCH.html
[1] https://vt100.net/docs/vt510-rm/ICH.html
Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com>
Cc: qemu-devel@nongnu.org
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20250226075913.353676-6-r.peniaev@gmail.com>
There are aliases for save and restore cursor commands:
* save cursor
`ESC 7` (DEC Save Cursor [1], older VT100)
`ESC [ s` (CSI Save Cursor, standard ANSI)
* load cursor
`ESC 8` (DEC Restore Cursor [2], older VT100)
`ESC [ u` (CSI Restore Cursor, standard ANSI)
This change introduces older DEC sequencies for compatibility with
some scripts (for example [3]) and tools.
This change also adds saving and restoring of character attributes,
which is according to the VT spec [1][2]
[1] https://vt100.net/docs/vt510-rm/DECSC.html
[2] https://vt100.net/docs/vt510-rm/DECRC.html
[3] https://wiki.archlinux.org/title/Working_with_the_serial_console#Resizing_a_terminal
Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com>
Cc: qemu-devel@nongnu.org
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20250226075913.353676-5-r.peniaev@gmail.com>
The format of the CSI cursor position report is `ESC[row;columnR`,
where `row` is a row of a cursor in the screen, not in the scrollback
buffer. What's the difference? Let's say the terminal screen has 24
lines, no matter how long the scrollback buffer may be, the last line
is the 24th.
For example the following command can be executed in xterm on the last
screen line:
$ echo -en '\e[6n'; IFS='[;' read -sdR _ row col; echo $row:$col
24:1
It shows the cursor position on the current screen and not relative
to the backscroll buffer.
Before this change the row number was always increasing for the QEMU
VC and represents the cursor position relative to the backscroll
buffer.
Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com>
Cc: qemu-devel@nongnu.org
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20250226075913.353676-4-r.peniaev@gmail.com>
Terminal Device Status Report (DSR) [1] should be sent to an
application, not rendered to the screen. This patch fixes rendering of
terminal report, which appear only on the graphical screen of the
terminal (console "vc") and can be reproduced by the following
command:
echo -en '\e[6n'; IFS='[;' read -sdR _ row col; echo $row:$col
Command requests cursor position and waits for terminal response, but
instead, the response is rendered to the graphical screen and never
sent to an application.
Why bother? Busybox shell (ash) in Alpine distribution requests cursor
position on each shell prompt (once <ENTER> is pressed), which makes a
prompt on a graphical screen corrupted with repeating Cursor Position
Report (CPR) [2]:
[root@alpine ~]# \033[57;1R]
Which is very annoying and incorrect.
[1] https://vt100.net/docs/vt100-ug/chapter3.html#DSR
[2] https://vt100.net/docs/vt100-ug/chapter3.html#CPR
Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com>
Cc: qemu-devel@nongnu.org
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20250226075913.353676-3-r.peniaev@gmail.com>
This change introduces parsing of the 'ESC ( <ch>' sequence, which is
supposed to change character set [1]. In the QEMU case, the
introduced parsing logic does not actually change the character set, but
simply parses the sequence and does not let output of a tool to be
corrupted with leftovers: `top` sends 'ESC ( B', so if character
sequence is not parsed correctly, chracter 'B' appears in the output:
Btop - 11:08:42 up 5 min, 1 user, load average: 0BB
Tasks:B 158 Btotal,B 1 Brunning,B 157 Bsleeping,B 0 BsBB
%Cpu(s):B 0.0 Bus,B 0.0 Bsy,B 0.0 Bni,B 99.8 Bid,B 0.2 BB
MiB Mem :B 7955.6 Btotal,B 7778.6 Bfree,B 79.6 BB
MiB Swap:B 0.0 Btotal,B 0.0 Bfree,B 0.0 BB
PID USER PR NI VIRT RES SHR S B
B 735 root 20 0 9328 3540 3152 R B
B 1 root 20 0 20084 10904 8404 S B
B 2 root 20 0 0 0 0 S B
[1] https://vt100.net/docs/vt100-ug/chapter3.html#SCS
Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com>
Cc: qemu-devel@nongnu.org
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20250226075913.353676-2-r.peniaev@gmail.com>
The commit 5a99a10da6cf ("target/loongarch: fix vcpu reset command word issue")
fixes the error in the cpu reset ioctl command word delivery process,
so that the command word can be delivered correctly, and adds the judgment
and processing of the error return value, which exposes another problem that
under loongarch, the cpu reset action is earlier than the creation of vcpu.
An error occurs when the cpu reset command is sent.
Now adjust the order of cpu reset and vcpu create actions to fix this problem
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
On virt machine, enable CPU hotplug feature has_hotpluggable_cpus. For
hot-added CPUs, there is socket-id/core-id/thread-id property set,
arch_id can be caculated from these properties. So that cpu slot can be
searched from its arch_id.
Co-developed-by: Xianglai Li <lixianglai@loongson.cn>
Signed-off-by: Bibo Mao <maobibo@loongson.cn>