Proxmox-Port/qemu - qemu - Gitea: Git with a cup of tea

mirror of https://github.com/qemu/qemu.git synced 2025-08-15 05:04:03 +00:00

Author	SHA1	Message	Date
Steve Sistare	322ee16824	vfio/pci: preserve pending interrupts cpr-transfer may lose a VFIO interrupt because the KVM instance is destroyed and recreated. If an interrupt arrives in the middle, it is dropped. To fix, stop pending new interrupts during cpr save, and pick up the pieces. In more detail: Stop the VCPUs. Call kvm_irqchip_remove_irqfd_notifier_gsi --> KVM_IRQFD to deassign the irqfd gsi that routes interrupts directly to the VCPU and KVM. After this call, interrupts fall back to the kernel vfio_msihandler, which writes to QEMU's kvm_interrupt eventfd. CPR already preserves that eventfd. When the route is re-established in new QEMU, the kernel tests the eventfd and injects an interrupt to KVM if necessary. Deassign INTx in a similar manner. For both MSI and INTx, remove the eventfd handler so old QEMU does not consume an event. If an interrupt was already pended to KVM prior to the completion of kvm_irqchip_remove_irqfd_notifier_gsi, it will be recovered by the subsequent call to cpu_synchronize_all_states, which pulls KVM interrupt state to userland prior to saving it in vmstate. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1752689169-233452-3-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-08-09 00:06:48 +02:00
Steve Sistare	76cfb87f5f	vfio/pci: augment set_handler Extend vfio_pci_msi_set_handler() so it can set or clear the handler. Add a similar accessor for INTx. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1752689169-233452-2-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-08-09 00:06:48 +02:00
Tomita Moeko	0db7e4cb62	vfio/igd: Fix VGA regions are not exposed in legacy mode In commit `a59d06305f` ("vfio/pci: Introduce x-pci-class-code option"), pci_register_vga() has been moved ouside of vfio_populate_vga(). As a result, IGD VGA ranges are no longer properly exposed to guest. To fix this, call pci_register_vga() after vfio_populate_vga() legacy mode. A wrapper function vfio_pci_config_register_vga() is introduced to handle it. Fixes: `a59d06305f` ("vfio/pci: Introduce x-pci-class-code option") Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250723160906.44941-3-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-07-28 17:52:34 +02:00
Steve Sistare	9751377c3a	vfio: fix sub-page bar after cpr Regions for sub-page BARs are normally mapped here, in response to the guest writing to PCI config space: vfio_pci_write_config() pci_default_write_config() pci_update_mappings() memory_region_add_subregion() vfio_sub_page_bar_update_mapping() ... vfio_dma_map() However, after CPR, the guest does not reconfigure the device and the code path above is not taken. To fix, in vfio_cpr_pci_post_load, call vfio_sub_page_bar_update_mapping for each sub-page BAR with a valid address. Fixes: `7e9f214113` ("vfio/container: restore DMA vaddr") Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/1752520890-223356-1-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-07-28 17:52:34 +02:00
Stefan Hajnoczi	ebcc602aae	Load ramfb vgabios on x86 only. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEoDKM/7k6F6eZAf59TLbY7tPocTgFAmh6o80ACgkQTLbY7tPo cTjxPBAAktTXxFK6loSMSWC1ul8RCl/4F7G84J4eT+Ui8/KIG8do5KcebTnXb9zo keOG7n9HPk4fROWiAFgGnuBfw41DWmLDS34iuENrG3X26TQgSSgBveuwas67Pzqu HpaFSxjh7BRLlkUWaNoll57cDM3kKLmx+Onw6m/7kbcVXAsy1N4wxfCT1faUU7ID R1ggULG1WhB8q+YtQjac6EfOpdHe1BTBGLuxSwE3mNkce9ZP7C8uxZTCR5PXggZi IXzJzGpFRDCHqrilWksiE62yF20Kem4ZcpO/GgLWmF+X+DYBDEWcajihvF20TGUL n6dyT7MBxuvqFy0OtBPHNcnq2PZzOIKyxyMvBg9402xeD6goNbFKloAYeae4C9u0 QuqQUpb8D3lVagVu55N5XfpdMHR0P8yefPAjaFL4o3rf2JSjyI6MRX/+2eA7aXcX xiwHSx3iavEeNQNsPZsS3JhH5bKy/zkWRiBd+msGVAYMZGzhdEtLg/w8yUd6dQ5p /3Y3F4fL6T6QSwhsiihcbdPtjhfVCP09MYK/P4cIFbWOzjfbndt1/UIXHQ54s8Jo PShcE7QH7ttT2gK5nFPG5yeTqF70kKpSyhwF2pukf2fAgcU+0SNoj2zZNtHAvKeh 8EHqAy8m1J4AlQeO5nT9tJj/v1CM0q6cljzIfV8hWWgM/hL/vLc= =76m5 -----END PGP SIGNATURE----- Merge tag 'display-20250718-pull-request' of https://gitlab.com/kraxel/qemu into staging Load ramfb vgabios on x86 only. # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCgAdFiEEoDKM/7k6F6eZAf59TLbY7tPocTgFAmh6o80ACgkQTLbY7tPo # cTjxPBAAktTXxFK6loSMSWC1ul8RCl/4F7G84J4eT+Ui8/KIG8do5KcebTnXb9zo # keOG7n9HPk4fROWiAFgGnuBfw41DWmLDS34iuENrG3X26TQgSSgBveuwas67Pzqu # HpaFSxjh7BRLlkUWaNoll57cDM3kKLmx+Onw6m/7kbcVXAsy1N4wxfCT1faUU7ID # R1ggULG1WhB8q+YtQjac6EfOpdHe1BTBGLuxSwE3mNkce9ZP7C8uxZTCR5PXggZi # IXzJzGpFRDCHqrilWksiE62yF20Kem4ZcpO/GgLWmF+X+DYBDEWcajihvF20TGUL # n6dyT7MBxuvqFy0OtBPHNcnq2PZzOIKyxyMvBg9402xeD6goNbFKloAYeae4C9u0 # QuqQUpb8D3lVagVu55N5XfpdMHR0P8yefPAjaFL4o3rf2JSjyI6MRX/+2eA7aXcX # xiwHSx3iavEeNQNsPZsS3JhH5bKy/zkWRiBd+msGVAYMZGzhdEtLg/w8yUd6dQ5p # /3Y3F4fL6T6QSwhsiihcbdPtjhfVCP09MYK/P4cIFbWOzjfbndt1/UIXHQ54s8Jo # PShcE7QH7ttT2gK5nFPG5yeTqF70kKpSyhwF2pukf2fAgcU+0SNoj2zZNtHAvKeh # 8EHqAy8m1J4AlQeO5nT9tJj/v1CM0q6cljzIfV8hWWgM/hL/vLc= # =76m5 # -----END PGP SIGNATURE----- # gpg: Signature made Fri 18 Jul 2025 15:43:09 EDT # gpg: using RSA key A0328CFFB93A17A79901FE7D4CB6D8EED3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full] # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" [full] # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full] # Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138 * tag 'display-20250718-pull-request' of https://gitlab.com/kraxel/qemu: hw/i386: Add the ramfb romfile compatibility vfio: Move the TYPE_* to hw/vfio/types.h ramfb: Add property to control if load the romfile Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Conflicts: hw/core/machine.c Context conflict because the vfio-pci "x-migration-load-config-after-iter" was added recently.	2025-07-21 12:24:36 -04:00
Shaoqin Huang	b53a3bba5e	vfio: Move the TYPE_* to hw/vfio/types.h Move the TYPE_* to a new file hw/vfio/types.h because the TYPE_VFIO_PCI will be used in later patch, but directly include the hw/vfio/pci.h can cause some compilation error when cross build the windows version. The hw/vfio/types.h can be included to mitigate that problem. Signed-off-by: Shaoqin Huang <shahuang@redhat.com> Message-ID: <20250717100941.2230408-3-shahuang@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2025-07-18 21:41:45 +02:00
Shaoqin Huang	350785d41d	ramfb: Add property to control if load the romfile Currently the ramfb device loads the vgabios-ramfb.bin unconditionally, but only the x86 need the vgabios-ramfb.bin, this can cause that when use the release package on arm64 it can't find the vgabios-ramfb.bin. Because only seabios will use the vgabios-ramfb.bin, load the rom logic is x86-specific. For other !x86 platforms, the edk2 ships an EFI driver for ramfb, so they don't need to load the romfile. So add a new property use-legacy-x86-rom in both ramfb and vfio_pci device, because the vfio display also use the ramfb_setup() to load the vgabios-ramfb.bin file. After have this property, the machine type can set the compatibility to not load the vgabios-ramfb.bin if the arch doesn't need it. For now the default value is true but it will be turned off by default in subsequent patch when compats get properly handled. Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Shaoqin Huang <shahuang@redhat.com> Message-ID: <20250717100941.2230408-2-shahuang@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2025-07-17 12:44:57 +02:00
Tomita Moeko	a59d06305f	vfio/pci: Introduce x-pci-class-code option Introduce x-pci-class-code option to allow users to override PCI class code of a device, similar to the existing x-pci-vendor-id option. Only the lower 24 bits of this option are used, though a uint32 is used here for determining whether the value is valid and set by user. Additionally, to ensure VGA ranges are only exposed on VGA devices, pci_register_vga() is now called in vfio_pci_config_setup(), after the class code override is completed. This is mainly intended for IGD devices that expose themselves either as VGA controller (primary display) or Display controller (non-primary display). The UEFI GOP driver depends on the device reporting a VGA controller class code (0x030000). Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250708145211.6179-1-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-07-15 17:11:12 +02:00
Steve Sistare	30edcb4d4e	vfio-pci: preserve MSI Save the MSI message area as part of vfio-pci vmstate, and preserve the interrupt and notifier eventfd's. migrate_incoming loads the MSI data, then the vfio-pci post_load handler finds the eventfds in CPR state, rebuilds vector data structures, and attaches the interrupts to the new KVM instance. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1751493538-202042-2-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-07-03 13:42:28 +02:00
John Levon	777e45c7b9	vfio-user: forward MSI-X PBA BAR accesses to server For vfio-user, the server holds the pending IRQ state; set up an I/O region for the MSI-X PBA so we can ask the server for this state on a PBA read. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-11-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-06-26 08:55:38 +02:00
Steve Sistare	6f06e3729a	vfio/pci: export MSI functions Export various MSI functions, renamed with a vfio_pci prefix, for use by CPR in subsequent patches. No functional change. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/1749569991-25171-18-git-send-email-steven.sistare@oracle.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-06-11 14:01:58 +02:00
John Levon	7163c0bca7	vfio: export PCI helpers needed for vfio-user The vfio-user code will need to re-use various parts of the vfio PCI code. Export them in hw/vfio/pci.h, and rename them to the vfio_pci_* namespace. Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250607001056.335310-2-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-06-11 14:01:58 +02:00
John Levon	d4e392d0a9	vfio: add vfio-pci-base class Split out parts of TYPE_VFIO_PCI into a base TYPE_VFIO_PCI_BASE, although we have not yet introduced another subclass, so all the properties have remained in TYPE_VFIO_PCI. Note that currently there is no need for additional data for TYPE_VFIO_PCI, so it shares the same C struct type as TYPE_VFIO_PCI_BASE, VFIOPCIDevice. Originally-by: John Johnson <john.g.johnson@oracle.com> Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Signed-off-by: John Levon <john.levon@nutanix.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-14-john.levon@nutanix.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-05-09 12:42:28 +02:00
Cédric Le Goater	11b8b9d53d	vfio: Rename vfio-common.h to vfio-device.h "hw/vfio/vfio-common.h" has been emptied of most of its declarations by the previous changes and the only declarations left are related to VFIODevice. Rename it to "hw/vfio/vfio-device.h" and make the necessary adjustments. Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-36-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-04-25 09:01:37 +02:00
Cédric Le Goater	499e53cce9	vfio: Introduce new files for VFIORegion definitions and declarations Gather all VFIORegion related declarations and definitions into their own files to reduce exposure of VFIO internals in "hw/vfio/vfio-common.h". They were introduced for 'vfio-platform' support in commits `db0da029a1` ("vfio: Generalize region support") and `a664477db8` ("hw/vfio/pci: Introduce VFIORegion"). To be noted that the 'vfio-platform' devices have been deprecated and will be removed in QEMU 10.2. Until then, make the declarations available externally for 'sysbus-fdt.c'. Cc: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-12-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-04-25 09:01:37 +02:00
Cédric Le Goater	aa173cb279	vfio: Introduce a new header file for VFIOdisplay declarations Gather all VFIOdisplay related declarations into "vfio-display.h" to reduce exposure of VFIO internals in "hw/vfio/vfio-common.h". Reviewed-by: John Levon <john.levon@nutanix.com> Link: https://lore.kernel.org/qemu-devel/20250318095415.670319-8-clg@redhat.com Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-9-clg@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-04-25 09:01:37 +02:00
Richard Henderson	8be545ba5a	include/system: Move exec/memory.h to system/memory.h Convert the existing includes with sed -i ,exec/memory.h,system/memory.h,g Move the include within cpu-all.h into a !CONFIG_USER_ONLY block. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2025-04-23 14:08:21 -07:00
Tomita Moeko	674f61117d	vfio/igd: Introduce x-igd-lpc option for LPC bridge ID quirk The LPC bridge/Host bridge IDs quirk is also not dependent on legacy mode. Recent Windows driver no longer depends on these IDs, as well as Linux i915 driver, while UEFI GOP seems still needs them. Make it an option to allow users enabling and disabling it as needed. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Link: https://lore.kernel.org/qemu-devel/20250306180131.32970-10-tomitamoeko@gmail.com [ clg: - Fixed spelling in vfio_probe_igd_config_quirk() ] Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-03-11 17:01:14 +01:00
Tomita Moeko	434ac62ef2	vfio/igd: Handle x-igd-opregion option in config quirk Both enable OpRegion option (x-igd-opregion) and legacy mode require setting up OpRegion copy for IGD devices. As the config quirk no longer depends on legacy mode, we can now handle x-igd-opregion option there instead of in vfio_realize. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Link: https://lore.kernel.org/qemu-devel/20250306180131.32970-9-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-03-11 17:01:14 +01:00
Tomita Moeko	2fedccf03c	vfio/igd: Decouple common quirks from legacy mode So far, IGD-specific quirks all require enabling legacy mode, which is toggled by assigning IGD to 00:02.0. However, some quirks, like the BDSM and GGC register quirks, should be applied to all supported IGD devices. A new config option, x-igd-legacy-mode=[on\|off\|auto], is introduced to control the legacy mode only quirks. The default value is "auto", which keeps current behavior that enables legacy mode implicitly and continues on error when all following conditions are met. * Machine type is i440fx * IGD device is at guest BDF 00:02.0 If any one of the conditions above is not met, the default behavior is equivalent to "off", QEMU will fail immediately if any error occurs. Users can also use "on" to force enabling legacy mode. It checks if all the conditions above are met and set up legacy mode. QEMU will also fail immediately on error in this case. Additionally, the hotplug check in legacy mode is removed as hotplugging IGD device is never supported, and it will be checked when enabling the OpRegion quirk. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Link: https://lore.kernel.org/qemu-devel/20250306180131.32970-8-tomitamoeko@gmail.com [ clg: - Changed warn_report() by info_report() in vfio_probe_igd_config_quirk() as suggested by Alex W. - Fixed spelling in vfio_probe_igd_config_quirk () ] Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-03-11 17:01:14 +01:00
Tomita Moeko	9267f96ad6	vfio/igd: Refactor vfio_probe_igd_bar4_quirk into pci config quirk The actual IO BAR4 write quirk in vfio_probe_igd_bar4_quirk was removed in previous change, leaving the function not matching its name, so move it into the newly introduced vfio_config_quirk_setup. There is no functional change in this commit. For now, to align with current legacy mode behavior, it returns and proceeds on error. Later it will fail on error after decoupling the quirks from legacy mode. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Link: https://lore.kernel.org/qemu-devel/20250306180131.32970-7-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-03-11 17:01:14 +01:00
Tomita Moeko	b22ab580d2	vfio/pci: Add placeholder for device-specific config space quirks IGD devices require device-specific quirk to be applied to their PCI config space. Currently, it is put in the BAR4 quirk that does nothing to BAR4 itself. Add a placeholder for PCI config space quirks to hold that quirk later. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Link: https://lore.kernel.org/qemu-devel/20250306180131.32970-6-tomitamoeko@gmail.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-03-11 17:01:14 +01:00
Tomita Moeko	ae9d9ec4a6	vfio/igd: Consolidate OpRegion initialization into a single function Both x-igd-opregion option and legacy mode require identical steps to set up OpRegion for IGD devices. Consolidate these steps into a single vfio_pci_igd_setup_opregion function. The function call in pci.c is wrapped with ifdef temporarily to prevent build error for non-x86 archs, it will be removed after we decouple it from legacy mode. Additionally, move vfio_pci_igd_opregion_init to igd.c to prevent it from being compiled in non-x86 builds. Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com> Link: https://lore.kernel.org/qemu-devel/20250306180131.32970-4-tomitamoeko@gmail.com [ clg: Fixed spelling in vfio_pci_igd_setup_opregion() ] Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-03-11 17:01:14 +01:00
Alex Williamson	05c6a8eff6	vfio/pci: Delete local pm_cap This is now redundant to PCIDevice.pm_cap. Cc: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-4-alex.williamson@redhat.com Signed-off-by: Cédric Le Goater <clg@redhat.com>	2025-03-06 06:47:33 +01:00
Philippe Mathieu-Daudé	32cad1ffb8	include: Rename sysemu/ -> system/ Headers in include/sysemu/ are not only related to system emulation, they are also used by virtualization. Rename as system/ which is clearer. Files renamed manually then mechanical change using sed tool. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Lei Yang <leiyang@redhat.com> Message-Id: <20241203172445.28576-1-philmd@linaro.org>	2024-12-20 17:44:56 +01:00
Corvin Köhne	11b5ce95be	vfio/igd: add new bar0 quirk to emulate BDSM mirror The BDSM register is mirrored into MMIO space at least for gen 11 and later devices. Unfortunately, the Windows driver reads the register value from MMIO space instead of PCI config space for those devices [1]. Therefore, we either have to keep a 1:1 mapping for the host and guest address or we have to emulate the MMIO register too. Using the igd in legacy mode is already hard due to it's many constraints. Keeping a 1:1 mapping may not work in all cases and makes it even harder to use. An MMIO emulation has to trap the whole MMIO page. This makes accesses to this page slower compared to using second level address translation. Nevertheless, it doesn't have any constraints and I haven't noticed any performance degradation yet making it a better solution. [1] `5c351bee0f/devicemodel/hw/pci/passthrough.c (L650-L653)` Signed-off-by: Corvin Köhne <c.koehne@beckhoff.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>	2024-09-17 10:37:55 +02:00
Zhenzhong Duan	0a0bda0acd	vfio/pci-quirks: Make vfio_add_*_cap() return bool This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Include below functions: vfio_add_virt_caps() vfio_add_nv_gpudirect_cap() vfio_add_vmd_shadow_cap() Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:22 +02:00
Zhenzhong Duan	d3c6a18bc7	vfio/pci-quirks: Make vfio_pci_igd_opregion_init() return bool This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Zhenzhong Duan	64410a741d	vfio/pci: Make vfio_populate_vga() return bool This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Zhenzhong Duan	455c009dc4	vfio/display: Make vfio_display_*() return bool This is to follow the coding standand in qapi/error.h to return bool for bool-valued functions. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-22 10:04:21 +02:00
Vinayak Kale	187716feeb	vfio/pci: migration: Skip config space check for Vendor Specific Information in VSC during restore/load In case of migration, during restore operation, qemu checks config space of the pci device with the config space in the migration stream captured during save operation. In case of config space data mismatch, restore operation is failed. config space check is done in function get_pci_config_device(). By default VSC (vendor-specific-capability) in config space is checked. Due to qemu's config space check for VSC, live migration is broken across NVIDIA vGPU devices in situation where source and destination host driver is different. In this situation, Vendor Specific Information in VSC varies on the destination to ensure vGPU feature capabilities exposed to the guest driver are compatible with destination host. If a vfio-pci device is migration capable and vfio-pci vendor driver is OK with volatile Vendor Specific Info in VSC then qemu should exempt config space check for Vendor Specific Info. It is vendor driver's responsibility to ensure that VSC is consistent across migration. Here consistency could mean that VSC format should be same on source and destination, however actual Vendor Specific Info may not be byte-to-byte identical. This patch skips the check for Vendor Specific Information in VSC for VFIO-PCI device by clearing pdev->cmask[] offsets. Config space check is still enforced for 3 byte VSC header. If cmask[] is not set for an offset, then qemu skips config space check for that offset. VSC check is skipped for machine types >= 9.1. The check would be enforced on older machine types (<= 9.0). Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Cédric Le Goater <clg@redhat.com> Signed-off-by: Vinayak Kale <vkale@nvidia.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2024-05-16 16:59:19 +02:00
Zhenzhong Duan	c328e7e8ad	vfio/pci: Introduce a vfio pci hot reset interface Legacy vfio pci and iommufd cdev have different process to hot reset vfio device, expand current code to abstract out pci_hot_reset callback for legacy vfio, this same interface will also be used by iommufd cdev vfio device. Rename vfio_pci_hot_reset to vfio_legacy_pci_hot_reset and move it into container.c. vfio_pci_[pre/post]_reset and vfio_pci_host_match are exported so they could be called in legacy and iommufd pci_hot_reset callback. Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-12-19 19:03:38 +01:00
Zhenzhong Duan	4d36ec23a7	vfio/pci: Extract out a helper vfio_pci_get_pci_hot_reset_info This helper will be used by both legacy and iommufd backends. No functional changes intended. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-12-19 19:03:38 +01:00
Marc-André Lureau	8741781157	hw/vfio: add ramfb migration support Add a "VFIODisplay" subsection whenever "x-ramfb-migrate" is turned on. Turn it off by default on machines <= 8.1 for compatibility reasons. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> [ clg: - checkpatch fixes - improved warn_report() in vfio_realize() ] Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-10-18 10:10:49 +02:00
Jing Liu	45d85f6228	vfio/pci: detect the support of dynamic MSI-X allocation Kernel provides the guidance of dynamic MSI-X allocation support of passthrough device, by clearing the VFIO_IRQ_INFO_NORESIZE flag to guide user space. Fetch the flags from host to determine if dynamic MSI-X allocation is supported. Originally-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-10-05 22:04:51 +02:00
Cédric Le Goater	44fa20c928	spapr: Remove support for NVIDIA V100 GPU with NVLink2 NVLink2 support was removed from the PPC PowerNV platform and VFIO in Linux 5.13 with commits : 562d1e207d32 ("powerpc/powernv: remove the nvlink support") b392a1989170 ("vfio/pci: remove vfio_pci_nvlink2") This was 2.5 years ago. Do the same in QEMU with a revert of commit `ec132efaa8` ("spapr: Support NVIDIA V100 GPU with NVLink2"). Some adjustements are required on the NUMA part. Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Message-ID: <20230918091717.149950-1-clg@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2023-09-18 07:25:28 -03:00
Alex Williamson	c00aac6f14	vfio/pci: Enable AtomicOps completers on root ports Dynamically enable Atomic Ops completer support around realize/exit of vfio-pci devices reporting host support for these accesses and adhering to a minimal configuration standard. While the Atomic Ops completer bits in the root port device capabilities2 register are read-only, the PCIe spec does allow RO bits to change to reflect hardware state. We take advantage of that here around the realize and exit functions of the vfio-pci device. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Robin Voetter <robin@streamhpc.com> Tested-by: Robin Voetter <robin@streamhpc.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>	2023-07-10 09:52:52 +02:00
Minwoo Im	2dca1b37a7	vfio/pci: add support for VF token VF token was introduced [1] to kernel vfio-pci along with SR-IOV support [2]. This patch adds support VF token among PF and VF(s). To passthu PCIe VF to a VM, kernel >= v5.7 needs this. It can be configured with UUID like: -device vfio-pci,host=DDDD:BB:DD:F,vf-token=<uuid>,... [1] https://lore.kernel.org/linux-pci/158396393244.5601.10297430724964025753.stgit@gimli.home/ [2] https://lore.kernel.org/linux-pci/158396044753.5601.14804870681174789709.stgit@gimli.home/ Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Minwoo Im <minwoo.im@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Link: https://lore.kernel.org/r/20230320073522epcms2p48f682ecdb73e0ae1a4850ad0712fd780@epcms2p4 Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2023-05-09 09:30:13 -06:00
Markus Armbruster	edf5ca5dbe	include/hw/pci: Split pci_device.h off pci.h PCIDeviceClass and PCIDevice are defined in pci.h. Many users of the header don't actually need them. Similar structs live in their own headers: PCIBusClass and PCIBus in pci_bus.h, PCIBridge in pci_bridge.h, PCIHostBridgeClass and PCIHostState in pci_host.h, PCIExpressHost in pcie_host.h, and PCIERootPortClass, PCIEPort, and PCIESlot in pcie_port.h. Move PCIDeviceClass and PCIDeviceClass to new pci_device.h, along with the code that needs them. Adjust include directives. This also enables the next commit. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20221222100330.380143-6-armbru@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-01-08 01:54:22 -05:00
Longpeng(Mike)	dc580d51f7	vfio: defer to commit kvm irq routing when enable msi/msix In migration resume phase, all unmasked msix vectors need to be setup when loading the VF state. However, the setup operation would take longer if the VM has more VFs and each VF has more unmasked vectors. The hot spot is kvm_irqchip_commit_routes, it'll scan and update all irqfds that are already assigned each invocation, so more vectors means need more time to process them. vfio_pci_load_config vfio_msix_enable msix_set_vector_notifiers for (vector = 0; vector < dev->msix_entries_nr; vector++) { vfio_msix_vector_do_use vfio_add_kvm_msi_virq kvm_irqchip_commit_routes <-- expensive } We can reduce the cost by only committing once outside the loop. The routes are cached in kvm_state, we commit them first and then bind irqfd for each vector. The test VM has 128 vcpus and 8 VF (each one has 65 vectors), we measure the cost of the vfio_msix_enable for each VF, and we can see 90+% costs can be reduce. VF Count of irqfds[] Original With this patch 1st 65 8 2 2nd 130 15 2 3rd 195 22 2 4th 260 24 3 5th 325 36 2 6th 390 44 3 7th 455 51 3 8th 520 58 4 Total 258ms 21ms [] Count of irqfds How many irqfds that already assigned and need to process in this round. The optimization can be applied to msi type too. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220326060226.1892-6-longpeng2@huawei.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-06 09:06:50 -06:00
Philippe Mathieu-Daudé	4eda914cac	hw/vfio/pci-quirks: Replace the word 'blacklist' Follow the inclusive terminology from the "Conscious Language in your Open Source Projects" guidelines [] and replace the word "blacklist" appropriately. [] https://github.com/conscious-lang/conscious-lang-docs/blob/main/faq.md Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210205171817.2108907-9-philmd@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:06:44 -06:00
Kirti Wankhede	a22651053b	vfio: Make vfio-pci device migration capable If the device is not a failover primary device, call vfio_migration_probe() and vfio_migration_finalize() to enable migration support for those devices that support it respectively to tear it down again. Removed migration blocker from VFIO PCI device specific structure and use migration blocker from generic structure of VFIO device. Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Neo Jia <cjia@nvidia.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-11-01 12:30:51 -07:00
Eduardo Habkost	8063396bf3	Use OBJECT_DECLARE_SIMPLE_TYPE when possible This converts existing DECLARE_INSTANCE_CHECKER usage to OBJECT_DECLARE_SIMPLE_TYPE when possible. $ ./scripts/codeconverter/converter.py -i \ --pattern=AddObjectDeclareSimpleType $(git grep -l '' -- '*.[ch]') Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: Paul Durrant <paul@xen.org> Message-Id: <20200916182519.415636-6-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-18 14:12:32 -04:00
Eduardo Habkost	01b4606440	vfio: Rename PCI_VFIO to VFIO_PCI Make the type checking macro name consistent with the TYPE_* constant. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20200902224311.1321159-56-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-09 13:20:22 -04:00
Eduardo Habkost	8110fa1d94	Use DECLARE_CHECKER macros Generated using: $ ./scripts/codeconverter/converter.py -i \ --pattern=TypeCheckMacro $(git grep -l '' -- '*.[ch]') Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-12-ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-13-ehabkost@redhat.com> Message-Id: <20200831210740.126168-14-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-09 09:27:09 -04:00
Eduardo Habkost	db1015e92e	Move QOM typedefs and add missing includes Some typedefs and macros are defined after the type check macros. This makes it difficult to automatically replace their definitions with OBJECT_DECLARE_TYPE. Patch generated using: $ ./scripts/codeconverter/converter.py -i \ --pattern=QOMStructTypedefSplit $(git grep -l '' -- '.[ch]') which will split "typdef struct { ... } TypedefName" declarations. Followed by: $ ./scripts/codeconverter/converter.py -i --pattern=MoveSymbols \ $(git grep -l '' -- '.[ch]') which will: - move the typedefs and #defines above the type check macros - add missing #include "qom/object.h" lines if necessary Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-9-ehabkost@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20200831210740.126168-10-ehabkost@redhat.com> Message-Id: <20200831210740.126168-11-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-09-09 09:26:43 -04:00
Eduardo Habkost	42db0fb5e0	vfio/pci: Move QOM macros to header This will make future conversion to OBJECT_DECLARE* easier. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Tested-By: Roman Bolshakov <r.bolshakov@yadro.com> Message-Id: <20200825192110.3528606-43-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-08-27 14:04:55 -04:00
Thomas Huth	29d62771c8	hw/vfio: Move the IGD quirk code to a separate file The IGD quirk code defines a separate device, the so-called "vfio-pci-igd-lpc-bridge" which shows up as a user-creatable device in all QEMU binaries that include the vfio code. This is a little bit unfortunate for two reasons: First, this device is completely useless in binaries like qemu-system-s390x. Second we also would like to disable it in downstream RHEL which currently requires some extra patches there since the device does not have a proper Kconfig-style switch yet. So it would be good if the device could be disabled more easily, thus let's move the code to a separate file instead and introduce a proper Kconfig switch for it which gets only enabled by default if we also have CONFIG_PC_PCI enabled. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2020-02-06 11:55:42 -07:00
David Gibson	c5478fea27	vfio/pci: Respond to KVM irqchip change notifier VFIO PCI devices already respond to the pci intx routing notifier, in order to update kernel irqchip mappings when routing is updated. However this won't handle the case where the irqchip itself is replaced by a different model while retaining the same routing. This case can happen on the pseries machine type due to PAPR feature negotiation. To handle that case, add a handler for the irqchip change notifier, which does much the same thing as the routing notifier, but is unconditional, rather than being a no-op when the routing hasn't changed. Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Alex Williamson <alex.williamson@redhat.com>	2019-11-26 10:11:30 +11:00
Jens Freimann	f045a0104c	vfio: unplug failover primary device before migration As usual block all vfio-pci devices from being migrated, but make an exception for failover primary devices. This is achieved by setting unmigratable to 0 but also add a migration blocker for all vfio-pci devices except failover primary devices. These will be unplugged before migration happens by the migration handler of the corresponding virtio-net standby device. Signed-off-by: Jens Freimann <jfreimann@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Message-Id: <20191029114905.6856-12-jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-10-29 18:55:26 -04:00

1 2

83 Commits