mirror of
https://github.com/qemu/qemu.git
synced 2025-08-16 06:43:21 +00:00
docs/devel: Add VFIO iommufd backend documentation
Suggested-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
This commit is contained in:
parent
6106a32914
commit
98dad2b019
@ -2176,6 +2176,7 @@ F: backends/iommufd.c
|
|||||||
F: include/sysemu/iommufd.h
|
F: include/sysemu/iommufd.h
|
||||||
F: include/qemu/chardev_open.h
|
F: include/qemu/chardev_open.h
|
||||||
F: util/chardev_open.c
|
F: util/chardev_open.c
|
||||||
|
F: docs/devel/vfio-iommufd.rst
|
||||||
|
|
||||||
vhost
|
vhost
|
||||||
M: Michael S. Tsirkin <mst@redhat.com>
|
M: Michael S. Tsirkin <mst@redhat.com>
|
||||||
|
@ -18,5 +18,6 @@ Details about QEMU's various subsystems including how to add features to them.
|
|||||||
s390-dasd-ipl
|
s390-dasd-ipl
|
||||||
tracing
|
tracing
|
||||||
vfio-migration
|
vfio-migration
|
||||||
|
vfio-iommufd
|
||||||
writing-monitor-commands
|
writing-monitor-commands
|
||||||
virtio-backends
|
virtio-backends
|
||||||
|
166
docs/devel/vfio-iommufd.rst
Normal file
166
docs/devel/vfio-iommufd.rst
Normal file
@ -0,0 +1,166 @@
|
|||||||
|
===============================
|
||||||
|
IOMMUFD BACKEND usage with VFIO
|
||||||
|
===============================
|
||||||
|
|
||||||
|
(Same meaning for backend/container/BE)
|
||||||
|
|
||||||
|
With the introduction of iommufd, the Linux kernel provides a generic
|
||||||
|
interface for user space drivers to propagate their DMA mappings to kernel
|
||||||
|
for assigned devices. While the legacy kernel interface is group-centric,
|
||||||
|
the new iommufd interface is device-centric, relying on device fd and iommufd.
|
||||||
|
|
||||||
|
To support both interfaces in the QEMU VFIO device, introduce a base container
|
||||||
|
to abstract the common part of VFIO legacy and iommufd container. So that the
|
||||||
|
generic VFIO code can use either container.
|
||||||
|
|
||||||
|
The base container implements generic functions such as memory_listener and
|
||||||
|
address space management whereas the derived container implements callbacks
|
||||||
|
specific to either legacy or iommufd. Each container has its own way to setup
|
||||||
|
secure context and dma management interface. The below diagram shows how it
|
||||||
|
looks like with both containers.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
VFIO AddressSpace/Memory
|
||||||
|
+-------+ +----------+ +-----+ +-----+
|
||||||
|
| pci | | platform | | ap | | ccw |
|
||||||
|
+---+---+ +----+-----+ +--+--+ +--+--+ +----------------------+
|
||||||
|
| | | | | AddressSpace |
|
||||||
|
| | | | +------------+---------+
|
||||||
|
+---V-----------V-----------V--------V----+ /
|
||||||
|
| VFIOAddressSpace | <------------+
|
||||||
|
| | | MemoryListener
|
||||||
|
| VFIOContainerBase list |
|
||||||
|
+-------+----------------------------+----+
|
||||||
|
| |
|
||||||
|
| |
|
||||||
|
+-------V------+ +--------V----------+
|
||||||
|
| iommufd | | vfio legacy |
|
||||||
|
| container | | container |
|
||||||
|
+-------+------+ +--------+----------+
|
||||||
|
| |
|
||||||
|
| /dev/iommu | /dev/vfio/vfio
|
||||||
|
| /dev/vfio/devices/vfioX | /dev/vfio/$group_id
|
||||||
|
Userspace | |
|
||||||
|
============+============================+===========================
|
||||||
|
Kernel | device fd |
|
||||||
|
+---------------+ | group/container fd
|
||||||
|
| (BIND_IOMMUFD | | (SET_CONTAINER/SET_IOMMU)
|
||||||
|
| ATTACH_IOAS) | | device fd
|
||||||
|
| | |
|
||||||
|
| +-------V------------V-----------------+
|
||||||
|
iommufd | | vfio |
|
||||||
|
(map/unmap | +---------+--------------------+-------+
|
||||||
|
ioas_copy) | | | map/unmap
|
||||||
|
| | |
|
||||||
|
+------V------+ +-----V------+ +------V--------+
|
||||||
|
| iommfd core | | device | | vfio iommu |
|
||||||
|
+-------------+ +------------+ +---------------+
|
||||||
|
|
||||||
|
* Secure Context setup
|
||||||
|
|
||||||
|
- iommufd BE: uses device fd and iommufd to setup secure context
|
||||||
|
(bind_iommufd, attach_ioas)
|
||||||
|
- vfio legacy BE: uses group fd and container fd to setup secure context
|
||||||
|
(set_container, set_iommu)
|
||||||
|
|
||||||
|
* Device access
|
||||||
|
|
||||||
|
- iommufd BE: device fd is opened through ``/dev/vfio/devices/vfioX``
|
||||||
|
- vfio legacy BE: device fd is retrieved from group fd ioctl
|
||||||
|
|
||||||
|
* DMA Mapping flow
|
||||||
|
|
||||||
|
1. VFIOAddressSpace receives MemoryRegion add/del via MemoryListener
|
||||||
|
2. VFIO populates DMA map/unmap via the container BEs
|
||||||
|
* iommufd BE: uses iommufd
|
||||||
|
* vfio legacy BE: uses container fd
|
||||||
|
|
||||||
|
Example configuration
|
||||||
|
=====================
|
||||||
|
|
||||||
|
Step 1: configure the host device
|
||||||
|
---------------------------------
|
||||||
|
|
||||||
|
It's exactly same as the VFIO device with legacy VFIO container.
|
||||||
|
|
||||||
|
Step 2: configure QEMU
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
Interactions with the ``/dev/iommu`` are abstracted by a new iommufd
|
||||||
|
object (compiled in with the ``CONFIG_IOMMUFD`` option).
|
||||||
|
|
||||||
|
Any QEMU device (e.g. VFIO device) wishing to use ``/dev/iommu`` must
|
||||||
|
be linked with an iommufd object. It gets a new optional property
|
||||||
|
named iommufd which allows to pass an iommufd object. Take ``vfio-pci``
|
||||||
|
device for example:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
-object iommufd,id=iommufd0
|
||||||
|
-device vfio-pci,host=0000:02:00.0,iommufd=iommufd0
|
||||||
|
|
||||||
|
Note the ``/dev/iommu`` and VFIO cdev can be externally opened by a
|
||||||
|
management layer. In such a case the fd is passed, the fd supports a
|
||||||
|
string naming the fd or a number, for example:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
-object iommufd,id=iommufd0,fd=22
|
||||||
|
-device vfio-pci,iommufd=iommufd0,fd=23
|
||||||
|
|
||||||
|
If the ``fd`` property is not passed, the fd is opened by QEMU.
|
||||||
|
|
||||||
|
If no ``iommufd`` object is passed to the ``vfio-pci`` device, iommufd
|
||||||
|
is not used and the user gets the behavior based on the legacy VFIO
|
||||||
|
container:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
-device vfio-pci,host=0000:02:00.0
|
||||||
|
|
||||||
|
Supported platform
|
||||||
|
==================
|
||||||
|
|
||||||
|
Supports x86, ARM and s390x currently.
|
||||||
|
|
||||||
|
Caveats
|
||||||
|
=======
|
||||||
|
|
||||||
|
Dirty page sync
|
||||||
|
---------------
|
||||||
|
|
||||||
|
Dirty page sync with iommufd backend is unsupported yet, live migration is
|
||||||
|
disabled by default. But it can be force enabled like below, low efficient
|
||||||
|
though.
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
-object iommufd,id=iommufd0
|
||||||
|
-device vfio-pci,host=0000:02:00.0,iommufd=iommufd0,enable-migration=on
|
||||||
|
|
||||||
|
P2P DMA
|
||||||
|
-------
|
||||||
|
|
||||||
|
PCI p2p DMA is unsupported as IOMMUFD doesn't support mapping hardware PCI
|
||||||
|
BAR region yet. Below warning shows for assigned PCI device, it's not a bug.
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
qemu-system-x86_64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI BAR?
|
||||||
|
qemu-system-x86_64: vfio_container_dma_map(0x560cb6cb1620, 0xe000000021000, 0x3000, 0x7f32ed55c000) = -14 (Bad address)
|
||||||
|
|
||||||
|
FD passing with mdev
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
``vfio-pci`` device checks sysfsdev property to decide if backend is a mdev.
|
||||||
|
If FD passing is used, there is no way to know that and the mdev is treated
|
||||||
|
like a real PCI device. There is an error as below if user wants to enable
|
||||||
|
RAM discarding for mdev.
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
qemu-system-x86_64: -device vfio-pci,iommufd=iommufd0,x-balloon-allowed=on,fd=9: vfio VFIO_FD9: x-balloon-allowed only potentially compatible with mdev devices
|
||||||
|
|
||||||
|
``vfio-ap`` and ``vfio-ccw`` devices don't have same issue as their backend
|
||||||
|
devices are always mdev and RAM discarding is force enabled.
|
Loading…
Reference in New Issue
Block a user