add documentation for pci passthrough and sr-iov

explain what it is and how to use it, especially the steps necessary
on the host and the various options under one chapter

most of this is also found on the wiki in the Pci_passthrough
article

we may want to condense the information there and link it as
'notes and examples'

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
This commit is contained in:
Dominik Csapak 2018-11-12 16:00:46 +01:00 committed by Thomas Lamprecht
parent 941ff8d3d0
commit 6e4c46c4cb
2 changed files with 240 additions and 0 deletions

237
qm-pci-passthrough.adoc Normal file
View File

@ -0,0 +1,237 @@
[[qm_pci_passthrough]]
PCI(e) Passthrough
------------------
PCI(e) passthrough is a mechanism to give a virtual machine control over
a pci device usually only available for the host. This can have some
advantages over using virtualized hardware, for example lower latency,
higher performance, or more features (e.g., offloading).
If you pass through a device to a virtual machine, you cannot use that
device anymore on the host or in any other VM.
General Requirements
~~~~~~~~~~~~~~~~~~~~
Since passthrough is a feature which also needs hardware support, there are
some requirements and steps before it can work.
Hardware
^^^^^^^^
Your hardware has to support IOMMU interrupt remapping, this includes CPU and
Mainboard.
Generally Intel systems with VT-d, and AMD systems with AMD-Vi support this,
but it is not guaranteed that everything will work out of the box, due
to bad hardware implementation or missing/low quality drivers.
In most cases, server grade hardware has better support than consumer grade
hardware, but even then, many modern system can support this.
Please refer to your hardware vendor if this is a feature that is supported
under Linux.
Configuration
^^^^^^^^^^^^^
To enable PCI(e) passthrough, there are some configurations needed.
First, the iommu has to be activated on the kernel commandline.
The easiest way is to enable it in */etc/default/grub*. Just add
intel_iommu=on
or if you have AMD hardware:
amd_iommu=on
to GRUB_CMDLINE_LINUX_DEFAULT
After that, make sure you run 'update grub' to update grub.
Second, you have to make sure the following modules are loaded.
This can be achieved by adding them to */etc/modules*
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
After changing anything modules related, you need to refresh your
initramfs with
----
update-initramfs -u -k all
----
Finally reboot and check that it is indeed enabled.
----
dmesg -e DMAR -e IOMMU -e AMD-Vi
----
should display that IOMMU, Directed I/O or Interrupt Remapping is enabled.
(The exact message can vary, depending on hardware and kernel version)
It is also important that the device(s) you want to pass through
are in a seperate IOMMU group. This can be checked with:
----
find /sys/kernel/iommu_groups/ -type l
----
It is okay if the device is in an IOMMU group together with its functions
(e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
.PCI(e) slots
[NOTE]
====
Some platforms handle their PCI(e) slots differently, so if you
do not get the desired IOMMU group separation, it may be helpful to
try to put the card in a another PCI(e) slot.
====
.Unsafe interrupts
[NOTE]
====
For some platforms, it may be necessary to allow unsafe interrupts.
This can most easily enabled with adding the following line
in a .conf file in */etc/modprobe.d/*.
options vfio_iommu_type1 allow_unsafe_interrupts=1
Please be aware that this option can make your system unstable.
====
Host Device Passhtrough
~~~~~~~~~~~~~~~~~~~~~~~
The most used variant of PCI(e) passthrough is to pass through a whole
PCI(e) card, for example a GPU or network card.
Host Configuration
^^^^^^^^^^^^^^^^^^
In this case, the host can not use the card. This can be achieved by two
methods:
Either add the ids to the options of the vfio-pci modules. This works
with adding
options vfio-pci ids=1234:5678,4321:8765
to a .conf file in */etc/modprobe.d/* where 1234:5678 and 4321:8765 are
the vendor and device ids obtained by:
----
lcpci -nn
----
Or simply blacklist the driver completely on the host with
blacklist DRIVERNAME
also in a .conf file in */etc/modprobe.d/*. Again update the initramfs
and reboot after that.
VM Configuration
^^^^^^^^^^^^^^^^
To pass through the device you set *hostpciX* on the VM with
----
qm set VMID -hostpci0 00:02.0
----
If your device has multiple functions, you can pass them through all together
with the shortened syntax
00:02
There are some options to which may be necessary, depending on the device
and guest OS.
* *x-vga=on|off* marks the PCI(e) device the primary GPU of the VM.
With this enabled the *vga* parameter of the config will be ignored.
* *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
combination require PCIe rather than PCI (only available for q35 machine types).
* *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
Some PCI(e) devices need this disabled.
* *romfile=<path>*, is an optional path to a ROM file for the device to use.
this is a relative path under */usr/share/kvm/*.
An example of PCIe passthrough with a GPU set to primary:
----
qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
----
Other considerations
^^^^^^^^^^^^^^^^^^^^
When passing through a GPU, the best compatibility is reached when using
q35 as machine type, OVMF instead of SeaBIOS and PCIe instead of PCI.
Note that if you want to use OVMF for GPU passthrough, the GPU needs
to have an EFI capable ROM, otherwise use SeaBIOS instead.
SR-IOV
~~~~~~
Another variant of passing through PCI(e) devices, is to use the hardware
virtualization features of your devices.
SR-IOV (Single-root input/output virtualization) enables a single device
to provide multiple vf (virtual functions) to the system, so that each
vf can be used in a different VM, with full hardware features, better
performance and lower latency than software virtualized devices.
The most used devices for this are NICs with SR-IOV which can provide
multiple vf per physical port, allowing features such as
checksum offloading, etc. to be used inside a VM, reducing CPU overhead.
Host Configuration
^^^^^^^^^^^^^^^^^^
Generally there are 2 methods for enabling virtual functions on a device.
In some cases there is an option for the driver module e.g. for some
Intel drivers
max_vfs=4
which could be put in a file in a .conf file in */etc/modprobe.d/*.
(Do not forget to update your initramfs after that)
Please refer to your driver module documentation for the exact
parameters and options.
The second (more generic) approach is via the sysfs.
If a device and driver supports this you can change the number of vfs on
the fly. For example 4 vfs on device 0000:01:00.0 with:
----
echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
----
To make this change persistent you can use sysfsutils.
Just install them via
----
apt install sysfsutils
----
and configure it via */etc/sysfs.conf* or */etc/sysfs.d/*.
VM Configuration
^^^^^^^^^^^^^^^^
After creating vfs, you should see them as seperate PCI(e) devices, which
can be passed through like a normal PCI(e) device.
Other considerations
^^^^^^^^^^^^^^^^^^^^
For this feature, platform support is especially important. It may be necessary
to enable this feature in the BIOS or to use a specific PCI(e) port for it
to work. In doubt, consult the manual of the platform or contact the vendor.

View File

@ -1021,6 +1021,9 @@ ifndef::wiki[]
include::qm-cloud-init.adoc[]
endif::wiki[]
ifndef::wiki[]
include::qm-pci-passthrough.adoc[]
endif::wiki[]
Managing Virtual Machines with `qm`