mirror of
https://git.proxmox.com/git/pve-docs
synced 2025-06-23 00:36:41 +00:00
followup: typos, formatting and wording improvements
fix some typos like s/seperate/separate/ passhtrough, ... Reword some sentences, try to reduces commas per sentence (while I like them, others don't and they entangle your mind when reading a bit). Try to improve formatting, adding some emphasis on abbrevations or other important things. Ensur all abbrev. are writen uppercase. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This commit is contained in:
parent
6e4c46c4cb
commit
49f20f1b0f
@ -3,235 +3,278 @@ PCI(e) Passthrough
|
|||||||
------------------
|
------------------
|
||||||
|
|
||||||
PCI(e) passthrough is a mechanism to give a virtual machine control over
|
PCI(e) passthrough is a mechanism to give a virtual machine control over
|
||||||
a pci device usually only available for the host. This can have some
|
a PCI device from the host. This can have some advantages over using
|
||||||
advantages over using virtualized hardware, for example lower latency,
|
virtualized hardware, for example lower latency, higher performance, or more
|
||||||
higher performance, or more features (e.g., offloading).
|
features (e.g., offloading).
|
||||||
|
|
||||||
If you pass through a device to a virtual machine, you cannot use that
|
But, if you pass through a device to a virtual machine, you cannot use that
|
||||||
device anymore on the host or in any other VM.
|
device anymore on the host or in any other VM.
|
||||||
|
|
||||||
General Requirements
|
General Requirements
|
||||||
~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
Since passthrough is a feature which also needs hardware support, there are
|
Since passthrough is a feature which also needs hardware support, there are
|
||||||
some requirements and steps before it can work.
|
some requirements to check and preparations to be done to make it work.
|
||||||
|
|
||||||
|
|
||||||
Hardware
|
Hardware
|
||||||
^^^^^^^^
|
^^^^^^^^
|
||||||
|
Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement
|
||||||
|
**U**nit) interrupt remapping, this includes the CPU and the mainboard.
|
||||||
|
|
||||||
Your hardware has to support IOMMU interrupt remapping, this includes CPU and
|
Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this.
|
||||||
Mainboard.
|
But it is not guaranteed that everything will work out of the box, due
|
||||||
|
to bad hardware implementation and missing or low quality drivers.
|
||||||
|
|
||||||
Generally Intel systems with VT-d, and AMD systems with AMD-Vi support this,
|
Further, server grade hardware has often better support than consumer grade
|
||||||
but it is not guaranteed that everything will work out of the box, due
|
|
||||||
to bad hardware implementation or missing/low quality drivers.
|
|
||||||
|
|
||||||
In most cases, server grade hardware has better support than consumer grade
|
|
||||||
hardware, but even then, many modern system can support this.
|
hardware, but even then, many modern system can support this.
|
||||||
|
|
||||||
Please refer to your hardware vendor if this is a feature that is supported
|
Please refer to your hardware vendor to check if they support this feature
|
||||||
under Linux.
|
under Linux for your specific setup
|
||||||
|
|
||||||
|
|
||||||
Configuration
|
Configuration
|
||||||
^^^^^^^^^^^^^
|
^^^^^^^^^^^^^
|
||||||
|
|
||||||
To enable PCI(e) passthrough, there are some configurations needed.
|
Once you ensured that your hardware supports passthrough, you will need to do
|
||||||
|
some configuration to enable PCI(e) passthrough.
|
||||||
|
|
||||||
First, the iommu has to be activated on the kernel commandline.
|
|
||||||
The easiest way is to enable it in */etc/default/grub*. Just add
|
|
||||||
|
|
||||||
|
IOMMU
|
||||||
|
+++++
|
||||||
|
|
||||||
|
The IOMMU has to be activated on the kernel commandline. The easiest way is to
|
||||||
|
enable trough grub. Edit `'/etc/default/grub'' and add the following to th
|
||||||
|
'GRUB_CMDLINE_LINUX_DEFAULT' variable:
|
||||||
|
|
||||||
|
* for Intel CPUs:
|
||||||
|
+
|
||||||
|
----
|
||||||
intel_iommu=on
|
intel_iommu=on
|
||||||
|
----
|
||||||
or if you have AMD hardware:
|
* for AMD CPUs:
|
||||||
|
+
|
||||||
|
----
|
||||||
amd_iommu=on
|
amd_iommu=on
|
||||||
|
----
|
||||||
|
|
||||||
to GRUB_CMDLINE_LINUX_DEFAULT
|
To bring this change in effect, make sure you run:
|
||||||
|
|
||||||
After that, make sure you run 'update grub' to update grub.
|
----
|
||||||
|
# update-grub
|
||||||
|
----
|
||||||
|
|
||||||
Second, you have to make sure the following modules are loaded.
|
Kernel Modules
|
||||||
This can be achieved by adding them to */etc/modules*
|
++++++++++++++
|
||||||
|
|
||||||
|
You have to make sure the following modules are loaded. This can be achieved by
|
||||||
|
adding them to `'/etc/modules''
|
||||||
|
|
||||||
|
----
|
||||||
vfio
|
vfio
|
||||||
vfio_iommu_type1
|
vfio_iommu_type1
|
||||||
vfio_pci
|
vfio_pci
|
||||||
vfio_virqfd
|
vfio_virqfd
|
||||||
|
----
|
||||||
|
|
||||||
|
[[qm_pci_passthrough_update_initramfs]]
|
||||||
After changing anything modules related, you need to refresh your
|
After changing anything modules related, you need to refresh your
|
||||||
initramfs with
|
`initramfs`. On {pve} this can be done by executing:
|
||||||
|
|
||||||
----
|
----
|
||||||
update-initramfs -u -k all
|
# update-initramfs -u -k all
|
||||||
----
|
----
|
||||||
|
|
||||||
Finally reboot and check that it is indeed enabled.
|
Finish Configuration
|
||||||
|
++++++++++++++++++++
|
||||||
|
|
||||||
|
Finally reboot to bring the changes into effect and check that it is indeed
|
||||||
|
enabled.
|
||||||
|
|
||||||
----
|
----
|
||||||
dmesg -e DMAR -e IOMMU -e AMD-Vi
|
# dmesg -e DMAR -e IOMMU -e AMD-Vi
|
||||||
----
|
----
|
||||||
|
|
||||||
should display that IOMMU, Directed I/O or Interrupt Remapping is enabled.
|
should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is
|
||||||
(The exact message can vary, depending on hardware and kernel version)
|
enabled, depending on hardware and kernel the exact message can vary.
|
||||||
|
|
||||||
It is also important that the device(s) you want to pass through
|
It is also important that the device(s) you want to pass through
|
||||||
are in a seperate IOMMU group. This can be checked with:
|
are in a *separate* `IOMMU` group. This can be checked with:
|
||||||
|
|
||||||
----
|
----
|
||||||
find /sys/kernel/iommu_groups/ -type l
|
# find /sys/kernel/iommu_groups/ -type l
|
||||||
----
|
----
|
||||||
|
|
||||||
It is okay if the device is in an IOMMU group together with its functions
|
It is okay if the device is in an `IOMMU` group together with its functions
|
||||||
(e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
|
(e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
|
||||||
|
|
||||||
.PCI(e) slots
|
.PCI(e) slots
|
||||||
[NOTE]
|
[NOTE]
|
||||||
====
|
====
|
||||||
Some platforms handle their PCI(e) slots differently, so if you
|
Some platforms handle their physical PCI(e) slots differently. So, sometimes
|
||||||
do not get the desired IOMMU group separation, it may be helpful to
|
it can help to put the card in a another PCI(e) slot, if you do not get the
|
||||||
try to put the card in a another PCI(e) slot.
|
desired `IOMMU` group separation.
|
||||||
====
|
====
|
||||||
|
|
||||||
.Unsafe interrupts
|
.Unsafe interrupts
|
||||||
[NOTE]
|
[NOTE]
|
||||||
====
|
====
|
||||||
For some platforms, it may be necessary to allow unsafe interrupts.
|
For some platforms, it may be necessary to allow unsafe interrupts.
|
||||||
This can most easily enabled with adding the following line
|
For this add the following line in a file ending with `.conf' file in
|
||||||
in a .conf file in */etc/modprobe.d/*.
|
*/etc/modprobe.d/*:
|
||||||
|
|
||||||
|
----
|
||||||
options vfio_iommu_type1 allow_unsafe_interrupts=1
|
options vfio_iommu_type1 allow_unsafe_interrupts=1
|
||||||
|
----
|
||||||
|
|
||||||
Please be aware that this option can make your system unstable.
|
Please be aware that this option can make your system unstable.
|
||||||
====
|
====
|
||||||
|
|
||||||
Host Device Passhtrough
|
Host Device Passthrough
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The most used variant of PCI(e) passthrough is to pass through a whole
|
The most used variant of PCI(e) passthrough is to pass through a whole
|
||||||
PCI(e) card, for example a GPU or network card.
|
PCI(e) card, for example a GPU or a network card.
|
||||||
|
|
||||||
|
|
||||||
Host Configuration
|
Host Configuration
|
||||||
^^^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
In this case, the host can not use the card. This can be achieved by two
|
In this case, the host cannot use the card. There are two methods to achieve
|
||||||
methods:
|
this:
|
||||||
|
|
||||||
Either add the ids to the options of the vfio-pci modules. This works
|
|
||||||
with adding
|
|
||||||
|
|
||||||
|
* pass the device IDs to the options of the 'vfio-pci' modules by adding
|
||||||
|
+
|
||||||
|
----
|
||||||
options vfio-pci ids=1234:5678,4321:8765
|
options vfio-pci ids=1234:5678,4321:8765
|
||||||
|
|
||||||
to a .conf file in */etc/modprobe.d/* where 1234:5678 and 4321:8765 are
|
|
||||||
the vendor and device ids obtained by:
|
|
||||||
|
|
||||||
----
|
----
|
||||||
lcpci -nn
|
+
|
||||||
|
to a .conf file in */etc/modprobe.d/* where `1234:5678` and `4321:8765` are
|
||||||
|
the vendor and device IDs obtained by:
|
||||||
|
+
|
||||||
|
----
|
||||||
|
# lcpci -nn
|
||||||
----
|
----
|
||||||
|
|
||||||
Or simply blacklist the driver completely on the host with
|
* blacklist the driver completely on the host, ensuring that it is free to bind
|
||||||
|
for passthrough, with
|
||||||
|
+
|
||||||
|
----
|
||||||
blacklist DRIVERNAME
|
blacklist DRIVERNAME
|
||||||
|
----
|
||||||
|
+
|
||||||
|
in a .conf file in */etc/modprobe.d/*.
|
||||||
|
|
||||||
also in a .conf file in */etc/modprobe.d/*. Again update the initramfs
|
For both methods you need to
|
||||||
and reboot after that.
|
xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and
|
||||||
|
reboot after that.
|
||||||
|
|
||||||
|
[[qm_pci_passthrough_vm_config]]
|
||||||
VM Configuration
|
VM Configuration
|
||||||
^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^
|
||||||
|
To pass through the device you need to set the *hostpciX* option in the VM
|
||||||
To pass through the device you set *hostpciX* on the VM with
|
configuration, for example by executing:
|
||||||
|
|
||||||
----
|
----
|
||||||
qm set VMID -hostpci0 00:02.0
|
# qm set VMID -hostpci0 00:02.0
|
||||||
----
|
----
|
||||||
|
|
||||||
If your device has multiple functions, you can pass them through all together
|
If your device has multiple functions, you can pass them through all together
|
||||||
with the shortened syntax
|
with the shortened syntax ``00:02`'
|
||||||
|
|
||||||
00:02
|
|
||||||
|
|
||||||
There are some options to which may be necessary, depending on the device
|
There are some options to which may be necessary, depending on the device
|
||||||
and guest OS.
|
and guest OS:
|
||||||
|
|
||||||
|
* *x-vga=on|off* marks the PCI(e) device as the primary GPU of the VM.
|
||||||
|
With this enabled the *vga* configuration option will be ignored.
|
||||||
|
|
||||||
* *x-vga=on|off* marks the PCI(e) device the primary GPU of the VM.
|
|
||||||
With this enabled the *vga* parameter of the config will be ignored.
|
|
||||||
* *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
|
* *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
|
||||||
combination require PCIe rather than PCI (only available for q35 machine types).
|
combination require PCIe rather than PCI. PCIe is only available for 'q35'
|
||||||
|
machine types.
|
||||||
|
|
||||||
* *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
|
* *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
|
||||||
Some PCI(e) devices need this disabled.
|
Some PCI(e) devices need this disabled.
|
||||||
|
|
||||||
* *romfile=<path>*, is an optional path to a ROM file for the device to use.
|
* *romfile=<path>*, is an optional path to a ROM file for the device to use.
|
||||||
this is a relative path under */usr/share/kvm/*.
|
This is a relative path under */usr/share/kvm/*.
|
||||||
|
|
||||||
|
Example
|
||||||
|
+++++++
|
||||||
|
|
||||||
An example of PCIe passthrough with a GPU set to primary:
|
An example of PCIe passthrough with a GPU set to primary:
|
||||||
|
|
||||||
----
|
----
|
||||||
qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
|
# qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
|
||||||
----
|
----
|
||||||
|
|
||||||
|
|
||||||
Other considerations
|
Other considerations
|
||||||
^^^^^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
When passing through a GPU, the best compatibility is reached when using
|
When passing through a GPU, the best compatibility is reached when using
|
||||||
q35 as machine type, OVMF instead of SeaBIOS and PCIe instead of PCI.
|
'q35' as machine type, 'OVMF' ('EFI' for VMs) instead of SeaBIOS and PCIe
|
||||||
Note that if you want to use OVMF for GPU passthrough, the GPU needs
|
instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the
|
||||||
to have an EFI capable ROM, otherwise use SeaBIOS instead.
|
GPU needs to have an EFI capable ROM, otherwise use SeaBIOS instead.
|
||||||
|
|
||||||
SR-IOV
|
SR-IOV
|
||||||
~~~~~~
|
~~~~~~
|
||||||
|
|
||||||
Another variant of passing through PCI(e) devices, is to use the hardware
|
Another variant for passing through PCI(e) devices, is to use the hardware
|
||||||
virtualization features of your devices.
|
virtualization features of your devices, if available.
|
||||||
|
|
||||||
SR-IOV (Single-root input/output virtualization) enables a single device
|
'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables
|
||||||
to provide multiple vf (virtual functions) to the system, so that each
|
a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the
|
||||||
vf can be used in a different VM, with full hardware features, better
|
system. Each of those 'VF' can be used in a different VM, with full hardware
|
||||||
performance and lower latency than software virtualized devices.
|
features and also better performance and lower latency than software
|
||||||
|
virtualized devices.
|
||||||
|
|
||||||
|
Currently, the most common use case for this are NICs (**N**etwork
|
||||||
|
**I**nterface **C**ard) with SR-IOV support, which can provide multiple VFs per
|
||||||
|
physical port. This allows using features such as checksum offloading, etc. to
|
||||||
|
be used inside a VM, reducing the (host) CPU overhead.
|
||||||
|
|
||||||
The most used devices for this are NICs with SR-IOV which can provide
|
|
||||||
multiple vf per physical port, allowing features such as
|
|
||||||
checksum offloading, etc. to be used inside a VM, reducing CPU overhead.
|
|
||||||
|
|
||||||
Host Configuration
|
Host Configuration
|
||||||
^^^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
Generally there are 2 methods for enabling virtual functions on a device.
|
Generally, there are two methods for enabling virtual functions on a device.
|
||||||
|
|
||||||
In some cases there is an option for the driver module e.g. for some
|
* sometimes there is an option for the driver module e.g. for some
|
||||||
Intel drivers
|
Intel drivers
|
||||||
|
+
|
||||||
|
----
|
||||||
max_vfs=4
|
max_vfs=4
|
||||||
|
----
|
||||||
which could be put in a file in a .conf file in */etc/modprobe.d/*.
|
+
|
||||||
|
which could be put file with '.conf' ending under */etc/modprobe.d/*.
|
||||||
(Do not forget to update your initramfs after that)
|
(Do not forget to update your initramfs after that)
|
||||||
|
+
|
||||||
Please refer to your driver module documentation for the exact
|
Please refer to your driver module documentation for the exact
|
||||||
parameters and options.
|
parameters and options.
|
||||||
|
|
||||||
The second (more generic) approach is via the sysfs.
|
* The second, more generic, approach is using the `sysfs`.
|
||||||
If a device and driver supports this you can change the number of vfs on
|
If a device and driver supports this you can change the number of VFs on
|
||||||
the fly. For example 4 vfs on device 0000:01:00.0 with:
|
the fly. For example, to setup 4 VFs on device 0000:01:00.0 execute:
|
||||||
|
+
|
||||||
----
|
----
|
||||||
echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
|
# echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
|
||||||
----
|
----
|
||||||
|
+
|
||||||
To make this change persistent you can use sysfsutils.
|
To make this change persistent you can use the `sysfsutils` Debian package.
|
||||||
Just install them via
|
After installation configure it via */etc/sysfs.conf* or a `FILE.conf' inf
|
||||||
|
*/etc/sysfs.d/*.
|
||||||
----
|
|
||||||
apt install sysfsutils
|
|
||||||
----
|
|
||||||
|
|
||||||
and configure it via */etc/sysfs.conf* or */etc/sysfs.d/*.
|
|
||||||
|
|
||||||
VM Configuration
|
VM Configuration
|
||||||
^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
After creating vfs, you should see them as seperate PCI(e) devices, which
|
After creating VFs, you should see them as separate PCI(e) devices when
|
||||||
can be passed through like a normal PCI(e) device.
|
outputting them with `lspci`. Get their ID and pass them through like a
|
||||||
|
xref:qm_pci_passthrough_vm_config[normal PCI(e) device].
|
||||||
|
|
||||||
Other considerations
|
Other considerations
|
||||||
^^^^^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
For this feature, platform support is especially important. It may be necessary
|
For this feature, platform support is especially important. It may be necessary
|
||||||
to enable this feature in the BIOS or to use a specific PCI(e) port for it
|
to enable this feature in the BIOS/EFI first, or to use a specific PCI(e) port
|
||||||
to work. In doubt, consult the manual of the platform or contact the vendor.
|
for it to work. In doubt, consult the manual of the platform or contact its
|
||||||
|
vendor.
|
||||||
|
Loading…
Reference in New Issue
Block a user