mirror of
				https://github.com/qemu/qemu.git
				synced 2025-10-31 12:07:31 +00:00 
			
		
		
		
	 3e8f715815
			
		
	
	
		3e8f715815
		
	
	
	
	
		
			
			Put in a more accessible place the reasoning behind our decision to officially drop KVM support in the powernv machine. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20211130133153.444601-3-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
		
			
				
	
	
		
			207 lines
		
	
	
		
			7.3 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			207 lines
		
	
	
		
			7.3 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| PowerNV family boards (``powernv8``, ``powernv9``, ``powernv10``)
 | |
| ==================================================================
 | |
| 
 | |
| PowerNV (as Non-Virtualized) is the "bare metal" platform using the
 | |
| OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can
 | |
| be used as an hypervisor OS, running KVM guests, or simply as a host
 | |
| OS.
 | |
| 
 | |
| The PowerNV QEMU machine tries to emulate a PowerNV system at the
 | |
| level of the skiboot firmware, which loads the OS and provides some
 | |
| runtime services. Power Systems have a lower firmware (HostBoot) that
 | |
| does low level system initialization, like DRAM training. This is
 | |
| beyond the scope of what QEMU addresses today.
 | |
| 
 | |
| Supported devices
 | |
| -----------------
 | |
| 
 | |
|  * Multi processor support for POWER8, POWER8NVL and POWER9.
 | |
|  * XSCOM, serial communication sideband bus to configure chiplets.
 | |
|  * Simple LPC Controller.
 | |
|  * Processor Service Interface (PSI) Controller.
 | |
|  * Interrupt Controller, XICS (POWER8) and XIVE (POWER9) and XIVE2 (Power10).
 | |
|  * POWER8 PHB3 PCIe Host bridge and POWER9 PHB4 PCIe Host bridge.
 | |
|  * Simple OCC is an on-chip micro-controller used for power management tasks.
 | |
|  * iBT device to handle BMC communication, with the internal BMC simulator
 | |
|    provided by QEMU or an external BMC such as an Aspeed QEMU machine.
 | |
|  * PNOR containing the different firmware partitions.
 | |
| 
 | |
| Missing devices
 | |
| ---------------
 | |
| 
 | |
| A lot is missing, among which :
 | |
| 
 | |
|  * I2C controllers (yet to be merged).
 | |
|  * NPU/NPU2/NPU3 controllers.
 | |
|  * EEH support for PCIe Host bridge controllers.
 | |
|  * NX controller.
 | |
|  * VAS controller.
 | |
|  * chipTOD (Time Of Day).
 | |
|  * Self Boot Engine (SBE).
 | |
|  * FSI bus.
 | |
| 
 | |
| Firmware
 | |
| --------
 | |
| 
 | |
| The OPAL firmware (OpenPower Abstraction Layer) for OpenPower systems
 | |
| includes the runtime services ``skiboot`` and the bootloader kernel and
 | |
| initramfs ``skiroot``. Source code can be found on the `OpenPOWER account at
 | |
| GitHub <https://github.com/open-power>`_.
 | |
| 
 | |
| Prebuilt images of ``skiboot`` and ``skiroot`` are made available on the
 | |
| `OpenPOWER <https://github.com/open-power/op-build/releases/>`__ site.
 | |
| 
 | |
| QEMU includes a prebuilt image of ``skiboot`` which is updated when a
 | |
| more recent version is required by the models.
 | |
| 
 | |
| Current acceleration status
 | |
| ---------------------------
 | |
| 
 | |
| KVM acceleration in Linux Power hosts is provided by the kvm-hv and
 | |
| kvm-pr modules. kvm-hv is adherent to PAPR and it's not compliant with
 | |
| powernv. kvm-pr in theory could be used as a valid accel option but
 | |
| this isn't supported by kvm-pr at this moment.
 | |
| 
 | |
| To spare users from dealing with not so informative errors when attempting
 | |
| to use accel=kvm, the powernv machine will throw an error informing that
 | |
| KVM is not supported. This can be revisited in the future if kvm-pr (or
 | |
| any other KVM alternative) is usable as KVM accel for this machine.
 | |
| 
 | |
| Boot options
 | |
| ------------
 | |
| 
 | |
| Here is a simple setup with one e1000e NIC :
 | |
| 
 | |
| .. code-block:: bash
 | |
| 
 | |
|   $ qemu-system-ppc64 -m 2G -machine powernv9 -smp 2,cores=2,threads=1 \
 | |
|   -accel tcg,thread=single \
 | |
|   -device e1000e,netdev=net0,mac=C0:FF:EE:00:00:02,bus=pcie.0,addr=0x0 \
 | |
|   -netdev user,id=net0,hostfwd=::20022-:22,hostname=pnv \
 | |
|   -kernel ./zImage.epapr  \
 | |
|   -initrd ./rootfs.cpio.xz \
 | |
|   -nographic
 | |
| 
 | |
| and a SATA disk :
 | |
| 
 | |
| .. code-block:: bash
 | |
| 
 | |
|   -device ich9-ahci,id=sata0,bus=pcie.1,addr=0x0 \
 | |
|   -drive file=./ubuntu-ppc64le.qcow2,if=none,id=drive0,format=qcow2,cache=none \
 | |
|   -device ide-hd,bus=sata0.0,unit=0,drive=drive0,id=ide,bootindex=1 \
 | |
| 
 | |
| Complex PCIe configuration
 | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~
 | |
| 
 | |
| Six PHBs are defined per chip (POWER9) but no default PCI layout is
 | |
| provided (to be compatible with libvirt). One PCI device can be added
 | |
| on any of the available PCIe slots using command line options such as:
 | |
| 
 | |
| .. code-block:: bash
 | |
| 
 | |
|   -device e1000e,netdev=net0,mac=C0:FF:EE:00:00:02,bus=pcie.0,addr=0x0
 | |
|   -netdev bridge,id=net0,helper=/usr/libexec/qemu-bridge-helper,br=virbr0,id=hostnet0
 | |
| 
 | |
|   -device megasas,id=scsi0,bus=pcie.0,addr=0x0
 | |
|   -drive file=./ubuntu-ppc64le.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none
 | |
|   -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2
 | |
| 
 | |
| Here is a full example with two different storage controllers on
 | |
| different PHBs, each with a disk, the second PHB is empty :
 | |
| 
 | |
| .. code-block:: bash
 | |
| 
 | |
|   $ qemu-system-ppc64 -m 2G -machine powernv9 -smp 2,cores=2,threads=1 -accel tcg,thread=single \
 | |
|   -kernel ./zImage.epapr -initrd ./rootfs.cpio.xz -bios ./skiboot.lid \
 | |
|   \
 | |
|   -device megasas,id=scsi0,bus=pcie.0,addr=0x0 \
 | |
|   -drive file=./rhel7-ppc64le.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none \
 | |
|   -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2 \
 | |
|   \
 | |
|   -device pcie-pci-bridge,id=bridge1,bus=pcie.1,addr=0x0 \
 | |
|   \
 | |
|   -device ich9-ahci,id=sata0,bus=bridge1,addr=0x1 \
 | |
|   -drive file=./ubuntu-ppc64le.qcow2,if=none,id=drive0,format=qcow2,cache=none \
 | |
|   -device ide-hd,bus=sata0.0,unit=0,drive=drive0,id=ide,bootindex=1 \
 | |
|   -device e1000e,netdev=net0,mac=C0:FF:EE:00:00:02,bus=bridge1,addr=0x2 \
 | |
|   -netdev bridge,helper=/usr/libexec/qemu-bridge-helper,br=virbr0,id=net0 \
 | |
|   -device nec-usb-xhci,bus=bridge1,addr=0x7 \
 | |
|   \
 | |
|   -serial mon:stdio -nographic
 | |
| 
 | |
| You can also use VIRTIO devices :
 | |
| 
 | |
| .. code-block:: bash
 | |
| 
 | |
|   -drive file=./fedora-ppc64le.qcow2,if=none,snapshot=on,id=drive0 \
 | |
|   -device virtio-blk-pci,drive=drive0,id=blk0,bus=pcie.0 \
 | |
|   \
 | |
|   -netdev tap,helper=/usr/lib/qemu/qemu-bridge-helper,br=virbr0,id=netdev0 \
 | |
|   -device virtio-net-pci,netdev=netdev0,id=net0,bus=pcie.1 \
 | |
|   \
 | |
|   -fsdev local,id=fsdev0,path=$HOME,security_model=passthrough \
 | |
|   -device virtio-9p-pci,fsdev=fsdev0,mount_tag=host,bus=pcie.2
 | |
| 
 | |
| Multi sockets
 | |
| ~~~~~~~~~~~~~
 | |
| 
 | |
| The number of sockets is deduced from the number of CPUs and the
 | |
| number of cores. ``-smp 2,cores=1`` will define a machine with 2
 | |
| sockets of 1 core, whereas ``-smp 2,cores=2`` will define a machine
 | |
| with 1 socket of 2 cores. ``-smp 8,cores=2``, 4 sockets of 2 cores.
 | |
| 
 | |
| BMC configuration
 | |
| ~~~~~~~~~~~~~~~~~
 | |
| 
 | |
| OpenPOWER systems negotiate the shutdown and reboot with their
 | |
| BMC. The QEMU PowerNV machine embeds an IPMI BMC simulator using the
 | |
| iBT interface and should offer the same power features.
 | |
| 
 | |
| If you want to define your own BMC, use ``-nodefaults`` and specify
 | |
| one on the command line :
 | |
| 
 | |
| .. code-block:: bash
 | |
| 
 | |
|   -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-bt,bmc=bmc0,irq=10
 | |
| 
 | |
| The files `palmetto-SDR.bin <http://www.kaod.org/qemu/powernv/palmetto-SDR.bin>`__
 | |
| and `palmetto-FRU.bin <http://www.kaod.org/qemu/powernv/palmetto-FRU.bin>`__
 | |
| define a Sensor Data Record repository and a Field Replaceable Unit
 | |
| inventory for a Palmetto BMC. They can be used to extend the QEMU BMC
 | |
| simulator.
 | |
| 
 | |
| .. code-block:: bash
 | |
| 
 | |
|   -device ipmi-bmc-sim,sdrfile=./palmetto-SDR.bin,fruareasize=256,frudatafile=./palmetto-FRU.bin,id=bmc0 \
 | |
|   -device isa-ipmi-bt,bmc=bmc0,irq=10
 | |
| 
 | |
| The PowerNV machine can also be run with an external IPMI BMC device
 | |
| connected to a remote QEMU machine acting as BMC, using these options
 | |
| :
 | |
| 
 | |
| .. code-block:: bash
 | |
| 
 | |
|   -chardev socket,id=ipmi0,host=localhost,port=9002,reconnect=10 \
 | |
|   -device ipmi-bmc-extern,id=bmc0,chardev=ipmi0 \
 | |
|   -device isa-ipmi-bt,bmc=bmc0,irq=10 \
 | |
|   -nodefaults
 | |
| 
 | |
| NVRAM
 | |
| ~~~~~
 | |
| 
 | |
| Use a MTD drive to add a PNOR to the machine, and get a NVRAM :
 | |
| 
 | |
| .. code-block:: bash
 | |
| 
 | |
|   -drive file=./witherspoon.pnor,format=raw,if=mtd
 | |
| 
 | |
| CAVEATS
 | |
| -------
 | |
| 
 | |
|  * No support for multiple HW threads (SMT=1). Same as pseries.
 | |
| 
 | |
| Maintainer contact information
 | |
| ------------------------------
 | |
| 
 | |
| Cédric Le Goater <clg@kaod.org>
 |