mirror_ubuntu-kernels

mirror of https://git.proxmox.com/git/mirror_ubuntu-kernels.git synced 2025-12-24 19:57:48 +00:00

Author	SHA1	Message	Date
Oliver Upton	74158a8cad	KVM: arm64: Skip instruction after emulating write to TCR_EL1 Whelp, this is embarrassing. Since commit `082fdfd138` ("KVM: arm64: Prevent guests from enabling HA/HD on Ampere1") KVM traps writes to TCR_EL1 on AmpereOne to work around an erratum in the unadvertised HAFDBS implementation, preventing the guest from enabling the feature. Unfortunately, I failed virtualization 101 when working on that change, and forgot to advance PC after instruction emulation. Do the right thing and skip the MSR instruction after emulating the write. Fixes: `082fdfd138` ("KVM: arm64: Prevent guests from enabling HA/HD on Ampere1") Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230728000824.3848025-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-07-28 17:11:23 +00:00
Fuad Tabba	a9626099a5	KVM: arm64: Use the appropriate feature trap register when activating traps Instead of writing directly to cptr_el2, use the helper that selects which feature trap register to write to based on the KVM mode. Fixes: `75c76ab5a6` ("KVM: arm64: Rework CPTR_EL2 programming for HVHE configuration") Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230724123829.2929609-7-tabba@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-07-26 17:08:30 +00:00
Oliver Upton	733c758e50	KVM: arm64: Rephrase percpu enable/disable tracking in terms of hyp kvm_arm_hardware_enabled is rather misleading, since it doesn't track the state of all hardware resources needed for running a VM. What it actually tracks is whether or not the hyp cpu context has been initialized. Since we're now at the point where vgic + timer irq management has been separated from kvm_arm_hardware_enabled, rephrase it (and the associated helpers) to make it clear what state is being tracked. Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230719231855.262973-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-07-20 17:17:32 +00:00
Raghavendra Rao Ananta	c718ca0e99	KVM: arm64: Fix hardware enable/disable flows for pKVM When running in protected mode, the hyp stub is disabled after pKVM is initialized, meaning the host cannot enable/disable the hyp at runtime. As such, kvm_arm_hardware_enabled is always 1 after initialization, and kvm_arch_hardware_enable() never enables the vgic maintenance irq or timer irqs. Unconditionally enable/disable the vgic + timer irqs in the respective calls, instead relying on the percpu bookkeeping in the generic code to keep track of which cpus have the interrupts unmasked. Fixes: `466d27e48d` ("KVM: arm64: Simplify the CPUHP logic") Reported-by: Oliver Upton <oliver.upton@linux.dev> Suggested-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Link: https://lore.kernel.org/r/20230719175400.647154-1-rananta@google.com Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-07-20 17:16:56 +00:00
Oliver Upton	84f6867903	KVM: arm64: Allow pKVM on v1.0 compatible FF-A implementations pKVM initialization fails on systems with v1.1+ FF-A implementations, as the hyp does a strict match on the returned version from FFA_VERSION. This is a stronger assertion than required by the specification, which requires minor revisions be backwards compatible with earlier revisions of the same major version. Relax the check in hyp_ffa_init() to only test the returned major version. Even though v1.1 broke ABI, the expectation is that firmware incapable of using the v1.0 ABI return NOT_SUPPORTED instead of a valid version. Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230718184537.3220867-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-07-19 16:59:18 +00:00
Xiang Chen	9d2a55b403	KVM: arm64: Fix the name of sys_reg_desc related to PMU For those PMU system registers defined in sys_reg_descs[], use macro PMU_SYS_REG() / PMU_PMEVCNTR_EL0 / PMU_PMEVTYPER_EL0 to define them, and later two macros call macro PMU_SYS_REG() actually. Currently the input parameter of PMU_SYS_REG() is another macro which is calculation formula of the value of system registers, so for example, if we want to "SYS_PMINTENSET_EL1" as the name of sys register, actually the name we get is as following: (((3) << 19) \| ((0) << 16) \| ((9) << 12) \| ((14) << 8) \| ((1) << 5)) The name of system register is used in some tracepoints such as trace_kvm_sys_access(), if not set correctly, we need to analyze the inaccurate name to get the exact name (which also is inconsistent with other system registers), and also the inaccurate name occupies more space. To fix the issue, use the name as a input parameter of PMU_SYS_REG like MTE_REG or EL2_REG. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1689305920-170523-1-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-07-14 23:34:05 +00:00
Oliver Upton	6d4f9236cd	KVM: arm64: Correctly handle RES0 bits PMEVTYPER<n>_EL0.evtCount The PMU event ID varies from 10 to 16 bits, depending on the PMU version. If the PMU only supports 10 bits of event ID, bits [15:10] of the evtCount field behave as RES0. While the actual PMU emulation code gets this right (i.e. RES0 bits are masked out when programming the perf event), the sysreg emulation writes an unmasked value to the in-memory cpu context. The net effect is that guest reads and writes of PMEVTYPER<n>_EL0 will see non-RES0 behavior in the reserved bits of the field. As it so happens, kvm_pmu_set_counter_event_type() already writes a masked value to the in-memory context that gets overwritten by access_pmu_evtyper(). Fix the issue by removing the unnecessary (and incorrect) register write in access_pmu_evtyper(). Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Reiji Watanabe <reijiw@google.com> Link: https://lore.kernel.org/r/20230713221649.3889210-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-07-14 23:28:58 +00:00
Marc Zyngier	b321c31c9b	KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when running a preemptible kernel, as it is possible that a vCPU is blocked without requesting a doorbell interrupt. The issue is that any preemption that occurs between vgic_v4_put() and schedule() on the block path will mark the vPE as nonresident and not request a doorbell irq. This occurs because when the vcpu thread is resumed on its way to block, vcpu_load() will make the vPE resident again. Once the vcpu actually blocks, we don't request a doorbell anymore, and the vcpu won't be woken up on interrupt delivery. Fix it by tracking that we're entering WFI, and key the doorbell request on that flag. This allows us not to make the vPE resident when going through a preempt/schedule cycle, meaning we don't lose any state. Cc: stable@vger.kernel.org Fixes: `8e01d9a396` ("KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put") Reported-by: Xiang Chen <chenxiang66@hisilicon.com> Suggested-by: Zenghui Yu <yuzenghui@huawei.com> Tested-by: Xiang Chen <chenxiang66@hisilicon.com> Co-developed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20230713070657.3873244-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-07-13 22:23:34 +00:00
Mostafa Saleh	dcf89d1111	KVM: arm64: Add missing BTI instructions Some bti instructions were missing from commit `b53d4a2723` ("KVM: arm64: Use BTI for nvhe") 1) kvm_host_psci_cpu_entry kvm_host_psci_cpu_entry is called from __kvm_hyp_init_cpu through "br" instruction as __kvm_hyp_init_cpu resides in idmap section while kvm_host_psci_cpu_entry is in hyp .text so the offset is larger than 128MB range covered by "b". Which means that this function should start with "bti j" instruction. LLVM which is the only compiler supporting BTI for Linux, adds "bti j" for jump tables or by when taking the address of the block [1]. Same behaviour is observed with GCC. As kvm_host_psci_cpu_entry is a C function, this must be done in assembly. Another solution is to use X16/X17 with "br", as according to ARM ARM DDI0487I.a RLJHCL/IGMGRS, PACIASP has an implicit branch target identification instruction that is compatible with PSTATE.BTYPE 0b01 which includes "br X16/X17" And the kvm_host_psci_cpu_entry has PACIASP as it is an external function. Although, using explicit "bti" makes it more clear than relying on which register is used. A third solution is to clear SCTLR_EL2.BT, which would make PACIASP compatible PSTATE.BTYPE 0b11 ("br" to other registers). However this deviates from the kernel behaviour (in bti_enable()). 2) Spectre vector table "br" instructions are generated at runtime for the vector table (__bp_harden_hyp_vecs). These branches would land on vectors in __kvm_hyp_vector at offset 8. As all the macros are defined with valid_vect/invalid_vect, it is sufficient to add "bti j" at the correct offset. [1] https://reviews.llvm.org/D52867 Fixes: `b53d4a2723` ("KVM: arm64: Use BTI for nvhe") Signed-off-by: Mostafa Saleh <smostafa@google.com> Reported-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20230706152240.685684-1-smostafa@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-07-12 22:15:36 +00:00
Oliver Upton	df6556adf2	KVM: arm64: Correctly handle page aging notifiers for unaligned memslot Userspace is allowed to select any PAGE_SIZE aligned hva to back guest memory. This is even the case with hugepages, although it is a rather suboptimal configuration as PTE level mappings are used at stage-2. The arm64 page aging handlers have an assumption that the specified range is exactly one page/block of memory, which in the aforementioned case is not necessarily true. All together this leads to the WARN() in kvm_age_gfn() firing. However, the WARN is only part of the issue as the table walkers visit at most a single leaf PTE. For hugepage-backed memory in a memslot that isn't hugepage-aligned, page aging entirely misses accesses to the hugepage beyond the first page in the memslot. Add a new walker dedicated to handling page aging MMU notifiers capable of walking a range of PTEs. Convert kvm(_test)_age_gfn() over to the new walker and drop the WARN that caught the issue in the first place. The implementation of this walker was inspired by the test_clear_young() implementation by Yu Zhao [*], but repurposed to address a bug in the existing aging implementation. Cc: stable@vger.kernel.org # v5.15 Fixes: `056aad67f8` ("kvm: arm/arm64: Rework gpa callback handlers") Link: https://lore.kernel.org/kvmarm/20230526234435.662652-6-yuzhao@google.com/ Co-developed-by: Yu Zhao <yuzhao@google.com> Signed-off-by: Yu Zhao <yuzhao@google.com> Reported-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Link: https://lore.kernel.org/r/20230627235405.4069823-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-07-12 20:10:40 +00:00
Marc Zyngier	970dee09b2	KVM: arm64: Disable preemption in kvm_arch_hardware_enable() Since `0bf50497f0` ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock"), hotplugging back a CPU whilst a guest is running results in a number of ugly splats as most of this code expects to run with preemption disabled, which isn't the case anymore. While the context is preemptable, it isn't migratable, which should be enough. But we have plenty of preemptible() checks all over the place, and our per-CPU accessors also disable preemption. Since this affects released versions, let's do the easy fix first, disabling preemption in kvm_arch_hardware_enable(). We can always revisit this with a more invasive fix in the future. Fixes: `0bf50497f0` ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock") Reported-by: Kristina Martsenko <kristina.martsenko@arm.com> Tested-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/aeab7562-2d39-e78e-93b1-4711f8cc3fa5@arm.com Cc: stable@vger.kernel.org # v6.3, v6.4 Link: https://lore.kernel.org/r/20230703163548.1498943-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-07-11 19:30:14 +00:00
Sudeep Holla	fa729bc7c9	KVM: arm64: Handle kvm_arm_init failure correctly in finalize_pkvm Currently there is no synchronisation between finalize_pkvm() and kvm_arm_init() initcalls. The finalize_pkvm() proceeds happily even if kvm_arm_init() fails resulting in the following warning on all the CPUs and eventually a HYP panic: \| kvm [1]: IPA Size Limit: 48 bits \| kvm [1]: Failed to init hyp memory protection \| kvm [1]: error initializing Hyp mode: -22 \| \| <snip> \| \| WARNING: CPU: 0 PID: 0 at arch/arm64/kvm/pkvm.c:226 _kvm_host_prot_finalize+0x30/0x50 \| Modules linked in: \| CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.0 #237 \| Hardware name: FVP Base RevC (DT) \| pstate: 634020c5 (nZCv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--) \| pc : _kvm_host_prot_finalize+0x30/0x50 \| lr : __flush_smp_call_function_queue+0xd8/0x230 \| \| Call trace: \| _kvm_host_prot_finalize+0x3c/0x50 \| on_each_cpu_cond_mask+0x3c/0x6c \| pkvm_drop_host_privileges+0x4c/0x78 \| finalize_pkvm+0x3c/0x5c \| do_one_initcall+0xcc/0x240 \| do_initcall_level+0x8c/0xac \| do_initcalls+0x54/0x94 \| do_basic_setup+0x1c/0x28 \| kernel_init_freeable+0x100/0x16c \| kernel_init+0x20/0x1a0 \| ret_from_fork+0x10/0x20 \| Failed to finalize Hyp protection: -22 \| dtb=fvp-base-revc.dtb \| kvm [95]: nVHE hyp BUG at: arch/arm64/kvm/hyp/nvhe/mem_protect.c:540! \| kvm [95]: nVHE call trace: \| kvm [95]: [<ffff800081052984>] __kvm_nvhe_hyp_panic+0xac/0xf8 \| kvm [95]: [<ffff800081059644>] __kvm_nvhe_handle_host_mem_abort+0x1a0/0x2ac \| kvm [95]: [<ffff80008105511c>] __kvm_nvhe_handle_trap+0x4c/0x160 \| kvm [95]: [<ffff8000810540fc>] __kvm_nvhe___skip_pauth_save+0x4/0x4 \| kvm [95]: ---[ end nVHE call trace ]--- \| kvm [95]: Hyp Offset: 0xfffe8db00ffa0000 \| Kernel panic - not syncing: HYP panic: \| PS:a34023c9 PC:0000f250710b973c ESR:00000000f2000800 \| FAR:ffff000800cb00d0 HPFAR:000000000880cb00 PAR:0000000000000000 \| VCPU:0000000000000000 \| CPU: 3 PID: 95 Comm: kworker/u16:2 Tainted: G W 6.4.0 #237 \| Hardware name: FVP Base RevC (DT) \| Workqueue: rpciod rpc_async_schedule \| Call trace: \| dump_backtrace+0xec/0x108 \| show_stack+0x18/0x2c \| dump_stack_lvl+0x50/0x68 \| dump_stack+0x18/0x24 \| panic+0x138/0x33c \| nvhe_hyp_panic_handler+0x100/0x184 \| new_slab+0x23c/0x54c \| ___slab_alloc+0x3e4/0x770 \| kmem_cache_alloc_node+0x1f0/0x278 \| __alloc_skb+0xdc/0x294 \| tcp_stream_alloc_skb+0x2c/0xf0 \| tcp_sendmsg_locked+0x3d0/0xda4 \| tcp_sendmsg+0x38/0x5c \| inet_sendmsg+0x44/0x60 \| sock_sendmsg+0x1c/0x34 \| xprt_sock_sendmsg+0xdc/0x274 \| xs_tcp_send_request+0x1ac/0x28c \| xprt_transmit+0xcc/0x300 \| call_transmit+0x78/0x90 \| __rpc_execute+0x114/0x3d8 \| rpc_async_schedule+0x28/0x48 \| process_one_work+0x1d8/0x314 \| worker_thread+0x248/0x474 \| kthread+0xfc/0x184 \| ret_from_fork+0x10/0x20 \| SMP: stopping secondary CPUs \| Kernel Offset: 0x57c5cb460000 from 0xffff800080000000 \| PHYS_OFFSET: 0x80000000 \| CPU features: 0x00000000,1035b7a3,ccfe773f \| Memory Limit: none \| ---[ end Kernel panic - not syncing: HYP panic: \| PS:a34023c9 PC:0000f250710b973c ESR:00000000f2000800 \| FAR:ffff000800cb00d0 HPFAR:000000000880cb00 PAR:0000000000000000 \| VCPU:0000000000000000 ]--- Fix it by checking for the successfull initialisation of kvm_arm_init() in finalize_pkvm() before proceeding any futher. Fixes: `87727ba2bb` ("KVM: arm64: Ensure CPU PMU probes before pKVM host de-privilege") Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230704193243.3300506-1-sudeep.holla@arm.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-07-11 19:30:14 +00:00
Marc Zyngier	fe769e6c1f	KVM: arm64: timers: Use CNTHCTL_EL2 when setting non-CNTKCTL_EL1 bits It recently appeared that, when running VHE, there is a notable difference between using CNTKCTL_EL1 and CNTHCTL_EL2, despite what the architecture documents: - When accessed from EL2, bits [19:18] and [16:10] of CNTKCTL_EL1 have the same assignment as CNTHCTL_EL2 - When accessed from EL1, bits [19:18] and [16:10] are RES0 It is all OK, until you factor in NV, where the EL2 guest runs at EL1. In this configuration, CNTKCTL_EL11 doesn't trap, nor ends up in the VNCR page. This means that any write from the guest affecting CNTHCTL_EL2 using CNTKCTL_EL1 ends up losing some state. Not good. The fix it obvious: don't use CNTKCTL_EL1 if you want to change bits that are not part of the EL1 definition of CNTKCTL_EL1, and use CNTHCTL_EL2 instead. This doesn't change anything for a bare-metal OS, and fixes it when running under NV. The NV hypervisor will itself have to work harder to merge the two accessors. Note that there is a pending update to the architecture to address this issue by making the affected bits UNKNOWN when CNTKCTL_EL1 is used from EL2 with VHE enabled. Fixes: `c605ee2450` ("KVM: arm64: timers: Allow physical offset without CNTPOFF_EL2") Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org # v6.4 Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20230627140557.544885-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-07-11 19:28:30 +00:00
Linus Torvalds	e8069f5a8e	ARM64: * Eager page splitting optimization for dirty logging, optionally allowing for a VM to avoid the cost of hugepage splitting in the stage-2 fault path. * Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with services that live in the Secure world. pKVM intervenes on FF-A calls to guarantee the host doesn't misuse memory donated to the hyp or a pKVM guest. * Support for running the split hypervisor with VHE enabled, known as 'hVHE' mode. This is extremely useful for testing the split hypervisor on VHE-only systems, and paves the way for new use cases that depend on having two TTBRs available at EL2. * Generalized framework for configurable ID registers from userspace. KVM/arm64 currently prevents arbitrary CPU feature set configuration from userspace, but the intent is to relax this limitation and allow userspace to select a feature set consistent with the CPU. * Enable the use of Branch Target Identification (FEAT_BTI) in the hypervisor. * Use a separate set of pointer authentication keys for the hypervisor when running in protected mode, as the host is untrusted at runtime. * Ensure timer IRQs are consistently released in the init failure paths. * Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps (FEAT_EVT), as it is a register commonly read from userspace. * Erratum workaround for the upcoming AmpereOne part, which has broken hardware A/D state management. RISC-V: * Redirect AMO load/store misaligned traps to KVM guest * Trap-n-emulate AIA in-kernel irqchip for KVM guest * Svnapot support for KVM Guest s390: * New uvdevice secret API * CMM selftest and fixes * fix racy access to target CPU for diag 9c x86: * Fix missing/incorrect #GP checks on ENCLS * Use standard mmu_notifier hooks for handling APIC access page * Drop now unnecessary TR/TSS load after VM-Exit on AMD * Print more descriptive information about the status of SEV and SEV-ES during module load * Add a test for splitting and reconstituting hugepages during and after dirty logging * Add support for CPU pinning in demand paging test * Add support for AMD PerfMonV2, with a variety of cleanups and minor fixes included along the way * Add a "nx_huge_pages=never" option to effectively avoid creating NX hugepage recovery threads (because nx_huge_pages=off can be toggled at runtime) * Move handling of PAT out of MTRR code and dedup SVM+VMX code * Fix output of PIC poll command emulation when there's an interrupt * Add a maintainer's handbook to document KVM x86 processes, preferred coding style, testing expectations, etc. * Misc cleanups, fixes and comments Generic: * Miscellaneous bugfixes and cleanups Selftests: * Generate dependency files so that partial rebuilds work as expected -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmSgHrIUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroORcAf+KkBlXwQMf+Q0Hy6Mfe0OtkKmh0Ae 6HJ6dsuMfOHhWv5kgukh+qvuGUGzHq+gpVKmZg2yP3h3cLHOLUAYMCDm+rjXyjsk F4DbnJLfxq43Pe9PHRKFxxSecRcRYCNox0GD5UYL4PLKcH0FyfQrV+HVBK+GI8L3 FDzUcyJkR12Lcj1qf++7fsbzfOshL0AJPmidQCoc6wkLJpUEr/nYUqlI1Kx3YNuQ LKmxFHS4l4/O/px3GKNDrLWDbrVlwciGIa3GZLS52PZdW3mAqT+cqcPcYK6SW71P m1vE80VbNELX5q3YSRoOXtedoZ3Pk97LEmz/xQAsJ/jri0Z5Syk0Ok0m/Q== =AMXp -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "ARM64: - Eager page splitting optimization for dirty logging, optionally allowing for a VM to avoid the cost of hugepage splitting in the stage-2 fault path. - Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with services that live in the Secure world. pKVM intervenes on FF-A calls to guarantee the host doesn't misuse memory donated to the hyp or a pKVM guest. - Support for running the split hypervisor with VHE enabled, known as 'hVHE' mode. This is extremely useful for testing the split hypervisor on VHE-only systems, and paves the way for new use cases that depend on having two TTBRs available at EL2. - Generalized framework for configurable ID registers from userspace. KVM/arm64 currently prevents arbitrary CPU feature set configuration from userspace, but the intent is to relax this limitation and allow userspace to select a feature set consistent with the CPU. - Enable the use of Branch Target Identification (FEAT_BTI) in the hypervisor. - Use a separate set of pointer authentication keys for the hypervisor when running in protected mode, as the host is untrusted at runtime. - Ensure timer IRQs are consistently released in the init failure paths. - Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps (FEAT_EVT), as it is a register commonly read from userspace. - Erratum workaround for the upcoming AmpereOne part, which has broken hardware A/D state management. RISC-V: - Redirect AMO load/store misaligned traps to KVM guest - Trap-n-emulate AIA in-kernel irqchip for KVM guest - Svnapot support for KVM Guest s390: - New uvdevice secret API - CMM selftest and fixes - fix racy access to target CPU for diag 9c x86: - Fix missing/incorrect #GP checks on ENCLS - Use standard mmu_notifier hooks for handling APIC access page - Drop now unnecessary TR/TSS load after VM-Exit on AMD - Print more descriptive information about the status of SEV and SEV-ES during module load - Add a test for splitting and reconstituting hugepages during and after dirty logging - Add support for CPU pinning in demand paging test - Add support for AMD PerfMonV2, with a variety of cleanups and minor fixes included along the way - Add a "nx_huge_pages=never" option to effectively avoid creating NX hugepage recovery threads (because nx_huge_pages=off can be toggled at runtime) - Move handling of PAT out of MTRR code and dedup SVM+VMX code - Fix output of PIC poll command emulation when there's an interrupt - Add a maintainer's handbook to document KVM x86 processes, preferred coding style, testing expectations, etc. - Misc cleanups, fixes and comments Generic: - Miscellaneous bugfixes and cleanups Selftests: - Generate dependency files so that partial rebuilds work as expected" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (153 commits) Documentation/process: Add a maintainer handbook for KVM x86 Documentation/process: Add a label for the tip tree handbook's coding style KVM: arm64: Fix misuse of KVM_ARM_VCPU_POWER_OFF bit index RISC-V: KVM: Remove unneeded semicolon RISC-V: KVM: Allow Svnapot extension for Guest/VM riscv: kvm: define vcpu_sbi_ext_pmu in header RISC-V: KVM: Expose IMSIC registers as attributes of AIA irqchip RISC-V: KVM: Add in-kernel virtualization of AIA IMSIC RISC-V: KVM: Expose APLIC registers as attributes of AIA irqchip RISC-V: KVM: Add in-kernel emulation of AIA APLIC RISC-V: KVM: Implement device interface for AIA irqchip RISC-V: KVM: Skeletal in-kernel AIA irqchip support RISC-V: KVM: Set kvm_riscv_aia_nr_hgei to zero RISC-V: KVM: Add APLIC related defines RISC-V: KVM: Add IMSIC related defines RISC-V: KVM: Implement guest external interrupt line management KVM: x86: Remove PRIx* definitions as they are solely for user space s390/uv: Update query for secret-UVCs s390/uv: replace scnprintf with sysfs_emit s390/uvdevice: Add 'Lock Secret Store' UVC ...	2023-07-03 15:32:22 -07:00
Paolo Bonzini	cc744042d9	KVM/arm64 updates for 6.5 - Eager page splitting optimization for dirty logging, optionally allowing for a VM to avoid the cost of block splitting in the stage-2 fault path. - Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with services that live in the Secure world. pKVM intervenes on FF-A calls to guarantee the host doesn't misuse memory donated to the hyp or a pKVM guest. - Support for running the split hypervisor with VHE enabled, known as 'hVHE' mode. This is extremely useful for testing the split hypervisor on VHE-only systems, and paves the way for new use cases that depend on having two TTBRs available at EL2. - Generalized framework for configurable ID registers from userspace. KVM/arm64 currently prevents arbitrary CPU feature set configuration from userspace, but the intent is to relax this limitation and allow userspace to select a feature set consistent with the CPU. - Enable the use of Branch Target Identification (FEAT_BTI) in the hypervisor. - Use a separate set of pointer authentication keys for the hypervisor when running in protected mode, as the host is untrusted at runtime. - Ensure timer IRQs are consistently released in the init failure paths. - Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps (FEAT_EVT), as it is a register commonly read from userspace. - Erratum workaround for the upcoming AmpereOne part, which has broken hardware A/D state management. As a consequence of the hVHE series reworking the arm64 software features framework, the for-next/module-alloc branch from the arm64 tree comes along for the ride. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZJWrhwAKCRCivnWIJHzd Fs07AP9xliv5yIoQtRgPZXc0QDPyUm7zg8f5KDgqCVJtyHXcvAEAmmerBr39nbPc XoMXALKx6NGt836P0C+bhvRcHdFPGwE= =c/Xh -----END PGP SIGNATURE----- Merge tag 'kvmarm-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.5 - Eager page splitting optimization for dirty logging, optionally allowing for a VM to avoid the cost of block splitting in the stage-2 fault path. - Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with services that live in the Secure world. pKVM intervenes on FF-A calls to guarantee the host doesn't misuse memory donated to the hyp or a pKVM guest. - Support for running the split hypervisor with VHE enabled, known as 'hVHE' mode. This is extremely useful for testing the split hypervisor on VHE-only systems, and paves the way for new use cases that depend on having two TTBRs available at EL2. - Generalized framework for configurable ID registers from userspace. KVM/arm64 currently prevents arbitrary CPU feature set configuration from userspace, but the intent is to relax this limitation and allow userspace to select a feature set consistent with the CPU. - Enable the use of Branch Target Identification (FEAT_BTI) in the hypervisor. - Use a separate set of pointer authentication keys for the hypervisor when running in protected mode, as the host is untrusted at runtime. - Ensure timer IRQs are consistently released in the init failure paths. - Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps (FEAT_EVT), as it is a register commonly read from userspace. - Erratum workaround for the upcoming AmpereOne part, which has broken hardware A/D state management. As a consequence of the hVHE series reworking the arm64 software features framework, the for-next/module-alloc branch from the arm64 tree comes along for the ride.	2023-07-01 07:04:29 -04:00
Linus Torvalds	2605e80d34	arm64 updates for 6.5: - Support for the Armv8.9 Permission Indirection Extensions. While this feature doesn't add new functionality, it enables future support for Guarded Control Stacks (GCS) and Permission Overlays. - User-space support for the Armv8.8 memcpy/memset instructions. - arm64 perf: support the HiSilicon SoC uncore PMU, Arm CMN sysfs identifier, support for the NXP i.MX9 SoC DDRC PMU, fixes and cleanups. - Removal of superfluous ISBs on context switch (following retrospective architecture tightening). - Decode the ISS2 register during faults for additional information to help with debugging. - KPTI clean-up/simplification of the trampoline exit code. - Addressing several -Wmissing-prototype warnings. - Kselftest improvements for signal handling and ptrace. - Fix TPIDR2_EL0 restoring on sigreturn - Clean-up, robustness improvements of the module allocation code. - More sysreg conversions to the automatic register/bitfields generation. - CPU capabilities handling cleanup. - Arm documentation updates: ACPI, ptdump. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmSZyXwACgkQa9axLQDI XvEM3BAAkMzHGTDhNVNGLSO07PVmdzTiuoNFlfX7bktdIb+El76VhGXhHeEywTje wAq9JIYBf/Src2HbgZLwuly8Fn2vCrhyp++bRJW82o9SiBnx91+0mH7zLf+XHiQ4 FHKZxvaE6PaDc9o8WXr+IeucPRb5W2HgH37mktxh7ShMLsxorwS94V1oL29A2mV9 t4XQY7/tdmrDKMKMuQnIr1DurNXBhJ1OKvDnSN/Zzm96JOU/QQ32N2wEE7Y0aHOh bBzClksx2mguQqV515mySGFe5yy9NqaAfx2hTAciq+1rwbiCSjqQQmEswoUH8WLX JNLylxADWT2qXThFe8W6uyFzEshSAoI1yKxlCGuOsQpu4sFJtR8oh8dDj5669g4Y j0jR87r9rWm0iyYI5I+XDMxFVyuh2eFInvjtynRbj+mtS3f/SkO8fXG6Uya+I76C UGLlBUKnLr/zHuIGN0LE/V4dYTqsi9EtHoc2Am2xCZsS9jqkxKJG8C93Zsm4GlJC OcUtBSjW0rYJq+tLk0yhR6hbh59QbiRh05KnZsPpOKi8purlKSL9ZNPRi7TndLdm HjHUY+vQwNIpPIb6pyK4aYZuTdGEQIsQykQ8CULiIGlHi7kc4g9029ouLc5bBAeU mU8D62I2ztzPoYljYWNtO7K6g/Dq8c4lpsaMAJ+1Wp2iq2xBJjo= =rNBK -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "Notable features are user-space support for the memcpy/memset instructions and the permission indirection extension. - Support for the Armv8.9 Permission Indirection Extensions. While this feature doesn't add new functionality, it enables future support for Guarded Control Stacks (GCS) and Permission Overlays - User-space support for the Armv8.8 memcpy/memset instructions - arm64 perf: support the HiSilicon SoC uncore PMU, Arm CMN sysfs identifier, support for the NXP i.MX9 SoC DDRC PMU, fixes and cleanups - Removal of superfluous ISBs on context switch (following retrospective architecture tightening) - Decode the ISS2 register during faults for additional information to help with debugging - KPTI clean-up/simplification of the trampoline exit code - Addressing several -Wmissing-prototype warnings - Kselftest improvements for signal handling and ptrace - Fix TPIDR2_EL0 restoring on sigreturn - Clean-up, robustness improvements of the module allocation code - More sysreg conversions to the automatic register/bitfields generation - CPU capabilities handling cleanup - Arm documentation updates: ACPI, ptdump" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (124 commits) kselftest/arm64: Add a test case for TPIDR2 restore arm64/signal: Restore TPIDR2 register rather than memory state arm64: alternatives: make clean_dcache_range_nopatch() noinstr-safe Documentation/arm64: Add ptdump documentation arm64: hibernate: remove WARN_ON in save_processor_state kselftest/arm64: Log signal code and address for unexpected signals docs: perf: Fix warning from 'make htmldocs' in hisi-pmu.rst arm64/fpsimd: Exit streaming mode when flushing tasks docs: perf: Add new description for HiSilicon UC PMU drivers/perf: hisi: Add support for HiSilicon UC PMU driver drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driver perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE perf/arm-cmn: Add sysfs identifier perf/arm-cmn: Revamp model detection perf/arm_dmc620: Add cpumask arm64: mm: fix VA-range sanity check arm64/mm: remove now-superfluous ISBs from TTBR writes Documentation/arm64: Update ACPI tables from BBR Documentation/arm64: Update references in arm-acpi Documentation/arm64: Update ARM and arch reference ...	2023-06-26 17:11:53 -07:00
Catalin Marinas	abc17128c8	Merge branch 'for-next/feat_s1pie' into for-next/core * for-next/feat_s1pie: : Support for the Armv8.9 Permission Indirection Extensions (stage 1 only) KVM: selftests: get-reg-list: add Permission Indirection registers KVM: selftests: get-reg-list: support ID register features arm64: Document boot requirements for PIE arm64: transfer permission indirection settings to EL2 arm64: enable Permission Indirection Extension (PIE) arm64: add encodings of PIRx_ELx registers arm64: disable EL2 traps for PIE arm64: reorganise PAGE_/PROT_ macros arm64: add PTE_WRITE to PROT_SECT_NORMAL arm64: add PTE_UXN/PTE_WRITE to SWAPPER__FLAGS KVM: arm64: expose ID_AA64MMFR3_EL1 to guests KVM: arm64: Save/restore PIE registers KVM: arm64: Save/restore TCR2_EL1 arm64: cpufeature: add Permission Indirection Extension cpucap arm64: cpufeature: add TCR2 cpucap arm64: cpufeature: add system register ID_AA64MMFR3 arm64/sysreg: add PIR_ELx registers arm64/sysreg: update HCRX_EL2 register arm64/sysreg: add system registers TCR2_ELx arm64/sysreg: Add ID register ID_AA64MMFR3	2023-06-23 18:34:16 +01:00
Catalin Marinas	f42039d10b	Merge branches 'for-next/kpti', 'for-next/missing-proto-warn', 'for-next/iss2-decode', 'for-next/kselftest', 'for-next/misc', 'for-next/feat_mops', 'for-next/module-alloc', 'for-next/sysreg', 'for-next/cpucap', 'for-next/acpi', 'for-next/kdump', 'for-next/acpi-doc', 'for-next/doc' and 'for-next/tpidr2-fix', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: docs: perf: Fix warning from 'make htmldocs' in hisi-pmu.rst docs: perf: Add new description for HiSilicon UC PMU drivers/perf: hisi: Add support for HiSilicon UC PMU driver drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driver perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE perf/arm-cmn: Add sysfs identifier perf/arm-cmn: Revamp model detection perf/arm_dmc620: Add cpumask dt-bindings: perf: fsl-imx-ddr: Add i.MX93 compatible drivers/perf: imx_ddr: Add support for NXP i.MX9 SoC DDRC PMU driver perf/arm_cspmu: Decouple APMT dependency perf/arm_cspmu: Clean up ACPI dependency ACPI/APMT: Don't register invalid resource perf/arm_cspmu: Fix event attribute type perf: arm_cspmu: Set irq affinitiy only if overflow interrupt is used drivers/perf: hisi: Don't migrate perf to the CPU going to teardown drivers/perf: apple_m1: Force 63bit counters for M2 CPUs perf/arm-cmn: Fix DTC reset perf: qcom_l2_pmu: Make l2_cache_pmu_probe_cluster() more robust perf/arm-cci: Slightly optimize cci_pmu_sync_counters() * for-next/kpti: : Simplify KPTI trampoline exit code arm64: entry: Simplify tramp_alias macro and tramp_exit routine arm64: entry: Preserve/restore X29 even for compat tasks * for-next/missing-proto-warn: : Address -Wmissing-prototype warnings arm64: add alt_cb_patch_nops prototype arm64: move early_brk64 prototype to header arm64: signal: include asm/exception.h arm64: kaslr: add kaslr_early_init() declaration arm64: flush: include linux/libnvdimm.h arm64: module-plts: inline linux/moduleloader.h arm64: hide unused is_valid_bugaddr() arm64: efi: add efi_handle_corrupted_x18 prototype arm64: cpuidle: fix #ifdef for acpi functions arm64: kvm: add prototypes for functions called in asm arm64: spectre: provide prototypes for internal functions arm64: move cpu_suspend_set_dbg_restorer() prototype to header arm64: avoid prototype warnings for syscalls arm64: add scs_patch_vmlinux prototype arm64: xor-neon: mark xor_arm64_neon_() static for-next/iss2-decode: : Add decode of ISS2 to data abort reports arm64/esr: Add decode of ISS2 to data abort reporting arm64/esr: Use GENMASK() for the ISS mask * for-next/kselftest: : Various arm64 kselftest improvements kselftest/arm64: Log signal code and address for unexpected signals kselftest/arm64: Add a smoke test for ptracing hardware break/watch points * for-next/misc: : Miscellaneous patches arm64: alternatives: make clean_dcache_range_nopatch() noinstr-safe arm64: hibernate: remove WARN_ON in save_processor_state arm64/fpsimd: Exit streaming mode when flushing tasks arm64: mm: fix VA-range sanity check arm64/mm: remove now-superfluous ISBs from TTBR writes arm64: consolidate rox page protection logic arm64: set __exception_irq_entry with __irq_entry as a default arm64: syscall: unmask DAIF for tracing status arm64: lockdep: enable checks for held locks when returning to userspace arm64/cpucaps: increase string width to properly format cpucaps.h arm64/cpufeature: Use helper for ECV CNTPOFF cpufeature * for-next/feat_mops: : Support for ARMv8.8 memcpy instructions in userspace kselftest/arm64: add MOPS to hwcap test arm64: mops: allow disabling MOPS from the kernel command line arm64: mops: detect and enable FEAT_MOPS arm64: mops: handle single stepping after MOPS exception arm64: mops: handle MOPS exceptions KVM: arm64: hide MOPS from guests arm64: mops: don't disable host MOPS instructions from EL2 arm64: mops: document boot requirements for MOPS KVM: arm64: switch HCRX_EL2 between host and guest arm64: cpufeature: detect FEAT_HCX KVM: arm64: initialize HCRX_EL2 * for-next/module-alloc: : Make the arm64 module allocation code more robust (clean-up, VA range expansion) arm64: module: rework module VA range selection arm64: module: mandate MODULE_PLTS arm64: module: move module randomization to module.c arm64: kaslr: split kaslr/module initialization arm64: kasan: remove !KASAN_VMALLOC remnants arm64: module: remove old !KASAN_VMALLOC logic * for-next/sysreg: (21 commits) : More sysreg conversions to automatic generation arm64/sysreg: Convert TRBIDR_EL1 register to automatic generation arm64/sysreg: Convert TRBTRG_EL1 register to automatic generation arm64/sysreg: Convert TRBMAR_EL1 register to automatic generation arm64/sysreg: Convert TRBSR_EL1 register to automatic generation arm64/sysreg: Convert TRBBASER_EL1 register to automatic generation arm64/sysreg: Convert TRBPTR_EL1 register to automatic generation arm64/sysreg: Convert TRBLIMITR_EL1 register to automatic generation arm64/sysreg: Rename TRBIDR_EL1 fields per auto-gen tools format arm64/sysreg: Rename TRBTRG_EL1 fields per auto-gen tools format arm64/sysreg: Rename TRBMAR_EL1 fields per auto-gen tools format arm64/sysreg: Rename TRBSR_EL1 fields per auto-gen tools format arm64/sysreg: Rename TRBBASER_EL1 fields per auto-gen tools format arm64/sysreg: Rename TRBPTR_EL1 fields per auto-gen tools format arm64/sysreg: Rename TRBLIMITR_EL1 fields per auto-gen tools format arm64/sysreg: Convert OSECCR_EL1 to automatic generation arm64/sysreg: Convert OSDTRTX_EL1 to automatic generation arm64/sysreg: Convert OSDTRRX_EL1 to automatic generation arm64/sysreg: Convert OSLAR_EL1 to automatic generation arm64/sysreg: Standardise naming of bitfield constants in OSL[AS]R_EL1 arm64/sysreg: Convert MDSCR_EL1 to automatic register generation ... * for-next/cpucap: : arm64 cpucap clean-up arm64: cpufeature: fold cpus_set_cap() into update_cpu_capabilities() arm64: cpufeature: use cpucap naming arm64: alternatives: use cpucap naming arm64: standardise cpucap bitmap names * for-next/acpi: : Various arm64-related ACPI patches ACPI: bus: Consolidate all arm specific initialisation into acpi_arm_init() * for-next/kdump: : Simplify the crashkernel reservation behaviour of crashkernel=X,high on arm64 arm64: add kdump.rst into index.rst Documentation: add kdump.rst to present crashkernel reservation on arm64 arm64: kdump: simplify the reservation behaviour of crashkernel=,high * for-next/acpi-doc: : Update ACPI documentation for Arm systems Documentation/arm64: Update ACPI tables from BBR Documentation/arm64: Update references in arm-acpi Documentation/arm64: Update ARM and arch reference * for-next/doc: : arm64 documentation updates Documentation/arm64: Add ptdump documentation * for-next/tpidr2-fix: : Fix the TPIDR2_EL0 register restoring on sigreturn kselftest/arm64: Add a test case for TPIDR2 restore arm64/signal: Restore TPIDR2 register rather than memory state	2023-06-23 18:32:20 +01:00
Oliver Upton	192df2aa01	KVM: arm64: Fix misuse of KVM_ARM_VCPU_POWER_OFF bit index KVM_ARM_VCPU_POWER_OFF is as bit index, _not_ a literal bitmask. Nonetheless, commit `e3c1c0cae3` ("KVM: arm64: Relax invariance of KVM_ARM_VCPU_POWER_OFF") started using it that way, meaning that powering off a vCPU with the KVM_ARM_VCPU_INIT ioctl is completely broken. Fix it by using a shifted bit for the bitwise operations instead. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: `e3c1c0cae3` ("KVM: arm64: Relax invariance of KVM_ARM_VCPU_POWER_OFF") Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230622160922.1925530-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-22 17:17:14 +00:00
Oliver Upton	92d05e2492	Merge branch kvm-arm64/ampere1-hafdbs-mitigation into kvmarm/next * kvm-arm64/ampere1-hafdbs-mitigation: : AmpereOne erratum AC03_CPU_38 mitigation : : AmpereOne does not advertise support for FEAT_HAFDBS due to an : underlying erratum in the feature. The associated control bits do not : have RES0 behavior as required by the architecture. : : Introduce mitigations to prevent KVM from enabling the feature at : stage-2 as well as preventing KVM guests from enabling HAFDBS at : stage-1. KVM: arm64: Prevent guests from enabling HA/HD on Ampere1 KVM: arm64: Refactor HFGxTR configuration into separate helpers arm64: errata: Mitigate Ampere1 erratum AC03_CPU_38 at stage-2 Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-16 00:49:36 +00:00
Oliver Upton	082fdfd138	KVM: arm64: Prevent guests from enabling HA/HD on Ampere1 An erratum in the HAFDBS implementation in AmpereOne was addressed by clearing the feature in the ID register, with the expectation that software would not attempt to use the corresponding controls in TCR_EL1. The architecture, on the other hand, takes a much more pedantic stance on the subject, requiring the TCR bits behave as RES0. Take an extremely conservative stance on the issue and leverage the precise write trap afforded by FGT. Handle guest writes by clearing HA and HD before writing the intended value to the EL1 register alias. Link: https://lore.kernel.org/r/20230609220104.1836988-4-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-16 00:31:44 +00:00
Oliver Upton	ce4a362257	KVM: arm64: Refactor HFGxTR configuration into separate helpers A subsequent change will need to flip more trap bits in HFGWTR_EL2. Make room for this by factoring out the programming of the HFGxTR registers into helpers and using locals to build the set/clear masks. Link: https://lore.kernel.org/r/20230609220104.1836988-3-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-16 00:31:44 +00:00
Oliver Upton	6df696cd9b	arm64: errata: Mitigate Ampere1 erratum AC03_CPU_38 at stage-2 AmpereOne has an erratum in its implementation of FEAT_HAFDBS that required disabling the feature on the design. This was done by reporting the feature as not implemented in the ID register, although the corresponding control bits were not actually RES0. This does not align well with the requirements of the architecture, which mandates these bits be RES0 if HAFDBS isn't implemented. The kernel's use of stage-1 is unaffected, as the HA and HD bits are only set if HAFDBS is detected in the ID register. KVM, on the other hand, relies on the RES0 behavior at stage-2 to use the same value for VTCR_EL2 on any cpu in the system. Mitigate the non-RES0 behavior by leaving VTCR_EL2.HA clear on affected systems. Cc: stable@vger.kernel.org Cc: D Scott Phillips <scott@os.amperecomputing.com> Cc: Darren Hart <darren@os.amperecomputing.com> Acked-by: D Scott Phillips <scott@os.amperecomputing.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230609220104.1836988-2-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-16 00:31:44 +00:00
Oliver Upton	e1e315c4d5	Merge branch kvm-arm64/misc into kvmarm/next * kvm-arm64/misc: : Miscellaneous updates : : - Avoid trapping CTR_EL0 on systems with FEAT_EVT, as the register is : commonly read by userspace : : - Make use of FEAT_BTI at hyp stage-1, setting the Guard Page bit to 1 : for executable mappings : : - Use a separate set of pointer authentication keys for the hypervisor : when running in protected mode (i.e. pKVM) : : - Plug a few holes in timer initialization where KVM fails to free the : timer IRQ(s) KVM: arm64: Use different pointer authentication keys for pKVM KVM: arm64: timers: Fix resource leaks in kvm_timer_hyp_init() KVM: arm64: Use BTI for nvhe KVM: arm64: Relax trapping of CTR_EL0 when FEAT_EVT is available Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-15 13:09:43 +00:00
Oliver Upton	89a734b54c	Merge branch kvm-arm64/configurable-id-regs into kvmarm/next * kvm-arm64/configurable-id-regs: : Configurable ID register infrastructure, courtesy of Jing Zhang : : Create generalized infrastructure for allowing userspace to select the : supported feature set for a VM, so long as the feature set is a subset : of what hardware + KVM allows. This does not add any new features that : are user-configurable, and instead focuses on the necessary refactoring : to enable future work. : : As a consequence of the series, feature asymmetry is now deliberately : disallowed for KVM. It is unlikely that VMMs ever configured VMs with : asymmetry, nor does it align with the kernel's overall stance that : features must be uniform across all cores in the system. : : Furthermore, KVM incorrectly advertised an IMP_DEF PMU to guests for : some time. Migrations from affected kernels was supported by explicitly : allowing such an ID register value from userspace, and forwarding that : along to the guest. KVM now allows an IMP_DEF PMU version to be restored : through the ID register interface, but reinterprets the user value as : not implemented (0). KVM: arm64: Rip out the vestiges of the 'old' ID register scheme KVM: arm64: Handle ID register reads using the VM-wide values KVM: arm64: Use generic sanitisation for ID_AA64PFR0_EL1 KVM: arm64: Use generic sanitisation for ID_(AA64)DFR0_EL1 KVM: arm64: Use arm64_ftr_bits to sanitise ID register writes KVM: arm64: Save ID registers' sanitized value per guest KVM: arm64: Reuse fields of sys_reg_desc for idreg KVM: arm64: Rewrite IMPDEF PMU version as NI KVM: arm64: Make vCPU feature flags consistent VM-wide KVM: arm64: Relax invariance of KVM_ARM_VCPU_POWER_OFF KVM: arm64: Separate out feature sanitisation and initialisation Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-15 13:05:11 +00:00
Oliver Upton	b710fe0d30	Merge branch kvm-arm64/hvhe into kvmarm/next * kvm-arm64/hvhe: : Support for running split-hypervisor w/VHE, courtesy of Marc Zyngier : : From the cover letter: : : KVM (on ARMv8.0) and pKVM (on all revisions of the architecture) use : the split hypervisor model that makes the EL2 code more or less : standalone. In the later case, we totally ignore the VHE mode and : stick with the good old v8.0 EL2 setup. : : We introduce a new "mode" for KVM called hVHE, in reference to the : nVHE mode, and indicating that only the hypervisor is using VHE. KVM: arm64: Fix hVHE init on CPUs where HCR_EL2.E2H is not RES1 arm64: Allow arm64_sw.hvhe on command line KVM: arm64: Force HCR_E2H in guest context when ARM64_KVM_HVHE is set KVM: arm64: Program the timer traps with VHE layout in hVHE mode KVM: arm64: Rework CPTR_EL2 programming for HVHE configuration KVM: arm64: Adjust EL2 stage-1 leaf AP bits when ARM64_KVM_HVHE is set KVM: arm64: Disable TTBR1_EL2 when using ARM64_KVM_HVHE KVM: arm64: Force HCR_EL2.E2H when ARM64_KVM_HVHE is set KVM: arm64: Key use of VHE instructions in nVHE code off ARM64_KVM_HVHE KVM: arm64: Remove alternatives from sysreg accessors in VHE hypervisor context arm64: Use CPACR_EL1 format to set CPTR_EL2 when E2H is set arm64: Allow EL1 physical timer access when running VHE arm64: Don't enable VHE for the kernel if OVERRIDE_HVHE is set arm64: Add KVM_HVHE capability and has_hvhe() predicate arm64: Turn kaslr_feature_override into a generic SW feature override arm64: Prevent the use of is_kernel_in_hyp_mode() in hypervisor code KVM: arm64: Drop is_kernel_in_hyp_mode() from __invalidate_icache_guest_page() Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-15 13:02:49 +00:00
Oliver Upton	1a08f4927a	Merge branch kvm-arm64/ffa-proxy into kvmarm/next * kvm-arm64/ffa-proxy: : pKVM FF-A Proxy, courtesy Will Deacon and Andrew Walbran : : From the cover letter: : : pKVM's primary goal is to protect guest pages from a compromised host by : enforcing access control restrictions using stage-2 page-tables. Sadly, : this cannot prevent TrustZone from accessing non-secure memory, and a : compromised host could, for example, perform a 'confused deputy' attack : by asking TrustZone to use pages that have been donated to protected : guests. This would effectively allow the host to have TrustZone : exfiltrate guest secrets on its behalf, hence breaking the isolation : that pKVM intends to provide. : : This series addresses this problem by providing pKVM with the ability to : monitor SMCs following the Arm FF-A protocol. FF-A provides (among other : things) a set of memory management APIs allowing the Normal World to : share, donate or lend pages with Secure. By monitoring these SMCs, pKVM : can ensure that the pages that are shared, lent or donated to Secure by : the host kernel are only pages that it owns. KVM: arm64: pkvm: Add support for fragmented FF-A descriptors KVM: arm64: Handle FFA_FEATURES call from the host KVM: arm64: Handle FFA_MEM_LEND calls from the host KVM: arm64: Handle FFA_MEM_RECLAIM calls from the host KVM: arm64: Handle FFA_MEM_SHARE calls from the host KVM: arm64: Add FF-A helpers to share/unshare memory with secure world KVM: arm64: Handle FFA_RXTX_MAP and FFA_RXTX_UNMAP calls from the host KVM: arm64: Allocate pages for hypervisor FF-A mailboxes KVM: arm64: Probe FF-A version and host/hyp partition ID during init KVM: arm64: Block unsafe FF-A calls from the host Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-15 13:02:37 +00:00
Oliver Upton	83510396c0	Merge branch kvm-arm64/eager-page-splitting into kvmarm/next * kvm-arm64/eager-page-splitting: : Eager Page Splitting, courtesy of Ricardo Koller. : : Dirty logging performance is dominated by the cost of splitting : hugepages to PTE granularity. On systems that mere mortals can get their : hands on, each fault incurs the cost of a full break-before-make : pattern, wherein the broadcast invalidation and ensuing serialization : significantly increases fault latency. : : The goal of eager page splitting is to move the cost of hugepage : splitting out of the stage-2 fault path and instead into the ioctls : responsible for managing the dirty log: : : - If manual protection is enabled for the VM, hugepage splitting : happens in the KVM_CLEAR_DIRTY_LOG ioctl. This is desirable as it : provides userspace granular control over hugepage splitting. : : - Otherwise, if userspace relies on the legacy dirty log behavior : (clear on collection), hugepage splitting is done at the moment dirty : logging is enabled for a particular memslot. : : Support for eager page splitting requires explicit opt-in from : userspace, which is realized through the : KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE capability. arm64: kvm: avoid overflow in integer division KVM: arm64: Use local TLBI on permission relaxation KVM: arm64: Split huge pages during KVM_CLEAR_DIRTY_LOG KVM: arm64: Open-code kvm_mmu_write_protect_pt_masked() KVM: arm64: Split huge pages when dirty logging is enabled KVM: arm64: Add kvm_uninit_stage2_mmu() KVM: arm64: Refactor kvm_arch_commit_memory_region() KVM: arm64: Add kvm_pgtable_stage2_split() KVM: arm64: Add KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE KVM: arm64: Export kvm_are_all_memslots_empty() KVM: arm64: Add helper for creating unlinked stage2 subtrees KVM: arm64: Add KVM_PGTABLE_WALK flags for skipping CMOs and BBM TLBIs KVM: arm64: Rename free_removed to free_unlinked Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-15 13:02:11 +00:00
Oliver Upton	686672407e	KVM: arm64: Rip out the vestiges of the 'old' ID register scheme There's no longer a need for the baggage of the old scheme for handling configurable ID register fields. Rip it all out in favor of the generalized infrastructure. Link: https://lore.kernel.org/r/20230609190054.1542113-12-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-15 12:55:35 +00:00
Oliver Upton	6db7af0d5b	KVM: arm64: Handle ID register reads using the VM-wide values Everything is in place now to use the generic ID register infrastructure. Use the VM-wide values to service ID register reads. The ID registers are invariant after the VM has started, so there is no need for locking in that case. This is rather desirable for VM live migration, as the needless lock contention could prolong the VM blackout period. Link: https://lore.kernel.org/r/20230609190054.1542113-11-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-15 12:55:35 +00:00
Jing Zhang	c39f5974d3	KVM: arm64: Use generic sanitisation for ID_AA64PFR0_EL1 KVM allows userspace to write to the CSV2 and CSV3 fields of ID_AA64PFR0_EL1 so long as it doesn't over-promise on the Spectre/Meltdown mitigation state. Switch over to the new way of the world for screening user writes. Leave the old plumbing in place until we actually start handling ID register reads from the VM-wide values. Signed-off-by: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20230609190054.1542113-10-oliver.upton@linux.dev [Oliver: split from monster patch, added commit description] Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-15 12:55:29 +00:00
Jing Zhang	c118cead07	KVM: arm64: Use generic sanitisation for ID_(AA64)DFR0_EL1 KVM allows userspace to specify a PMU version for the guest by writing to the corresponding ID registers. Currently the validation of these writes is done manuallly, but there's no reason we can't switch over to the generic sanitisation infrastructure. Start screening user writes through arm64_check_features() to prevent userspace from over-promising in terms of vPMU support. Leave the old masking in place for now, as we aren't completely ready to serve reads from the VM-wide values. Signed-off-by: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20230609190054.1542113-9-oliver.upton@linux.dev [Oliver: split off from monster patch, cleaned up handling of NI vPMU values, wrote commit description] Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-15 12:55:20 +00:00
Jing Zhang	2e8bf0cbd0	KVM: arm64: Use arm64_ftr_bits to sanitise ID register writes Rather than reinventing the wheel in KVM to do ID register sanitisation we can rely on the work already done in the core kernel. Implement a generalized sanitisation of ID registers based on the combination of the arm64_ftr_bits definitions from the core kernel and (optionally) a set of KVM-specific overrides. This all amounts to absolutely nothing for now, but will be used in subsequent changes to realize user-configurable ID registers. Signed-off-by: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20230609190054.1542113-8-oliver.upton@linux.dev [Oliver: split off from monster patch, rewrote commit description, reworked RAZ handling, return EINVAL to userspace] Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-15 12:55:20 +00:00
Jing Zhang	4733414690	KVM: arm64: Save ID registers' sanitized value per guest Initialize the default ID register values upon the first call to KVM_ARM_VCPU_INIT. The vCPU feature flags are finalized at that point, so it is possible to determine the maximum feature set supported by a particular VM configuration. Do nothing with these values for now, as we need to rework the plumbing of what's already writable to be compatible with the generic infrastructure. Co-developed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20230609190054.1542113-7-oliver.upton@linux.dev [Oliver: Hoist everything into KVM_ARM_VCPU_INIT time, so the features are final] Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-15 12:55:08 +00:00
Marc Zyngier	1700f89cb9	KVM: arm64: Fix hVHE init on CPUs where HCR_EL2.E2H is not RES1 On CPUs where E2H is RES1, we very quickly set the scene for running EL2 with a VHE configuration, as we do not have any other choice. However, CPUs that conform to the current writing of the architecture start with E2H=0, and only later upgrade with E2H=1. This is all good, but nothing there is actually reconfiguring EL2 to be able to correctly run the kernel at EL1. Huhuh... The "obvious" solution is not to just reinitialise the timer controls like we do, but to really intitialise everything unconditionally. This requires a bit of surgery, and is a good opportunity to remove the macro that messes with SPSR_EL2 in init_el2_state. With that, hVHE now works correctly on my trusted A55 machine! Reported-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230614155129.2697388-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-15 09:27:51 +00:00
Mostafa Saleh	8c15c2a028	KVM: arm64: Use different pointer authentication keys for pKVM When the use of pointer authentication is enabled in the kernel it applies to both the kernel itself as well as KVM's nVHE hypervisor. The same keys are used for both the kernel and the nVHE hypervisor, which is less than desirable for pKVM as the host is not trusted at runtime. Naturally, the fix is to use a different set of keys for the hypervisor when running in protected mode. Have the host generate a new set of keys for the hypervisor before deprivileging the kernel. While there might be other sources of random directly available at EL2, this keeps the implementation simple, and the host is trusted anyways until it is deprivileged. Since the host and hypervisor no longer share a set of pointer authentication keys, start context switching them on the host entry/exit path exactly as we do for guest entry/exit. There is no need to handle CPU migration as the nVHE code is not migratable in the first place. Signed-off-by: Mostafa Saleh <smostafa@google.com> Link: https://lore.kernel.org/r/20230614122600.2098901-1-smostafa@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-14 15:17:32 +00:00
Anshuman Khandual	f170aa51e6	arm64/sysreg: Rename TRBIDR_EL1 fields per auto-gen tools format This renames TRBIDR_EL1 register fields per auto-gen tools format without causing any functional change in the TRBE driver. Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Cc: kvmarm@lists.linux.dev Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20230614065949.146187-8-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2023-06-14 14:37:33 +01:00
Anshuman Khandual	92b1efcd9d	arm64/sysreg: Rename TRBLIMITR_EL1 fields per auto-gen tools format This renames TRBLIMITR_EL1 register fields per auto-gen tools format without causing any functional change in the TRBE driver. Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Cc: kvmarm@lists.linux.dev Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20230614065949.146187-2-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2023-06-14 14:37:32 +01:00
Dan Carpenter	21e87daece	KVM: arm64: timers: Fix resource leaks in kvm_timer_hyp_init() Smatch detected this bug: arch/arm64/kvm/arch_timer.c:1425 kvm_timer_hyp_init() warn: missing unwind goto? There are two resources to be freed the vtimer and ptimer. The line that Smatch complains about should free the vtimer first before returning and then after that cleanup code should free the ptimer. I've added a out_free_ptimer_irq to free the ptimer and renamed the existing label to out_free_vtimer_irq. Fixes: `9e01dc76be` ("KVM: arm/arm64: arch_timer: Assign the phys timer on VHE systems") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/72fffc35-7669-40b1-9d14-113c43269cf3@kili.mountain Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-13 12:06:05 +00:00
Marc Zyngier	38cba55008	KVM: arm64: Force HCR_E2H in guest context when ARM64_KVM_HVHE is set Also make sure HCR_EL2.E2H is set when switching HCR_EL2 in guest context. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230609162200.2024064-16-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:17:24 +00:00
Marc Zyngier	aca18585db	KVM: arm64: Program the timer traps with VHE layout in hVHE mode Just like the rest of the timer code, we need to shift the enable bits around when HCR_EL2.E2H is set, which is the case in hVHE mode. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230609162200.2024064-15-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:17:24 +00:00
Marc Zyngier	75c76ab5a6	KVM: arm64: Rework CPTR_EL2 programming for HVHE configuration Just like we repainted the early arm64 code, we need to update the CPTR_EL2 accesses that are taking place in the nVHE code when hVHE is used, making them look as if they were CPACR_EL1 accesses. Just like the VHE code. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230609162200.2024064-14-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:17:24 +00:00
Marc Zyngier	6537565fd9	KVM: arm64: Adjust EL2 stage-1 leaf AP bits when ARM64_KVM_HVHE is set El2 stage-1 page-table format is subtly (and annoyingly) different when HCR_EL2.E2H is set. Take the ARM64_KVM_HVHE configuration into account when setting the AP bits. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230609162200.2024064-13-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:17:23 +00:00
Marc Zyngier	cff3b5cf96	KVM: arm64: Disable TTBR1_EL2 when using ARM64_KVM_HVHE When using hVHE, we end-up with two TTBRs at EL2. That's great, but we're not quite ready for this just yet. Disable TTBR1_EL2 by setting TCR_EL2.EPD1 so that we only translate via TTBR0_EL2. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230609162200.2024064-12-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:17:23 +00:00
Marc Zyngier	d0daf5a21e	KVM: arm64: Force HCR_EL2.E2H when ARM64_KVM_HVHE is set Obviously, in order to be able to use VHE whilst at EL2, we need to set HCR_EL2.E2H. Do so when ARM64_KVM_HVHE is set. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230609162200.2024064-11-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:17:23 +00:00
Marc Zyngier	9e7462bbe0	arm64: Allow EL1 physical timer access when running VHE To initialise the timer access from EL2 when HCR_EL2.E2H is set, we must make use the CNTHCTL_EL2 formap used is appropriate. This amounts to shifting the timer/counter enable bits by 10 to the left. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230609162200.2024064-7-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:17:23 +00:00
Jing Zhang	d86cde6e33	KVM: arm64: Reuse fields of sys_reg_desc for idreg sys_reg_desc::{reset, val} are presently unused for ID register descriptors. Repurpose these fields to support user-configurable ID registers. Use the ::reset() function pointer to return the sanitised value of a given ID register, optionally with KVM-specific feature sanitisation. Additionally, keep a mask of writable register fields in ::val. Signed-off-by: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20230609190054.1542113-6-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:08:33 +00:00
Oliver Upton	f90f9360c3	KVM: arm64: Rewrite IMPDEF PMU version as NI KVM allows userspace to write an IMPDEF PMU version to the corresponding 32bit and 64bit ID register fields for the sake of backwards compatibility with kernels that lacked commit `3d0dba5764` ("KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation"). Plumbing that IMPDEF PMU version through to the gues is getting in the way of progress, and really doesn't any sense in the first place. Bite the bullet and reinterpret the IMPDEF PMU version as NI (0) for userspace writes. Additionally, spill the dirty details into a comment. Link: https://lore.kernel.org/r/20230609190054.1542113-5-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:08:33 +00:00
Oliver Upton	2251e9ff15	KVM: arm64: Make vCPU feature flags consistent VM-wide To date KVM has allowed userspace to construct asymmetric VMs where particular features may only be supported on a subset of vCPUs. This wasn't really the intened usage pattern, and it is a total pain in the ass to keep working in the kernel. What's more, this is at odds with CPU features in host userspace, where asymmetric features are largely hidden or disabled. It's time to put an end to the whole game. Require all vCPUs in the VM to have the same feature set, rejecting deviants in the KVM_ARM_VCPU_INIT ioctl. Preserve some of the vestiges of per-vCPU feature flags in case we need to reinstate the old behavior for some limited configurations. Yes, this is a sign of cowardice around a user-visibile change. Hoist all of the 32-bit limitations into kvm_vcpu_init_check_features() to avoid nested attempts to acquire the config_lock, which won't end well. Link: https://lore.kernel.org/r/20230609190054.1542113-4-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:08:33 +00:00
Oliver Upton	e3c1c0cae3	KVM: arm64: Relax invariance of KVM_ARM_VCPU_POWER_OFF Allow the value of KVM_ARM_VCPU_POWER_OFF to differ between calls to KVM_ARM_VCPU_INIT. Userspace can already change the state of the vCPU through the KVM_SET_MP_STATE ioctl, so making the bit invariant seems needlessly restrictive. Link: https://lore.kernel.org/r/20230609190054.1542113-3-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:08:33 +00:00

1 2 3 4 5 ...

2097 Commits