linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-09-04 18:49:41 +00:00

Author	SHA1	Message	Date
Jing Zhang	4733414690	KVM: arm64: Save ID registers' sanitized value per guest Initialize the default ID register values upon the first call to KVM_ARM_VCPU_INIT. The vCPU feature flags are finalized at that point, so it is possible to determine the maximum feature set supported by a particular VM configuration. Do nothing with these values for now, as we need to rework the plumbing of what's already writable to be compatible with the generic infrastructure. Co-developed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> Link: https://lore.kernel.org/r/20230609190054.1542113-7-oliver.upton@linux.dev [Oliver: Hoist everything into KVM_ARM_VCPU_INIT time, so the features are final] Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-15 12:55:08 +00:00
Oliver Upton	f90f9360c3	KVM: arm64: Rewrite IMPDEF PMU version as NI KVM allows userspace to write an IMPDEF PMU version to the corresponding 32bit and 64bit ID register fields for the sake of backwards compatibility with kernels that lacked commit `3d0dba5764` ("KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation"). Plumbing that IMPDEF PMU version through to the gues is getting in the way of progress, and really doesn't any sense in the first place. Bite the bullet and reinterpret the IMPDEF PMU version as NI (0) for userspace writes. Additionally, spill the dirty details into a comment. Link: https://lore.kernel.org/r/20230609190054.1542113-5-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:08:33 +00:00
Oliver Upton	2251e9ff15	KVM: arm64: Make vCPU feature flags consistent VM-wide To date KVM has allowed userspace to construct asymmetric VMs where particular features may only be supported on a subset of vCPUs. This wasn't really the intened usage pattern, and it is a total pain in the ass to keep working in the kernel. What's more, this is at odds with CPU features in host userspace, where asymmetric features are largely hidden or disabled. It's time to put an end to the whole game. Require all vCPUs in the VM to have the same feature set, rejecting deviants in the KVM_ARM_VCPU_INIT ioctl. Preserve some of the vestiges of per-vCPU feature flags in case we need to reinstate the old behavior for some limited configurations. Yes, this is a sign of cowardice around a user-visibile change. Hoist all of the 32-bit limitations into kvm_vcpu_init_check_features() to avoid nested attempts to acquire the config_lock, which won't end well. Link: https://lore.kernel.org/r/20230609190054.1542113-4-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:08:33 +00:00
Oliver Upton	a7a2c72ae0	KVM: arm64: Separate out feature sanitisation and initialisation kvm_vcpu_set_target() iteratively sanitises and copies feature flags in one go. This is rather odd, especially considering the fact that bitmap accessors can do the heavy lifting. A subsequent change will make vCPU features VM-wide, and fitting that into the present implementation is just a chore. Rework the whole thing to use bitmap accessors to sanitise and copy flags. Link: https://lore.kernel.org/r/20230609190054.1542113-2-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:08:33 +00:00
Mark Brown	187de7c2aa	arm64/sysreg: Standardise naming of bitfield constants in OSL[AS]R_EL1 Our standard scheme for naming the constants for bitfields in system registers includes _ELx in the name but not the SYS_, update the constants for OSL[AS]R_EL1 to follow this convention. Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20230419-arm64-syreg-gen-v2-3-4c6add1f6257@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2023-06-06 17:43:08 +01:00
Joey Gouly	86f9de9db1	KVM: arm64: Save/restore PIE registers Define the new system registers that PIE introduces and context switch them. The PIE feature is still hidden from the ID register, and not exposed to a VM. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230606145859.697944-10-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2023-06-06 16:52:40 +01:00
Joey Gouly	fbff560682	KVM: arm64: Save/restore TCR2_EL1 Define the new system register TCR2_EL1 and context switch it. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230606145859.697944-9-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2023-06-06 16:52:40 +01:00
Reiji Watanabe	0c2f9acf6a	KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded Currently, with VHE, KVM sets ER, CR, SW and EN bits of PMUSERENR_EL0 to 1 on vcpu_load(), and saves and restores the register value for the host on vcpu_load() and vcpu_put(). If the value of those bits are cleared on a pCPU with a vCPU loaded (armv8pmu_start() would do that when PMU counters are programmed for the guest), PMU access from the guest EL0 might be trapped to the guest EL1 directly regardless of the current PMUSERENR_EL0 value of the vCPU. Fix this by not letting armv8pmu_start() overwrite PMUSERENR_EL0 on the pCPU where PMUSERENR_EL0 for the guest is loaded, and instead updating the saved shadow register value for the host so that the value can be restored on vcpu_put() later. While vcpu_{put,load}() are manipulating PMUSERENR_EL0, disable IRQs to prevent a race condition between these processes and IPIs that attempt to update PMUSERENR_EL0 for the host EL0. Suggested-by: Mark Rutland <mark.rutland@arm.com> Suggested-by: Marc Zyngier <maz@kernel.org> Fixes: `83a7a4d643` ("arm64: perf: Enable PMU counter userspace access for perf event") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230603025035.3781797-3-reijiw@google.com	2023-06-04 17:19:36 +01:00
Will Deacon	12bdce4f41	KVM: arm64: Probe FF-A version and host/hyp partition ID during init Probe FF-A during pKVM initialisation so that we can detect any inconsistencies in the version or partition ID early on. Signed-off-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230523101828.7328-3-will@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-01 21:34:50 +00:00
Ricardo Koller	2f440b72e8	KVM: arm64: Add KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE Add a capability for userspace to specify the eager split chunk size. The chunk size specifies how many pages to break at a time, using a single allocation. Bigger the chunk size, more pages need to be allocated ahead of time. Suggested-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Link: https://lore.kernel.org/r/20230426172330.1439644-6-ricarkol@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-05-16 17:39:18 +00:00
Linus Torvalds	c8c655c34e	s390: * More phys_to_virt conversions * Improvement of AP management for VSIE (nested virtualization) ARM64: * Numerous fixes for the pathological lock inversion issue that plagued KVM/arm64 since... forever. * New framework allowing SMCCC-compliant hypercalls to be forwarded to userspace, hopefully paving the way for some more features being moved to VMMs rather than be implemented in the kernel. * Large rework of the timer code to allow a VM-wide offset to be applied to both virtual and physical counters as well as a per-timer, per-vcpu offset that complements the global one. This last part allows the NV timer code to be implemented on top. * A small set of fixes to make sure that we don't change anything affecting the EL1&0 translation regime just after having having taken an exception to EL2 until we have executed a DSB. This ensures that speculative walks started in EL1&0 have completed. * The usual selftest fixes and improvements. KVM x86 changes for 6.4: * Optimize CR0.WP toggling by avoiding an MMU reload when TDP is enabled, and by giving the guest control of CR0.WP when EPT is enabled on VMX (VMX-only because SVM doesn't support per-bit controls) * Add CR0/CR4 helpers to query single bits, and clean up related code where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long" return as a bool * Move AMD_PSFD to cpufeatures.h and purge KVM's definition * Avoid unnecessary writes+flushes when the guest is only adding new PTEs * Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s optimizations when emulating invalidations * Clean up the range-based flushing APIs * Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a single A/D bit using a LOCK AND instead of XCHG, and skip all of the "handle changed SPTE" overhead associated with writing the entire entry * Track the number of "tail" entries in a pte_list_desc to avoid having to walk (potentially) all descriptors during insertion and deletion, which gets quite expensive if the guest is spamming fork() * Disallow virtualizing legacy LBRs if architectural LBRs are available, the two are mutually exclusive in hardware * Disallow writes to immutable feature MSRs (notably PERF_CAPABILITIES) after KVM_RUN, similar to CPUID features * Overhaul the vmx_pmu_caps selftest to better validate PERF_CAPABILITIES * Apply PMU filters to emulated events and add test coverage to the pmu_event_filter selftest x86 AMD: * Add support for virtual NMIs * Fixes for edge cases related to virtual interrupts x86 Intel: * Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if XTILE_DATA is not being reported due to userspace not opting in via prctl() * Fix a bug in emulation of ENCLS in compatibility mode * Allow emulation of NOP and PAUSE for L2 * AMX selftests improvements * Misc cleanups MIPS: * Constify MIPS's internal callbacks (a leftover from the hardware enabling rework that landed in 6.3) Generic: * Drop unnecessary casts from "void " throughout kvm_main.c Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the struct size by 8 bytes on 64-bit kernels by utilizing a padding hole Documentation: * Fix goof introduced by the conversion to rST -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmRNExkUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNyjwf+MkzDael9y9AsOZoqhEZ5OsfQYJ32 Im5ZVYsPRU2K5TuoWql6meIihgclCj1iIU32qYHa2F1WYt2rZ72rJp+HoY8b+TaI WvF0pvNtqQyg3iEKUBKPA4xQ6mj7RpQBw86qqiCHmlfNt0zxluEGEPxH8xrWcfhC huDQ+NUOdU7fmJ3rqGitCvkUbCuZNkw3aNPR8dhU8RAWrwRzP2hBOmdxIeo81WWY XMEpJSijbGpXL9CvM0Jz9nOuMJwZwCCBGxg1vSQq0xTfLySNMxzvWZC2GFaBjucb j0UOQ7yE0drIZDVhd3sdNslubXXU6FcSEzacGQb9aigMUon3Tem9SHi7Kw== =S2Hq -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "s390: - More phys_to_virt conversions - Improvement of AP management for VSIE (nested virtualization) ARM64: - Numerous fixes for the pathological lock inversion issue that plagued KVM/arm64 since... forever. - New framework allowing SMCCC-compliant hypercalls to be forwarded to userspace, hopefully paving the way for some more features being moved to VMMs rather than be implemented in the kernel. - Large rework of the timer code to allow a VM-wide offset to be applied to both virtual and physical counters as well as a per-timer, per-vcpu offset that complements the global one. This last part allows the NV timer code to be implemented on top. - A small set of fixes to make sure that we don't change anything affecting the EL1&0 translation regime just after having having taken an exception to EL2 until we have executed a DSB. This ensures that speculative walks started in EL1&0 have completed. - The usual selftest fixes and improvements. x86: - Optimize CR0.WP toggling by avoiding an MMU reload when TDP is enabled, and by giving the guest control of CR0.WP when EPT is enabled on VMX (VMX-only because SVM doesn't support per-bit controls) - Add CR0/CR4 helpers to query single bits, and clean up related code where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long" return as a bool - Move AMD_PSFD to cpufeatures.h and purge KVM's definition - Avoid unnecessary writes+flushes when the guest is only adding new PTEs - Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s optimizations when emulating invalidations - Clean up the range-based flushing APIs - Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a single A/D bit using a LOCK AND instead of XCHG, and skip all of the "handle changed SPTE" overhead associated with writing the entire entry - Track the number of "tail" entries in a pte_list_desc to avoid having to walk (potentially) all descriptors during insertion and deletion, which gets quite expensive if the guest is spamming fork() - Disallow virtualizing legacy LBRs if architectural LBRs are available, the two are mutually exclusive in hardware - Disallow writes to immutable feature MSRs (notably PERF_CAPABILITIES) after KVM_RUN, similar to CPUID features - Overhaul the vmx_pmu_caps selftest to better validate PERF_CAPABILITIES - Apply PMU filters to emulated events and add test coverage to the pmu_event_filter selftest - AMD SVM: - Add support for virtual NMIs - Fixes for edge cases related to virtual interrupts - Intel AMX: - Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if XTILE_DATA is not being reported due to userspace not opting in via prctl() - Fix a bug in emulation of ENCLS in compatibility mode - Allow emulation of NOP and PAUSE for L2 - AMX selftests improvements - Misc cleanups MIPS: - Constify MIPS's internal callbacks (a leftover from the hardware enabling rework that landed in 6.3) Generic: - Drop unnecessary casts from "void " throughout kvm_main.c - Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the struct size by 8 bytes on 64-bit kernels by utilizing a padding hole Documentation: - Fix goof introduced by the conversion to rST" tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (211 commits) KVM: s390: pci: fix virtual-physical confusion on module unload/load KVM: s390: vsie: clarifications on setting the APCB KVM: s390: interrupt: fix virtual-physical confusion for next alert GISA KVM: arm64: Have kvm_psci_vcpu_on() use WRITE_ONCE() to update mp_state KVM: arm64: Acquire mp_state_lock in kvm_arch_vcpu_ioctl_vcpu_init() KVM: selftests: Test the PMU event "Instructions retired" KVM: selftests: Copy full counter values from guest in PMU event filter test KVM: selftests: Use error codes to signal errors in PMU event filter test KVM: selftests: Print detailed info in PMU event filter asserts KVM: selftests: Add helpers for PMC asserts in PMU event filter test KVM: selftests: Add a common helper for the PMU event filter guest code KVM: selftests: Fix spelling mistake "perrmited" -> "permitted" KVM: arm64: vhe: Drop extra isb() on guest exit KVM: arm64: vhe: Synchronise with page table walker on MMU update KVM: arm64: pkvm: Document the side effects of kvm_flush_dcache_to_poc() KVM: arm64: nvhe: Synchronise with page table walker on TLBI KVM: arm64: Handle 32bit CNTPCTSS traps KVM: arm64: nvhe: Synchronise with page table walker on vcpu run KVM: arm64: vgic: Don't acquire its_lock before config_lock KVM: selftests: Add test to verify KVM's supported XCR0 ...	2023-05-01 12:06:20 -07:00
Paolo Bonzini	4f382a79a6	KVM/arm64 updates for 6.4 - Numerous fixes for the pathological lock inversion issue that plagued KVM/arm64 since... forever. - New framework allowing SMCCC-compliant hypercalls to be forwarded to userspace, hopefully paving the way for some more features being moved to VMMs rather than be implemented in the kernel. - Large rework of the timer code to allow a VM-wide offset to be applied to both virtual and physical counters as well as a per-timer, per-vcpu offset that complements the global one. This last part allows the NV timer code to be implemented on top. - A small set of fixes to make sure that we don't change anything affecting the EL1&0 translation regime just after having having taken an exception to EL2 until we have executed a DSB. This ensures that speculative walks started in EL1&0 have completed. - The usual selftest fixes and improvements. -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmRCZIwPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDoZ8P/ioXAdDbAE4hTuyD2YdKJ3IGWN3pg52Z7xc2 rBXXFrbK9+n9FEc3AVdHoGsRPDP0Ynl+apj+aB0Klr/Fl0KKqac+W0ARX9rn1mI1 HjeygFPaGnXjMUp0BjeSLS+g3b0gebELJ6R1QEe1/MIPb8Se7M1y3ZpMWdhe0PPL vyzw3LZq2OAlLgWKZhAfhh03qdr2kqJxypYs6nMrcexfn8dXT78dsYKW1nXmqKcE 61Gg23MDPUoexYpUhm+ym5t8hltoI1di8faPmxEpaFzpSDyAg8V5vo6LiW9jn3cf RX0Sikk1laiRAhVbbIFCKC148vFyKxum3scpKyb91Qc+sK1kmIcxvEqlc6SfG9je +5ndZwAfXtW6SMSOyX8y5fXbee7M0sx3n3le9BNgwXfmLWg/GHXJ544dJgVIlf/e 0Z+8QnP1IUDfARR/b2FlW7A7XLzNHQzO379ekcAdUptbGwlS9CrW6SJ83QR7K6fB bh0aSSELKsD7pX8wnNyNACvmz2zL12ITlDKdZWUr8MSxyTjgVy7s0BDsQT3sbrA1 1sH++RvUWfC2k7tVT3vjZFzUDlPw3bnZmo5YMWRTMbXEdr1V5rDw5F5IXit13KeT 8bk0hnJgnLmyoX2A17v5dkFMIKD7p13tqDRdfFcn0ru63HIKxgkS3ITkDmsAQELK DHT7RBE0 =Bhta -----END PGP SIGNATURE----- Merge tag 'kvmarm-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.4 - Numerous fixes for the pathological lock inversion issue that plagued KVM/arm64 since... forever. - New framework allowing SMCCC-compliant hypercalls to be forwarded to userspace, hopefully paving the way for some more features being moved to VMMs rather than be implemented in the kernel. - Large rework of the timer code to allow a VM-wide offset to be applied to both virtual and physical counters as well as a per-timer, per-vcpu offset that complements the global one. This last part allows the NV timer code to be implemented on top. - A small set of fixes to make sure that we don't change anything affecting the EL1&0 translation regime just after having having taken an exception to EL2 until we have executed a DSB. This ensures that speculative walks started in EL1&0 have completed. - The usual selftest fixes and improvements.	2023-04-26 15:46:52 -04:00
Marc Zyngier	6dcf7316e0	Merge branch kvm-arm64/smccc-filtering into kvmarm-master/next * kvm-arm64/smccc-filtering: : . : SMCCC call filtering and forwarding to userspace, courtesy of : Oliver Upton. From the cover letter: : : "The Arm SMCCC is rather prescriptive in regards to the allocation of : SMCCC function ID ranges. Many of the hypercall ranges have an : associated specification from Arm (FF-A, PSCI, SDEI, etc.) with some : room for vendor-specific implementations. : : The ever-expanding SMCCC surface leaves a lot of work within KVM for : providing new features. Furthermore, KVM implements its own : vendor-specific ABI, with little room for other implementations (like : Hyper-V, for example). Rather than cramming it all into the kernel we : should provide a way for userspace to handle hypercalls." : . KVM: selftests: Fix spelling mistake "KVM_HYPERCAL_EXIT_SMC" -> "KVM_HYPERCALL_EXIT_SMC" KVM: arm64: Test that SMC64 arch calls are reserved KVM: arm64: Prevent userspace from handling SMC64 arch range KVM: arm64: Expose SMC/HVC width to userspace KVM: selftests: Add test for SMCCC filter KVM: selftests: Add a helper for SMCCC calls with SMC instruction KVM: arm64: Let errors from SMCCC emulation to reach userspace KVM: arm64: Return NOT_SUPPORTED to guest for unknown PSCI version KVM: arm64: Introduce support for userspace SMCCC filtering KVM: arm64: Add support for KVM_EXIT_HYPERCALL KVM: arm64: Use a maple tree to represent the SMCCC filter KVM: arm64: Refactor hvc filtering to support different actions KVM: arm64: Start handling SMCs from EL1 KVM: arm64: Rename SMC/HVC call handler to reflect reality KVM: arm64: Add vm fd device attribute accessors KVM: arm64: Add a helper to check if a VM has ran once KVM: x86: Redefine 'longmode' as a flag for KVM_EXIT_HYPERCALL Signed-off-by: Marc Zyngier <maz@kernel.org>	2023-04-21 09:44:32 +01:00
Marc Zyngier	b22498c484	Merge branch kvm-arm64/timer-vm-offsets into kvmarm-master/next * kvm-arm64/timer-vm-offsets: (21 commits) : . : This series aims at satisfying multiple goals: : : - allow a VMM to atomically restore a timer offset for a whole VM : instead of updating the offset each time a vcpu get its counter : written : : - allow a VMM to save/restore the physical timer context, something : that we cannot do at the moment due to the lack of offsetting : : - provide a framework that is suitable for NV support, where we get : both global and per timer, per vcpu offsetting, and manage : interrupts in a less braindead way. : : Conflict resolution involves using the new per-vcpu config lock instead : of the home-grown timer lock. : . KVM: arm64: Handle 32bit CNTPCTSS traps KVM: arm64: selftests: Augment existing timer test to handle variable offset KVM: arm64: selftests: Deal with spurious timer interrupts KVM: arm64: selftests: Add physical timer registers to the sysreg list KVM: arm64: nv: timers: Support hyp timer emulation KVM: arm64: nv: timers: Add a per-timer, per-vcpu offset KVM: arm64: Document KVM_ARM_SET_CNT_OFFSETS and co KVM: arm64: timers: Abstract the number of valid timers per vcpu KVM: arm64: timers: Fast-track CNTPCT_EL0 trap handling KVM: arm64: Elide kern_hyp_va() in VHE-specific parts of the hypervisor KVM: arm64: timers: Move the timer IRQs into arch_timer_vm_data KVM: arm64: timers: Abstract per-timer IRQ access KVM: arm64: timers: Rationalise per-vcpu timer init KVM: arm64: timers: Allow save/restoring of the physical timer KVM: arm64: timers: Allow userspace to set the global counter offset KVM: arm64: Expose {un,}lock_all_vcpus() to the rest of KVM KVM: arm64: timers: Allow physical offset without CNTPOFF_EL2 KVM: arm64: timers: Use CNTPOFF_EL2 to offset the physical timer arm64: Add HAS_ECV_CNTPOFF capability arm64: Add CNTPOFF_EL2 register definition ... Signed-off-by: Marc Zyngier <maz@kernel.org>	2023-04-21 09:36:40 +01:00
Marc Zyngier	35dcb3ac66	KVM: arm64: Make vcpu flag updates non-preemptible Per-vcpu flags are updated using a non-atomic RMW operation. Which means it is possible to get preempted between the read and write operations. Another interesting thing to note is that preemption also updates flags, as we have some flag manipulation in both the load and put operations. It is thus possible to lose information communicated by either load or put, as the preempted flag update will overwrite the flags when the thread is resumed. This is specially critical if either load or put has stored information which depends on the physical CPU the vcpu runs on. This results in really elusive bugs, and kudos must be given to Mostafa for the long hours of debugging, and finally spotting the problem. Fix it by disabling preemption during the RMW operation, which ensures that the state stays consistent. Also upgrade vcpu_get_flag path to use READ_ONCE() to make sure the field is always atomically accessed. Fixes: `e87abb73e5` ("KVM: arm64: Add helpers to manipulate vcpu flags among a set") Reported-by: Mostafa Saleh <smostafa@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230418125737.2327972-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-04-18 17:08:09 +00:00
Oliver Upton	fb88707dd3	KVM: arm64: Use a maple tree to represent the SMCCC filter Maple tree is an efficient B-tree implementation that is intended for storing non-overlapping intervals. Such a data structure is a good fit for the SMCCC filter as it is desirable to sparsely allocate the 32 bit function ID space. To that end, add a maple tree to kvm_arch and correctly init/teardown along with the VM. Wire in a test against the hypercall filter for HVCs which does nothing until the controls are exposed to userspace. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230404154050.2270077-8-oliver.upton@linux.dev	2023-04-05 12:07:41 +01:00
Oliver Upton	de40bb8abb	KVM: arm64: Add a helper to check if a VM has ran once The test_bit(...) pattern is quite a lot of keystrokes. Replace existing callsites with a helper. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230404154050.2270077-3-oliver.upton@linux.dev	2023-04-05 12:07:41 +01:00
Marc Zyngier	81dc9504a7	KVM: arm64: nv: timers: Support hyp timer emulation Emulating EL2 also means emulating the EL2 timers. To do so, we expand our timer framework to deal with at most 4 timers. At any given time, two timers are using the HW timers, and the two others are purely emulated. The role of deciding which is which at any given time is left to a mapping function which is called every time we need to make such a decision. Reviewed-by: Colton Lewis <coltonlewis@google.com> Co-developed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230330174800.2677007-18-maz@kernel.org	2023-03-30 19:01:10 +01:00
Marc Zyngier	8a5eb2d210	KVM: arm64: timers: Move the timer IRQs into arch_timer_vm_data Having the timer IRQs duplicated into each vcpu isn't great, and becomes absolutely awful with NV. So let's move these into the per-VM arch_timer_vm_data structure. This simplifies a lot of code, but requires us to introduce a mutex so that we can reason about userspace trying to change an interrupt number while another vcpu is running, something that wasn't really well handled so far. Reviewed-by: Colton Lewis <coltonlewis@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230330174800.2677007-12-maz@kernel.org	2023-03-30 19:01:10 +01:00
Marc Zyngier	30ec7997d1	KVM: arm64: timers: Allow userspace to set the global counter offset And this is the moment you have all been waiting for: setting the counter offset from userspace. We expose a brand new capability that reports the ability to set the offset for both the virtual and physical sides. In keeping with the architecture, the offset is expressed as a delta that is substracted from the physical counter value. Once this new API is used, there is no going back, and the counters cannot be written to to set the offsets implicitly (the writes are instead ignored). Reviewed-by: Colton Lewis <coltonlewis@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230330174800.2677007-8-maz@kernel.org	2023-03-30 19:01:10 +01:00
Marc Zyngier	96906a9150	KVM: arm64: Expose {un,}lock_all_vcpus() to the rest of KVM Being able to lock/unlock all vcpus in one go is a feature that only the vgic has enjoyed so far. Let's be brave and expose it to the world. Reviewed-by: Colton Lewis <coltonlewis@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230330174800.2677007-7-maz@kernel.org	2023-03-30 19:01:09 +01:00
Oliver Upton	c43120afb5	KVM: arm64: Avoid lock inversion when setting the VM register width kvm->lock must be taken outside of the vcpu->mutex. Of course, the locking documentation for KVM makes this abundantly clear. Nonetheless, the locking order in KVM/arm64 has been wrong for quite a while; we acquire the kvm->lock while holding the vcpu->mutex all over the shop. All was seemingly fine until commit `42a90008f8` ("KVM: Ensure lockdep knows about kvm->lock vs. vcpu->mutex ordering rule") caught us with our pants down, leading to lockdep barfing: ====================================================== WARNING: possible circular locking dependency detected 6.2.0-rc7+ #19 Not tainted ------------------------------------------------------ qemu-system-aar/859 is trying to acquire lock: ffff5aa69269eba0 (&host_kvm->lock){+.+.}-{3:3}, at: kvm_reset_vcpu+0x34/0x274 but task is already holding lock: ffff5aa68768c0b8 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x8c/0xba0 which lock already depends on the new lock. Add a dedicated lock to serialize writes to VM-scoped configuration from the context of a vCPU. Protect the register width flags with the new lock, thus avoiding the need to grab the kvm->lock while holding vcpu->mutex in kvm_reset_vcpu(). Cc: stable@vger.kernel.org Reported-by: Jeremy Linton <jeremy.linton@arm.com> Link: https://lore.kernel.org/kvmarm/f6452cdd-65ff-34b8-bab0-5c06416da5f6@arm.com/ Tested-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230327164747.2466958-3-oliver.upton@linux.dev	2023-03-29 14:08:31 +01:00
Oliver Upton	0acc7239c2	KVM: arm64: Avoid vcpu->mutex v. kvm->lock inversion in CPU_ON KVM/arm64 had the lock ordering backwards on vcpu->mutex and kvm->lock from the very beginning. One such example is the way vCPU resets are handled: the kvm->lock is acquired while handling a guest CPU_ON PSCI call. Add a dedicated lock to serialize writes to kvm_vcpu_arch::{mp_state, reset_state}. Promote all accessors of mp_state to {READ,WRITE}_ONCE() as readers do not acquire the mp_state_lock. While at it, plug yet another race by taking the mp_state_lock in the KVM_SET_MP_STATE ioctl handler. As changes to MP state are now guarded with a dedicated lock, drop the kvm->lock acquisition from the PSCI CPU_ON path. Similarly, move the reader of reset_state outside of the kvm->lock and instead protect it with the mp_state_lock. Note that writes to reset_state::reset have been demoted to regular stores as both readers and writers acquire the mp_state_lock. While the kvm->lock inversion still exists in kvm_reset_vcpu(), at least now PSCI CPU_ON no longer depends on it for serializing vCPU reset. Cc: stable@vger.kernel.org Tested-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230327164747.2466958-2-oliver.upton@linux.dev	2023-03-29 14:08:31 +01:00
Thomas Huth	2def950c63	KVM: arm64: Limit length in kvm_vm_ioctl_mte_copy_tags() to INT_MAX In case of success, this function returns the amount of handled bytes. However, this does not work for large values: The function is called from kvm_arch_vm_ioctl() (which still returns a long), which in turn is called from kvm_vm_ioctl() in virt/kvm/kvm_main.c. And that function stores the return value in an "int r" variable. So the upper 32-bits of the "long" return value are lost there. KVM ioctl functions should only return "int" values, so let's limit the amount of bytes that can be requested here to INT_MAX to avoid the problem with the truncated return value. We can then also change the return type of the function to "int" to make it clearer that it is not possible to return a "long" here. Fixes: `f0376edb1d` ("KVM: arm64: Add ioctl to fetch/store tags in a guest") Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Steven Price <steven.price@arm.com> Message-Id: <20230208140105.655814-5-thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-03-16 10:18:06 -04:00
Marc Zyngier	47053904e1	KVM: arm64: timers: Convert per-vcpu virtual offset to a global value Having a per-vcpu virtual offset is a pain. It needs to be synchronized on each update, and expands badly to a setup where different timers can have different offsets, or have composite offsets (as with NV). So let's start by replacing the use of the CNTVOFF_EL2 shadow register (which we want to reclaim for NV anyway), and make the virtual timer carry a pointer to a VM-wide offset. This simplifies the code significantly. It also addresses two terrible bugs: - The use of CNTVOFF_EL2 leads to some nice offset corruption when the sysreg gets reset, as reported by Joey. - The kvm mutex is taken from a vcpu ioctl, which goes against the locking rules... Reported-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230224173915.GA17407@e124191.cambridge.arm.com Tested-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20230224191640.3396734-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-03-11 02:00:40 -08:00
Oliver Upton	0d3b2b4d23	Merge branch kvm-arm64/nv-prefix into kvmarm/next * kvm-arm64/nv-prefix: : Preamble to NV support, courtesy of Marc Zyngier. : : This brings in a set of prerequisite patches for supporting nested : virtualization in KVM/arm64. Of course, there is a long way to go until : NV is actually enabled in KVM. : : - Introduce cpucap / vCPU feature flag to pivot the NV code on : : - Add support for EL2 vCPU register state : : - Basic nested exception handling : : - Hide unsupported features from the ID registers for NV-capable VMs KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes KVM: arm64: nv: Filter out unsupported features from ID regs KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2 KVM: arm64: nv: Allow a sysreg to be hidden from userspace only KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2 KVM: arm64: nv: Handle SMCs taken from virtual EL2 KVM: arm64: nv: Handle trapped ERET from virtual EL2 KVM: arm64: nv: Inject HVC exceptions to the virtual EL2 KVM: arm64: nv: Support virtual EL2 exceptions KVM: arm64: nv: Handle HCR_EL2.NV system register traps KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state KVM: arm64: nv: Add EL2 system registers to vcpu context KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set KVM: arm64: nv: Introduce nested virtualization VCPU feature KVM: arm64: Use the S2 MMU context to iterate over S2 table arm64: Add ARM64_HAS_NESTED_VIRT cpufeature Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-02-13 23:33:41 +00:00
Oliver Upton	022d3f0800	Merge branch kvm-arm64/misc into kvmarm/next * kvm-arm64/misc: : Miscellaneous updates : : - Convert CPACR_EL1_TTA to the new, generated system register : definitions. : : - Serialize toggling CPACR_EL1.SMEN to avoid unexpected exceptions when : accessing SVCR in the host. : : - Avoid quiescing the guest if a vCPU accesses its own redistributor's : SGIs/PPIs, eliminating the need to IPI. Largely an optimization for : nested virtualization, as the L1 accesses the affected registers : rather often. : : - Conversion to kstrtobool() : : - Common definition of INVALID_GPA across architectures : : - Enable CONFIG_USERFAULTFD for CI runs of KVM selftests KVM: arm64: Fix non-kerneldoc comments KVM: selftests: Enable USERFAULTFD KVM: selftests: Remove redundant setbuf() arm64/sysreg: clean up some inconsistent indenting KVM: MMU: Make the definition of 'INVALID_GPA' common KVM: arm64: vgic-v3: Use kstrtobool() instead of strtobool() KVM: arm64: vgic-v3: Limit IPI-ing when accessing GICR_{C,S}ACTIVER0 KVM: arm64: Synchronize SMEN on vcpu schedule out KVM: arm64: Kill CPACR_EL1_TTA definition Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-02-13 23:33:25 +00:00
Oliver Upton	e8789ab704	Merge branch kvm-arm64/virtual-cache-geometry into kvmarm/next * kvm-arm64/virtual-cache-geometry: : Virtualized cache geometry for KVM guests, courtesy of Akihiko Odaki. : : KVM/arm64 has always exposed the host cache geometry directly to the : guest, even though non-secure software should never perform CMOs by : Set/Way. This was slightly wrong, as the cache geometry was derived from : the PE on which the vCPU thread was running and not a sanitized value. : : All together this leads to issues migrating VMs on heterogeneous : systems, as the cache geometry saved/restored could be inconsistent. : : KVM/arm64 now presents 1 level of cache with 1 set and 1 way. The cache : geometry is entirely controlled by userspace, such that migrations from : older kernels continue to work. KVM: arm64: Mark some VM-scoped allocations as __GFP_ACCOUNT KVM: arm64: Normalize cache configuration KVM: arm64: Mask FEAT_CCIDX KVM: arm64: Always set HCR_TID2 arm64/cache: Move CLIDR macro definitions arm64/sysreg: Add CCSIDR2_EL1 arm64/sysreg: Convert CCSIDR_EL1 to automatic generation arm64: Allow the definition of UNKNOWN system register fields Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-02-13 22:32:40 +00:00
Marc Zyngier	d9552fe133	KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor We can no longer blindly copy the VCPU's PSTATE into SPSR_EL2 and return to the guest and vice versa when taking an exception to the hypervisor, because we emulate virtual EL2 in EL1 and therefore have to translate the mode field from EL2 to EL1 and vice versa. This requires keeping track of the state we enter the guest, for which we transiently use a dedicated flag. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230209175820.1939006-15-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-02-11 10:13:29 +00:00
Jintack Lim	47f3a2fc76	KVM: arm64: nv: Support virtual EL2 exceptions Support injecting exceptions and performing exception returns to and from virtual EL2. This must be done entirely in software except when taking an exception from vEL0 to vEL2 when the virtual HCR_EL2.{E2H,TGE} == {1,1} (a VHE guest hypervisor). [maz: switch to common exception injection framework, illegal exeption return handling] Reviewed-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Signed-off-by: Jintack Lim <jintack.lim@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230209175820.1939006-10-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-02-11 09:16:11 +00:00
Marc Zyngier	5305cc2c34	KVM: arm64: nv: Add EL2 system registers to vcpu context Add the minimal set of EL2 system registers to the vcpu context. Nothing uses them just yet. Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230209175820.1939006-7-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-02-11 09:16:11 +00:00
Jintack Lim	675cabc899	arm64: Add ARM64_HAS_NESTED_VIRT cpufeature Add a new ARM64_HAS_NESTED_VIRT feature to indicate that the CPU has the ARMv8.3 nested virtualization capability, together with the 'kvm-arm.mode=nested' command line option. This will be used to support nested virtualization in KVM. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jintack Lim <jintack.lim@linaro.org> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> [maz: moved the command-line option to kvm-arm.mode] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230209175820.1939006-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-02-11 09:16:11 +00:00
Akihiko Odaki	7af0c2534f	KVM: arm64: Normalize cache configuration Before this change, the cache configuration of the physical CPU was exposed to vcpus. This is problematic because the cache configuration a vcpu sees varies when it migrates between vcpus with different cache configurations. Fabricate cache configuration from the sanitized value, which holds the CTR_EL0 value the userspace sees regardless of which physical CPU it resides on. CLIDR_EL1 and CCSIDR_EL1 are now writable from the userspace so that the VMM can restore the values saved with the old kernel. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Link: https://lore.kernel.org/r/20230112023852.42012-8-akihiko.odaki@daynix.com [ Oliver: Squash Marc's fix for CCSIDR_EL1.LineSize when set from userspace ] Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-01-21 18:09:23 +00:00
Yu Zhang	cecafc0a83	KVM: MMU: Make the definition of 'INVALID_GPA' common KVM already has a 'GPA_INVALID' defined as (~(gpa_t)0) in kvm_types.h, and it is used by ARM code. We do not need another definition of 'INVALID_GPA' for X86 specifically. Instead of using the common 'GPA_INVALID' for X86, replace it with 'INVALID_GPA', and change the users of 'GPA_INVALID' so that the diff can be smaller. Also because the name 'INVALID_GPA' tells the user we are using an invalid GPA, while the name 'GPA_INVALID' is emphasizing the GPA is an invalid one. No functional change intended. Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230105130127.866171-1-yu.c.zhang@linux.intel.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-01-19 21:48:38 +00:00
Akihiko Odaki	8cc6dedaff	KVM: arm64: Always set HCR_TID2 Always set HCR_TID2 to trap CTR_EL0, CCSIDR2_EL1, CLIDR_EL1, and CSSELR_EL1. This saves a few lines of code and allows to employ their access trap handlers for more purposes anticipated by the old condition for setting HCR_TID2. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Link: https://lore.kernel.org/r/20230112023852.42012-6-akihiko.odaki@daynix.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-01-12 21:07:43 +00:00
Sean Christopherson	8d20bd6381	KVM: x86: Unify pr_fmt to use module name for all KVM modules Define pr_fmt using KBUILD_MODNAME for all KVM x86 code so that printks use consistent formatting across common x86, Intel, and AMD code. In addition to providing consistent print formatting, using KBUILD_MODNAME, e.g. kvm_amd and kvm_intel, allows referencing SVM and VMX (and SEV and SGX and ...) as technologies without generating weird messages, and without causing naming conflicts with other kernel code, e.g. "SEV: ", "tdx: ", "sgx: " etc.. are all used by the kernel for non-KVM subsystems. Opportunistically move away from printk() for prints that need to be modified anyways, e.g. to drop a manual "kvm: " prefix. Opportunistically convert a few SGX WARNs that are similarly modified to WARN_ONCE; in the very unlikely event that the WARNs fire, odds are good that they would fire repeatedly and spam the kernel log without providing unique information in each print. Note, defining pr_fmt yields undesirable results for code that uses KVM's printk wrappers, e.g. vcpu_unimpl(). But, that's a pre-existing problem as SVM/kvm_amd already defines a pr_fmt, and thankfully use of KVM's wrappers is relatively limited in KVM x86 code. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Paul Durrant <paul@xen.org> Message-Id: <20221130230934.1014142-35-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-29 15:47:35 -05:00
Sean Christopherson	63a1bd8ad1	KVM: Drop arch hardware (un)setup hooks Drop kvm_arch_hardware_setup() and kvm_arch_hardware_unsetup() now that all implementations are nops. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> # s390 Acked-by: Anup Patel <anup@brainfault.org> Message-Id: <20221130230934.1014142-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-29 15:40:54 -05:00
Linus Torvalds	8fa590bf34	ARM64: * Enable the per-vcpu dirty-ring tracking mechanism, together with an option to keep the good old dirty log around for pages that are dirtied by something other than a vcpu. * Switch to the relaxed parallel fault handling, using RCU to delay page table reclaim and giving better performance under load. * Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option, which multi-process VMMs such as crosvm rely on (see merge commit `382b5b87a9`: "Fix a number of issues with MTE, such as races on the tags being initialised vs the PG_mte_tagged flag as well as the lack of support for VM_SHARED when KVM is involved. Patches from Catalin Marinas and Peter Collingbourne"). * Merge the pKVM shadow vcpu state tracking that allows the hypervisor to have its own view of a vcpu, keeping that state private. * Add support for the PMUv3p5 architecture revision, bringing support for 64bit counters on systems that support it, and fix the no-quite-compliant CHAIN-ed counter support for the machines that actually exist out there. * Fix a handful of minor issues around 52bit VA/PA support (64kB pages only) as a prefix of the oncoming support for 4kB and 16kB pages. * Pick a small set of documentation and spelling fixes, because no good merge window would be complete without those. s390: * Second batch of the lazy destroy patches * First batch of KVM changes for kernel virtual != physical address support * Removal of a unused function x86: * Allow compiling out SMM support * Cleanup and documentation of SMM state save area format * Preserve interrupt shadow in SMM state save area * Respond to generic signals during slow page faults * Fixes and optimizations for the non-executable huge page errata fix. * Reprogram all performance counters on PMU filter change * Cleanups to Hyper-V emulation and tests * Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest running on top of a L1 Hyper-V hypervisor) * Advertise several new Intel features * x86 Xen-for-KVM: Allow the Xen runstate information to cross a page boundary Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured ** Add support for 32-bit guests in SCHEDOP_poll * Notable x86 fixes and cleanups: One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. Advertise (on AMD) that the SMM_CTL MSR is not supported ** Remove unnecessary exports Generic: * Support for responding to signals during page faults; introduces new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks Selftests: * Fix an inverted check in the access tracking perf test, and restore support for asserting that there aren't too many idle pages when running on bare metal. * Fix build errors that occur in certain setups (unsure exactly what is unique about the problematic setup) due to glibc overriding static_assert() to a variant that requires a custom message. * Introduce actual atomics for clear/set_bit() in selftests * Add support for pinning vCPUs in dirty_log_perf_test. * Rename the so called "perf_util" framework to "memstress". * Add a lightweight psuedo RNG for guest use, and use it to randomize the access pattern and write vs. read percentage in the memstress tests. * Add a common ucall implementation; code dedup and pre-work for running SEV (and beyond) guests in selftests. * Provide a common constructor and arch hook, which will eventually be used by x86 to automatically select the right hypercall (AMD vs. Intel). * A bunch of added/enabled/fixed selftests for ARM64, covering memslots, breakpoints, stage-2 faults and access tracking. * x86-specific selftest changes: Clean up x86's page table management. Clean up and enhance the "smaller maxphyaddr" test, and add a related test to cover generic emulation failure. Clean up the nEPT support checks. Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values. ** Fix an ordering issue in the AMX test introduced by recent conversions to use kvm_cpu_has(), and harden the code to guard against similar bugs in the future. Anything that tiggers caching of KVM's supported CPUID, kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if the caching occurs before the test opts in via prctl(). Documentation: * Remove deleted ioctls from documentation * Clean up the docs for the x86 MSR filter. * Various fixes -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmOaFrcUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPemQgAq49excg2Cc+EsHnZw3vu/QWdA0Rt KhL3OgKxuHNjCbD2O9n2t5di7eJOTQ7F7T0eDm3xPTr4FS8LQ2327/mQePU/H2CF mWOpq9RBWLzFsSTeVA2Mz9TUTkYSnDHYuRsBvHyw/n9cL76BWVzjImldFtjYjjex yAwl8c5itKH6bc7KO+5ydswbvBzODkeYKUSBNdbn6m0JGQST7XppNwIAJvpiHsii Qgpk0e4Xx9q4PXG/r5DedI6BlufBsLhv0aE9SHPzyKH3JbbUFhJYI8ZD5OhBQuYW MwxK2KlM5Jm5ud2NZDDlsMmmvd1lnYCFDyqNozaKEWC1Y5rq1AbMa51fXA== =QAYX -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "ARM64: - Enable the per-vcpu dirty-ring tracking mechanism, together with an option to keep the good old dirty log around for pages that are dirtied by something other than a vcpu. - Switch to the relaxed parallel fault handling, using RCU to delay page table reclaim and giving better performance under load. - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option, which multi-process VMMs such as crosvm rely on (see merge commit `382b5b87a9`: "Fix a number of issues with MTE, such as races on the tags being initialised vs the PG_mte_tagged flag as well as the lack of support for VM_SHARED when KVM is involved. Patches from Catalin Marinas and Peter Collingbourne"). - Merge the pKVM shadow vcpu state tracking that allows the hypervisor to have its own view of a vcpu, keeping that state private. - Add support for the PMUv3p5 architecture revision, bringing support for 64bit counters on systems that support it, and fix the no-quite-compliant CHAIN-ed counter support for the machines that actually exist out there. - Fix a handful of minor issues around 52bit VA/PA support (64kB pages only) as a prefix of the oncoming support for 4kB and 16kB pages. - Pick a small set of documentation and spelling fixes, because no good merge window would be complete without those. s390: - Second batch of the lazy destroy patches - First batch of KVM changes for kernel virtual != physical address support - Removal of a unused function x86: - Allow compiling out SMM support - Cleanup and documentation of SMM state save area format - Preserve interrupt shadow in SMM state save area - Respond to generic signals during slow page faults - Fixes and optimizations for the non-executable huge page errata fix. - Reprogram all performance counters on PMU filter change - Cleanups to Hyper-V emulation and tests - Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest running on top of a L1 Hyper-V hypervisor) - Advertise several new Intel features - x86 Xen-for-KVM: - Allow the Xen runstate information to cross a page boundary - Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured - Add support for 32-bit guests in SCHEDOP_poll - Notable x86 fixes and cleanups: - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. - Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. - Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. - Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. - Advertise (on AMD) that the SMM_CTL MSR is not supported - Remove unnecessary exports Generic: - Support for responding to signals during page faults; introduces new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks Selftests: - Fix an inverted check in the access tracking perf test, and restore support for asserting that there aren't too many idle pages when running on bare metal. - Fix build errors that occur in certain setups (unsure exactly what is unique about the problematic setup) due to glibc overriding static_assert() to a variant that requires a custom message. - Introduce actual atomics for clear/set_bit() in selftests - Add support for pinning vCPUs in dirty_log_perf_test. - Rename the so called "perf_util" framework to "memstress". - Add a lightweight psuedo RNG for guest use, and use it to randomize the access pattern and write vs. read percentage in the memstress tests. - Add a common ucall implementation; code dedup and pre-work for running SEV (and beyond) guests in selftests. - Provide a common constructor and arch hook, which will eventually be used by x86 to automatically select the right hypercall (AMD vs. Intel). - A bunch of added/enabled/fixed selftests for ARM64, covering memslots, breakpoints, stage-2 faults and access tracking. - x86-specific selftest changes: - Clean up x86's page table management. - Clean up and enhance the "smaller maxphyaddr" test, and add a related test to cover generic emulation failure. - Clean up the nEPT support checks. - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values. - Fix an ordering issue in the AMX test introduced by recent conversions to use kvm_cpu_has(), and harden the code to guard against similar bugs in the future. Anything that tiggers caching of KVM's supported CPUID, kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if the caching occurs before the test opts in via prctl(). Documentation: - Remove deleted ioctls from documentation - Clean up the docs for the x86 MSR filter. - Various fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits) KVM: x86: Add proper ReST tables for userspace MSR exits/flags KVM: selftests: Allocate ucall pool from MEM_REGION_DATA KVM: arm64: selftests: Align VA space allocator with TTBR0 KVM: arm64: Fix benign bug with incorrect use of VA_BITS KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow KVM: x86: Advertise that the SMM_CTL MSR is not supported KVM: x86: remove unnecessary exports KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic" tools: KVM: selftests: Convert clear/set_bit() to actual atomics tools: Drop "atomic_" prefix from atomic test_and_set_bit() tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests perf tools: Use dedicated non-atomic clear/set bit helpers tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers KVM: arm64: selftests: Enable single-step without a "full" ucall() KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself KVM: Remove stale comment about KVM_REQ_UNHALT KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR KVM: Reference to kvm_userspace_memory_region in doc and comments KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl ...	2022-12-15 11:12:21 -08:00
Marc Zyngier	118bc846d4	Merge branch kvm-arm64/pmu-unchained into kvmarm-master/next * kvm-arm64/pmu-unchained: : . : PMUv3 fixes and improvements: : : - Make the CHAIN event handling strictly follow the architecture : : - Add support for PMUv3p5 (64bit counters all the way) : : - Various fixes and cleanups : . KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow KVM: arm64: PMU: Sanitise PMCR_EL0.LP on first vcpu run KVM: arm64: PMU: Simplify PMCR_EL0 reset handling KVM: arm64: PMU: Replace version number '0' with ID_AA64DFR0_EL1_PMUVer_NI KVM: arm64: PMU: Make kvm_pmc the main data structure KVM: arm64: PMU: Simplify vcpu computation on perf overflow notification KVM: arm64: PMU: Allow PMUv3p5 to be exposed to the guest KVM: arm64: PMU: Implement PMUv3p5 long counter support KVM: arm64: PMU: Allow ID_DFR0_EL1.PerfMon to be set from userspace KVM: arm64: PMU: Allow ID_AA64DFR0_EL1.PMUver to be set from userspace KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation KVM: arm64: PMU: Do not let AArch32 change the counters' top 32 bits KVM: arm64: PMU: Simplify setting a counter to a specific value KVM: arm64: PMU: Add counter_index_to_*reg() helpers KVM: arm64: PMU: Only narrow counters that are not 64bit wide KVM: arm64: PMU: Narrow the overflow checking when required KVM: arm64: PMU: Distinguish between 64bit counter and 64bit overflow KVM: arm64: PMU: Always advertise the CHAIN event KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode arm64: Add ID_DFR0_EL1.PerfMon values for PMUv3p7 and IMP_DEF Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-12-05 14:38:44 +00:00
Mark Brown	baa8515281	arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE When we save the state for the floating point registers this can be done in the form visible through either the FPSIMD V registers or the SVE Z and P registers. At present we track which format is currently used based on TIF_SVE and the SME streaming mode state but particularly in the SVE case this limits our options for optimising things, especially around syscalls. Introduce a new enum which we place together with saved floating point state in both thread_struct and the KVM guest state which explicitly states which format is active and keep it up to date when we change it. At present we do not use this state except to verify that it has the expected value when loading the state, future patches will introduce functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221115094640.112848-3-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2022-11-29 15:01:56 +00:00
Marc Zyngier	3d0dba5764	KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation As further patches will enable the selection of a PMU revision from userspace, sample the supported PMU revision at VM creation time, rather than building each time the ID_AA64DFR0_EL1 register is accessed. This shouldn't result in any change in behaviour. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-11-maz@kernel.org	2022-11-19 12:56:39 +00:00
Quentin Perret	f41dff4efb	KVM: arm64: Return guest memory from EL2 via dedicated teardown memcache Rather than relying on the host to free the previously-donated pKVM hypervisor VM pages explicitly on teardown, introduce a dedicated teardown memcache which allows the host to reclaim guest memory resources without having to keep track of all of the allocations made by the pKVM hypervisor at EL2. Tested-by: Vincent Donnefort <vdonnefort@google.com> Co-developed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> [maz: dropped __maybe_unused from unmap_donated_memory_noclear()] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-21-will@kernel.org	2022-11-11 17:18:58 +00:00
Quentin Perret	315775ff7c	KVM: arm64: Consolidate stage-2 initialisation into a single function The initialisation of guest stage-2 page-tables is currently split across two functions: kvm_init_stage2_mmu() and kvm_arm_setup_stage2(). That is presumably for historical reasons as kvm_arm_setup_stage2() originates from the (now defunct) KVM port for 32-bit Arm. Simplify this code path by merging both functions into one, taking care to map the 'struct kvm' into the hypervisor stage-1 early on in order to simplify the failure path. Tested-by: Vincent Donnefort <vdonnefort@google.com> Co-developed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-19-will@kernel.org	2022-11-11 17:16:25 +00:00
Quentin Perret	717a7eebac	KVM: arm64: Add generic hyp_memcache helpers The host at EL1 and the pKVM hypervisor at EL2 will soon need to exchange memory pages dynamically for creating and destroying VM state. Indeed, the hypervisor will rely on the host to donate memory pages it can use to create guest stage-2 page-tables and to store VM and vCPU metadata. In order to ease this process, introduce a 'struct hyp_memcache' which is essentially a linked list of available pages, indexed by physical addresses so that it can be passed meaningfully between the different virtual address spaces configured at EL1 and EL2. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-18-will@kernel.org	2022-11-11 17:16:25 +00:00
Fuad Tabba	9d0c063a4d	KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1 With the pKVM hypervisor at EL2 now offering hypercalls to the host for creating and destroying VM and vCPU structures, plumb these in to the existing arm64 KVM backend to ensure that the hypervisor data structures are allocated and initialised on first vCPU run for a pKVM guest. In the host, 'struct kvm_protected_vm' is introduced to hold the handle of the pKVM VM instance as well as to track references to the memory donated to the hypervisor so that it can be freed back to the host allocator following VM teardown. The stage-2 page-table, hypervisor VM and vCPU structures are allocated separately so as to avoid the need for a large physically-contiguous allocation in the host at run-time. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-14-will@kernel.org	2022-11-11 17:16:24 +00:00
Fuad Tabba	a1ec5c70d3	KVM: arm64: Add infrastructure to create and track pKVM instances at EL2 Introduce a global table (and lock) to track pKVM instances at EL2, and provide hypercalls that can be used by the untrusted host to create and destroy pKVM VMs and their vCPUs. pKVM VM/vCPU state is directly accessible only by the trusted hypervisor (EL2). Each pKVM VM is directly associated with an untrusted host KVM instance, and is referenced by the host using an opaque handle. Future patches will provide hypercalls to allow the host to initialize/set/get pKVM VM/vCPU state using the opaque handle. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Co-developed-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> [maz: silence warning on unmap_donated_memory_noclear()] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-13-will@kernel.org	2022-11-11 17:16:05 +00:00
Reiji Watanabe	370531d1e9	KVM: arm64: Clear PSTATE.SS when the Software Step state was Active-pending While userspace enables single-step, if the Software Step state at the last guest exit was "Active-pending", clear PSTATE.SS on guest entry to restore the state. Currently, KVM sets PSTATE.SS to 1 on every guest entry while userspace enables single-step for the vCPU (with KVM_GUESTDBG_SINGLESTEP). It means KVM always makes the vCPU's Software Step state "Active-not-pending" on the guest entry, which lets the VCPU perform single-step (then Software Step exception is taken). This could cause extra single-step (without returning to userspace) if the Software Step state at the last guest exit was "Active-pending" (i.e. the last exit was triggered by an asynchronous exception after the single-step is performed, but before the Software Step exception is taken. See "Figure D2-3 Software step state machine" and "D2.12.7 Behavior in the active-pending state" in ARM DDI 0487I.a for more info about this behavior). Fix this by clearing PSTATE.SS on guest entry if the Software Step state at the last exit was "Active-pending" so that KVM restore the state (and the exception is taken before further single-step is performed). Fixes: `337b99bf7e` ("KVM: arm64: guest debug, add support for single-step") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220917010600.532642-3-reijiw@google.com	2022-09-19 10:48:53 +01:00
Reiji Watanabe	34fbdee086	KVM: arm64: Preserve PSTATE.SS for the guest while single-step is enabled Preserve the PSTATE.SS value for the guest while userspace enables single-step (i.e. while KVM manipulates the PSTATE.SS) for the vCPU. Currently, while userspace enables single-step for the vCPU (with KVM_GUESTDBG_SINGLESTEP), KVM sets PSTATE.SS to 1 on every guest entry, not saving its original value. When userspace disables single-step, KVM doesn't restore the original value for the subsequent guest entry (use the current value instead). Exception return instructions copy PSTATE.SS from SPSR_ELx.SS only in certain cases when single-step is enabled (and set it to 0 in other cases). So, the value matters only when the guest enables single-step (and when the guest's Software step state isn't affected by single-step enabled by userspace, practically), though. Fix this by preserving the original PSTATE.SS value while userspace enables single-step, and restoring the value once it is disabled. This fix modifies the behavior of GET_ONE_REG/SET_ONE_REG for the PSTATE.SS while single-step is enabled by userspace. Presently, GET_ONE_REG/SET_ONE_REG gets/sets the current PSTATE.SS value, which KVM will override on the next guest entry (i.e. the value userspace gets/sets is not used for the next guest entry). With this patch, GET_ONE_REG/SET_ONE_REG will get/set the guest's preserved value, which KVM will preserve and try to restore after single-step is disabled. Fixes: `337b99bf7e` ("KVM: arm64: guest debug, add support for single-step") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220917010600.532642-2-reijiw@google.com	2022-09-19 10:48:53 +01:00
Oliver Upton	f3c6efc72f	KVM: arm64: Treat PMCR_EL1.LC as RES1 on asymmetric systems KVM does not support AArch32 on asymmetric systems. To that end, enforce AArch64-only behavior on PMCR_EL1.LC when on an asymmetric system. Fixes: `2122a83331` ("arm64: Allow mismatched 32-bit EL0 support") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220816192554.1455559-2-oliver.upton@linux.dev	2022-08-17 10:29:07 +01:00
Marc Zyngier	ae98a4a989	Merge branch kvm-arm64/sysreg-cleanup-5.20 into kvmarm-master/next * kvm-arm64/sysreg-cleanup-5.20: : . : Long overdue cleanup of the sysreg userspace access, : with extra scrubbing on the vgic side of things. : From the cover letter: : : "Schspa Shi recently reported[1] that some of the vgic code interacting : with userspace was reading uninitialised stack memory, and although : that read wasn't used any further, it prompted me to revisit this part : of the code. : : Needless to say, this area of the kernel is pretty crufty, and shows a : bunch of issues in other parts of the KVM/arm64 infrastructure. This : series tries to remedy a bunch of them: : : - Sanitise the way we deal with sysregs from userspace: at the moment, : each and every .set_user/.get_user callback has to implement its own : userspace accesses (directly or indirectly). It'd be much better if : that was centralised so that we can reason about it. : : - Enforce that all AArch64 sysregs are 64bit. Always. This was sort of : implied by the code, but it took some effort to convince myself that : this was actually the case. : : - Move the vgic-v3 sysreg userspace accessors to the userspace : callbacks instead of hijacking the vcpu trap callback. This allows : us to reuse the sysreg infrastructure. : : - Consolidate userspace accesses for both GICv2, GICv3 and common code : as much as possible. : : - Cleanup a bunch of not-very-useful helpers, tidy up some of the code : as we touch it. : : [1] https://lore.kernel.org/r/m2h740zz1i.fsf@gmail.com" : . KVM: arm64: Get rid or outdated comments KVM: arm64: Descope kvm_arm_sys_reg_{get,set}_reg() KVM: arm64: Get rid of find_reg_by_id() KVM: arm64: vgic: Tidy-up calls to vgic_{get,set}_common_attr() KVM: arm64: vgic: Consolidate userspace access for base address setting KVM: arm64: vgic-v2: Add helper for legacy dist/cpuif base address setting KVM: arm64: vgic: Use {get,put}_user() instead of copy_{from.to}_user KVM: arm64: vgic-v2: Consolidate userspace access for MMIO registers KVM: arm64: vgic-v3: Consolidate userspace access for MMIO registers KVM: arm64: vgic-v3: Use u32 to manage the line level from userspace KVM: arm64: vgic-v3: Convert userspace accessors over to FIELD_GET/FIELD_PREP KVM: arm64: vgic-v3: Make the userspace accessors use sysreg API KVM: arm64: vgic-v3: Push user access into vgic_v3_cpu_sysregs_uaccess() KVM: arm64: vgic-v3: Simplify vgic_v3_has_cpu_sysregs_attr() KVM: arm64: Get rid of reg_from/to_user() KVM: arm64: Consolidate sysreg userspace accesses KVM: arm64: Rely on index_to_param() for size checks on userspace access KVM: arm64: Introduce generic get_user/set_user helpers for system registers KVM: arm64: Reorder handling of invariant sysregs from userspace KVM: arm64: Add get_reg_by_id() as a sys_reg_desc retrieving helper Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:58 +01:00
Marc Zyngier	c5332898dc	KVM: arm64: Descope kvm_arm_sys_reg_{get,set}_reg() Having kvm_arm_sys_reg_get_reg and co in kvm_host.h gives the impression that these functions are free to be called from anywhere. Not quite. They really are tied to out internal sysreg handling, and they would be better off in the sys_regs.h header, which is private. kvm_host.h could also get a bit of a diet, so let's just do that. Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	dc94f89ae6	Merge branch kvm-arm64/burn-the-flags into kvmarm-master/next * kvm-arm64/burn-the-flags: : . : Rework the per-vcpu flags to make them more manageable, : splitting them in different sets that have specific : uses: : : - configuration flags : - input to the world-switch : - state bookkeeping for the kernel itself : : The FP tracking is also simplified and tracked outside : of the flags as a separate state. : . KVM: arm64: Move the handling of !FP outside of the fast path KVM: arm64: Document why pause cannot be turned into a flag KVM: arm64: Reduce the size of the vcpu flag members KVM: arm64: Add build-time sanity checks for flags KVM: arm64: Warn when PENDING_EXCEPTION and INCREMENT_PC are set together KVM: arm64: Convert vcpu sysregs_loaded_on_cpu to a state flag KVM: arm64: Kill unused vcpu flags field KVM: arm64: Move vcpu WFIT flag to the state flag set KVM: arm64: Move vcpu ON_UNSUPPORTED_CPU flag to the state flag set KVM: arm64: Move vcpu SVE/SME flags to the state flag set KVM: arm64: Move vcpu debug/SPE/TRBE flags to the input flag set KVM: arm64: Move vcpu PC/Exception flags to the input flag set KVM: arm64: Move vcpu configuration flags into their own set KVM: arm64: Add three sets of flags to the vcpu state KVM: arm64: Add helpers to manipulate vcpu flags among a set KVM: arm64: Move FP state ownership from flag to a tristate KVM: arm64: Drop FP_FOREIGN_STATE from the hypervisor code Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:30:10 +01:00
Marc Zyngier	0fa4a3137e	KVM: arm64: Document why pause cannot be turned into a flag It would be tempting to turn the 'pause' state into a flag. However, this cannot easily be done as it is updated out of context, while all the flags expect to only be updated from the vcpu thread. Turning it into a flag would require to make all flag updates atomic, which isn't necessary desireable. Document this, and take this opportunity to move the field next to the flag sets, filling a hole in the vcpu structure. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:51 +01:00
Marc Zyngier	54ddda919c	KVM: arm64: Reduce the size of the vcpu flag members Now that we can detect flags overflowing their container, reduce the size of all flag set members in the vcpu struct, turning them into 8bit quantities. Even with the FP state enum occupying 32bit, the whole of the state that was represented by flags is smaller by one byte. Profit! Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:46 +01:00
Marc Zyngier	5a3984f4ec	KVM: arm64: Add build-time sanity checks for flags Flags are great, but flags can also be dangerous: it is easy to encode a flag that is bigger than its container (unless the container is a u64), and it is easy to construct a flag value that doesn't fit in the mask that is associated with it. Add a couple of build-time sanity checks that ensure we catch these two cases. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:41 +01:00
Marc Zyngier	30b6ab45f8	KVM: arm64: Convert vcpu sysregs_loaded_on_cpu to a state flag The aptly named boolean 'sysregs_loaded_on_cpu' tracks whether some of the vcpu system registers are resident on the physical CPU when running in VHE mode. This is obviously a flag in hidding, so let's convert it to a state flag, since this is solely a host concern (the hypervisor itself always knows which state we're in). Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:32 +01:00
Marc Zyngier	781e3ae148	KVM: arm64: Kill unused vcpu flags field Horray, we have now sorted all the preexisting flags, and the 'flags' field is now unused. Get rid of it while nobody is looking. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:28 +01:00
Marc Zyngier	eebc538d8e	KVM: arm64: Move vcpu WFIT flag to the state flag set The host kernel uses the WFIT flag to remember that a vcpu has used this instruction and wake it up as required. Move it to the state set, as nothing in the hypervisor uses this information. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:23 +01:00
Marc Zyngier	aff3ccd732	KVM: arm64: Move vcpu ON_UNSUPPORTED_CPU flag to the state flag set The ON_UNSUPPORTED_CPU flag is only there to track the sad fact that we have ended-up on a CPU where we cannot really run. Since this is only for the host kernel's use, move it to the state set. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:19 +01:00
Marc Zyngier	0affa37fcd	KVM: arm64: Move vcpu SVE/SME flags to the state flag set The two HOST_{SVE,SME}_ENABLED are only used for the host kernel to track its own state across a vcpu run so that it can be fully restored. Move these flags to the so called state set. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:14 +01:00
Marc Zyngier	b1da49088a	KVM: arm64: Move vcpu debug/SPE/TRBE flags to the input flag set The three debug flags (which deal with the debug registers, SPE and TRBE) all are input flags to the hypervisor code. Move them into the input set and convert them to the new accessors. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:03 +01:00
Marc Zyngier	699bb2e0c6	KVM: arm64: Move vcpu PC/Exception flags to the input flag set The PC update flags (which also deal with exception injection) is one of the most complicated use of the flag we have. Make it more fool prof by: - moving it over to the new accessors and assign it to the input flag set - turn the combination of generic ELx flags with another flag indicating the target EL itself into an explicit set of flags for each EL and vector combination - add a new accessor to pend the exception This is otherwise a pretty straightformward conversion. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-10 09:54:34 +01:00
Marc Zyngier	4c0680d394	KVM: arm64: Move vcpu configuration flags into their own set The KVM_ARM64_{GUEST_HAS_SVE,VCPU_SVE_FINALIZED,GUEST_HAS_PTRAUTH} flags are purely configuration flags. Once set, they are never cleared, but evaluated all over the code base. Move these three flags into the configuration set in one go, using the new accessors, and take this opportunity to drop the KVM_ARM64_ prefix which doesn't provide any help. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-09 15:43:46 +01:00
Marc Zyngier	bcbfb588cf	KVM: arm64: Drop stale comment The layout of 'struct kvm_vcpu_arch' has evolved significantly since the initial port of KVM/arm64, so remove the stale comment suggesting that a prefix of the structure is used exclusively from assembly code. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220609121223.2551-7-will@kernel.org	2022-06-09 13:24:02 +01:00
Marc Zyngier	690bacb83b	KVM: arm64: Add three sets of flags to the vcpu state It so appears that each of the vcpu flags is really belonging to one of three categories: - a configuration flag, set once and for all - an input flag generated by the kernel for the hypervisor to use - a state flag that is only for the kernel's own bookkeeping As we are going to split all the existing flags into these three sets, introduce all three in one go. No functional change other than a bit of bloat... Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-09 12:02:12 +01:00
Marc Zyngier	e87abb73e5	KVM: arm64: Add helpers to manipulate vcpu flags among a set Careful analysis of the vcpu flags show that this is a mix of configuration, communication between the host and the hypervisor, as well as anciliary state that has no consistency. It'd be a lot better if we could split these flags into consistent categories. However, even if we split these flags apart, we want to make sure that each flag can only be applied to its own set, and not across sets. To achieve this, use a preprocessor hack so that each flag is always associated with: - the set that contains it, - a mask that describe all the bits that contain it (for a simple flag, this is the same thing as the flag itself, but we will eventually have values that cover multiple bits at once). Each flag is thus a triplet that is not directly usable as a value, but used by three helpers that allow the flag to be set, cleared, and fetched. By mandating the use of such helper, we can easily enforce that a flag can only be used with the set it belongs to. Finally, one last helper "unpacks" the raw value from the triplet that represents a flag, which is useful for multi-bit values that need to be enumerated (in a switch statement, for example). Further patches will start making use of this infrastructure. Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-09 12:02:06 +01:00
Marc Zyngier	f8077b0d59	KVM: arm64: Move FP state ownership from flag to a tristate The KVM FP code uses a pair of flags to denote three states: - FP_ENABLED set: the guest owns the FP state - FP_HOST set: the host owns the FP state - FP_ENABLED and FP_HOST clear: nobody owns the FP state at all and both flags set is an illegal state, which nothing ever checks for... As it turns out, this isn't really a good match for flags, and we'd be better off if this was a simpler tristate, each state having a name that actually reflect the state: - FP_STATE_FREE - FP_STATE_HOST_OWNED - FP_STATE_GUEST_OWNED Kill the two flags, and move over to an enum encoding these three states. This results in less confusing code, and less risk of ending up in the uncharted territory of a 4th state if we forget to clear one of the two flags. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Reiji Watanabe <reijiw@google.com>	2022-06-09 12:01:58 +01:00
Marc Zyngier	e9ada6c208	KVM: arm64: Drop FP_FOREIGN_STATE from the hypervisor code The vcpu KVM_ARM64_FP_FOREIGN_FPSTATE flag tracks the thread's own TIF_FOREIGN_FPSTATE so that we can evaluate just before running the vcpu whether it the FP regs contain something that is owned by the vcpu or not by updating the rest of the FP flags. We do this in the hypervisor code in order to make sure we're in a context where we are not interruptible. But we already have a hook in the run loop to generate this flag. We may as well update the FP flags directly and save the pointless flag tracking. Whilst we're at it, rename update_fp_enabled() to guest_owns_fp_regs() to indicate what the leftover of this helper actually do. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Mark Brown <broonie@kernel.org>	2022-06-09 12:01:51 +01:00
Linus Torvalds	bf9095424d	S390: * ultravisor communication device driver * fix TEID on terminating storage key ops RISC-V: * Added Sv57x4 support for G-stage page table * Added range based local HFENCE functions * Added remote HFENCE functions based on VCPU requests * Added ISA extension registers in ONE_REG interface * Updated KVM RISC-V maintainers entry to cover selftests support ARM: * Add support for the ARMv8.6 WFxT extension * Guard pages for the EL2 stacks * Trap and emulate AArch32 ID registers to hide unsupported features * Ability to select and save/restore the set of hypercalls exposed to the guest * Support for PSCI-initiated suspend in collaboration with userspace * GICv3 register-based LPI invalidation support * Move host PMU event merging into the vcpu data structure * GICv3 ITS save/restore fixes * The usual set of small-scale cleanups and fixes x86: * New ioctls to get/set TSC frequency for a whole VM * Allow userspace to opt out of hypercall patching * Only do MSR filtering for MSRs accessed by rdmsr/wrmsr AMD SEV improvements: * Add KVM_EXIT_SHUTDOWN metadata for SEV-ES * V_TSC_AUX support Nested virtualization improvements for AMD: * Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE, nested vGIF) * Allow AVIC to co-exist with a nested guest running * Fixes for LBR virtualizations when a nested guest is running, and nested LBR virtualization support * PAUSE filtering for nested hypervisors Guest support: * Decoupling of vcpu_is_preempted from PV spinlocks -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmKN9M4UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNLeAf+KizAlQwxEehHHeNyTkZuKyMawrD6 zsqAENR6i1TxiXe7fDfPFbO2NR0ZulQopHbD9mwnHJ+nNw0J4UT7g3ii1IAVcXPu rQNRGMVWiu54jt+lep8/gDg0JvPGKVVKLhxUaU1kdWT9PhIOC6lwpP3vmeWkUfRi PFL/TMT0M8Nfryi0zHB0tXeqg41BiXfqO8wMySfBAHUbpv8D53D2eXQL6YlMM0pL 2quB1HxHnpueE5vj3WEPQ3PCdy1M2MTfCDBJAbZGG78Ljx45FxSGoQcmiBpPnhJr C6UGP4ZDWpml5YULUoA70k5ylCbP+vI61U4vUtzEiOjHugpPV5wFKtx5nw== =ozWx -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "S390: - ultravisor communication device driver - fix TEID on terminating storage key ops RISC-V: - Added Sv57x4 support for G-stage page table - Added range based local HFENCE functions - Added remote HFENCE functions based on VCPU requests - Added ISA extension registers in ONE_REG interface - Updated KVM RISC-V maintainers entry to cover selftests support ARM: - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes x86: - New ioctls to get/set TSC frequency for a whole VM - Allow userspace to opt out of hypercall patching - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr AMD SEV improvements: - Add KVM_EXIT_SHUTDOWN metadata for SEV-ES - V_TSC_AUX support Nested virtualization improvements for AMD: - Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE, nested vGIF) - Allow AVIC to co-exist with a nested guest running - Fixes for LBR virtualizations when a nested guest is running, and nested LBR virtualization support - PAUSE filtering for nested hypervisors Guest support: - Decoupling of vcpu_is_preempted from PV spinlocks" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (199 commits) KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest KVM: selftests: x86: Sync the new name of the test case to .gitignore Documentation: kvm: reorder ARM-specific section about KVM_SYSTEM_EVENT_SUSPEND x86, kvm: use correct GFP flags for preemption disabled KVM: LAPIC: Drop pending LAPIC timer injection when canceling the timer x86/kvm: Alloc dummy async #PF token outside of raw spinlock KVM: x86: avoid calling x86 emulator without a decoded instruction KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave) s390/uv_uapi: depend on CONFIG_S390 KVM: selftests: x86: Fix test failure on arch lbr capable platforms KVM: LAPIC: Trace LAPIC timer expiration on every vmentry KVM: s390: selftest: Test suppression indication on key prot exception KVM: s390: Don't indicate suppression on dirtying, failing memop selftests: drivers/s390x: Add uvdevice tests drivers/s390/char: Add Ultravisor io device MAINTAINERS: Update KVM RISC-V entry to cover selftests support RISC-V: KVM: Introduce ISA extension register RISC-V: KVM: Cleanup stale TLB entries when host CPU changes RISC-V: KVM: Add remote HFENCE functions based on VCPU requests ...	2022-05-26 14:20:14 -07:00
Paolo Bonzini	47e8eec832	KVM/arm64 updates for 5.19 - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmKGAGsPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDB/gQAMhyZ+wCG0OMEZhwFF6iDfxVEX2Kw8L41NtD a/e6LDWuIOGihItpRkYROc5myG74D7XckF2Bz3G7HJoU4vhwHOV/XulE26GFizoC O1GVRekeSUY81wgS1yfo0jojLupBkTjiq3SjTHoDP7GmCM0qDPBtA0QlMRzd2bMs Kx0+UUXZUHFSTXc7Lp4vqNH+tMp7se+yRx7hxm6PCM5zG+XYJjLxnsZ0qpchObgU 7f6YFojsLUs1SexgiUqJ1RChVQ+FkgICh5HyzORvGtHNNzK6D2sIbsW6nqMGAMql Kr3A5O/VOkCztSYnLxaa76/HqD21mvUrXvr3grhabNc7rOmuzWV0dDgr6c6wHKHb uNCtH4d7Ra06gUrEOrfsgLOLn0Zqik89y6aIlMsnTudMg9gMNgFHy1jz4LM7vMkY FS5AVj059heg2uJcfgTvzzcqneyuBLBmF3dS4coowO6oaj8SycpaEmP5e89zkPMI 1kk8d0e6RmXuCh/2AJ8GxxnKvBPgqp2mMKXOCJ8j4AmHEDX/CKpEBBqIWLKkplUU 8DGiOWJUtRZJg398dUeIpiVLoXJthMODjAnkKkuhiFcQbXomlwgg7YSnNAz6TRED Z7KR2leC247kapHnnagf02q2wED8pBeyrxbQPNdrHtSJ9Usm4nTkY443HgVTJW3s aTwPZAQ7 =mh7W -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 5.19 - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes [Due to the conflict, KVM_SYSTEM_EVENT_SEV_TERM is relocated from 4 to 6. - Paolo]	2022-05-25 05:09:23 -04:00
Catalin Marinas	0616ea3f1b	Merge branch 'for-next/esr-elx-64-bit' into for-next/core * for-next/esr-elx-64-bit: : Treat ESR_ELx as a 64-bit register. KVM: arm64: uapi: Add kvm_debug_exit_arch.hsr_high KVM: arm64: Treat ESR_EL2 as a 64-bit register arm64: Treat ESR_ELx as a 64-bit register arm64: compat: Do not treat syscall number as ESR_ELx for a bad syscall arm64: Make ESR_ELx_xVC_IMM_MASK compatible with assembly	2022-05-20 18:51:54 +01:00
Marc Zyngier	822ca7f82b	Merge branch kvm-arm64/misc-5.19 into kvmarm-master/next * kvm-arm64/misc-5.19: : . : Misc fixes and general improvements for KVMM/arm64: : : - Better handle out of sequence sysregs in the global tables : : - Remove a couple of unnecessary loads from constant pool : : - Drop unnecessary pKVM checks : : - Add all known M1 implementations to the SEIS workaround : : - Cleanup kerneldoc warnings : . KVM: arm64: vgic-v3: List M1 Pro/Max as requiring the SEIS workaround KVM: arm64: pkvm: Don't mask already zeroed FEAT_SVE KVM: arm64: pkvm: Drop unnecessary FP/SIMD trap handler KVM: arm64: nvhe: Eliminate kernel-doc warnings KVM: arm64: Avoid unnecessary absolute addressing via literals KVM: arm64: Print emulated register table name when it is unsorted KVM: arm64: Don't BUG_ON() if emulated register table is unsorted Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-16 17:48:36 +01:00
Marc Zyngier	8794b4f510	Merge branch kvm-arm64/per-vcpu-host-pmu-data into kvmarm-master/next * kvm-arm64/per-vcpu-host-pmu-data: : . : Pass the host PMU state in the vcpu to avoid the use of additional : shared memory between EL1 and EL2 (this obviously only applies : to nVHE and Protected setups). : : Patches courtesy of Fuad Tabba. : . KVM: arm64: pmu: Restore compilation when HW_PERF_EVENTS isn't selected KVM: arm64: Reenable pmu in Protected Mode KVM: arm64: Pass pmu events to hyp via vcpu KVM: arm64: Repack struct kvm_pmu to reduce size KVM: arm64: Wrapper for getting pmu_events Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-16 17:48:36 +01:00
Marc Zyngier	3b8e21e3c3	Merge branch kvm-arm64/psci-suspend into kvmarm-master/next * kvm-arm64/psci-suspend: : . : Add support for PSCI SYSTEM_SUSPEND and allow userspace to : filter the wake-up events. : : Patches courtesy of Oliver. : . Documentation: KVM: Fix title level for PSCI_SUSPEND selftests: KVM: Test SYSTEM_SUSPEND PSCI call selftests: KVM: Refactor psci_test to make it amenable to new tests selftests: KVM: Use KVM_SET_MP_STATE to power off vCPU in psci_test selftests: KVM: Create helper for making SMCCC calls selftests: KVM: Rename psci_cpu_on_test to psci_test KVM: arm64: Implement PSCI SYSTEM_SUSPEND KVM: arm64: Add support for userspace to suspend a vCPU KVM: arm64: Return a value from check_vcpu_requests() KVM: arm64: Rename the KVM_REQ_SLEEP handler KVM: arm64: Track vCPU power state using MP state values KVM: arm64: Dedupe vCPU power off helpers KVM: arm64: Don't depend on fallthrough to hide SYSTEM_RESET2 Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-16 17:48:20 +01:00
Marc Zyngier	0586e28aaa	Merge branch kvm-arm64/hcall-selection into kvmarm-master/next * kvm-arm64/hcall-selection: : . : Introduce a new set of virtual sysregs for userspace to : select the hypercalls it wants to see exposed to the guest. : : Patches courtesy of Raghavendra and Oliver. : . KVM: arm64: Fix hypercall bitmap writeback when vcpus have already run KVM: arm64: Hide KVM_REG_ARM_*_BMAP_BIT_COUNT from userspace Documentation: Fix index.rst after psci.rst renaming selftests: KVM: aarch64: Add the bitmap firmware registers to get-reg-list selftests: KVM: aarch64: Introduce hypercall ABI test selftests: KVM: Create helper for making SMCCC calls selftests: KVM: Rename psci_cpu_on_test to psci_test tools: Import ARM SMCCC definitions Docs: KVM: Add doc for the bitmap firmware registers Docs: KVM: Rename psci.rst to hypercalls.rst KVM: arm64: Add vendor hypervisor firmware register KVM: arm64: Add standard hypervisor firmware register KVM: arm64: Setup a framework for hypercall bitmap firmware registers KVM: arm64: Factor out firmware register handling from psci.c Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-16 17:47:03 +01:00
Marc Zyngier	20492a62b9	KVM: arm64: pmu: Restore compilation when HW_PERF_EVENTS isn't selected Moving kvm_pmu_events into the vcpu (and refering to it) broke the somewhat unusual case where the kernel has no support for a PMU at all. In order to solve this, move things around a bit so that we can easily avoid refering to the pmu structure outside of PMU-aware code. As a bonus, pmu.c isn't compiled in when HW_PERF_EVENTS isn't selected. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/202205161814.KQHpOzsJ-lkp@intel.com	2022-05-16 13:42:41 +01:00
Fuad Tabba	84d751a019	KVM: arm64: Pass pmu events to hyp via vcpu Instead of the host accessing hyp data directly, pass the pmu events of the current cpu to hyp via the vcpu. This adds 64 bits (in two fields) to the vcpu that need to be synced before every vcpu run in nvhe and protected modes. However, it isolates the hypervisor from the host, which allows us to use pmu in protected mode in a subsequent patch. No visible side effects in behavior intended. Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220510095710.148178-4-tabba@google.com	2022-05-15 11:26:41 +01:00
Alexandru Elisei	f1f0c0cfea	KVM: arm64: Don't BUG_ON() if emulated register table is unsorted To emulate a register access, KVM uses a table of registers sorted by register encoding to speed up queries using binary search. When Linux boots, KVM checks that the table is sorted and uses a BUG_ON() statement to let the user know if it's not. The unfortunate side effect is that an unsorted sysreg table brings down the whole kernel, not just KVM, even though the rest of the kernel can function just fine without KVM. To make matters worse, on machines which lack a serial console, the user is left pondering why the machine is taking so long to boot. Improve this situation by returning an error from kvm_arch_init() if the sysreg tables are not in the correct order. The machine is still very much usable for the user, with the exception of virtualization, who can now easily determine what went wrong. A minor typo has also been corrected in the check_sysreg_table() function. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220428103405.70884-2-alexandru.elisei@arm.com	2022-05-04 17:52:50 +01:00
Marc Zyngier	d25f30fe41	Merge branch kvm-arm64/aarch32-idreg-trap into kvmarm-master/next * kvm-arm64/aarch32-idreg-trap: : . : Add trapping/sanitising infrastructure for AArch32 systen registers, : allowing more control over what we actually expose (such as the PMU). : : Patches courtesy of Oliver and Alexandru. : . KVM: arm64: Fix new instances of 32bit ESRs KVM: arm64: Hide AArch32 PMU registers when not available KVM: arm64: Start trapping ID registers for 32 bit guests KVM: arm64: Plumb cp10 ID traps through the AArch64 sysreg handler KVM: arm64: Wire up CP15 feature registers to their AArch64 equivalents KVM: arm64: Don't write to Rt unless sys_reg emulation succeeds KVM: arm64: Return a bool from emulate_cp() Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-04 09:42:45 +01:00
Marc Zyngier	b2c4caf331	Merge branch kvm-arm64/wfxt into kvmarm-master/next * kvm-arm64/wfxt: : . : Add support for the WFET/WFIT instructions that provide the same : service as WFE/WFI, only with a timeout. : . KVM: arm64: Expose the WFXT feature to guests KVM: arm64: Offer early resume for non-blocking WFxT instructions KVM: arm64: Handle blocking WFIT instruction KVM: arm64: Introduce kvm_counter_compute_delta() helper KVM: arm64: Simplify kvm_cpu_has_pending_timer() arm64: Use WFxT for __delay() when possible arm64: Add wfet()/wfit() helpers arm64: Add HWCAP advertising FEAT_WFXT arm64: Add RV and RN fields for ESR_ELx_WFx_ISS arm64: Expand ESR_ELx_WFx_ISS_TI to match its ARMv8.7 definition Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-04 09:42:16 +01:00
Oliver Upton	bfbab44568	KVM: arm64: Implement PSCI SYSTEM_SUSPEND ARM DEN0022D.b 5.19 "SYSTEM_SUSPEND" describes a PSCI call that allows software to request that a system be placed in the deepest possible low-power state. Effectively, software can use this to suspend itself to RAM. Unfortunately, there really is no good way to implement a system-wide PSCI call in KVM. Any precondition checks done in the kernel will need to be repeated by userspace since there is no good way to protect a critical section that spans an exit to userspace. SYSTEM_RESET and SYSTEM_OFF are equally plagued by this issue, although no users have seemingly cared for the relatively long time these calls have been supported. The solution is to just make the whole implementation userspace's problem. Introduce a new system event, KVM_SYSTEM_EVENT_SUSPEND, that indicates to userspace a calling vCPU has invoked PSCI SYSTEM_SUSPEND. Additionally, add a CAP to get buy-in from userspace for this new exit type. Only advertise the SYSTEM_SUSPEND PSCI call if userspace has opted in. If a vCPU calls SYSTEM_SUSPEND, punt straight to userspace. Provide explicit documentation of userspace's responsibilites for the exit and point to the PSCI specification to describe the actual PSCI call. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-8-oupton@google.com	2022-05-04 09:28:45 +01:00
Oliver Upton	7b33a09d03	KVM: arm64: Add support for userspace to suspend a vCPU Introduce a new MP state, KVM_MP_STATE_SUSPENDED, which indicates a vCPU is in a suspended state. In the suspended state the vCPU will block until a wakeup event (pending interrupt) is recognized. Add a new system event type, KVM_SYSTEM_EVENT_WAKEUP, to indicate to userspace that KVM has recognized one such wakeup event. It is the responsibility of userspace to then make the vCPU runnable, or leave it suspended until the next wakeup event. Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-7-oupton@google.com	2022-05-04 09:28:45 +01:00
Oliver Upton	b171f9bbb1	KVM: arm64: Track vCPU power state using MP state values A subsequent change to KVM will add support for additional power states. Store the MP state by value rather than keeping track of it as a boolean. No functional change intended. Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-4-oupton@google.com	2022-05-04 09:28:44 +01:00
Oliver Upton	1e5794295c	KVM: arm64: Dedupe vCPU power off helpers vcpu_power_off() and kvm_psci_vcpu_off() are equivalent; rename the former and replace all callsites to the latter. No functional change intended. Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-3-oupton@google.com	2022-05-04 09:28:44 +01:00
Raghavendra Rao Ananta	b22216e1a6	KVM: arm64: Add vendor hypervisor firmware register Introduce the firmware register to hold the vendor specific hypervisor service calls (owner value 6) as a bitmap. The bitmap represents the features that'll be enabled for the guest, as configured by the user-space. Currently, this includes support for KVM-vendor features along with reading the UID, represented by bit-0, and Precision Time Protocol (PTP), represented by bit-1. Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> [maz: tidy-up bitmap values] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220502233853.1233742-5-rananta@google.com	2022-05-03 21:30:19 +01:00
Raghavendra Rao Ananta	428fd6788d	KVM: arm64: Add standard hypervisor firmware register Introduce the firmware register to hold the standard hypervisor service calls (owner value 5) as a bitmap. The bitmap represents the features that'll be enabled for the guest, as configured by the user-space. Currently, this includes support only for Paravirtualized time, represented by bit-0. Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> [maz: tidy-up bitmap values] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220502233853.1233742-4-rananta@google.com	2022-05-03 21:30:19 +01:00
Raghavendra Rao Ananta	05714cab7d	KVM: arm64: Setup a framework for hypercall bitmap firmware registers KVM regularly introduces new hypercall services to the guests without any consent from the userspace. This means, the guests can observe hypercall services in and out as they migrate across various host kernel versions. This could be a major problem if the guest discovered a hypercall, started using it, and after getting migrated to an older kernel realizes that it's no longer available. Depending on how the guest handles the change, there's a potential chance that the guest would just panic. As a result, there's a need for the userspace to elect the services that it wishes the guest to discover. It can elect these services based on the kernels spread across its (migration) fleet. To remedy this, extend the existing firmware pseudo-registers, such as KVM_REG_ARM_PSCI_VERSION, but by creating a new COPROC register space for all the hypercall services available. These firmware registers are categorized based on the service call owners, but unlike the existing firmware pseudo-registers, they hold the features supported in the form of a bitmap. During the VM initialization, the registers are set to upper-limit of the features supported by the corresponding registers. It's expected that the VMMs discover the features provided by each register via GET_ONE_REG, and write back the desired values using SET_ONE_REG. KVM allows this modification only until the VM has started. Some of the standard features are not mapped to any bits of the registers. But since they can recreate the original problem of making it available without userspace's consent, they need to be explicitly added to the case-list in kvm_hvc_call_default_allowed(). Any function-id that's not enabled via the bitmap, or not listed in kvm_hvc_call_default_allowed, will be returned as SMCCC_RET_NOT_SUPPORTED to the guest. Older userspace code can simply ignore the feature and the hypercall services will be exposed unconditionally to the guests, thus ensuring backward compatibility. In this patch, the framework adds the register only for ARM's standard secure services (owner value 4). Currently, this includes support only for ARM True Random Number Generator (TRNG) service, with bit-0 of the register representing mandatory features of v1.0. Other services are momentarily added in the upcoming patches. Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> [maz: reduced the scope of some helpers, tidy-up bitmap max values, dropped error-only fast path] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220502233853.1233742-3-rananta@google.com	2022-05-03 21:30:19 +01:00
Oliver Upton	9369bc5c5e	KVM: arm64: Plumb cp10 ID traps through the AArch64 sysreg handler In order to enable HCR_EL2.TID3 for AArch32 guests KVM needs to handle traps where ESR_EL2.EC=0x8, which corresponds to an attempted VMRS access from an ID group register. Specifically, the MVFR{0-2} registers are accessed this way from AArch32. Conveniently, these registers are architecturally mapped to MVFR{0-2}_EL1 in AArch64. Furthermore, KVM already handles reads to these aliases in AArch64. Plumb VMRS read traps through to the general AArch64 system register handler. Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220503060205.2823727-5-oupton@google.com	2022-05-03 11:14:34 +01:00
Sean Christopherson	f502cc568d	KVM: Add max_vcpus field in common 'struct kvm' For TDX guests, the maximum number of vcpus needs to be specified when the TDX guest VM is initialized (creating the TDX data corresponding to TDX guest) before creating vcpu. It needs to record the maximum number of vcpus on VM creation (KVM_CREATE_VM) and return error if the number of vcpus exceeds it Because there is already max_vcpu member in arm64 struct kvm_arch, move it to common struct kvm and initialize it to KVM_MAX_VCPUS before kvm_arch_init_vm() instead of adding it to x86 struct kvm_arch. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Message-Id: <e53234cdee6a92357d06c80c03d77c19cdefb804.1646422845.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-02 11:42:42 -04:00
Alexandru Elisei	0b12620fdd	KVM: arm64: Treat ESR_EL2 as a 64-bit register ESR_EL2 was defined as a 32-bit register in the initial release of the ARM Architecture Manual for Armv8-A, and was later extended to 64 bits, with bits [63:32] RES0. ARMv8.7 introduced FEAT_LS64, which makes use of bits [36:32]. KVM treats ESR_EL1 as a 64-bit register when saving and restoring the guest context, but ESR_EL2 is handled as a 32-bit register. Start treating ESR_EL2 as a 64-bit register to allow KVM to make use of the most significant 32 bits in the future. The type chosen to represent ESR_EL2 is u64, as that is consistent with the notation KVM overwhelmingly uses today (u32), and how the rest of the registers are declared. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220425114444.368693-5-alexandru.elisei@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-29 19:26:27 +01:00
Mark Brown	861262ab86	KVM: arm64: Handle SME host state when running guests While we don't currently support SME in guests we do currently support it for the host system so we need to take care of SME's impact, including the floating point register state, when running guests. Simiarly to SVE we need to manage the traps in CPACR_RL1, what is new is the handling of streaming mode and ZA. Normally we defer any handling of the floating point register state until the guest first uses it however if the system is in streaming mode FPSIMD and SVE operations may generate SME traps which we would need to distinguish from actual attempts by the guest to use SME. Rather than do this for the time being if we are in streaming mode when entering the guest we force the floating point state to be saved immediately and exit streaming mode, meaning that the guest won't generate SME traps for supported operations. We could handle ZA in the access trap similarly to the FPSIMD/SVE state without the disruption caused by streaming mode but for simplicity handle it the same way as streaming mode for now. This will be revisited when we support SME for guests (hopefully before SME hardware becomes available), for now it will only incur additional cost on systems with SME and even there only if streaming mode or ZA are enabled. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220419112247.711548-27-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-22 18:51:23 +01:00
Mark Brown	0033cd9339	arm64/sme: Implement ZA context switching Allocate space for storing ZA on first access to SME and use that to save and restore ZA state when context switching. We do this by using the vector form of the LDR and STR ZA instructions, these do not require streaming mode and have implementation recommendations that they avoid contention issues in shared SMCU implementations. Since ZA is architecturally guaranteed to be zeroed when enabled we do not need to explicitly zero ZA, either we will be restoring from a saved copy or trapping on first use of SME so we know that ZA must be disabled. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-16-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-22 18:51:02 +01:00
Marc Zyngier	89f5074c50	KVM: arm64: Handle blocking WFIT instruction When trapping a blocking WFIT instruction, take it into account when computing the deadline of the background timer. The state is tracked with a new vcpu flag, and is gated by a new CPU capability, which isn't currently enabled. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220419182755.601427-6-maz@kernel.org	2022-04-20 13:24:45 +01:00
Reiji Watanabe	26bf74bd9f	KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs KVM allows userspace to configure either all EL1 32bit or 64bit vCPUs for a guest. At vCPU reset, vcpu_allowed_register_width() checks if the vcpu's register width is consistent with all other vCPUs'. Since the checking is done even against vCPUs that are not initialized (KVM_ARM_VCPU_INIT has not been done) yet, the uninitialized vCPUs are erroneously treated as 64bit vCPU, which causes the function to incorrectly detect a mixed-width VM. Introduce KVM_ARCH_FLAG_EL1_32BIT and KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED bits for kvm->arch.flags. A value of the EL1_32BIT bit indicates that the guest needs to be configured with all 32bit or 64bit vCPUs, and a value of the REG_WIDTH_CONFIGURED bit indicates if a value of the EL1_32BIT bit is valid (already set up). Values in those bits are set at the first KVM_ARM_VCPU_INIT for the guest based on KVM_ARM_VCPU_EL1_32BIT configuration for the vCPU. Check vcpu's register width against those new bits at the vcpu's KVM_ARM_VCPU_INIT (instead of against other vCPUs' register width). Fixes: `66e94d5caf` ("KVM: arm64: Prevent mixed-width VM creation") Signed-off-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220329031924.619453-2-reijiw@google.com	2022-04-06 12:29:45 +01:00
Linus Torvalds	1ebdbeb03e	ARM: - Proper emulation of the OSLock feature of the debug architecture - Scalibility improvements for the MMU lock when dirty logging is on - New VMID allocator, which will eventually help with SVA in VMs - Better support for PMUs in heterogenous systems - PSCI 1.1 support, enabling support for SYSTEM_RESET2 - Implement CONFIG_DEBUG_LIST at EL2 - Make CONFIG_ARM64_ERRATUM_2077057 default y - Reduce the overhead of VM exit when no interrupt is pending - Remove traces of 32bit ARM host support from the documentation - Updated vgic selftests - Various cleanups, doc updates and spelling fixes RISC-V: - Prevent KVM_COMPAT from being selected - Optimize __kvm_riscv_switch_to() implementation - RISC-V SBI v0.3 support s390: - memop selftest - fix SCK locking - adapter interruptions virtualization for secure guests - add Claudio Imbrenda as maintainer - first step to do proper storage key checking x86: - Continue switching kvm_x86_ops to static_call(); introduce static_call_cond() and __static_call_ret0 when applicable. - Cleanup unused arguments in several functions - Synthesize AMD 0x80000021 leaf - Fixes and optimization for Hyper-V sparse-bank hypercalls - Implement Hyper-V's enlightened MSR bitmap for nested SVM - Remove MMU auditing - Eager splitting of page tables (new aka "TDP" MMU only) when dirty page tracking is enabled - Cleanup the implementation of the guest PGD cache - Preparation for the implementation of Intel IPI virtualization - Fix some segment descriptor checks in the emulator - Allow AMD AVIC support on systems with physical APIC ID above 255 - Better API to disable virtualization quirks - Fixes and optimizations for the zapping of page tables: - Zap roots in two passes, avoiding RCU read-side critical sections that last too long for very large guests backed by 4 KiB SPTEs. - Zap invalid and defunct roots asynchronously via concurrency-managed work queue. - Allowing yielding when zapping TDP MMU roots in response to the root's last reference being put. - Batch more TLB flushes with an RCU trick. Whoever frees the paging structure now holds RCU as a proxy for all vCPUs running in the guest, i.e. to prolongs the grace period on their behalf. It then kicks the the vCPUs out of guest mode before doing rcu_read_unlock(). Generic: - Introduce __vcalloc and use it for very large allocations that need memcg accounting -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmI4fdwUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMq8gf/WoeVHtw2QlL5Mmz6McvRRmPAYPLV wLUIFNrRqRvd8Tw4kivzZoh/xTpwmnojv0YdK5SjKAiMjgv094YI1LrNp1JSPvmL pitocMkA10RSJNWHeEMg9cMSKH0rKiqeYl6S1e2XsdB+UZZ2BINOCVtvglmjTAvJ dFBdKdBkqjAUZbdXAGIvz4JEEER3N/LkFDKGaUGX+0QIQOzGBPIyLTxynxIDG6mt RViCCFyXdy5NkVp5hZFm96vQ2qAlWL9B9+iKruQN++82+oqWbeTdSqPhdwF7GyFz BfOv3gobQ2c4ef/aMLO5LswZ9joI1t/4kQbbAn6dNybpOAz/NXfDnbNefg== =keox -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "ARM: - Proper emulation of the OSLock feature of the debug architecture - Scalibility improvements for the MMU lock when dirty logging is on - New VMID allocator, which will eventually help with SVA in VMs - Better support for PMUs in heterogenous systems - PSCI 1.1 support, enabling support for SYSTEM_RESET2 - Implement CONFIG_DEBUG_LIST at EL2 - Make CONFIG_ARM64_ERRATUM_2077057 default y - Reduce the overhead of VM exit when no interrupt is pending - Remove traces of 32bit ARM host support from the documentation - Updated vgic selftests - Various cleanups, doc updates and spelling fixes RISC-V: - Prevent KVM_COMPAT from being selected - Optimize __kvm_riscv_switch_to() implementation - RISC-V SBI v0.3 support s390: - memop selftest - fix SCK locking - adapter interruptions virtualization for secure guests - add Claudio Imbrenda as maintainer - first step to do proper storage key checking x86: - Continue switching kvm_x86_ops to static_call(); introduce static_call_cond() and __static_call_ret0 when applicable. - Cleanup unused arguments in several functions - Synthesize AMD 0x80000021 leaf - Fixes and optimization for Hyper-V sparse-bank hypercalls - Implement Hyper-V's enlightened MSR bitmap for nested SVM - Remove MMU auditing - Eager splitting of page tables (new aka "TDP" MMU only) when dirty page tracking is enabled - Cleanup the implementation of the guest PGD cache - Preparation for the implementation of Intel IPI virtualization - Fix some segment descriptor checks in the emulator - Allow AMD AVIC support on systems with physical APIC ID above 255 - Better API to disable virtualization quirks - Fixes and optimizations for the zapping of page tables: - Zap roots in two passes, avoiding RCU read-side critical sections that last too long for very large guests backed by 4 KiB SPTEs. - Zap invalid and defunct roots asynchronously via concurrency-managed work queue. - Allowing yielding when zapping TDP MMU roots in response to the root's last reference being put. - Batch more TLB flushes with an RCU trick. Whoever frees the paging structure now holds RCU as a proxy for all vCPUs running in the guest, i.e. to prolongs the grace period on their behalf. It then kicks the the vCPUs out of guest mode before doing rcu_read_unlock(). Generic: - Introduce __vcalloc and use it for very large allocations that need memcg accounting" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits) KVM: use kvcalloc for array allocations KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2 kvm: x86: Require const tsc for RT KVM: x86: synthesize CPUID leaf 0x80000021h if useful KVM: x86: add support for CPUID leaf 0x80000021 KVM: x86: do not use KVM_X86_OP_OPTIONAL_RET0 for get_mt_mask Revert "KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range()" kvm: x86/mmu: Flush TLB before zap_gfn_range releases RCU KVM: arm64: fix typos in comments KVM: arm64: Generalise VM features into a set of flags KVM: s390: selftests: Add error memop tests KVM: s390: selftests: Add more copy memop tests KVM: s390: selftests: Add named stages for memop test KVM: s390: selftests: Add macro as abstraction for MEM_OP KVM: s390: selftests: Split memop tests KVM: s390x: fix SCK locking RISC-V: KVM: Implement SBI HSM suspend call RISC-V: KVM: Add common kvm_riscv_vcpu_wfi() function RISC-V: Add SBI HSM suspend related defines RISC-V: KVM: Implement SBI v0.3 SRST extension ...	2022-03-24 11:58:57 -07:00
Marc Zyngier	06394531b4	KVM: arm64: Generalise VM features into a set of flags We currently deal with a set of booleans for VM features, while they could be better represented as set of flags contained in an unsigned long, similarily to what we are doing on the CPU side. Signed-off-by: Marc Zyngier <maz@kernel.org> [Oliver: Flag-ify the 'ran_once' boolean] Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220311174001.605719-2-oupton@google.com	2022-03-18 14:02:33 +00:00
James Morse	5bdf343760	KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A CPUs vulnerable to Spectre-BHB either need to make an SMC-CC firmware call from the vectors, or run a sequence of branches. This gets added to the hyp vectors. If there is no support for arch-workaround-1 in firmware, the indirect vector will be used. kvm_init_vector_slots() only initialises the two indirect slots if the platform is vulnerable to Spectre-v3a. pKVM's hyp_map_vectors() only initialises __hyp_bp_vect_base if the platform is vulnerable to Spectre-v3a. As there are about to more users of the indirect vectors, ensure their entries in hyp_spectre_vector_selector[] are always initialised, and __hyp_bp_vect_base defaults to the regular VA mapping. The Spectre-v3a check is moved to a helper kvm_system_needs_idmapped_vectors(), and merged with the code that creates the hyp mappings. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>	2022-02-15 17:38:25 +00:00
Marc Zyngier	00e6dae00e	Merge branch kvm-arm64/pmu-bl into kvmarm-master/next * kvm-arm64/pmu-bl: : . : Improve PMU support on heterogeneous systems, courtesy of Alexandru Elisei : . KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute KVM: arm64: Keep a list of probed PMUs KVM: arm64: Keep a per-VM pointer to the default PMU perf: Fix wrong name in comment for struct perf_cpu_context KVM: arm64: Do not change the PMU event filter after a VCPU has run Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-02-08 17:54:41 +00:00
Alexandru Elisei	583cda1b0e	KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU device ioctl. If the VCPU is scheduled on a physical CPU which has a different PMU, the perf events needed to emulate a guest PMU won't be scheduled in and the guest performance counters will stop counting. Treat it as an userspace error and refuse to run the VCPU in this situation. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127161759.53553-7-alexandru.elisei@arm.com	2022-02-08 17:51:22 +00:00
Marc Zyngier	46b1878214	KVM: arm64: Keep a per-VM pointer to the default PMU As we are about to allow selection of the PMU exposed to a guest, start by keeping track of the default one instead of only the PMU version. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20220127161759.53553-4-alexandru.elisei@arm.com	2022-02-08 17:51:21 +00:00
Marc Zyngier	5177fe91e4	KVM: arm64: Do not change the PMU event filter after a VCPU has run Userspace can specify which events a guest is allowed to use with the KVM_ARM_VCPU_PMU_V3_FILTER attribute. The list of allowed events can be identified by a guest from reading the PMCEID{0,1}_EL0 registers. Changing the PMU event filter after a VCPU has run can cause reads of the registers performed before the filter is changed to return different values than reads performed with the new event filter in place. The architecture defines the two registers as read-only, and this behaviour contradicts that. Keep track when the first VCPU has run and deny changes to the PMU event filter to prevent this from happening. Signed-off-by: Marc Zyngier <maz@kernel.org> [ Alexandru E: Added commit message, updated ioctl documentation ] Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127161759.53553-2-alexandru.elisei@arm.com	2022-02-08 17:51:21 +00:00
Marc Zyngier	ebca68972e	Merge branch kvm-arm64/vmid-allocator into kvmarm-master/next * kvm-arm64/vmid-allocator: : . : VMID allocation rewrite from Shameerali Kolothum Thodi, paving the : way for pinned VMIDs and SVA. : . KVM: arm64: Make active_vmids invalid on vCPU schedule out KVM: arm64: Align the VMID allocation with the arm64 ASID KVM: arm64: Make VMID bits accessible outside of allocator KVM: arm64: Introduce a new VMID allocator for KVM Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-02-08 14:58:38 +00:00
Shameer Kolothum	100b4f092f	KVM: arm64: Make active_vmids invalid on vCPU schedule out Like ASID allocator, we copy the active_vmids into the reserved_vmids on a rollover. But it's unlikely that every CPU will have a vCPU as current task and we may end up unnecessarily reserving the VMID space. Hence, set active_vmids to an invalid one when scheduling out a vCPU. Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211122121844.867-5-shameerali.kolothum.thodi@huawei.com	2022-02-08 14:57:04 +00:00
Julien Grall	3248136b36	KVM: arm64: Align the VMID allocation with the arm64 ASID At the moment, the VMID algorithm will send an SGI to all the CPUs to force an exit and then broadcast a full TLB flush and I-Cache invalidation. This patch uses the new VMID allocator. The benefits are: - Aligns with arm64 ASID algorithm. - CPUs are not forced to exit at roll-over. Instead, the VMID will be marked reserved and context invalidation is broadcasted. This will reduce the IPIs traffic. - More flexible to add support for pinned KVM VMIDs in the future. With the new algo, the code is now adapted: - The call to update_vmid() will be done with preemption disabled as the new algo requires to store information per-CPU. Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211122121844.867-4-shameerali.kolothum.thodi@huawei.com	2022-02-08 14:57:03 +00:00
Shameer Kolothum	f8051e9609	KVM: arm64: Make VMID bits accessible outside of allocator Since we already set the kvm_arm_vmid_bits in the VMID allocator init function, make it accessible outside as well so that it can be used in the subsequent patch. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211122121844.867-3-shameerali.kolothum.thodi@huawei.com	2022-02-08 14:46:28 +00:00
Shameer Kolothum	417838392f	KVM: arm64: Introduce a new VMID allocator for KVM A new VMID allocator for arm64 KVM use. This is based on arm64 ASID allocator algorithm. One major deviation from the ASID allocator is the way we flush the context. Unlike ASID allocator, we expect less frequent rollover in the case of VMIDs. Hence, instead of marking the CPU as flush_pending and issuing a local context invalidation on the next context switch, we broadcast TLB flush + I-cache invalidation over the inner shareable domain on rollover. Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211122121844.867-2-shameerali.kolothum.thodi@huawei.com	2022-02-08 14:46:28 +00:00
Marc Zyngier	2bb48074b3	Merge branch kvm-arm64/mmu-rwlock into kvmarm-master/next * kvm-arm64/mmu-rwlock: : . : MMU locking optimisations from Jing Zhang, allowing permission : relaxations to occur in parallel. : . KVM: selftests: Add vgic initialization for dirty log perf test for ARM KVM: arm64: Add fast path to handle permission relaxation during dirty logging KVM: arm64: Use read/write spin lock for MMU protection Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-02-08 14:29:29 +00:00
Jing Zhang	fcc5bf8963	KVM: arm64: Use read/write spin lock for MMU protection Replace MMU spinlock with rwlock and update all instances of the lock being acquired with a write lock acquisition. Future commit will add a fast path for permission relaxation during dirty logging under a read lock. Signed-off-by: Jing Zhang <jingzhangos@google.com> Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220118015703.3630552-2-jingzhangos@google.com	2022-02-08 14:27:52 +00:00
Oliver Upton	7dabf02f43	KVM: arm64: Emulate the OS Lock The OS lock blocks all debug exceptions at every EL. To date, KVM has not implemented the OS lock for its guests, despite the fact that it is mandatory per the architecture. Simple context switching between the guest and host is not appropriate, as its effects are not constrained to the guest context. Emulate the OS Lock by clearing MDE and SS in MDSCR_EL1, thereby blocking all but software breakpoint instructions. Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220203174159.2887882-5-oupton@google.com	2022-02-08 14:23:41 +00:00
Oliver Upton	d42e26716d	KVM: arm64: Stash OSLSR_EL1 in the cpu context An upcoming change to KVM will emulate the OS Lock from the PoV of the guest. Add OSLSR_EL1 to the cpu context and handle reads using the stored value. Define some mnemonics for for handling the OSLM field and use them to make the reset value of OSLSR_EL1 more readable. Wire up a custom handler for writes from userspace and prevent any of the invariant bits from changing. Note that the OSLK bit is not invariant and will be made writable by the aforementioned change. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220203174159.2887882-3-oupton@google.com	2022-02-08 14:23:41 +00:00
Linus Torvalds	79e06c4c49	RISCV: - Use common KVM implementation of MMU memory caches - SBI v0.2 support for Guest - Initial KVM selftests support - Fix to avoid spurious virtual interrupts after clearing hideleg CSR - Update email address for Anup and Atish ARM: - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update s390: - fix sigp sense/start/stop/inconsistency - cleanups x86: - Clean up some function prototypes more - improved gfn_to_pfn_cache with proper invalidation, used by Xen emulation - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery - completely remove potential TOC/TOU races in nested SVM consistency checks - update some PMCs on emulated instructions - Intel AMX support (joint work between Thomas and Intel) - large MMU cleanups - module parameter to disable PMU virtualization - cleanup register cache - first part of halt handling cleanups - Hyper-V enlightened MSR bitmap support for nested hypervisors Generic: - clean up Makefiles - introduce CONFIG_HAVE_KVM_DIRTY_RING - optimize memslot lookup using a tree - optimize vCPU array usage by converting to xarray -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmHhxvsUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPZkAf+Nz92UL/5nNGcdHtE4m7AToMmitE9 bYkesf9BMQvAe5wjkABLuoHGi6ay4jabo4fiGzbdkiK7lO5YgfsWiMB3/MT5fl4E jRPzaVQabp3YZLM8UYCBmfUVuRj524S967SfSRe0AvYjDEH8y7klPf4+7sCsFT0/ Px9Vf2KGuOlf0eM78yKg4rGaF0jS22eLgXm6FfNMY8/e29ZAo/jyUmqBY+Z2xxZG aWhceDtSheW1jwLHLj3nOlQJvHTn8LVGXBE/R8Gda3ZjrBV2rKaDi4Fh+HD+dz86 2zVXwzQ7uck2CMW73GMoXMTWoKSHMyvlBOs1BdvBm4UsnGcXR+q8IFCeuQ== =s73m -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "RISCV: - Use common KVM implementation of MMU memory caches - SBI v0.2 support for Guest - Initial KVM selftests support - Fix to avoid spurious virtual interrupts after clearing hideleg CSR - Update email address for Anup and Atish ARM: - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update s390: - fix sigp sense/start/stop/inconsistency - cleanups x86: - Clean up some function prototypes more - improved gfn_to_pfn_cache with proper invalidation, used by Xen emulation - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery - completely remove potential TOC/TOU races in nested SVM consistency checks - update some PMCs on emulated instructions - Intel AMX support (joint work between Thomas and Intel) - large MMU cleanups - module parameter to disable PMU virtualization - cleanup register cache - first part of halt handling cleanups - Hyper-V enlightened MSR bitmap support for nested hypervisors Generic: - clean up Makefiles - introduce CONFIG_HAVE_KVM_DIRTY_RING - optimize memslot lookup using a tree - optimize vCPU array usage by converting to xarray" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (268 commits) x86/fpu: Fix inline prefix warnings selftest: kvm: Add amx selftest selftest: kvm: Move struct kvm_x86_state to header selftest: kvm: Reorder vcpu_load_state steps for AMX kvm: x86: Disable interception for IA32_XFD on demand x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() kvm: selftests: Add support for KVM_CAP_XSAVE2 kvm: x86: Add support for getting/setting expanded xstate buffer x86/fpu: Add uabi_size to guest_fpu kvm: x86: Add CPUID support for Intel AMX kvm: x86: Add XCR0 support for Intel AMX kvm: x86: Disable RDMSR interception of IA32_XFD_ERR kvm: x86: Emulate IA32_XFD_ERR for guest kvm: x86: Intercept #NM for saving IA32_XFD_ERR x86/fpu: Prepare xfd_err in struct fpu_guest kvm: x86: Add emulation for IA32_XFD x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation kvm: x86: Enable dynamic xfeatures at KVM_SET_CPUID2 x86/fpu: Provide fpu_enable_guest_xfd_features() for KVM x86/fpu: Add guest support to xfd_enable_feature() ...	2022-01-16 16:15:14 +02:00
Paolo Bonzini	7fd55a02a4	KVM/arm64 updates for Linux 5.16 - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmHYIpgPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDndsP/RsBmX6bmQnDEhaaqfGAxOETyq/my1eT9r/V 3Ax4fEqSFfD5yHbYvqNRC8ueycH4r8WAr4ACWDAI6XpS/pYx00nx2N+HCSgjGyQR FeXqITuGPEsn4NkGuPci0PFmI8rVUzanl1ugRGQAETVrZo2ZVH2uqKVGT8XOlu0J FB/0x6Z4vMuIgEXyfa+DZ8WdW1aCRgPU2oyOdSdWE57/grjyLJqk6EdMmLyaQ19E vz6vXuRnA/GQwOtByqYEnQ8a4VXsQedCMqg/f9mj0BxpDzxC1ps8Nrpv36aJXKUN LEXapP9bCWPW9LqaKAOZnQYrUIIEFHsCUom0n3reDHrgObA+jivpz75L8GEr3CdC Bv78N04Yymjpp2WA6CuO3r9HjL1nJ6tYqobXU2pvqln4nNC3Ukucjq9ZVuWgS6Hx qOZXgPcZ/HpS3l/U+dAu8yIcV2SchQXDudaq8BsfLd8M1bD+oirSBolZFSvz7MYZ 6+jtEDLUOEO5s4rXiJF46+MauxiELcjaewAEK4WwrS8NBwEyhYe9EPsYcQ5pcrQF QwAd1+y7oLfhpGHv5KJKWswfvbtlLCm6NOAhawq0UXM8bS+79tu0dGjiDzVPBuSf SyA3VtBSKxcpvCrljw9ubtjxvKrviE0MDvlmTP2B1NU+lwm8xRBiwUwOH29qP9zU HDeUj2fy =HkZk -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.16 - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update	2022-01-07 10:42:19 -05:00
Marc Zyngier	1c53a1ae36	Merge branch kvm-arm64/misc-5.17 into kvmarm-master/next * kvm-arm64/misc-5.17: : . : Misc fixes and improvements: : - Add minimal support for ARMv8.7's PMU extension : - Constify kvm_io_gic_ops : - Drop kvm_is_transparent_hugepage() prototype : - Drop unused workaround_flags field : - Rework kvm_pgtable initialisation : - Documentation fixes : - Replace open-coded SCTLR_EL1.EE useage with its defined macro : - Sysreg list selftest update to handle PAuth : - Include cleanups : . KVM: arm64: vgic: Replace kernel.h with the necessary inclusions KVM: arm64: Fix comment typo in kvm_vcpu_finalize_sve() KVM: arm64: selftests: get-reg-list: Add pauth configuration KVM: arm64: Fix comment on barrier in kvm_psci_vcpu_on() KVM: arm64: Fix comment for kvm_reset_vcpu() KVM: arm64: Use defined value for SCTLR_ELx_EE KVM: arm64: Rework kvm_pgtable initialisation KVM: arm64: Drop unused workaround_flags vcpu field Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-01-04 17:16:15 +00:00
Quentin Perret	52b28657eb	KVM: arm64: pkvm: Unshare guest structs during teardown Make use of the newly introduced unshare hypercall during guest teardown to unmap guest-related data structures from the hyp stage-1. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-15-qperret@google.com	2021-12-16 12:58:57 +00:00
Marc Zyngier	142ff9bddb	KVM: arm64: Drop unused workaround_flags vcpu field workaround_flags is a leftover from our earlier Spectre-v4 workaround implementation, and now serves no purpose. Get rid of the field and the corresponding asm-offset definition. Fixes: `29e8910a56` ("KVM: arm64: Simplify handling of ARCH_WORKAROUND_2") Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-12-08 14:54:07 +00:00
Sean Christopherson	005467e06b	KVM: Drop obsolete kvm_arch_vcpu_block_finish() Drop kvm_arch_vcpu_block_finish() now that all arch implementations are nops. No functional change intended. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-12-08 04:24:50 -05:00
Marc Zyngier	2d761dbf7f	Merge branch kvm-arm64/fpsimd-tracking into kvmarm-master/next * kvm-arm64/fpsimd-tracking: : . : Simplify the handling of both the FP/SIMD and SVE state by : removing the need for mapping the thread at EL2, and by : dropping the tracking of the host's SVE state which is : always invalid by construction. : . arm64/fpsimd: Document the use of TIF_FOREIGN_FPSTATE by KVM KVM: arm64: Stop mapping current thread_info at EL2 KVM: arm64: Introduce flag shadowing TIF_FOREIGN_FPSTATE KVM: arm64: Remove unused __sve_save_state KVM: arm64: Get rid of host SVE tracking/saving KVM: arm64: Reorder vcpu flag definitions Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-12-01 12:13:44 +00:00
Marc Zyngier	cc5705fb1b	KVM: arm64: Drop vcpu->arch.has_run_once for vcpu->pid With the transition to kvm_arch_vcpu_run_pid_change() to handle the "run once" activities, it becomes obvious that has_run_once is now an exact shadow of vcpu->pid. Replace vcpu->arch.has_run_once with a new vcpu_has_run_once() helper that directly checks for vcpu->pid, and get rid of the now unused field. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-12-01 11:51:22 +00:00
Marc Zyngier	052f064d42	KVM: arm64: Move kvm_arch_vcpu_run_pid_change() out of line Having kvm_arch_vcpu_run_pid_change() inline doesn't bring anything to the table. Move it next to kvm_vcpu_first_run_init(), which will be convenient for what is next to come. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-12-01 11:51:21 +00:00
Marc Zyngier	bee14bca73	KVM: arm64: Stop mapping current thread_info at EL2 Now that we can track an equivalent of TIF_FOREIGN_FPSTATE, drop the mapping of current's thread_info at EL2. Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-11-22 16:01:39 +00:00
Marc Zyngier	af9a0e21d8	KVM: arm64: Introduce flag shadowing TIF_FOREIGN_FPSTATE We currently have to maintain a mapping the thread_info structure at EL2 in order to be able to check the TIF_FOREIGN_FPSTATE flag. In order to eventually get rid of this, start with a vcpu flag that shadows the thread flag on each entry into the hypervisor. Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-11-22 16:01:39 +00:00
Marc Zyngier	8383741ab2	KVM: arm64: Get rid of host SVE tracking/saving The SVE host tracking in KVM is pretty involved. It relies on a set of flags tracking the ownership of the SVE register, as well as that of the EL0 access. It is also pretty scary: __hyp_sve_save_host() computes a thread_struct pointer and obtains a sve_state which gets directly accessed without further ado, even on nVHE. How can this even work? The answer to that is that it doesn't, and that this is mostly dead code. Closer examination shows that on executing a syscall, userspace loses its SVE state entirely. This is part of the ABI. Another thing to notice is that although the kernel provides helpers such as kernel_neon_begin()/end(), they only deal with the FP/NEON state, and not SVE. Given that you can only execute a guest as the result of a syscall, and that the kernel cannot use SVE by itself, it becomes pretty obvious that there is never any host SVE state to save, and that this code is only there to increase confusion. Get rid of the TIF_SVE tracking and host save infrastructure altogether. Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-11-22 15:30:27 +00:00
Marc Zyngier	892fd259cb	KVM: arm64: Reorder vcpu flag definitions The vcpu arch flags are in an interesting, semi random order. As I have made the mistake of reusing a flag once, let's rework this in an order that I find a bit less confusing. Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-11-22 15:25:20 +00:00
Sean Christopherson	17ed14eba2	KVM: arm64: Drop perf.c and fold its tiny bits of code into arm.c Call KVM's (un)register perf callbacks helpers directly from arm.c and delete perf.c No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211111020738.2512932-17-seanjc@google.com	2021-11-17 14:49:11 +01:00
Sean Christopherson	e1bfc24577	KVM: Move x86's perf guest info callbacks to generic KVM Move x86's perf guest callbacks into common KVM, as they are semantically identical to arm64's callbacks (the only other such KVM callbacks). arm64 will convert to the common versions in a future patch. Implement the necessary arm64 arch hooks now to avoid having to provide stubs or a temporary #define (from x86) to avoid arm64 compilation errors when CONFIG_GUEST_PERF_EVENTS=y. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211111020738.2512932-13-seanjc@google.com	2021-11-17 14:49:10 +01:00
Sean Christopherson	2934e3d093	perf: Stop pretending that perf can handle multiple guest callbacks Drop the 'int' return value from the perf (un)register callbacks helpers and stop pretending perf can support multiple callbacks. The 'int' returns are not future proofing anything as none of the callers take action on an error. It's also not obvious that there will ever be co-tenant hypervisors, and if there are, that allowing multiple callbacks to be registered is desirable or even correct. Opportunistically rename callbacks=>cbs in the affected declarations to match their definitions. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20211111020738.2512932-5-seanjc@google.com	2021-11-17 14:49:07 +01:00
Paolo Bonzini	84886c262e	KVM/arm64 fixes for 5.16, take #1 - Fix the host S2 finalization by solely iterating over the memblocks instead of the whole IPA space - Tighten the return value of kvm_vcpu_preferred_target() now that 32bit support is long gone - Make sure the extraction of ESR_ELx.EC is limited to the architected bits - Comment fixups -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmGNhFYPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDA18P/RUmhBWJhx/KE4atccFK6Iy+L4q0vdZehBJN v/lqMQzqj2DOxClFWTYrLt/GJGjxy9IQorW1F2FTLGWVUp4LgwUPtWJed7qmXYD9 loPq69eVc/c8uwtYRFEsfYSbIGHmwJr6WOO0oA7z8Q0HNusO7mLbSywmo4Uz8eYN hQHIBZ19WLCoNOz1SctxNfOj4RDav0ybKaR6XZbvuOHOd3wa8zgbNp9e495/ovtG agWhicFrBOccU7/mZhcGwP4wMI/A/lbxY7KicjrGFHjTCnCOCcFjB+smhhJsjQOr 4KCK4cBvFkXjTRgh764K+KftT+PPQU82ThOoI2S9ZP5v7KdZON0813u/NlXFW/eI fiJegq568AuQvcJ3RJtNSBV3pgJYVVnG1wlrLNhFmm9F9Wzq9BTw62rgHEV2+k/7 BTGzdHSluCKwxjX7CckEk8ZUsJHRdyApFJBwbPEZhngNg7ZVRqW9oGhO1Oxf8auP 6DlTlCkT436/54YtWiN/Ipp9CptMsPqbX2pQp1kKIgtypmZLjTEGZcYflVu3v2lr IJYizOfgjJcosjgF4K7hEJLnviow1u7u9S6Odl0XMJ0SJwD+MPkv5KAZLTdkmRa3 0bjayOtaOn1S3R3b3jXVn8v2FXmHAMTmkvboaSEc2bQ9EuTODgBxwXlm8E6TBdoY Btbmm7DU =IxVi -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/arm64 fixes for 5.16, take #1 - Fix the host S2 finalization by solely iterating over the memblocks instead of the whole IPA space - Tighten the return value of kvm_vcpu_preferred_target() now that 32bit support is long gone - Make sure the extraction of ESR_ELx.EC is limited to the architected bits - Comment fixups	2021-11-12 16:01:55 -05:00
YueHaibing	08e873cb70	KVM: arm64: Change the return type of kvm_vcpu_preferred_target() kvm_vcpu_preferred_target() always return 0 because kvm_target_cpu() never returns a negative error code. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211105011500.16280-1-yuehaibing@huawei.com	2021-11-08 10:48:47 +00:00
Paolo Bonzini	4e33868433	KVM/arm64 updates for Linux 5.16 - More progress on the protected VM front, now with the full fixed feature set as well as the limitation of some hypercalls after initialisation. - Cleanup of the RAZ/WI sysreg handling, which was pointlessly complicated - Fixes for the vgic placement in the IPA space, together with a bunch of selftests - More memcg accounting of the memory allocated on behalf of a guest - Timer and vgic selftests - Workarounds for the Apple M1 broken vgic implementation - KConfig cleanups - New kvmarm.mode=none option, for those who really dislike us -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmF7u5YPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpD6w8QAIKDLJCTqkxv5Vh4ZSmtXxg4gTZMBlg8oSQ8 sVL639aqBvFe3A6Vmz6IwBm+NT7Sm1zxkuH9qHzVR1gmXq0oLYNrIuyrzRW8PvqO hIkSRRoVsf03755TmkxwR7/2jAFxb6FhEVAy6VWdQyI44orihIPvMp8aTIq+jvU+ XoNGb/rPf9HpSUtvuaHYvZhSZBhoi5dRnkr33R1+VR69n7Axs8lm905xcl6Pt0a0 QqYZWQvFu/BXPyNflG7LUsegRF/iiV2vNTbNNowkzlV5suqxBpJAp6ApDL/gWrHv ya/6cMqicSjBIkWnawhXY98w6/5xfzK4IV/zc00FNWOlUdVP89Thqrgc8EkigS9R BGcxFFqj41snr+ensSBBIkNtV+dBX52H3rUE0F9seiTXm8QWI86JobdeNadT8tUP TXdOeCUcA+cp4Ngln18lsbOEaBkPA5H1po1nUFPHbKnVOxnqXScB7E/xF6rAbryV m+Z+oidU7MyS/Ev/Da0ww/XFx7cs2ez9EgeQvjcdFAvUMqS6kcXEExvgGYlm+KRQ GBMKPLCNHKdflMANoSpol7MZUmPJ45XoWKW1rntj2r9X+oJW2Z2hEx32xrWDJdqK ixnbjog5kNZb0CjLGsUC90lo2hpRJecaLhAjgTLYaNC1QxGPrt92eat6gnwuMTBc mpADqi7w =qBAO -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.16 - More progress on the protected VM front, now with the full fixed feature set as well as the limitation of some hypercalls after initialisation. - Cleanup of the RAZ/WI sysreg handling, which was pointlessly complicated - Fixes for the vgic placement in the IPA space, together with a bunch of selftests - More memcg accounting of the memory allocated on behalf of a guest - Timer and vgic selftests - Workarounds for the Apple M1 broken vgic implementation - KConfig cleanups - New kvmarm.mode=none option, for those who really dislike us	2021-10-31 02:28:48 -04:00
Marc Zyngier	be08c3cf3c	Merge branch kvm-arm64/pkvm/fixed-features into kvmarm-master/next * kvm-arm64/pkvm/fixed-features: (22 commits) : . : Add the pKVM fixed feature that allows a bunch of exceptions : to either be forbidden or be easily handled at EL2. : . KVM: arm64: pkvm: Give priority to standard traps over pvm handling KVM: arm64: pkvm: Pass vpcu instead of kvm to kvm_get_exit_handler_array() KVM: arm64: pkvm: Move kvm_handle_pvm_restricted around KVM: arm64: pkvm: Consolidate include files KVM: arm64: pkvm: Preserve pending SError on exit from AArch32 KVM: arm64: pkvm: Handle GICv3 traps as required KVM: arm64: pkvm: Drop sysregs that should never be routed to the host KVM: arm64: pkvm: Drop AArch32-specific registers KVM: arm64: pkvm: Make the ERR/ERX*_EL1 registers RAZ/WI KVM: arm64: pkvm: Use a single function to expose all id-regs KVM: arm64: Fix early exit ptrauth handling KVM: arm64: Handle protected guests at 32 bits KVM: arm64: Trap access to pVM restricted features KVM: arm64: Move sanitized copies of CPU features KVM: arm64: Initialize trap registers for protected VMs KVM: arm64: Add handlers for protected VM System Registers KVM: arm64: Simplify masking out MTE in feature id reg KVM: arm64: Add missing field descriptor for MDCR_EL2 KVM: arm64: Pass struct kvm to per-EC handlers KVM: arm64: Move early handlers to per-EC handlers ... Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-10-18 17:20:50 +01:00
Fuad Tabba	2a0c343386	KVM: arm64: Initialize trap registers for protected VMs Protected VMs have more restricted features that need to be trapped. Moreover, the host should not be trusted to set the appropriate trapping registers and their values. Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and cptr_el2 at EL2 for protected guests, based on the values of the guest's feature id registers. No functional change intended as trap handlers introduced in the previous patch are still not hooked in to the guest exit handlers. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211010145636.1950948-9-tabba@google.com	2021-10-11 14:57:29 +01:00
Marc Zyngier	b6a68b97af	KVM: arm64: Allow KVM to be disabled from the command line Although KVM can be compiled out of the kernel, it cannot be disabled at runtime. Allow this possibility by introducing a new mode that will prevent KVM from initialising. This is useful in the (limited) circumstances where you don't want KVM to be available (what is wrong with you?), or when you want to install another hypervisor instead (good luck with that). Reviewed-by: David Brazdil <dbrazdil@google.com> Acked-by: Will Deacon <will@kernel.org> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Andrew Scull <ascull@google.com> Link: https://lore.kernel.org/r/20211001170553.3062988-1-maz@kernel.org	2021-10-11 09:48:47 +01:00
Juergen Gross	78b497f2e6	kvm: use kvfree() in kvm_arch_free_vm() By switching from kfree() to kvfree() in kvm_arch_free_vm() Arm64 can use the common variant. This can be accomplished by adding another macro __KVM_HAVE_ARCH_VM_FREE, which will be used only by x86 for now. Further simplification can be achieved by adding __kvm_arch_free_vm() doing the common part. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Juergen Gross <jgross@suse.com> Message-Id: <20210903130808.30142-5-jgross@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 03:44:57 -04:00
Marc Zyngier	7c7b363d62	Merge branch kvm-arm64/pkvm-fixed-features-prologue into kvmarm-master/next * kvm-arm64/pkvm-fixed-features-prologue: : Rework a bunch of common infrastructure as a prologue : to Fuad Tabba's protected VM fixed feature series. KVM: arm64: Upgrade trace_kvm_arm_set_dreg32() to 64bit KVM: arm64: Add config register bit definitions KVM: arm64: Add feature register flag definitions KVM: arm64: Track value of cptr_el2 in struct kvm_vcpu_arch KVM: arm64: Keep mdcr_el2's value as set by __init_el2_debug KVM: arm64: Restore mdcr_el2 from vcpu KVM: arm64: Refactor sys_regs.h,c for nVHE reuse KVM: arm64: Fix names of config register fields KVM: arm64: MDCR_EL2 is a 64-bit register KVM: arm64: Remove trailing whitespace in comment KVM: arm64: placeholder to check if VM is protected Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-08-20 12:23:53 +01:00
Marc Zyngier	ca3385a507	Merge branch kvm-arm64/generic-entry into kvmarm-master/next Switch KVM/arm64 to the generic entry code, courtesy of Oliver Upton * kvm-arm64/generic-entry: KVM: arm64: Use generic KVM xfer to guest work function entry: KVM: Allow use of generic KVM entry w/o full generic support KVM: arm64: Record number of signal exits as a vCPU stat Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-08-20 12:23:09 +01:00
Marc Zyngier	3ce5db8a59	Merge branch kvm-arm64/misc-5.15 into kvmarm-master/next * kvm-arm64/misc-5.15: : Misc improvements for 5.15: : : - Account the number of VMID-wide TLB invalidations as : remote TLB flushes : - Fix comments in the VGIC code : - Cleanup the PMU IMPDEF identification : - Streamline the TGRAN2 usage : - Avoid advertising a 52bit IPA range for non-64KB configs : - Avoid spurious signalling when a HW-mapped interrupt is in the : A+P state on entry, and in the P state on exit, but that the : physical line is not pending anymore. : - Bunch of minor cleanups KVM: arm64: vgic: Resample HW pending state on deactivation KVM: arm64: vgic: Drop WARN from vgic_get_irq KVM: arm64: Drop unused REQUIRES_VIRT KVM: arm64: Drop check_kvm_target_cpu() based percpu probe KVM: arm64: Drop init_common_resources() KVM: arm64: Use ARM64_MIN_PARANGE_BITS as the minimum supported IPA arm64/mm: Add remaining ID_AA64MMFR0_PARANGE_ macros KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page size arm64/mm: Define ID_AA64MMFR0_TGRAN_2_SHIFT KVM: arm64: perf: Replace '0xf' instances with ID_AA64DFR0_PMUVER_IMP_DEF KVM: arm64: Fix comments related to GICv2 PMR reporting KVM: arm64: Count VMID-wide TLB invalidations arm64/kexec: Test page size support with new TGRAN range values Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-08-20 12:14:56 +01:00
Fuad Tabba	cd496228fd	KVM: arm64: Track value of cptr_el2 in struct kvm_vcpu_arch Track the baseline guest value for cptr_el2 in struct kvm_vcpu_arch, similar to the other registers that control traps. Use this value when setting cptr_el2 for the guest. Currently this value is unchanged (CPTR_EL2_DEFAULT), but future patches will set trapping bits based on features supported for the guest. No functional change intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210817081134.2918285-9-tabba@google.com	2021-08-20 11:12:17 +01:00
Fuad Tabba	1460b4b25f	KVM: arm64: Restore mdcr_el2 from vcpu On deactivating traps, restore the value of mdcr_el2 from the newly created and preserved host value vcpu context, rather than directly reading the hardware register. Up until and including this patch the two values are the same, i.e., the hardware register and the vcpu one. A future patch will be changing the value of mdcr_el2 on activating traps, and this ensures that its value will be restored. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210817081134.2918285-7-tabba@google.com	2021-08-20 11:12:17 +01:00
Fuad Tabba	d6c850dd6c	KVM: arm64: MDCR_EL2 is a 64-bit register Fix the places in KVM that treat MDCR_EL2 as a 32-bit register. More recent features (e.g., FEAT_SPEv1p2) use bits above 31. No functional change intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210817081134.2918285-4-tabba@google.com	2021-08-20 11:12:17 +01:00
Fuad Tabba	2ea7f65580	KVM: arm64: placeholder to check if VM is protected Add a function to check whether a VM is protected (under pKVM). Since the creation of protected VMs isn't enabled yet, this is a placeholder that always returns false. The intention is for this to become a check for protected VMs in the future (see Will's RFC). No functional change intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/kvmarm/20210603183347.1695-1-will@kernel.org/ Link: https://lore.kernel.org/r/20210817081134.2918285-2-tabba@google.com	2021-08-20 11:12:05 +01:00
Oliver Upton	fe5161d2c3	KVM: arm64: Record number of signal exits as a vCPU stat Most other architectures that implement KVM record a statistic indicating the number of times a vCPU has exited due to a pending signal. Add support for that stat to arm64. Reviewed-by: Jing Zhang <jingzhangos@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210802192809.1851010-2-oupton@google.com	2021-08-19 11:19:41 +01:00
Anshuman Khandual	6b7982fefc	KVM: arm64: Drop check_kvm_target_cpu() based percpu probe kvm_target_cpu() never returns a negative error code, so check_kvm_target() would never have 'ret' filled with a negative error code. Hence the percpu probe via check_kvm_target_cpu() does not make sense as its never going to find an unsupported CPU, forcing kvm_arch_init() to exit early. Hence lets just drop this percpu probe (and also check_kvm_target_cpu()) altogether. While here, this also changes kvm_target_cpu() return type to a u32, making it explicit that an error code will not be returned from this function. Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-kernel@vger.kernel.org Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1628744994-16623-5-git-send-email-anshuman.khandual@arm.com	2021-08-18 09:26:07 +01:00
Marc Zyngier	7a3ba3095a	KVM: arm64: Remove PMSWINC_EL0 shadow register We keep an entry for the PMSWINC_EL0 register in the vcpu structure, while never writing anything there outside of reset. Given that the register is defined as write-only, that we always trap when this register is accessed, there is little point in saving anything anyway. Get rid of the entry, and save a mighty 8 bytes per vcpu structure. We still need to keep it exposed to userspace in order to preserve backward compatibility with previously saved VMs. Since userspace cannot expect any effect of writing to PMSWINC_EL0, treat the register as RAZ/WI for the purpose of userspace access. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210719123902.1493805-5-maz@kernel.org	2021-08-02 14:26:34 +01:00
Paolo Bonzini	b8917b4ae4	KVM/arm64 updates for v5.14. - Add MTE support in guests, complete with tag save/restore interface - Reduce the impact of CMOs by moving them in the page-table code - Allow device block mappings at stage-2 - Reduce the footprint of the vmemmap in protected mode - Support the vGIC on dumb systems such as the Apple M1 - Add selftest infrastructure to support multiple configuration and apply that to PMU/non-PMU setups - Add selftests for the debug architecture - The usual crop of PMU fixes -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmDV2bEPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDEr8P/ivwROx5NwGcHGmU5RfUCT3aFqhtVHHwD/lu jPcgoO61kz9TelOu6QRaVuK+mVHxcq3iP4R8nPq/QCkUlEXTmK2xkyhXhGXSYpH4 6jM8+BbC3eG7iAxx6H0UM4JTl4Riwat6ZZtXpWEWs9TKqOHOQYFpMkxSttwVZ1CZ SjbtFvXLEdzKn6PzUWnKdBNMV/mHsdAtohZit9oJOc4ttc8072XxETQ4TFQ+MSvA j9zY9QPmWzgcZnotqRRu9sbTGO2vxtXuUtY3sjdD8+C9OgSe9qvpnNjymcmfwaMu 1fBkfh65oaO4ItJBdGOUOoEcFqwN5imPiI7CB/O+ZYkO9sBCuTUPSQwPkyiwXb9r bUkTaQw2nZiNWsqR1x07fQ2sGYbMp5mnmgmqiV4MUWkLmFp9LZATCWYTTn24cBNS 6SjVP6/8S0r3EhLnYjH0Pn1we5PooU1EF6RlCAd3ewYoo+9fPnwjNYwIWH5i5wB7 +tnei44NACAw9cfbos+BYQQ/dY15OSFzLzIMomlabB7OpXOdDg3H6tJnPbFwWwXb 9nF8XdHqxeDVVVrDCAx1BSodSXm9xqgnQM2RDGTUnpVcAfqAr3MXX6VsyKQDzj8T QXF9qOVCBAABv6BXAvSQ6mvMJZDUVbUPEPhf7kXzF46JsRd6A7wWoU/OnMGHQ/w7 wjvH8HVy =fWBV -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for v5.14. - Add MTE support in guests, complete with tag save/restore interface - Reduce the impact of CMOs by moving them in the page-table code - Allow device block mappings at stage-2 - Reduce the footprint of the vmemmap in protected mode - Support the vGIC on dumb systems such as the Apple M1 - Add selftest infrastructure to support multiple configuration and apply that to PMU/non-PMU setups - Add selftests for the debug architecture - The usual crop of PMU fixes	2021-06-25 11:24:24 -04:00
Jing Zhang	0193cc908b	KVM: stats: Separate generic stats from architecture specific ones Generic KVM stats are those collected in architecture independent code or those supported by all architectures; put all generic statistics in a separate structure. This ensures that they are defined the same way in the statistics API which is being added, removing duplication among different architectures in the declaration of the descriptors. No functional change intended. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-2-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-24 11:47:56 -04:00
Marc Zyngier	9f03db6673	Merge branch kvm-arm64/mmu/mte into kvmarm-master/next KVM/arm64 support for MTE, courtesy of Steven Price. It allows the guest to use memory tagging, and offers a new userspace API to save/restore the tags. * kvm-arm64/mmu/mte: KVM: arm64: Document MTE capability and ioctl KVM: arm64: Add ioctl to fetch/store tags in a guest KVM: arm64: Expose KVM_ARM_CAP_MTE KVM: arm64: Save/restore MTE registers KVM: arm64: Introduce MTE VM feature arm64: mte: Sync tags for pages where PTE is untagged Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-06-22 15:09:34 +01:00
Steven Price	f0376edb1d	KVM: arm64: Add ioctl to fetch/store tags in a guest The VMM may not wish to have it's own mapping of guest memory mapped with PROT_MTE because this causes problems if the VMM has tag checking enabled (the guest controls the tags in physical RAM and it's unlikely the tags are correct for the VMM). Instead add a new ioctl which allows the VMM to easily read/write the tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE while the VMM can still read/write the tags for the purpose of migration. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210621111716.37157-6-steven.price@arm.com	2021-06-22 14:08:06 +01:00
Steven Price	e1f358b504	KVM: arm64: Save/restore MTE registers Define the new system registers that MTE introduces and context switch them. The MTE feature is still hidden from the ID register as it isn't supported in a VM yet. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210621111716.37157-4-steven.price@arm.com	2021-06-22 14:08:05 +01:00
Steven Price	ea7fc1bb1c	KVM: arm64: Introduce MTE VM feature Add a new VM feature 'KVM_ARM_CAP_MTE' which enables memory tagging for a VM. This will expose the feature to the guest and automatically tag memory pages touched by the VM as PG_mte_tagged (and clear the tag storage) to ensure that the guest cannot see stale tags, and so that the tags are correctly saved/restored across swap. Actually exposing the new capability to user space happens in a later patch. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> [maz: move VM_SHARED sampling into the critical section] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210621111716.37157-3-steven.price@arm.com	2021-06-22 14:08:05 +01:00
Marc Zyngier	d0c94c4979	KVM: arm64: Restore PMU configuration on first run Restoring a guest with an active virtual PMU results in no perf counters being instanciated on the host side. Not quite what you'd expect from a restore. In order to fix this, force a writeback of PMCR_EL0 on the first run of a vcpu (using a new request so that it happens once the vcpu has been loaded). This will in turn create all the host-side counters that were missing. Reported-by: Jinank Jain <jinankj@amazon.de> Tested-by: Jinank Jain <jinankj@amazon.de> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/87wnrbylxv.wl-maz@kernel.org Link: https://lore.kernel.org/r/b53dfcf9bbc4db7f96154b1cd5188d72b9766358.camel@amazon.de	2021-06-18 14:18:37 +01:00
Paolo Bonzini	e3cb6fa0e2	KVM: switch per-VM stats to u64 Make them the same type as vCPU stats. There is no reason to limit the counters to unsigned long. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-17 14:25:27 -04:00
Paolo Bonzini	c4f71901d5	KVM/arm64 updates for Linux 5.13 New features: - Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler - Alexandru is now a reviewer (not really a new feature...) Fixes: - Proper emulation of the GICR_TYPER register - Handle the complete set of relocation in the nVHE EL2 object - Get rid of the oprofile dependency in the PMU code (and of the oprofile body parts at the same time) - Debug and SPE fixes - Fix vcpu reset -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmCCpuAPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpD2G8QALWQYeBggKnNmAJfuihzZ2WariBmgcENs2R2 qNZ/Py6dIF+b69P68nmgrEV1x2Kp35cPJbBwXnnrS4FCB5tk0b8YMaj00QbiRIYV UXbPxQTmYO1KbevpoEcw8NmR4bZJ/hRYPuzcQG7CCMKIZw0zj2cMcBofzQpTOAp/ CgItdcv7at3iwamQatfU9vUmC0nDdnjdIwSxTAJOYMVV1ENwtnYSNgZVo4XLTg7n xR/5Qx27PKBJw7GyTRAIIxKAzNXG2tDL+GVIHe4AnRp3z3La8sr6PJf7nz9MCmco ISgeY7EGQINzmm4LahpnV+2xwwxOWo8QotxRFGNuRTOBazfARyAbp97yJ6eXJUpa j0qlg3xK9neyIIn9BQKkKx4sY9V45yqkuVDsK6odmqPq3EE01IMTRh1N/XQi+sTF iGrlM3ZW4AjlT5zgtT9US/FRXeDKoYuqVCObJeXZdm3sJSwEqTAs0JScnc0YTsh7 m30CODnomfR2y5X6GoaubbQ0wcZ2I20K1qtIm+2F6yzD5P1/3Yi8HbXMxsSWyYWZ 1ldoSa+ZUQlzV9Ot0S3iJ4PkphLKmmO96VlxE2+B5gQG50PZkLzsr8bVyYOuJC8p T83xT9xd07cy+FcGgF9veZL99Y6BLHMa6ZwFUolYNbzJxqrmqyR1aiJMEBIcX+aP ACeKW1w5 =fpey -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.13 New features: - Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler - Alexandru is now a reviewer (not really a new feature...) Fixes: - Proper emulation of the GICR_TYPER register - Handle the complete set of relocation in the nVHE EL2 object - Get rid of the oprofile dependency in the PMU code (and of the oprofile body parts at the same time) - Debug and SPE fixes - Fix vcpu reset	2021-04-23 07:41:17 -04:00
Sean Christopherson	b4c5936c47	KVM: Kill off the old hva-based MMU notifier callbacks Yank out the hva-based MMU notifier APIs now that all architectures that use the notifiers have moved to the gfn-based APIs. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:31:08 -04:00
Sean Christopherson	cd4c718352	KVM: arm64: Convert to the gfn-based MMU notifier callbacks Move arm64 to the gfn-base MMU notifier APIs, which do the hva->gfn lookup in common code. No meaningful functional change intended, though the exact order of operations is slightly different since the memslot lookups occur before calling into arch code. Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:31:07 -04:00
Maxim Levitsky	fa18aca927	KVM: aarch64: implement KVM_CAP_SET_GUEST_DEBUG2 Move KVM_GUESTDBG_VALID_MASK to kvm_host.h and use it to return the value of this capability. Compile tested only. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401135451.1004564-5-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:31:03 -04:00
Sean Christopherson	5f7c292b89	KVM: Move prototypes for MMU notifier callbacks to generic code Move the prototypes for the MMU notifier callbacks out of arch code and into common code. There is no benefit to having each arch replicate the prototypes since any deviation from the invocation in common code will explode. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210326021957.1424875-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:55 -04:00
Marc Zyngier	3d63ef4d52	Merge branch 'kvm-arm64/memslot-fixes' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-04-13 15:35:40 +01:00
Marc Zyngier	ac5ce2456e	Merge branch 'kvm-arm64/host-stage2' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-04-13 15:35:09 +01:00
Marc Zyngier	fbb31e5f3a	Merge branch 'kvm-arm64/debug-5.13' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-04-13 15:34:15 +01:00
Alexandru Elisei	263d6287da	KVM: arm64: Initialize VCPU mdcr_el2 before loading it When a VCPU is created, the kvm_vcpu struct is initialized to zero in kvm_vm_ioctl_create_vcpu(). On VHE systems, the first time vcpu.arch.mdcr_el2 is loaded on hardware is in vcpu_load(), before it is set to a sensible value in kvm_arm_setup_debug() later in the run loop. The result is that KVM executes for a short time with MDCR_EL2 set to zero. This has several unintended consequences: * Setting MDCR_EL2.HPMN to 0 is constrained unpredictable according to ARM DDI 0487G.a, page D13-3820. The behavior specified by the architecture in this case is for the PE to behave as if MDCR_EL2.HPMN is set to a value less than or equal to PMCR_EL0.N, which means that an unknown number of counters are now disabled by MDCR_EL2.HPME, which is zero. * The host configuration for the other debug features controlled by MDCR_EL2 is temporarily lost. This has been harmless so far, as Linux doesn't use the other fields, but that might change in the future. Let's avoid both issues by initializing the VCPU's mdcr_el2 field in kvm_vcpu_vcpu_first_run_init(), thus making sure that the MDCR_EL2 register has a consistent value after each vcpu_load(). Fixes: `d5a21bcc29` ("KVM: arm64: Move common VHE/non-VHE trap config in separate functions") Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210407144857.199746-3-alexandru.elisei@arm.com	2021-04-07 16:40:03 +01:00
Gavin Shan	eab6214847	KVM: arm64: Hide kvm_mmu_wp_memory_region() We needn't expose the function as it's only used by mmu.c since it was introduced by commit `c64735554c` ("KVM: arm: Add initial dirty page locking support"). Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210316041126.81860-2-gshan@redhat.com	2021-04-07 14:33:22 +01:00
Suzuki K Poulose	a1319260bf	arm64: KVM: Enable access to TRBE support for host For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1. For vhe, the EL2 must prevent the EL1 access to the Trace Buffer. The MDCR_EL2 bit definitions for TRBE are available here : https://developer.arm.com/documentation/ddi0601/2020-12/AArch64-Registers/ Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210405164307.1720226-8-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-04-06 16:05:28 -06:00
Suzuki K Poulose	d2602bb4f5	KVM: arm64: Move SPE availability check to VCPU load At the moment, we check the availability of SPE on the given CPU (i.e, SPE is implemented and is allowed at the host) during every guest entry. This can be optimized a bit by moving the check to vcpu_load time and recording the availability of the feature on the current CPU via a new flag. This will also be useful for adding the TRBE support. Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Alexandru Elisei <Alexandru.Elisei@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210405164307.1720226-7-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-04-06 16:05:20 -06:00
Marc Zyngier	7c4199375a	KVM: arm64: Drop the CPU_FTR_REG_HYP_COPY infrastructure Now that the read_ctr macro has been specialised for nVHE, the whole CPU_FTR_REG_HYP_COPY infrastrcture looks completely overengineered. Simplify it by populating the two u64 quantities (MMFR0 and 1) that the hypervisor need. Reviewed-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-03-25 11:01:03 +00:00
Quentin Perret	cfb1a98de7	KVM: arm64: Use kvm_arch in kvm_s2_mmu In order to make use of the stage 2 pgtable code for the host stage 2, change kvm_s2_mmu to use a kvm_arch pointer in lieu of the kvm pointer, as the host will have the former but not the latter. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-21-qperret@google.com	2021-03-19 12:01:21 +00:00
Quentin Perret	f320bc742b	KVM: arm64: Prepare the creation of s1 mappings at EL2 When memory protection is enabled, the EL2 code needs the ability to create and manage its own page-table. To do so, introduce a new set of hypercalls to bootstrap a memory management system at EL2. This leads to the following boot flow in nVHE Protected mode: 1. the host allocates memory for the hypervisor very early on, using the memblock API; 2. the host creates a set of stage 1 page-table for EL2, installs the EL2 vectors, and issues the __pkvm_init hypercall; 3. during __pkvm_init, the hypervisor re-creates its stage 1 page-table and stores it in the memory pool provided by the host; 4. the hypervisor then extends its stage 1 mappings to include a vmemmap in the EL2 VA space, hence allowing to use the buddy allocator introduced in a previous patch; 5. the hypervisor jumps back in the idmap page, switches from the host-provided page-table to the new one, and wraps up its initialization by enabling the new allocator, before returning to the host. 6. the host can free the now unused page-table created for EL2, and will now need to issue hypercalls to make changes to the EL2 stage 1 mappings instead of modifying them directly. Note that for the sake of simplifying the review, this patch focuses on the hypervisor side of things. In other words, this only implements the new hypercalls, but does not make use of them from the host yet. The host-side changes will follow in a subsequent patch. Credits to Will for __pkvm_init_switch_pgd. Acked-by: Will Deacon <will@kernel.org> Co-authored-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-18-qperret@google.com	2021-03-19 12:01:21 +00:00
Quentin Perret	7a440cc783	KVM: arm64: Enable access to sanitized CPU features at EL2 Introduce the infrastructure in KVM enabling to copy CPU feature registers into EL2-owned data-structures, to allow reading sanitised values directly at EL2 in nVHE. Given that only a subset of these features are being read by the hypervisor, the ones that need to be copied are to be listed under <asm/kvm_cpufeature.h> together with the name of the nVHE variable that will hold the copy. This introduces only the infrastructure enabling this copy. The first users will follow shortly. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-14-qperret@google.com	2021-03-19 12:01:20 +00:00
Quentin Perret	40a50853d3	KVM: arm64: Make kvm_call_hyp() a function call at Hyp kvm_call_hyp() has some logic to issue a function call or a hypercall depending on the EL at which the kernel is running. However, all the code compiled under __KVM_NVHE_HYPERVISOR__ is guaranteed to only run at EL2 which allows us to simplify. Add ifdefery to kvm_host.h to simplify kvm_call_hyp() in .hyp.text. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-9-qperret@google.com	2021-03-19 12:01:20 +00:00
Daniel Kiss	6e94095c55	KVM: arm64: Enable SVE support for nVHE Now that KVM is equipped to deal with SVE on nVHE, remove the code preventing it from being used as well as the bits of documentation that were mentioning the incompatibility. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Daniel Kiss <daniel.kiss@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-03-18 14:23:19 +00:00
Marc Zyngier	468f3477ef	KVM: arm64: Introduce vcpu_sve_vq() helper The KVM code contains a number of "sve_vq_from_vl(vcpu->arch.sve_max_vl)" instances, and we are about to add more. Introduce vcpu_sve_vq() as a shorthand for this expression. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-03-18 11:24:10 +00:00
Marc Zyngier	985d3a1bea	KVM: arm64: Let vcpu_sve_pffr() handle HYP VAs The vcpu_sve_pffr() returns a pointer, which can be an interesting thing to do on nVHE. Wrap the pointer with kern_hyp_va(), and take this opportunity to remove the unnecessary casts (sve_state being a void *). Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-03-18 11:24:00 +00:00
Paolo Bonzini	8c6e67bec3	KVM/arm64 updates for Linux 5.12 - Make the nVHE EL2 object relocatable, resulting in much more maintainable code - Handle concurrent translation faults hitting the same page in a more elegant way - Support for the standard TRNG hypervisor call - A bunch of small PMU/Debug fixes - Allow the disabling of symbol export from assembly code - Simplification of the early init hypercall handling -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmAmjqEPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDoUEQAIrJ7YF4v4gz06a0HG9+b6fbmykHyxlG7jfm trvctfaiKzOybKoY5odPpNFzhbYOOdXXqYipyTHGwBYtGSy9G/9SjMKSUrfln2Ni lr1wBqapr9TE+SVKoR8pWWuZxGGbHVa7brNuMbMsMi1wwAsM2/n70H9PXrdq3QiK Ge1DWLso2oEfhtTwqNKa4dwB2MHjBhBFhhq+Nq5pslm6mmxJaYqz7pyBmw/C+2cc oU/6kpAa1yPAauptWXtYXJYOMHihxgEa1IdK3Gl0hUyFyu96xVkwH/KFsj+bRs23 QGGCSdy4313hzaoGaSOTK22R98Aeg0wI9a6tcCBvVVjTAztnlu1FPtUZr8e/F7uc +r8xVJUJFiywt3Zktf/D7YDK9LuMMqFnj0BkI4U9nIBY59XZRNhENsBCmjru5lnL iXa5cuta03H4emfssIChLpgn0XHFas6t5dFXBPGbXyw0qsQchTw98iQX9LVxefUK rOUGPIN4nE9ESRIZe0SPlAVeCtNP8cLH7+0YG9MJ1QeDVYaUsnvy9Ln/ox+514mR 5y2KJ6y7xnLB136SKCzPDDloYtz7BDiJq6a/RPiXKGheKoxy+N+BSe58yWCqFZYE Fx/cGUr7oSg39U7gCboog6BDp5e2CXBfbRllg6P47bZFfdPNwzNEzHvk49VltMxx Rl2W05bk =6EwV -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.12 - Make the nVHE EL2 object relocatable, resulting in much more maintainable code - Handle concurrent translation faults hitting the same page in a more elegant way - Support for the standard TRNG hypervisor call - A bunch of small PMU/Debug fixes - Allow the disabling of symbol export from assembly code - Simplification of the early init hypercall handling	2021-02-12 11:23:44 -05:00
Vitaly Kuznetsov	4fc096a99e	KVM: Raise the maximum number of user memslots Current KVM_USER_MEM_SLOTS limits are arch specific (512 on Power, 509 on x86, 32 on s390, 16 on MIPS) but they don't really need to be. Memory slots are allocated dynamically in KVM when added so the only real limitation is 'id_to_index' array which is 'short'. We don't have any other KVM_MEM_SLOTS_NUM/KVM_USER_MEM_SLOTS-sized statically defined structures. Low KVM_USER_MEM_SLOTS can be a limiting factor for some configurations. In particular, when QEMU tries to start a Windows guest with Hyper-V SynIC enabled and e.g. 256 vCPUs the limit is hit as SynIC requires two pages per vCPU and the guest is free to pick any GFN for each of them, this fragments memslots as QEMU wants to have a separate memslot for each of these pages (which are supposed to act as 'overlay' pages). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210127175731.2020089-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-02-09 08:17:08 -05:00
Ard Biesheuvel	a8e190cdae	KVM: arm64: Implement the TRNG hypervisor call Provide a hypervisor implementation of the ARM architected TRNG firmware interface described in ARM spec DEN0098. All function IDs are implemented, including both 32-bit and 64-bit versions of the TRNG_RND service, which is the centerpiece of the API. The API is backed by the kernel's entropy pool only, to avoid guests draining more precious direct entropy sources. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> [Andre: minor fixes, drop arch_get_random() usage] Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210106103453.152275-6-andre.przywara@arm.com	2021-01-25 22:19:31 +00:00
Marc Zyngier	767c973f2e	KVM: arm64: Declutter host PSCI 0.1 handling Although there is nothing wrong with the current host PSCI relay implementation, we can clean it up and remove some of the helpers that do not improve the overall readability of the legacy PSCI 0.1 handling. Opportunity is taken to turn the bitmap into a set of booleans, and creative use of preprocessor macros make init and check more concise/readable. Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-12-22 12:56:44 +00:00
David Brazdil	61fe0c37af	KVM: arm64: Minor cleanup of hyp variables used in host Small cleanup moving declarations of hyp-exported variables to kvm_host.h and using macros to avoid having to refer to them with kvm_nvhe_sym() in host. No functional change intended. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201208142452.87237-5-dbrazdil@google.com	2020-12-22 10:49:01 +00:00
David Brazdil	ff367fe473	KVM: arm64: Prevent use of invalid PSCI v0.1 function IDs PSCI driver exposes a struct containing the PSCI v0.1 function IDs configured in the DT. However, the struct does not convey the information whether these were set from DT or contain the default value zero. This could be a problem for PSCI proxy in KVM protected mode. Extend config passed to KVM with a bit mask with individual bits set depending on whether the corresponding function pointer in psci_ops is set, eg. set bit for PSCI_CPU_SUSPEND if psci_ops.cpu_suspend != NULL. Previously config was split into multiple global variables. Put everything into a single struct for convenience. Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201208142452.87237-2-dbrazdil@google.com	2020-12-22 10:47:59 +00:00
Marc Zyngier	3a514592b6	Merge remote-tracking branch 'origin/kvm-arm64/psci-relay' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-12-09 10:00:24 +00:00
David Brazdil	3eb681fba2	KVM: arm64: Add ARM64_KVM_PROTECTED_MODE CPU capability Expose the boolean value whether the system is running with KVM in protected mode (nVHE + kernel param). CPU capability was selected over a global variable to allow use in alternatives. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201202184122.26046-3-dbrazdil@google.com	2020-12-04 08:44:19 +00:00
David Brazdil	d8b369c4e3	KVM: arm64: Add kvm-arm.mode early kernel parameter Add an early parameter that allows users to select the mode of operation for KVM/arm64. For now, the only supported value is "protected". By passing this flag users opt into the hypervisor placing additional restrictions on the host kernel. These allow the hypervisor to spawn guests whose state is kept private from the host. Restrictions will include stage-2 address translation to prevent host from accessing guest memory, filtering its SMC calls, etc. Without this parameter, the default behaviour remains selecting VHE/nVHE based on hardware support and CONFIG_ARM64_VHE. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201202184122.26046-2-dbrazdil@google.com	2020-12-04 08:43:43 +00:00
Marc Zyngier	f86e54653e	Merge remote-tracking branch 'origin/kvm-arm64/csv3' into kvmarm-master/queue Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-12-03 19:12:24 +00:00
Marc Zyngier	4f1df628d4	KVM: arm64: Advertise ID_AA64PFR0_EL1.CSV3=1 if the CPUs are Meltdown-safe Cores that predate the introduction of ID_AA64PFR0_EL1.CSV3 to the ARMv8 architecture have this field set to 0, even of some of them are not affected by the vulnerability. The kernel maintains a list of unaffected cores (A53, A55 and a few others) so that it doesn't impose an expensive mitigation uncessarily. As we do for CSV2, let's expose the CSV3 property to guests that run on HW that is effectively not vulnerable. This can be reset to zero by writing to the ID register from userspace, ensuring that VMs can be migrated despite the new property being set. Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-30 16:02:53 +00:00
Marc Zyngier	90f0e16c64	Merge branch 'kvm-arm64/misc-5.11' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 19:48:24 +00:00
Will Deacon	bf118a5cb7	KVM: arm64: Remove unused __extended_idmap_trampoline() prototype __extended_idmap_trampoline() was removed a long time ago by `3421e9d88d` ("arm64: KVM: Simplify HYP init/teardown") so remove the unused function prototype. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201118194402.2892-4-will@kernel.org	2020-11-27 18:59:05 +00:00
Will Deacon	36fb4cd55f	KVM: arm64: Remove kvm_arch_vm_ioctl_check_extension() kvm_arch_vm_ioctl_check_extension() is only called from kvm_vm_ioctl_check_extension(), so we can inline it and remove the extra function. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201118194402.2892-3-will@kernel.org	2020-11-27 18:59:05 +00:00
Will Deacon	8d14797b53	KVM: arm64: Move 'struct kvm_arch_memory_slot' out of uapi/ 'struct kvm_arch_memory_slot' isn't part of the user ABI, so move it out of the uapi/ headers in case we start using it in future and accidentally back ourselves into a corner. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201118194402.2892-2-will@kernel.org	2020-11-27 18:59:05 +00:00
Marc Zyngier	6e5d8c713d	Merge branch 'kvm-arm64/pmu-undef' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 11:46:47 +00:00
Marc Zyngier	14bda7a927	KVM: arm64: Add kvm_vcpu_has_pmu() helper There are a number of places where we check for the KVM_ARM_VCPU_PMU_V3 feature. Wrap this check into a new kvm_vcpu_has_pmu(), and use it at the existing locations. No functional change. Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 11:39:14 +00:00
Marc Zyngier	149f120edb	Merge branch 'kvm-arm64/copro-no-more' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 11:33:16 +00:00
Marc Zyngier	37da329ed6	Merge branch 'kvm-arm64/el2-pc' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 11:33:10 +00:00
Marc Zyngier	23711a5e66	KVM: arm64: Allow setting of ID_AA64PFR0_EL1.CSV2 from userspace We now expose ID_AA64PFR0_EL1.CSV2=1 to guests running on hosts that are immune to Spectre-v2, but that don't have this field set, most likely because they predate the specification. However, this prevents the migration of guests that have started on a host the doesn't fake this CSV2 setting to one that does, as KVM rejects the write to ID_AA64PFR0_EL2 on the grounds that it isn't what is already there. In order to fix this, allow userspace to set this field as long as this doesn't result in a promising more than what is already there (setting CSV2 to 0 is acceptable, but setting it to 1 when it is already set to 0 isn't). Fixes: `e1026237f9` ("KVM: arm64: Set CSV2 for guests on hardware unaffected by Spectre-v2") Reported-by: Peng Liang <liangpeng10@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201110141308.451654-2-maz@kernel.org	2020-11-12 21:22:22 +00:00
Marc Zyngier	6ac4a5ac50	KVM: arm64: Drop kvm_coproc.h kvm_coproc.h used to serve as a compatibility layer for the files shared between the 32 and 64 bit ports. Another one bites the dust... Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-10 11:22:52 +00:00
Marc Zyngier	5f7e02aebd	KVM: arm64: Drop legacy copro shadow register Finally remove one of the biggest 32bit legacy: the copro shadow mapping. We won't missit. Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-10 11:22:52 +00:00
Marc Zyngier	1da42c34d7	KVM: arm64: Map AArch32 cp14 register to AArch64 sysregs Similarly to what has been done on the cp15 front, repaint the debug registers to use their AArch64 counterparts. This results in some simplification as we can remove the 32bit-specific accessors. Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-10 11:22:51 +00:00
Marc Zyngier	4ff3fc316d	KVM: arm64: Move AArch32 exceptions over to AArch64 sysregs The use of the AArch32-specific accessors have always been a bit annoying on 64bit, and it is time for a change. Let's move the AArch32 exception injection over to the AArch64 encoding, which requires us to split the two halves of FAR_EL1 into DFAR and IFAR. This enables us to drop the preempt_disable() games on VHE, and to kill the last user of the vcpu_cp15() macro. Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-10 11:22:51 +00:00
Marc Zyngier	ca4e514774	KVM: arm64: Introduce handling of AArch32 TTBCR2 traps ARMv8.2 introduced TTBCR2, which shares TCR_EL1 with TTBCR. Gracefully handle traps to this register when HCR_EL2.TVM is set. Cc: stable@vger.kernel.org Reported-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-10 11:22:50 +00:00
Marc Zyngier	e650b64f1a	KVM: arm64: Add basic hooks for injecting exceptions from EL2 Add the basic infrastructure to describe injection of exceptions into a guest. So far, nothing uses this code path. Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-10 08:34:25 +00:00
Marc Zyngier	21c810017c	KVM: arm64: Move VHE direct sysreg accessors into kvm_host.h As we are about to need to access system registers from the HYP code based on their internal encoding, move the direct sysreg accessors to a common include file, with a VHE-specific guard. No functionnal change. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-10 08:34:25 +00:00
Marc Zyngier	cdb5e02ed1	KVM: arm64: Make kvm_skip_instr() and co private to HYP In an effort to remove the vcpu PC manipulations from EL1 on nVHE systems, move kvm_skip_instr() to be HYP-specific. EL1's intent to increment PC post emulation is now signalled via a flag in the vcpu structure. Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-10 08:34:24 +00:00
Marc Zyngier	4a1c2c7f63	KVM: arm64: Fix AArch32 handling of DBGD{CCINT,SCRext} and DBGVCR The DBGD{CCINT,SCRext} and DBGVCR register entries in the cp14 array are missing their target register, resulting in all accesses being targetted at the guard sysreg (indexed by __INVALID_SYSREG__). Point the emulation code at the actual register entries. Fixes: `bdfb4b389c` ("arm64: KVM: add trap handlers for AArch32 debug registers") Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201029172409.2768336-1-maz@kernel.org	2020-10-29 19:49:03 +00:00
Marc Zyngier	14ef9d0492	Merge branch 'kvm-arm64/hyp-pcpu' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-09-30 14:05:35 +01:00
Marc Zyngier	816c347f3a	Merge remote-tracking branch 'arm64/for-next/ghostbusters' into kvm-arm64/hyp-pcpu Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-09-30 09:48:30 +01:00
David Brazdil	2a1198c9b4	kvm: arm64: Create separate instances of kvm_host_data for VHE/nVHE Host CPU context is stored in a global per-cpu variable `kvm_host_data`. In preparation for introducing independent per-CPU region for nVHE hyp, create two separate instances of `kvm_host_data`, one for VHE and one for nVHE. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-9-dbrazdil@google.com	2020-09-30 08:37:13 +01:00
Marc Zyngier	7311467702	KVM: arm64: Get rid of kvm_arm_have_ssbd() kvm_arm_have_ssbd() is now completely unused, get rid of it. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>	2020-09-29 16:08:17 +01:00
Will Deacon	d4647f0a2a	arm64: Rewrite Spectre-v2 mitigation code The Spectre-v2 mitigation code is pretty unwieldy and hard to maintain. This is largely due to it being written hastily, without much clue as to how things would pan out, and also because it ends up mixing policy and state in such a way that it is very difficult to figure out what's going on. Rewrite the Spectre-v2 mitigation so that it clearly separates state from policy and follows a more structured approach to handling the mitigation. Signed-off-by: Will Deacon <will@kernel.org>	2020-09-29 16:08:15 +01:00
Marc Zyngier	2e02cbb236	Merge branch 'kvm-arm64/pmu-5.9' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-09-29 15:12:54 +01:00
Marc Zyngier	d7eec2360e	KVM: arm64: Add PMU event filtering infrastructure It can be desirable to expose a PMU to a guest, and yet not want the guest to be able to count some of the implemented events (because this would give information on shared resources, for example. For this, let's extend the PMUv3 device API, and offer a way to setup a bitmap of the allowed events (the default being no bitmap, and thus no filtering). Userspace can thus allow/deny ranges of event. The default policy depends on the "polarity" of the first filter setup (default deny if the filter allows events, and default allow if the filter denies events). This allows to setup exactly what is allowed for a given guest. Note that although the ioctl is per-vcpu, the map of allowed events is global to the VM (it can be setup from any vcpu until the vcpu PMU is initialized). Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-09-29 14:19:39 +01:00
Marc Zyngier	fd65a3b5f8	KVM: arm64: Use event mask matching architecture revision The PMU code suffers from a small defect where we assume that the event number provided by the guest is always 16 bit wide, even if the CPU only implements the ARMv8.0 architecture. This isn't really problematic in the sense that the event number ends up in a system register, cropping it to the right width, but still this needs fixing. In order to make it work, let's probe the version of the PMU that the guest is going to use. This is done by temporarily creating a kernel event and looking at the PMUVer field that has been saved at probe time in the associated arm_pmu structure. This in turn gets saved in the kvm structure, and subsequently used to compute the event mask that gets used throughout the PMU code. Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-09-29 14:19:38 +01:00
Marc Zyngier	81867b75db	Merge branch 'kvm-arm64/nvhe-hyp-context' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-09-16 10:59:17 +01:00
Andrew Scull	04e4caa8d3	KVM: arm64: nVHE: Migrate hyp-init to SMCCC To complete the transition to SMCCC, the hyp initialization is given a function ID. This looks neater than comparing the hyp stub function IDs to the page table physical address. Some care is taken to only clobber x0-3 before the host context is saved as only those registers can be clobbered accoring to SMCCC. Fortunately, only a few acrobatics are needed. The possible new tpidr_el2 is moved to the argument in x2 so that it can be stashed in tpidr_el2 early to free up a scratch register. The page table configuration then makes use of x0-2. Signed-off-by: Andrew Scull <ascull@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200915104643.2543892-19-ascull@google.com	2020-09-15 18:39:04 +01:00
Andrew Scull	054698316d	KVM: arm64: nVHE: Migrate hyp interface to SMCCC Rather than passing arbitrary function pointers to run at hyp, define and equivalent set of SMCCC functions. Since the SMCCC functions are strongly tied to the original function prototypes, it is not expected for the host to ever call an invalid ID but a warning is raised if this does ever occur. As __kvm_vcpu_run is used for every switch between the host and a guest, it is explicitly singled out to be identified before the other function IDs to improve the performance of the hot path. Signed-off-by: Andrew Scull <ascull@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200915104643.2543892-18-ascull@google.com	2020-09-15 18:39:04 +01:00
Andrew Scull	d7ca1079d8	KVM: arm64: Remove kvm_host_data_t typedef The kvm_host_data_t typedef is used inconsistently and goes against the kernel's coding style. Remove it in favour of the full struct specifier. Signed-off-by: Andrew Scull <ascull@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200915104643.2543892-4-ascull@google.com	2020-09-15 18:39:01 +01:00
Marc Zyngier	ae8bd85ca8	Merge branch 'kvm-arm64/pt-new' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-09-11 15:54:30 +01:00
Will Deacon	74cfa7ea66	KVM: arm64: Remove unused 'pgd' field from 'struct kvm_s2_mmu' The stage-2 page-tables are entirely encapsulated by the 'pgt' field of 'struct kvm_s2_mmu', so remove the unused 'pgd' field. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-21-will@kernel.org	2020-09-11 15:51:15 +01:00
Will Deacon	71233d05f4	KVM: arm64: Add support for creating kernel-agnostic stage-2 page tables Introduce alloc() and free() functions to the generic page-table code for guest stage-2 page-tables and plumb these into the existing KVM page-table allocator. Subsequent patches will convert other operations within the KVM allocator over to the generic code. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-6-will@kernel.org	2020-09-11 15:51:13 +01:00
Will Deacon	fdfe7cbd58	KVM: Pass MMU notifier range flags to kvm_unmap_hva_range() The 'flags' field of 'struct mmu_notifier_range' is used to indicate whether invalidate_range_{start,end}() are permitted to block. In the case of kvm_mmu_notifier_invalidate_range_start(), this field is not forwarded on to the architecture-specific implementation of kvm_unmap_hva_range() and therefore the backend cannot sensibly decide whether or not to block. Add an extra 'flags' parameter to kvm_unmap_hva_range() so that architectures are aware as to whether or not they are permitted to block. Cc: <stable@vger.kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20200811102725.7121-2-will@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 18:03:47 -04:00
Andrew Jones	004a01241c	arm64/x86: KVM: Introduce steal-time cap arm64 requires a vcpu fd (KVM_HAS_DEVICE_ATTR vcpu ioctl) to probe support for steal-time. However this is unnecessary, as only a KVM fd is required, and it complicates userspace (userspace may prefer delaying vcpu creation until after feature probing). Introduce a cap that can be checked instead. While x86 can already probe steal-time support with a kvm fd (KVM_GET_SUPPORTED_CPUID), we add the cap there too for consistency. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20200804170604.42662-7-drjones@redhat.com	2020-08-21 14:05:19 +01:00
Andrew Jones	53f985584e	KVM: arm64: pvtime: Fix stolen time accounting across migration When updating the stolen time we should always read the current stolen time from the user provided memory, not from a kernel cache. If we use a cache then we'll end up resetting stolen time to zero on the first update after migration. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200804170604.42662-5-drjones@redhat.com	2020-08-21 14:04:14 +01:00
Paolo Bonzini	0378daef0c	KVM/arm64 updates for Linux 5.9: - Split the VHE and nVHE hypervisor code bases, build the EL2 code separately, allowing for the VHE code to now be built with instrumentation - Level-based TLB invalidation support - Restructure of the vcpu register storage to accomodate the NV code - Pointer Authentication available for guests on nVHE hosts - Simplification of the system register table parsing - MMU cleanups and fixes - A number of post-32bit cleanups and other fixes -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl8q5DEPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDQFAP/jtscnC5OxEOoGNW1gvg/1QI/BuU4zLvqQL1 OEW72fUQlil7tmF/CbLLKnsBpxKmzO02C3wDdg3oaRi884bRtTXdok0nsFuCvrZD u/wrlMnP0zTjjk1uwIFfZJTx+nnUiT0jC6ffvGxB/jnTJk/8atvOUFL7ODFEfixz mS5g1jwwJkRmWKESFg7KGSghKuwXTvo4HVWCfME+t1rQwAa03stXFV8H5tkU6+cG BRIssxo7BkAV2AozwL7hgl/M6wd6QvbOrYJqgb67+sQ8qts0YNne96NN3InMedb1 RENyDssXlA+VI0HoYyEbYnPtFy1Hoj1lOGDZLEZAEH1qcmWrV+hApnoSXSmuofvn QlfOWCyd92CZySu21MALRUVXbrKkA3zT2b9R93A5z7iEBPY+Wk0ryJCO6IxdZzF8 48LNjtzb/Kd0SMU/issJlw+u6fJvLbpnSzXNsYYhiiTMUE9cbu2SEkq0SkonH0a4 d3V8UifZyeffXsOfOAG0DJZOu/fWZp1/I3tfzujtG9rCb+jTQueJ4E1cFYrwSO6b sFNyiI1AzlwcCippG08zSUX61nGfKXBuMXuhIlMRk7GeiF95DmSXuxEgYndZX9I+ E6zJr1iQk/1lrip41svDIIOBHuMbIeD/w1bsOKi7Zoa270MxB4r2Z3IqRMgosoE5 l4YO9pl1 =Ukr4 -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-next-5.6 KVM/arm64 updates for Linux 5.9: - Split the VHE and nVHE hypervisor code bases, build the EL2 code separately, allowing for the VHE code to now be built with instrumentation - Level-based TLB invalidation support - Restructure of the vcpu register storage to accomodate the NV code - Pointer Authentication available for guests on nVHE hosts - Simplification of the system register table parsing - MMU cleanups and fixes - A number of post-32bit cleanups and other fixes	2020-08-09 12:58:23 -04:00
Linus Torvalds	921d2597ab	s390: implement diag318 x86: * Report last CPU for debugging * Emulate smaller MAXPHYADDR in the guest than in the host * .noinstr and tracing fixes from Thomas * nested SVM page table switching optimization and fixes Generic: * Unify shadow MMU cache data structures across architectures -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl8pC+oUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNcOwgAjomqtEqQNlp7DdZT7VyyklzbxX1/ ud7v+oOJ8K4sFlf64lSthjPo3N9rzZCcw+yOXmuyuITngXOGc3tzIwXpCzpLtuQ1 WO1Ql3B/2dCi3lP5OMmsO1UAZqy9pKLg1dfeYUPk48P5+p7d/NPmk+Em5kIYzKm5 JsaHfCp2EEXomwmljNJ8PQ1vTjIQSSzlgYUBZxmCkaaX7zbEUMtxAQCStHmt8B84 33LczwXBm3viSWrzsoBV37I70+tseugiSGsCfUyupXOvq55d6D9FCqtCb45Hn4Vh Ik8ggKdalsk/reiGEwNw1/3nr6mRMkHSbl+Mhc4waOIFf9dn0urgQgOaDg== =YVx0 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "s390: - implement diag318 x86: - Report last CPU for debugging - Emulate smaller MAXPHYADDR in the guest than in the host - .noinstr and tracing fixes from Thomas - nested SVM page table switching optimization and fixes Generic: - Unify shadow MMU cache data structures across architectures" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits) KVM: SVM: Fix sev_pin_memory() error handling KVM: LAPIC: Set the TDCR settable bits KVM: x86: Specify max TDP level via kvm_configure_mmu() KVM: x86/mmu: Rename max_page_level to max_huge_page_level KVM: x86: Dynamically calculate TDP level from max level and MAXPHYADDR KVM: VXM: Remove temporary WARN on expected vs. actual EPTP level mismatch KVM: x86: Pull the PGD's level from the MMU instead of recalculating it KVM: VMX: Make vmx_load_mmu_pgd() static KVM: x86/mmu: Add separate helper for shadow NPT root page role calc KVM: VMX: Drop a duplicate declaration of construct_eptp() KVM: nSVM: Correctly set the shadow NPT root level in its MMU role KVM: Using macros instead of magic values MIPS: KVM: Fix build error caused by 'kvm_run' cleanup KVM: nSVM: remove nonsensical EXITINFO1 adjustment on nested NPF KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support KVM: VMX: optimize #PF injection when MAXPHYADDR does not match KVM: VMX: Add guest physical address check in EPT violation and misconfig KVM: VMX: introduce vmx_need_pf_intercept KVM: x86: update exception bitmap on CPUID changes KVM: x86: rename update_bp_intercept to update_exception_bitmap ...	2020-08-06 12:59:31 -07:00
Marc Zyngier	bf4086b1a1	KVM: arm64: Prevent vcpu_has_ptrauth from generating OOL functions So far, vcpu_has_ptrauth() is implemented in terms of system_supports_*_auth() calls, which are declared "inline". In some specific conditions (clang and SCS), the "inline" very much turns into an "out of line", which leads to a fireworks when this predicate is evaluated on a non-VHE system (right at the beginning of __hyp_handle_ptrauth). Instead, make sure vcpu_has_ptrauth gets expanded inline by directly using the cpus_have_final_cap() helpers, which are __always_inline, generate much better code, and are the only thing that make sense when running at EL2 on a nVHE system. Fixes: `29eb5a3c57` ("KVM: arm64: Handle PtrAuth traps early") Reported-by: Nathan Chancellor <natechancellor@gmail.com> Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Link: https://lore.kernel.org/r/20200722162231.3689767-1-maz@kernel.org	2020-07-28 09:03:57 +01:00
Tianjia Zhang	74cc7e0c35	KVM: arm64: clean up redundant 'kvm_run' parameters In the current kvm version, 'kvm_run' has been included in the 'kvm_vcpu' structure. For historical reasons, many kvm-related function parameters retain the 'kvm_run' and 'kvm_vcpu' parameters at the same time. This patch does a unified cleanup of these remaining redundant parameters. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200623131418.31473-3-tianjia.zhang@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-07-10 04:26:40 -04:00
Sean Christopherson	c1a33aebe9	KVM: arm64: Use common KVM implementation of MMU memory caches Move to the common MMU memory cache implementation now that the common code and arm64's existing code are semantically compatible. No functional change intended. Cc: Marc Zyngier <maz@kernel.org> Suggested-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-19-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-07-09 13:29:43 -04:00
Sean Christopherson	e539451b7e	KVM: arm64: Use common code's approach for __GFP_ZERO with memory caches Add a "gfp_zero" member to arm64's 'struct kvm_mmu_memory_cache' to make the struct and its usage compatible with the common 'struct kvm_mmu_memory_cache' in linux/kvm_host.h. This will minimize code churn when arm64 moves to the common implementation in a future patch, at the cost of temporarily having somewhat silly code. Note, GFP_PGTABLE_USER is equivalent to GFP_KERNEL_ACCOUNT \| GFP_ZERO: #define GFP_PGTABLE_USER (GFP_PGTABLE_KERNEL \| __GFP_ACCOUNT) \| -> #define GFP_PGTABLE_KERNEL (GFP_KERNEL \| __GFP_ZERO) == GFP_KERNEL \| __GFP_ACCOUNT \| __GFP_ZERO versus #define GFP_KERNEL_ACCOUNT (GFP_KERNEL \| __GFP_ACCOUNT) with __GFP_ZERO explicitly OR'd in == GFP_KERNEL \| __GFP_ACCOUNT \| __GFP_ZERO No functional change intended. Tested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200703023545.8771-18-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-07-09 13:29:43 -04:00
Marc Zyngier	41ce82f63c	KVM: arm64: timers: Move timer registers to the sys_regs file Move the timer gsisters to the sysreg file. This will further help when they are directly changed by a nesting hypervisor in the VNCR page. This requires moving the initialisation of the timer struct so that some of the helpers (such as arch_timer_ctx_index) can work correctly at an early stage. Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-07-07 09:28:38 +01:00
Marc Zyngier	710f198218	KVM: arm64: Move SPSR_EL1 to the system register array SPSR_EL1 being a VNCR-capable register with ARMv8.4-NV, move it to the sysregs array and update the accessors. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-07-07 09:28:38 +01:00
Marc Zyngier	fd85b66789	KVM: arm64: Disintegrate SPSR array As we're about to move SPSR_EL1 into the VNCR page, we need to disassociate it from the rest of the 32bit cruft. Let's break the array into individual fields. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-07-07 09:28:38 +01:00
Marc Zyngier	1bded23ea7	KVM: arm64: Move SP_EL1 to the system register array SP_EL1 being a VNCR-capable register with ARMv8.4-NV, move it to the system register array and update the accessors. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-07-07 09:28:38 +01:00
Marc Zyngier	98909e6d1c	KVM: arm64: Move ELR_EL1 to the system register array As ELR-EL1 is a VNCR-capable register with ARMv8.4-NV, let's move it to the sys_regs array and repaint the accessors. While we're at it, let's kill the now useless accessors used only on the fault injection path. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-07-07 09:28:38 +01:00
Marc Zyngier	e47c2055c6	KVM: arm64: Make struct kvm_regs userspace-only struct kvm_regs is used by userspace to indicate which register gets accessed by the {GET,SET}_ONE_REG API. But as we're about to refactor the layout of the in-kernel register structures, we need the kernel to move away from it. Let's make kvm_regs userspace only, and let the kernel map it to its own internal representation. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-07-07 09:28:38 +01:00
Marc Zyngier	71071acfd3	KVM: arm64: hyp: Use ctxt_sys_reg/__vcpu_sys_reg instead of raw sys_regs access Switch the hypervisor code to using ctxt_sys_reg/__vcpu_sys_reg instead of raw sys_regs accesses. No intended functionnal change. Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-07-07 09:28:37 +01:00
Marc Zyngier	1b422dd7fc	KVM: arm64: Introduce accessor for ctxt->sys_reg In order to allow the disintegration of the per-vcpu sysreg array, let's introduce a new helper (ctxt_sys_reg()) that returns the in-memory copy of a system register, picked from a given context. __vcpu_sys_reg() is rewritten to use this helper. Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-07-07 09:28:37 +01:00
Christoffer Dall	a0e50aa3f4	KVM: arm64: Factor out stage 2 page table data from struct kvm As we are about to reuse our stage 2 page table manipulation code for shadow stage 2 page tables in the context of nested virtualization, we are going to manage multiple stage 2 page tables for a single VM. This requires some pretty invasive changes to our data structures, which moves the vmid and pgd pointers into a separate structure and change pretty much all of our mmu code to operate on this structure instead. The new structure is called struct kvm_s2_mmu. There is no intended functional change by this patch alone. Reviewed-by: James Morse <james.morse@arm.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> [Designed data structure layout in collaboration] Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Co-developed-by: Marc Zyngier <maz@kernel.org> [maz: Moved the last_vcpu_ran down to the S2 MMU structure as well] Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-07-07 09:28:37 +01:00
David Brazdil	13aeb9b400	KVM: arm64: Split hyp/sysreg-sr.c to VHE/nVHE sysreg-sr.c contains KVM's code for saving/restoring system registers, with some code shared between VHE/nVHE. These common routines are moved to a header file, VHE-specific code is moved to vhe/sysreg-sr.c and nVHE-specific code to nvhe/sysreg-sr.c. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200625131420.71444-12-dbrazdil@google.com	2020-07-05 18:38:29 +01:00
Andrew Scull	f50b6f6ae1	KVM: arm64: Handle calls to prefixed hyp functions Once hyp functions are moved to a hyp object, they will have prefixed symbols. This change declares and gets the address of the prefixed version for calls to the hyp functions. To aid migration, the hyp functions that have not yet moved have their prefixed versions aliased to their non-prefixed version. This begins with all the hyp functions being listed and will reduce to none of them once the migration is complete. Signed-off-by: Andrew Scull <ascull@google.com> [David: Extracted kvm_call_hyp nVHE branches into own helper macros, added comments around symbol aliases.] Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200625131420.71444-6-dbrazdil@google.com	2020-07-05 18:38:04 +01:00
Paolo Bonzini	49b3deaad3	KVM/arm64 fixes for Linux 5.8, take #1 * 32bit VM fixes: - Fix embarassing mapping issue between AArch32 CSSELR and AArch64 ACTLR - Add ACTLR2 support for AArch32 - Get rid of the useless ACTLR_EL1 save/restore - Fix CP14/15 accesses for AArch32 guests on BE hosts - Ensure that we don't loose any state when injecting a 32bit exception when running on a VHE host * 64bit VM fixes: - Fix PtrAuth host saving happening in preemptible contexts - Optimize PtrAuth lazy enable - Drop vcpu to cpu context pointer - Fix sparse warnings for HYP per-CPU accesses -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl7h6r8PHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDE3gP/iogqGjZasUIwk4gdIc4IaxxNsfTYJFIh5uw sedAqwCQg3OftX0jptp6GhI3ZIG5UPuGDM7f3aio6i02pjx6bfBxGJ9AXqNcp6gN WcECHsAfzHUScznRhBbVflKkOF4dzfzyiutnMdknihePOyO9drwdvzXuJa37cs52 tsCneP9xQ/vQWdqu42uPS7HtSepSa/Lf/qeKGaTDWQIvNYGI3PctQvRAxx4FNHc/ SMUpS5zdTFceVoya/2+azTJ24R1lbwlPwaw2WoaghB+QmREKN8uMKy5kjrO5YUnH 8BtjESiNBI2CZYSwcxFt+QNA6EmymwDwfrmOE+7iBCZelOLWLVYbJ7icKX3kT731 gts5PBD8JlZWAnbH/Mbo4qngXJwHaijA38Bt8rvSphI0aK6iOU6DP5BuOurzNRde XczDYq3lqdCC2ynROjRpH4paVo7s0sBjjgZ7OsWqsw9uRAogwTkVE2sEi4HdqNAH JHhIHEKj7t/bRtzneXVk6ngoezIs6sIdcqrUZ+rAMnmMHbrzBoEqnlrlQ7e2/UXY yvY5Yc3/H2pKRCK/KznOi1nVG+xUZp4RZp552pwULF+JVbmMHIOxn3IxiejfMZVx czD5cxMcgMWa14ZZRN0DynT9wCg+s+MGaKGR6STyudVYHFBTr7hrsuM1zq/neMQf JcUBVUot =I2Li -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for Linux 5.8, take #1 * 32bit VM fixes: - Fix embarassing mapping issue between AArch32 CSSELR and AArch64 ACTLR - Add ACTLR2 support for AArch32 - Get rid of the useless ACTLR_EL1 save/restore - Fix CP14/15 accesses for AArch32 guests on BE hosts - Ensure that we don't loose any state when injecting a 32bit exception when running on a VHE host * 64bit VM fixes: - Fix PtrAuth host saving happening in preemptible contexts - Optimize PtrAuth lazy enable - Drop vcpu to cpu context pointer - Fix sparse warnings for HYP per-CPU accesses	2020-06-11 14:02:32 -04:00
Marc Zyngier	15c99816ed	Merge branch 'kvm-arm64/ptrauth-fixes' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-06-10 19:10:40 +01:00
Marc Zyngier	3204be4109	KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts AArch32 CP1x registers are overlayed on their AArch64 counterparts in the vcpu struct. This leads to an interesting problem as they are stored in their CPU-local format, and thus a CP1x register doesn't "hit" the lower 32bit portion of the AArch64 register on a BE host. To workaround this unfortunate situation, introduce a bias trick in the vcpu_cp1x() accessors which picks the correct half of the 64bit register. Cc: stable@vger.kernel.org Reported-by: James Morse <james.morse@arm.com> Tested-by: James Morse <james.morse@arm.com> Acked-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-06-10 16:03:57 +01:00
Marc Zyngier	07da1ffaa1	KVM: arm64: Remove host_cpu_context member from vcpu structure For very long, we have kept this pointer back to the per-cpu host state, despite having working per-cpu accessors at EL2 for some time now. Recent investigations have shown that this pointer is easy to abuse in preemptible context, which is a sure sign that it would better be gone. Not to mention that a per-cpu pointer is faster to access at all times. Reported-by: Andrew Scull <ascull@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com Reviewed-by: Andrew Scull <ascull@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-06-09 10:59:52 +01:00
Linus Torvalds	039aeb9deb	ARM: - Move the arch-specific code into arch/arm64/kvm - Start the post-32bit cleanup - Cherry-pick a few non-invasive pre-NV patches x86: - Rework of TLB flushing - Rework of event injection, especially with respect to nested virtualization - Nested AMD event injection facelift, building on the rework of generic code and fixing a lot of corner cases - Nested AMD live migration support - Optimization for TSC deadline MSR writes and IPIs - Various cleanups - Asynchronous page fault cleanups (from tglx, common topic branch with tip tree) - Interrupt-based delivery of asynchronous "page ready" events (host side) - Hyper-V MSRs and hypercalls for guest debugging - VMX preemption timer fixes s390: - Cleanups Generic: - switch vCPU thread wakeup from swait to rcuwait The other architectures, and the guest side of the asynchronous page fault work, will come next week. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl7VJcYUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPf6QgAq4wU5wdd1lTGz/i3DIhNVJNJgJlp ozLzRdMaJbdbn5RpAK6PEBd9+pt3+UlojpFB3gpJh2Nazv2OzV4yLQgXXXyyMEx1 5Hg7b4UCJYDrbkCiegNRv7f/4FWDkQ9dx++RZITIbxeskBBCEI+I7GnmZhGWzuC4 7kj4ytuKAySF2OEJu0VQF6u0CvrNYfYbQIRKBXjtOwuRK4Q6L63FGMJpYo159MBQ asg3B1jB5TcuGZ9zrjL5LkuzaP4qZZHIRs+4kZsH9I6MODHGUxKonrkablfKxyKy CFK+iaHCuEXXty5K0VmWM3nrTfvpEjVjbMc7e1QGBQ5oXsDM0pqn84syRg== =v7Wn -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "ARM: - Move the arch-specific code into arch/arm64/kvm - Start the post-32bit cleanup - Cherry-pick a few non-invasive pre-NV patches x86: - Rework of TLB flushing - Rework of event injection, especially with respect to nested virtualization - Nested AMD event injection facelift, building on the rework of generic code and fixing a lot of corner cases - Nested AMD live migration support - Optimization for TSC deadline MSR writes and IPIs - Various cleanups - Asynchronous page fault cleanups (from tglx, common topic branch with tip tree) - Interrupt-based delivery of asynchronous "page ready" events (host side) - Hyper-V MSRs and hypercalls for guest debugging - VMX preemption timer fixes s390: - Cleanups Generic: - switch vCPU thread wakeup from swait to rcuwait The other architectures, and the guest side of the asynchronous page fault work, will come next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (256 commits) KVM: selftests: fix rdtsc() for vmx_tsc_adjust_test KVM: check userspace_addr for all memslots KVM: selftests: update hyperv_cpuid with SynDBG tests x86/kvm/hyper-v: Add support for synthetic debugger via hypercalls x86/kvm/hyper-v: enable hypercalls regardless of hypercall page x86/kvm/hyper-v: Add support for synthetic debugger interface x86/hyper-v: Add synthetic debugger definitions KVM: selftests: VMX preemption timer migration test KVM: nVMX: Fix VMX preemption timer migration x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit KVM: x86/pmu: Support full width counting KVM: x86/pmu: Tweak kvm_pmu_get_msr to pass 'struct msr_data' in KVM: x86: announce KVM_FEATURE_ASYNC_PF_INT KVM: x86: acknowledgment mechanism for async pf page ready notifications KVM: x86: interrupt based APF 'page ready' event delivery KVM: introduce kvm_read_guest_offset_cached() KVM: rename kvm_arch_can_inject_async_page_present() to kvm_arch_can_dequeue_async_page_present() KVM: x86: extend struct kvm_vcpu_pv_apf_data with token info Revert "KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously" KVM: VMX: Replace zero-length array with flexible-array ...	2020-06-03 15:13:47 -07:00
Paolo Bonzini	380609445c	KVM/arm64 updates for Linux 5.8: - Move the arch-specific code into arch/arm64/kvm - Start the post-32bit cleanup - Cherry-pick a few non-invasive pre-NV patches -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl7RLp8PHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpD/iAQAJOHsS1PT9y/Gefam5os9FqKpogj68e3rx9k XfPcweexBVqmDWSI4vmL9xHW2F7z4EwAE4dIDsTCKHpihK30+jH8l12tOJBz35yp MR1hYjv43F54xzKkkuP4F4wo3Ygg4ipjHZPReGkaGj1QOQs6N/YKa1aSSYfzkzCz VLCSqPQz45CkGPYEGwuPn13AjHqGQAwPhteJNAoCxViw1KAldmoqDk6kbKB+b+7a 2oIvxiTZejICsgSX6UvqQYNG52AyZ/5Daq8iraaigQ8sGyKr+/2Yi+3RUUH6p7ns aCsictk+RS3BzMAKDw6MPYc7OhJBhxQEV1pdiPpt0tpS4L9LNmBagKzlaBKZhwdr dYDAjOlbgZZUJpKnlBAipuVlQySHdm2WjXr4msdY69D7OGxmkzU/zkSIokqdA2hr MuL5W1v2Z1UpxyVltb+c/4lPcFZNnRI0Mz1WcvliEojlf2zzKYMcBAl3bTiAuil5 aTT2+1G0OSCfUfr8Zart4LoAHeczw4zG/Pern+hl92eMXUlX3pIcqzQaLtVmmEE/ ecPShMowKsXOOGGp/T8Q04N1fr6KzmufP5+kgJDFZfo6iJ6r5uQ9G8nuLmp3wQOX c9mNCwdSxrFBTJ10KfLHquKqwfl18VXzKDx1pzO5nSupmKWfWZ5YFO8j2709e83x R42MqKEG =aD+9 -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.8: - Move the arch-specific code into arch/arm64/kvm - Start the post-32bit cleanup - Cherry-pick a few non-invasive pre-NV patches	2020-06-01 04:26:27 -04:00
Will Deacon	c350717ec7	Merge branch 'for-next/kvm/errata' into for-next/core KVM CPU errata rework (Andrew Scull and Marc Zyngier) * for-next/kvm/errata: KVM: arm64: Move __load_guest_stage2 to kvm_mmu.h arm64: Unify WORKAROUND_SPECULATIVE_AT_{NVHE,VHE}	2020-05-28 18:02:51 +01:00
Marc Zyngier	b130a8f70c	KVM: arm64: Check advertised Stage-2 page size capability With ARMv8.5-GTG, the hardware (or more likely a hypervisor) can advertise the supported Stage-2 page sizes. Let's check this at boot time. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>	2020-05-28 17:28:51 +01:00
Marc Zyngier	8f7f4fe756	KVM: arm64: Drop obsolete comment about sys_reg ordering The general comment about keeping the enum order in sync with the save/restore code has been obsolete for many years now. Just drop it. Note that there are other ordering requirements in the enum, such as the PtrAuth and PMU registers, which are still valid. Reported-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-05-28 13:16:57 +01:00
David Brazdil	71b3ec5f22	KVM: arm64: Clean up cpu_init_hyp_mode() Pull bits of code to the only place where it is used. Remove empty function __cpu_init_stage2(). Remove redundant has_vhe() check since this function is nVHE-only. No functional changes intended. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200515152056.83158-1-dbrazdil@google.com	2020-05-25 16:15:47 +01:00
Keqian Zhu	c862626e19	KVM: arm64: Support enabling dirty log gradually in small chunks There is already support of enabling dirty log gradually in small chunks for x86 in commit `3c9bd4006b` ("KVM: x86: enable dirty log gradually in small chunks"). This adds support for arm64. x86 still writes protect all huge pages when DIRTY_LOG_INITIALLY_ALL_SET is enabled. However, for arm64, both huge pages and normal pages can be write protected gradually by userspace. Under the Huawei Kunpeng 920 2.6GHz platform, I did some tests on 128G Linux VMs with different page size. The memory pressure is 127G in each case. The time taken of memory_global_dirty_log_start in QEMU is listed below: Page Size Before After Optimization 4K 650ms 1.8ms 2M 4ms 1.8ms 1G 2ms 1.8ms Besides the time reduction, the biggest improvement is that we will minimize the performance side effect (because of dissolving huge pages and marking memslots dirty) on guest after enabling dirty log. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200413122023.52583-1-zhukeqian1@huawei.com	2020-05-16 15:05:02 +01:00
David Matlack	cb953129bf	kvm: add halt-polling cpu usage stats Two new stats for exposing halt-polling cpu usage: halt_poll_success_ns halt_poll_fail_ns Thus sum of these 2 stats is the total cpu time spent polling. "success" means the VCPU polled until a virtual interrupt was delivered. "fail" means the VCPU had to schedule out (either because the maximum poll time was reached or it needed to yield the CPU). To avoid touching every arch's kvm_vcpu_stat struct, only update and export halt-polling cpu usage stats if we're on x86. Exporting cpu usage as a u64 and in nanoseconds means we will overflow at ~500 years, which seems reasonably large. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Jon Cargille <jcargill@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Message-Id: <20200508182240.68440-1-jcargill@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-05-15 12:26:26 -04:00
Andrew Scull	02ab1f5018	arm64: Unify WORKAROUND_SPECULATIVE_AT_{NVHE,VHE} Errata 1165522, 1319367 and 1530923 each allow TLB entries to be allocated as a result of a speculative AT instruction. In order to avoid mandating VHE on certain affected CPUs, apply the workaround to both the nVHE and the VHE case for all affected CPUs. Signed-off-by: Andrew Scull <ascull@google.com> Acked-by: Will Deacon <will@kernel.org> CC: Marc Zyngier <maz@kernel.org> CC: James Morse <james.morse@arm.com> CC: Suzuki K Poulose <suzuki.poulose@arm.com> CC: Will Deacon <will@kernel.org> CC: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20200504094858.108917-1-ascull@google.com Signed-off-by: Will Deacon <will@kernel.org>	2020-05-04 16:05:47 +01:00
Marc Zyngier	d9c3872cd2	KVM: arm64: GICv4.1: Reload VLPI configuration on distributor enable/disable Each time a Group-enable bit gets flipped, the state of these bits needs to be forwarded to the hardware. This is a pretty heavy handed operation, requiring all vcpus to reload their GICv4 configuration. It is thus implemented as a new request type. These enable bits are programmed into the HW by setting the VGrp{0,1}En fields of GICR_VPENDBASER when the vPEs are made resident again. Of course, we only support Group-1 for now... Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200304203330.4967-22-maz@kernel.org	2020-03-24 12:15:51 +00:00
Paolo Bonzini	e951445f4d	KVM/arm fixes for 5.6, take #1 - Fix compilation on 32bit - Move VHE guest entry/exit into the VHE-specific entry code - Make sure all functions called by the non-VHE HYP code is tagged as __always_inline -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl5VsNMPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDLhUQAIsecO9IyYjy1J0Q5AxaKLL7NuKYlAaty2xX uY6UkTfPNsEaHFXSGYXWPDxrmkgArp2wuy4WVQB59Om00+LE7h9kiz7+xKpcUy1G UoHa5mzMlqoOeUIWO/oSU6LYHhYDnIpHTDco93YrscU4nNRevJZ/GVeuQeMblzuZ Sg7cWc+0V43FXUt9Jw8BsNhXH/D0l0p3v86p7GZLcSfFAccO62YfOwC8J/znLPym 4S+O9RYQkCczvzFeQVYQwqImOAunaOb0OzERUbm8icOF6ekYGwywjrtlmAC/3q+q 1g/te1yfwQ8fpprWl4QSH0sQVdfAcxdDZqcWtN2LhNaEShZtNa5yKpsRGn1V0eAS tIO8eexAKCXoASHrrwfSkizYjRAeDabmodBQmS50/isY9OdBE2tDel+BLrCjzBJ2 hABwEZ3Q78216EuoqsZqWaEUZ3ck0iSW3IcXglmHE4TC8Iq6dwskvOPjay+msHr9 dcHDCxFIN4jzv9QcpKN8LkxfmW0Us28bzap3OhKfrz0nv7b4n+j0q1xbKL1QnN/l RcDPW0dQeXuX9vYMeYIUDQcV4IgTUkF6IPDCRW7KCApi98HfPTbrfQ97nir79zDp pD8NXaNFr4PtxJoheYYia3sjZMt/fgfvP2dM32iOpsMu7W1FXdfQN7heNSc6MQmO ciyhf/mj =NpPo -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-5.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm fixes for 5.6, take #1 - Fix compilation on 32bit - Move VHE guest entry/exit into the VHE-specific entry code - Make sure all functions called by the non-VHE HYP code is tagged as __always_inline	2020-02-28 11:50:06 +01:00

... 3 4 5 6 7 ...

655 Commits