linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-09-01 23:46:45 +00:00

Author	SHA1	Message	Date
Fuad Tabba	5db1bef933	KVM: arm64: Track SVE state in the hypervisor vcpu structure When dealing with a guest with SVE enabled, make sure the host SVE state is pinned at EL2 S1, and that the hypervisor vCPU state is correctly initialised (and then unpinned on teardown). Co-authored-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20250416152648.2982950-2-qperret@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-04-28 09:23:46 +01:00
Oliver Upton	ca19dd4323	Merge branch 'kvm-arm64/pkvm-6.15' into kvmarm/next * kvm-arm64/pkvm-6.15: : pKVM updates for 6.15 : : - SecPageTable stats for stage-2 table pages allocated by the protected : hypervisor (Vincent Donnefort) : : - HCRX_EL2 trap + vCPU initialization fixes for pKVM (Fuad Tabba) KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu KVM: arm64: Factor out pKVM hyp vcpu creation to separate function KVM: arm64: Initialize HCRX_EL2 traps in pKVM KVM: arm64: Factor out setting HCRX_EL2 traps into separate function KVM: arm64: Count pKVM stage-2 usage in secondary pagetable stats KVM: arm64: Distinct pKVM teardown memcache for stage-2 KVM: arm64: Add flags to kvm_hyp_memcache Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-03-19 14:54:40 -07:00
Fuad Tabba	1eab115486	KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu Instead of creating and initializing _all_ hyp vcpus in pKVM when the first host vcpu runs for the first time, initialize _each_ hyp vcpu in conjunction with its corresponding host vcpu. Some of the host vcpu state (e.g., system registers and traps values) is not initialized until the first time the host vcpu is run. Therefore, initializing a hyp vcpu before its corresponding host vcpu has run for the first time might not view the complete host state of these vcpus. Additionally, this behavior is inline with non-protected modes. Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20250314111832.4137161-5-tabba@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-03-14 16:06:03 -07:00
Fuad Tabba	066daa8d3b	KVM: arm64: Initialize HCRX_EL2 traps in pKVM Initialize and set the traps controlled by the HCRX_EL2 in pKVM. Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20250314111832.4137161-3-tabba@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-03-14 16:00:49 -07:00
Vincent Donnefort	8c0d7d14c5	KVM: arm64: Distinct pKVM teardown memcache for stage-2 In order to account for memory dedicated to the stage-2 page-tables, use a separated memcache when tearing down the VM. Meanwhile rename reclaim_guest_pages to reflect the fact it only reclaim page-table pages. Signed-off-by: Vincent Donnefort <vdonnefort@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250313114038.1502357-3-vdonnefort@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-03-14 00:56:29 -07:00
Oliver Upton	03e1b89d05	KVM: arm64: Copy MIDR_EL1 into hyp VM when it is writable KVM recently added a capability that allows userspace to override the 'implementation ID' registers presented to the VM. MIDR_EL1 is a special example, where the hypervisor can directly set the value when read from EL1 using VPIDR_EL2. Copy the VM-wide value for MIDR_EL1 into the hyp VM for non-protected guests when the capability is enabled so VPIDR_EL2 gets set up correctly. Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/kvmarm/ac594b9c-4bbb-46c8-9391-e7a68ce4de5b@sirena.org.uk/ Fixes: `3adaee7830` ("KVM: arm64: Allow userspace to change the implementation ID registers") Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250305230825.484091-3-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-03-05 16:56:57 -08:00
Oliver Upton	9d91227364	KVM: arm64: Copy guest CTR_EL0 into hyp VM Since commit `2843cae266` ("KVM: arm64: Treat CTR_EL0 as a VM feature ID register") KVM has allowed userspace to configure the VM-wide view of CTR_EL0, falling back to trap-n-emulate if the value doesn't match hardware. It appears that this has worked by chance in protected-mode for some time, and on systems with FEAT_EVT protected-mode unconditionally sets TID4 (i.e. TID2 traps sans CTR_EL0). Forward the guest CTR_EL0 value through to the hyp VM and align the TID2/TID4 configuration with the non-protected setup. Fixes: `2843cae266` ("KVM: arm64: Treat CTR_EL0 as a VM feature ID register") Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250305230825.484091-2-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2025-03-05 16:55:41 -08:00
Marc Zyngier	e880b16efb	Merge branch kvm-arm64/pkvm-fixed-features-6.14 into kvmarm-master/next * kvm-arm64/pkvm-fixed-features-6.14: (24 commits) : . : Complete rework of the pKVM handling of features, catching up : with the rest of the code deals with it these days. : Patches courtesy of Fuad Tabba. From the cover letter: : : "This patch series uses the vm's feature id registers to track the : supported features, a framework similar to nested virt to set the : trap values, and removes the need to store cptr_el2 per vcpu in : favor of setting its value when traps are activated, as VHE mode : does." : : This branch drags the arm64/for-next/cpufeature branch to solve : ugly conflicts in -next. : . KVM: arm64: Fix FEAT_MTE in pKVM KVM: arm64: Use kvm_vcpu_has_feature() directly for struct kvm KVM: arm64: Convert the SVE guest vcpu flag to a vm flag KVM: arm64: Remove PtrAuth guest vcpu flag KVM: arm64: Fix the value of the CPTR_EL2 RES1 bitmask for nVHE KVM: arm64: Refactor kvm_reset_cptr_el2() KVM: arm64: Calculate cptr_el2 traps on activating traps KVM: arm64: Remove redundant setting of HCR_EL2 trap bit KVM: arm64: Remove fixed_config.h header KVM: arm64: Rework specifying restricted features for protected VMs KVM: arm64: Set protected VM traps based on its view of feature registers KVM: arm64: Fix RAS trapping in pKVM for protected VMs KVM: arm64: Initialize feature id registers for protected VMs KVM: arm64: Use KVM extension checks for allowed protected VM capabilities KVM: arm64: Remove KVM_ARM_VCPU_POWER_OFF from protected VMs allowed features in pKVM KVM: arm64: Move checking protected vcpu features to a separate function KVM: arm64: Group setting traps for protected VMs by control register KVM: arm64: Consolidate allowed and restricted VM feature checks arm64/sysreg: Get rid of CPACR_ELx SysregFields arm64/sysreg: Convert *_EL12 accessors to Mapping ... Signed-off-by: Marc Zyngier <maz@kernel.org> # Conflicts: # arch/arm64/kvm/fpsimd.c # arch/arm64/kvm/hyp/nvhe/pkvm.c	2025-01-12 10:40:10 +00:00
Vladimir Murzin	b7f345fbc3	KVM: arm64: Fix FEAT_MTE in pKVM Make sure we do not trap access to Allocation Tags. Fixes: `b56680de9c` ("KVM: arm64: Initialize trap register values in hyp in pKVM") Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20250106112421.65355-1-vladimir.murzin@arm.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2025-01-08 10:24:32 +00:00
Fuad Tabba	41d6028e28	KVM: arm64: Convert the SVE guest vcpu flag to a vm flag The vcpu flag GUEST_HAS_SVE is per-vcpu, but it is based on what is now a per-vm feature. Make the flag per-vm. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241216105057.579031-17-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 13:54:09 +00:00
Fuad Tabba	c5c1763596	KVM: arm64: Remove PtrAuth guest vcpu flag The vcpu flag GUEST_HAS_PTRAUTH is always associated with the vcpu PtrAuth features, which are defined per vm rather than per vcpu. Remove the flag, and replace it with checks for the features instead. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241216105057.579031-16-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 13:54:06 +00:00
Fuad Tabba	2fd5b4b0e7	KVM: arm64: Calculate cptr_el2 traps on activating traps Similar to VHE, calculate the value of cptr_el2 from scratch on activate traps. This removes the need to store cptr_el2 in every vcpu structure. Moreover, some traps, such as whether the guest owns the fp registers, need to be set on every vcpu run. Reported-by: James Clark <james.clark@linaro.org> Fixes: `5294afdbf4` ("KVM: arm64: Exclude FP ownership from kvm_vcpu_arch") Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241216105057.579031-13-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 13:53:57 +00:00
Fuad Tabba	092e7b2c3b	KVM: arm64: Remove redundant setting of HCR_EL2 trap bit In hVHE mode, HCR_E2H should be set for both protected and non-protected VMs. Since commit `b56680de9c` ("KVM: arm64: Initialize trap register values in hyp in pKVM"), this has been fixed, and the setting of the flag here is redundant. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241216105057.579031-12-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 13:53:55 +00:00
Fuad Tabba	81403c8d04	KVM: arm64: Remove fixed_config.h header The few remaining items needed in fixed_config.h are better suited for pkvm.h. Move them there and delete it. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241216105057.579031-11-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 13:53:53 +00:00
Fuad Tabba	0401f7e76d	KVM: arm64: Set protected VM traps based on its view of feature registers Now that the VM's feature id registers are initialized with the values of the supported features, use those values to determine which traps to set using kvm_has_feature(). Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241216105057.579031-9-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 13:53:00 +00:00
Fuad Tabba	9df9186f8d	KVM: arm64: Fix RAS trapping in pKVM for protected VMs Trap RAS in pKVM if not supported at all for protected VMs. The RAS version doesn't matter in this case. Fixes: `2a0c343386` ("KVM: arm64: Initialize trap registers for protected VMs") Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241216105057.579031-8-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 13:52:58 +00:00
Fuad Tabba	7ba5b8f804	KVM: arm64: Initialize feature id registers for protected VMs The hypervisor maintains the state of protected VMs. Initialize the values for feature ID registers for protected VMs, to be used when setting traps and when advertising features to protected VMs. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241216105057.579031-7-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 13:52:50 +00:00
Fuad Tabba	a3163dca48	KVM: arm64: Use KVM extension checks for allowed protected VM capabilities Use KVM extension checks as the source for determining which capabilities are allowed for protected VMs. KVM extension checks is the natural place for this, since it is also the interface exposed to users. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241216105057.579031-6-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 13:45:25 +00:00
Fuad Tabba	27f5cf8ad5	KVM: arm64: Remove KVM_ARM_VCPU_POWER_OFF from protected VMs allowed features in pKVM The hypervisor is responsible for the power state of protected VMs in pKVM. Therefore, remove KVM_ARM_VCPU_POWER_OFF from the list of allowed features for protected VMs. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241216105057.579031-5-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 13:45:25 +00:00
Fuad Tabba	1fea164ccf	KVM: arm64: Move checking protected vcpu features to a separate function At the moment, checks for supported vcpu features for protected VMs are build-time bugs. In the following patch, they will become runtime checks based on the vcpu's features registers. Therefore, consolidate them into one function that would return an error if it encounters an unsupported feature. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241216105057.579031-4-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 13:45:25 +00:00
Fuad Tabba	f50758260b	KVM: arm64: Group setting traps for protected VMs by control register Group setting protected VM traps by control register rather than feature id register, since some trap values (e.g., PAuth), depend on more than one feature id register. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241216105057.579031-3-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 13:45:21 +00:00
Marc Zyngier	2589dbd727	KVM: arm64: Consolidate allowed and restricted VM feature checks The definitions for features allowed and allowed with restrictions for protected guests, which are based on feature registers, were defined and checked for separately, even though they are handled in the same way. This could result in missing checks for certain features, e.g., pointer authentication, causing traps for allowed features. Consolidate the definitions into one. Use that new definition to construct the guest view of the feature registers for consistency. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241216105057.579031-2-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 13:39:10 +00:00
Quentin Perret	72db3d3fba	KVM: arm64: Introduce __pkvm_host_unshare_guest() In preparation for letting the host unmap pages from non-protected guests, introduce a new hypercall implementing the host-unshare-guest transition. Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20241218194059.3670226-12-qperret@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 09:44:00 +00:00
Quentin Perret	d0bd3e6570	KVM: arm64: Introduce __pkvm_host_share_guest() In preparation for handling guest stage-2 mappings at EL2, introduce a new pKVM hypercall allowing to share pages with non-protected guests. Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20241218194059.3670226-11-qperret@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 09:44:00 +00:00
Marc Zyngier	f7d03fcbf1	KVM: arm64: Introduce __pkvm_vcpu_{load,put}() Rather than look-up the hyp vCPU on every run hypercall at EL2, introduce a per-CPU 'loaded_hyp_vcpu' tracking variable which is updated by a pair of load/put hypercalls called directly from kvm_arch_vcpu_{load,put}() when pKVM is enabled. Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20241218194059.3670226-10-qperret@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 09:44:00 +00:00
Quentin Perret	99996d575e	KVM: arm64: Add {get,put}_pkvm_hyp_vm() helpers In preparation for accessing pkvm_hyp_vm structures at EL2 in a context where we can't always expect a vCPU to be loaded (e.g. MMU notifiers), introduce get/put helpers to get temporary references to hyp VMs from any context. Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20241218194059.3670226-9-qperret@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-12-20 09:44:00 +00:00
James Clark	d798bc6f3c	arm64: Fix usage of new shifted MDCR_EL2 values Since the linked fixes commit, these masks are already shifted so remove the shifts. One issue that this fixes is SPE and TRBE not being available anymore: arm_spe_pmu arm,spe-v1: profiling buffer owned by higher exception level Fixes: `641630313e` ("arm64: sysreg: Migrate MDCR_EL2 definition to table") Signed-off-by: James Clark <james.clark@linaro.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241122164636.2944180-1-james.clark@linaro.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-11-26 06:31:36 -08:00
Fuad Tabba	b56680de9c	KVM: arm64: Initialize trap register values in hyp in pKVM Handle the initialization of trap registers at the hypervisor in pKVM, even for non-protected guests. The host is not trusted with the values of the trap registers, regardless of the VM type. Therefore, when switching between the host and the guests, only flush the HCR_EL2 TWI and TWE bits. The host is allowed to configure these for opportunistic scheduling, as neither affects the protection of VMs or the hypervisor. Reported-by: Will Deacon <will@kernel.org> Fixes: `814ad8f96e` ("KVM: arm64: Drop trapping of PAuth instructions/keys") Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241018074833.2563674-5-tabba@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-10-31 18:45:24 +00:00
Fuad Tabba	cb0c272ace	KVM: arm64: Initialize the hypervisor's VM state at EL2 Do not trust the state of the VM as provided by the host when initializing the hypervisor's view of the VM sate. Initialize it instead at EL2 to a known good and safe state, as pKVM already does with hypervisor VCPU states. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241018074833.2563674-4-tabba@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-10-31 18:45:24 +00:00
Fuad Tabba	0546d4a925	KVM: arm64: Move pkvm_vcpu_init_traps() to init_pkvm_hyp_vcpu() Move pkvm_vcpu_init_traps() to the initialization of the hypervisor's vcpu state in init_pkvm_hyp_vcpu(), and remove the associated hypercall. In protected mode, traps need to be initialized whenever a VCPU is initialized anyway, and not only for protected VMs. This also saves an unnecessary hypercall. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20241018074833.2563674-2-tabba@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-10-31 18:45:24 +00:00
Vincent Donnefort	78fee4198b	KVM: arm64: Fix __pkvm_init_vcpu cptr_el2 error path On an error, hyp_vcpu will be accessed while this memory has already been relinquished to the host and unmapped from the hypervisor. Protect the CPTR assignment with an early return. Fixes: `b5b9955617` ("KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM") Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Vincent Donnefort <vdonnefort@google.com> Link: https://lore.kernel.org/r/20240919110500.2345927-1-vdonnefort@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-10-01 15:25:23 +01:00
Anshuman Khandual	42b9fed388	KVM: arm64: Replace custom macros with fields from ID_AA64PFR0_EL1 This replaces custom macros usage (i.e ID_AA64PFR0_EL1_ELx_64BIT_ONLY and ID_AA64PFR0_EL1_ELx_32BIT_64BIT) and instead directly uses register fields from ID_AA64PFR0_EL1 sysreg definition. Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.linux.dev Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240613102710.3295108-2-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-07-04 12:21:12 +01:00
Fuad Tabba	a69283ae1d	KVM: arm64: Refactor CPACR trap bit setting/clearing to use ELx format When setting/clearing CPACR bits for EL0 and EL1, use the ELx format of the bits, which covers both. This makes the code clearer, and reduces the chances of accidentally missing a bit. No functional change intended. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-9-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-06-04 15:06:33 +01:00
Fuad Tabba	1696fc2174	KVM: arm64: Consolidate initializing the host data's fpsimd_state/sve in pKVM Now that we have introduced finalize_init_hyp_mode(), lets consolidate the initializing of the host_data fpsimd_state and sve state. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240603122852.3923848-8-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-06-04 15:06:33 +01:00
Fuad Tabba	b5b9955617	KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM When running in protected mode we don't want to leak protected guest state to the host, including whether a guest has used fpsimd/sve. Therefore, eagerly restore the host state on guest exit when running in protected mode, which happens only if the guest has used fpsimd/sve. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-7-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-06-04 15:06:33 +01:00
Fuad Tabba	66d5b53e20	KVM: arm64: Allocate memory mapped at hyp for host sve state in pKVM Protected mode needs to maintain (save/restore) the host's sve state, rather than relying on the host kernel to do that. This is to avoid leaking information to the host about guests and the type of operations they are performing. As a first step towards that, allocate memory mapped at hyp, per cpu, for the host sve state. The following patch will use this memory to save/restore the host state. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-6-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-06-04 15:06:33 +01:00
Fuad Tabba	40458a66af	KVM: arm64: Fix comment for __pkvm_vcpu_init_traps() Fix the comment to clarify that __pkvm_vcpu_init_traps() initializes traps for all VMs in protected mode, and not only for protected VMs. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-14-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-05-01 16:48:14 +01:00
Quentin Perret	cb16301626	KVM: arm64: Issue CMOs when tearing down guest s2 pages On the guest teardown path, pKVM will zero the pages used to back the guest data structures before returning them to the host as they may contain secrets (e.g. in the vCPU registers). However, the zeroing is done using a cacheable alias, and CMOs are missing, hence giving the host a potential opportunity to read the original content of the guest structs from memory. Fix this by issuing CMOs after zeroing the pages. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-6-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-05-01 16:46:58 +01:00
Fuad Tabba	4c22a40dd9	KVM: arm64: Initialize the kvm host data's fpsimd_state pointer in pKVM Since the host_fpsimd_state has been removed from kvm_vcpu_arch, it isn't pointing to the hyp's version of the host fp_regs in protected mode. Initialize the host_data fpsimd_state point to the host_data's context fp_regs on pKVM initialization. Fixes: `51e09b5572` ("KVM: arm64: Exclude host_fpsimd_state pointer from kvm_vcpu_arch") Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-2-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-05-01 16:46:58 +01:00
Linus Torvalds	09d1c6a80f	Generic: - Use memdup_array_user() to harden against overflow. - Unconditionally advertise KVM_CAP_DEVICE_CTRL for all architectures. - Clean up Kconfigs that all KVM architectures were selecting - New functionality around "guest_memfd", a new userspace API that creates an anonymous file and returns a file descriptor that refers to it. guest_memfd files are bound to their owning virtual machine, cannot be mapped, read, or written by userspace, and cannot be resized. guest_memfd files do however support PUNCH_HOLE, which can be used to switch a memory area between guest_memfd and regular anonymous memory. - New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify per-page attributes for a given page of guest memory; right now the only attribute is whether the guest expects to access memory via guest_memfd or not, which in Confidential SVMs backed by SEV-SNP, TDX or ARM64 pKVM is checked by firmware or hypervisor that guarantees confidentiality (AMD PSP, Intel TDX module, or EL2 in the case of pKVM). x86: - Support for "software-protected VMs" that can use the new guest_memfd and page attributes infrastructure. This is mostly useful for testing, since there is no pKVM-like infrastructure to provide a meaningfully reduced TCB. - Fix a relatively benign off-by-one error when splitting huge pages during CLEAR_DIRTY_LOG. - Fix a bug where KVM could incorrectly test-and-clear dirty bits in non-leaf TDP MMU SPTEs if a racing thread replaces a huge SPTE with a non-huge SPTE. - Use more generic lockdep assertions in paths that don't actually care about whether the caller is a reader or a writer. - let Xen guests opt out of having PV clock reported as "based on a stable TSC", because some of them don't expect the "TSC stable" bit (added to the pvclock ABI by KVM, but never set by Xen) to be set. - Revert a bogus, made-up nested SVM consistency check for TLB_CONTROL. - Advertise flush-by-ASID support for nSVM unconditionally, as KVM always flushes on nested transitions, i.e. always satisfies flush requests. This allows running bleeding edge versions of VMware Workstation on top of KVM. - Sanity check that the CPU supports flush-by-ASID when enabling SEV support. - On AMD machines with vNMI, always rely on hardware instead of intercepting IRET in some cases to detect unmasking of NMIs - Support for virtualizing Linear Address Masking (LAM) - Fix a variety of vPMU bugs where KVM fail to stop/reset counters and other state prior to refreshing the vPMU model. - Fix a double-overflow PMU bug by tracking emulated counter events using a dedicated field instead of snapshotting the "previous" counter. If the hardware PMC count triggers overflow that is recognized in the same VM-Exit that KVM manually bumps an event count, KVM would pend PMIs for both the hardware-triggered overflow and for KVM-triggered overflow. - Turn off KVM_WERROR by default for all configs so that it's not inadvertantly enabled by non-KVM developers, which can be problematic for subsystems that require no regressions for W=1 builds. - Advertise all of the host-supported CPUID bits that enumerate IA32_SPEC_CTRL "features". - Don't force a masterclock update when a vCPU synchronizes to the current TSC generation, as updating the masterclock can cause kvmclock's time to "jump" unexpectedly, e.g. when userspace hotplugs a pre-created vCPU. - Use RIP-relative address to read kvm_rebooting in the VM-Enter fault paths, partly as a super minor optimization, but mostly to make KVM play nice with position independent executable builds. - Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on CONFIG_HYPERV as a minor optimization, and to self-document the code. - Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV "emulation" at build time. ARM64: - LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB base granule sizes. Branch shared with the arm64 tree. - Large Fine-Grained Trap rework, bringing some sanity to the feature, although there is more to come. This comes with a prefix branch shared with the arm64 tree. - Some additional Nested Virtualization groundwork, mostly introducing the NV2 VNCR support and retargetting the NV support to that version of the architecture. - A small set of vgic fixes and associated cleanups. Loongarch: - Optimization for memslot hugepage checking - Cleanup and fix some HW/SW timer issues - Add LSX/LASX (128bit/256bit SIMD) support RISC-V: - KVM_GET_REG_LIST improvement for vector registers - Generate ISA extension reg_list using macros in get-reg-list selftest - Support for reporting steal time along with selftest s390: - Bugfixes Selftests: - Fix an annoying goof where the NX hugepage test prints out garbage instead of the magic token needed to run the test. - Fix build errors when a header is delete/moved due to a missing flag in the Makefile. - Detect if KVM bugged/killed a selftest's VM and print out a helpful message instead of complaining that a random ioctl() failed. - Annotate the guest printf/assert helpers with __printf(), and fix the various bugs that were lurking due to lack of said annotation. There are two non-KVM patches buried in the middle of guest_memfd support: fs: Rename anon_inode_getfile_secure() and anon_inode_getfd_secure() mm: Add AS_UNMOVABLE to mark mapping as completely unmovable The first is small and mostly suggested-by Christian Brauner; the second a bit less so but it was written by an mm person (Vlastimil Babka). -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmWcMWkUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroO15gf/WLmmg3SET6Uzw9iEq2xo28831ZA+ 6kpILfIDGKozV5safDmMvcInlc/PTnqOFrsKyyN4kDZ+rIJiafJdg/loE0kPXBML wdR+2ix5kYI1FucCDaGTahskBDz8Lb/xTpwGg9BFLYFNmuUeHc74o6GoNvr1uliE 4kLZL2K6w0cSMPybUD+HqGaET80ZqPwecv+s1JL+Ia0kYZJONJifoHnvOUJ7DpEi rgudVdgzt3EPjG0y1z6MjvDBXTCOLDjXajErlYuZD3Ej8N8s59Dh2TxOiDNTLdP4 a4zjRvDmgyr6H6sz+upvwc7f4M4p+DBvf+TkWF54mbeObHUYliStqURIoA== =66Ws -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "Generic: - Use memdup_array_user() to harden against overflow. - Unconditionally advertise KVM_CAP_DEVICE_CTRL for all architectures. - Clean up Kconfigs that all KVM architectures were selecting - New functionality around "guest_memfd", a new userspace API that creates an anonymous file and returns a file descriptor that refers to it. guest_memfd files are bound to their owning virtual machine, cannot be mapped, read, or written by userspace, and cannot be resized. guest_memfd files do however support PUNCH_HOLE, which can be used to switch a memory area between guest_memfd and regular anonymous memory. - New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify per-page attributes for a given page of guest memory; right now the only attribute is whether the guest expects to access memory via guest_memfd or not, which in Confidential SVMs backed by SEV-SNP, TDX or ARM64 pKVM is checked by firmware or hypervisor that guarantees confidentiality (AMD PSP, Intel TDX module, or EL2 in the case of pKVM). x86: - Support for "software-protected VMs" that can use the new guest_memfd and page attributes infrastructure. This is mostly useful for testing, since there is no pKVM-like infrastructure to provide a meaningfully reduced TCB. - Fix a relatively benign off-by-one error when splitting huge pages during CLEAR_DIRTY_LOG. - Fix a bug where KVM could incorrectly test-and-clear dirty bits in non-leaf TDP MMU SPTEs if a racing thread replaces a huge SPTE with a non-huge SPTE. - Use more generic lockdep assertions in paths that don't actually care about whether the caller is a reader or a writer. - let Xen guests opt out of having PV clock reported as "based on a stable TSC", because some of them don't expect the "TSC stable" bit (added to the pvclock ABI by KVM, but never set by Xen) to be set. - Revert a bogus, made-up nested SVM consistency check for TLB_CONTROL. - Advertise flush-by-ASID support for nSVM unconditionally, as KVM always flushes on nested transitions, i.e. always satisfies flush requests. This allows running bleeding edge versions of VMware Workstation on top of KVM. - Sanity check that the CPU supports flush-by-ASID when enabling SEV support. - On AMD machines with vNMI, always rely on hardware instead of intercepting IRET in some cases to detect unmasking of NMIs - Support for virtualizing Linear Address Masking (LAM) - Fix a variety of vPMU bugs where KVM fail to stop/reset counters and other state prior to refreshing the vPMU model. - Fix a double-overflow PMU bug by tracking emulated counter events using a dedicated field instead of snapshotting the "previous" counter. If the hardware PMC count triggers overflow that is recognized in the same VM-Exit that KVM manually bumps an event count, KVM would pend PMIs for both the hardware-triggered overflow and for KVM-triggered overflow. - Turn off KVM_WERROR by default for all configs so that it's not inadvertantly enabled by non-KVM developers, which can be problematic for subsystems that require no regressions for W=1 builds. - Advertise all of the host-supported CPUID bits that enumerate IA32_SPEC_CTRL "features". - Don't force a masterclock update when a vCPU synchronizes to the current TSC generation, as updating the masterclock can cause kvmclock's time to "jump" unexpectedly, e.g. when userspace hotplugs a pre-created vCPU. - Use RIP-relative address to read kvm_rebooting in the VM-Enter fault paths, partly as a super minor optimization, but mostly to make KVM play nice with position independent executable builds. - Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on CONFIG_HYPERV as a minor optimization, and to self-document the code. - Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV "emulation" at build time. ARM64: - LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB base granule sizes. Branch shared with the arm64 tree. - Large Fine-Grained Trap rework, bringing some sanity to the feature, although there is more to come. This comes with a prefix branch shared with the arm64 tree. - Some additional Nested Virtualization groundwork, mostly introducing the NV2 VNCR support and retargetting the NV support to that version of the architecture. - A small set of vgic fixes and associated cleanups. Loongarch: - Optimization for memslot hugepage checking - Cleanup and fix some HW/SW timer issues - Add LSX/LASX (128bit/256bit SIMD) support RISC-V: - KVM_GET_REG_LIST improvement for vector registers - Generate ISA extension reg_list using macros in get-reg-list selftest - Support for reporting steal time along with selftest s390: - Bugfixes Selftests: - Fix an annoying goof where the NX hugepage test prints out garbage instead of the magic token needed to run the test. - Fix build errors when a header is delete/moved due to a missing flag in the Makefile. - Detect if KVM bugged/killed a selftest's VM and print out a helpful message instead of complaining that a random ioctl() failed. - Annotate the guest printf/assert helpers with __printf(), and fix the various bugs that were lurking due to lack of said annotation" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (185 commits) x86/kvm: Do not try to disable kvmclock if it was not enabled KVM: x86: add missing "depends on KVM" KVM: fix direction of dependency on MMU notifiers KVM: introduce CONFIG_KVM_COMMON KVM: arm64: Add missing memory barriers when switching to pKVM's hyp pgd KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache RISC-V: KVM: selftests: Add get-reg-list test for STA registers RISC-V: KVM: selftests: Add steal_time test support RISC-V: KVM: selftests: Add guest_sbi_probe_extension RISC-V: KVM: selftests: Move sbi_ecall to processor.c RISC-V: KVM: Implement SBI STA extension RISC-V: KVM: Add support for SBI STA registers RISC-V: KVM: Add support for SBI extension registers RISC-V: KVM: Add SBI STA info to vcpu_arch RISC-V: KVM: Add steal-update vcpu request RISC-V: KVM: Add SBI STA extension skeleton RISC-V: paravirt: Implement steal-time support RISC-V: Add SBI STA extension definitions RISC-V: paravirt: Add skeleton for pv-time support RISC-V: KVM: Fix indentation in kvm_riscv_vcpu_set_reg_csr() ...	2024-01-17 13:03:37 -08:00
Fuad Tabba	9d52612690	KVM: arm64: Trap external trace for protected VMs pKVM does not support external trace for protected VMs. Trap external trace, and add the ExtTrcBuff to make it possible to check for the feature. Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20231214100158.2305400-18-tabba@google.com	2023-12-18 11:25:51 +00:00
Marc Zyngier	ced242ba9d	KVM: arm64: Remove VPIPT I-cache handling We have some special handling for VPIPT I-cache in critical parts of the cache and TLB maintenance. Remove it. Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20231204143606.1806432-2-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-12-05 11:38:03 +00:00
Marc Zyngier	fe49fd940e	KVM: arm64: Move VTCR_EL2 into struct s2_mmu We currently have a global VTCR_EL2 value for each guest, even if the guest uses NV. This implies that the guest's own S2 must fit in the host's. This is odd, for multiple reasons: - the PARange values and the number of IPA bits don't necessarily match: you can have 33 bits of IPA space, and yet you can only describe 32 or 36 bits of PARange - When userspace set the IPA space, it creates a contract with the kernel saying "this is the IPA space I'm prepared to handle". At no point does it constraint the guest's own IPA space as long as the guest doesn't try to use a [I]PA outside of the IPA space set by userspace - We don't even try to hide the value of ID_AA64MMFR0_EL1.PARange. And then there is the consequence of the above: if a guest tries to create a S2 that has for input address something that is larger than the IPA space defined by the host, we inject a fatal exception. This is no good. For all intent and purposes, a guest should be able to have the S2 it really wants, as long as the output address of that S2 isn't outside of the IPA space. For that, we need to have a per-s2_mmu VTCR_EL2 setting, which allows us to represent the full PARange. Move the vctr field into the s2_mmu structure, which has no impact whatsoever, except for NV. Note that once we are able to override ID_AA64MMFR0_EL1.PARange from userspace, we'll also be able to restrict the size of the shadow S2 that NV uses. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20231012205108.3937270-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-10-23 18:48:46 +00:00
Marc Zyngier	38cba55008	KVM: arm64: Force HCR_E2H in guest context when ARM64_KVM_HVHE is set Also make sure HCR_EL2.E2H is set when switching HCR_EL2 in guest context. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230609162200.2024064-16-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:17:24 +00:00
Marc Zyngier	75c76ab5a6	KVM: arm64: Rework CPTR_EL2 programming for HVHE configuration Just like we repainted the early arm64 code, we need to update the CPTR_EL2 accesses that are taking place in the nVHE code when hVHE is used, making them look as if they were CPACR_EL1 accesses. Just like the VHE code. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230609162200.2024064-14-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2023-06-12 23:17:24 +00:00
Will Deacon	be66e67f17	KVM: arm64: Use the pKVM hyp vCPU structure in handle___kvm_vcpu_run() As a stepping stone towards deprivileging the host's access to the guest's vCPU structures, introduce some naive flush/sync routines to copy most of the host vCPU into the hyp vCPU on vCPU run and back again on return to EL1. This allows us to run using the pKVM hyp structures when KVM is initialised in protected mode. Tested-by: Vincent Donnefort <vdonnefort@google.com> Co-developed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-27-will@kernel.org	2022-11-11 17:19:35 +00:00
Will Deacon	73f38ef2ae	KVM: arm64: Maintain a copy of 'kvm_arm_vmid_bits' at EL2 Sharing 'kvm_arm_vmid_bits' between EL1 and EL2 allows the host to modify the variable arbitrarily, potentially leading to all sorts of shenanians as this is used to configure the VTTBR register for the guest stage-2. In preparation for unmapping host sections entirely from EL2, maintain a copy of 'kvm_arm_vmid_bits' in the pKVM hypervisor and initialise it from the host value while it is still trusted. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-23-will@kernel.org	2022-11-11 17:19:35 +00:00
Quentin Perret	f41dff4efb	KVM: arm64: Return guest memory from EL2 via dedicated teardown memcache Rather than relying on the host to free the previously-donated pKVM hypervisor VM pages explicitly on teardown, introduce a dedicated teardown memcache which allows the host to reclaim guest memory resources without having to keep track of all of the allocations made by the pKVM hypervisor at EL2. Tested-by: Vincent Donnefort <vdonnefort@google.com> Co-developed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> [maz: dropped __maybe_unused from unmap_donated_memory_noclear()] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-21-will@kernel.org	2022-11-11 17:18:58 +00:00
Will Deacon	13e248aab7	KVM: arm64: Provide I-cache invalidation by virtual address at EL2 In preparation for handling cache maintenance of guest pages from within the pKVM hypervisor at EL2, introduce an EL2 copy of icache_inval_pou() which will later be plumbed into the stage-2 page-table cache maintenance callbacks, ensuring that the initial contents of pages mapped as executable into the guest stage-2 page-table is visible to the instruction fetcher. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-17-will@kernel.org	2022-11-11 17:16:25 +00:00
Fuad Tabba	9d0c063a4d	KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1 With the pKVM hypervisor at EL2 now offering hypercalls to the host for creating and destroying VM and vCPU structures, plumb these in to the existing arm64 KVM backend to ensure that the hypervisor data structures are allocated and initialised on first vCPU run for a pKVM guest. In the host, 'struct kvm_protected_vm' is introduced to hold the handle of the pKVM VM instance as well as to track references to the memory donated to the hypervisor so that it can be freed back to the host allocator following VM teardown. The stage-2 page-table, hypervisor VM and vCPU structures are allocated separately so as to avoid the need for a large physically-contiguous allocation in the host at run-time. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-14-will@kernel.org	2022-11-11 17:16:24 +00:00

1 2

61 Commits