Commit Graph

582 Commits

Author SHA1 Message Date
Marc Zyngier
4d4f52052b KVM: arm64: nv: Drop EL12 register traps that are redirected to VNCR
With FEAT_NV2, a bunch of system register writes are turned into
memory writes. This is specially the fate of the EL12 registers
that the guest hypervisor manipulates out of context.

Remove the trap descriptors for those, as they are never going
to be used again.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-12-19 09:51:06 +00:00
Marc Zyngier
3ed0b5123c KVM: arm64: nv: Compute NV view of idregs as a one-off
Now that we have a full copy of the idregs for each VM, there is
no point in repainting the sysregs on each access. Instead, we
can simply perform the transmation as a one-off and be done
with it.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-12-19 09:51:00 +00:00
Fuad Tabba
676f482354 KVM: arm64: Handle HAFGRTR_EL2 trapping in nested virt
Add the encodings to fine grain trapping fields for HAFGRTR_EL2
and add the associated handling code in nested virt. Based on
DDI0601 2023-09. Add the missing field definitions as well,
both to generate the correct RES0 mask and to be able to toggle
their FGT bits.

Also add the code for handling FGT trapping, reading of the
register, to nested virt.

Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231214100158.2305400-10-tabba@google.com
2023-12-18 11:25:50 +00:00
James Clark
62e1f212e5 arm: perf/kvm: Use GENMASK for ARMV8_PMU_PMCR_N
This is so that FIELD_GET and FIELD_PREP can be used and that the fields
are in a consistent format to arm64/tools/sysreg

Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20231211161331.1277825-3-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12 09:46:21 +00:00
Linus Torvalds
6803bd7956 ARM:
* Generalized infrastructure for 'writable' ID registers, effectively
   allowing userspace to opt-out of certain vCPU features for its guest
 
 * Optimization for vSGI injection, opportunistically compressing MPIDR
   to vCPU mapping into a table
 
 * Improvements to KVM's PMU emulation, allowing userspace to select
   the number of PMCs available to a VM
 
 * Guest support for memory operation instructions (FEAT_MOPS)
 
 * Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
   bugs and getting rid of useless code
 
 * Changes to the way the SMCCC filter is constructed, avoiding wasted
   memory allocations when not in use
 
 * Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
   the overhead of errata mitigations
 
 * Miscellaneous kernel and selftest fixes
 
 LoongArch:
 
 * New architecture.  The hardware uses the same model as x86, s390
   and RISC-V, where guest/host mode is orthogonal to supervisor/user
   mode.  The virtualization extensions are very similar to MIPS,
   therefore the code also has some similarities but it's been cleaned
   up to avoid some of the historical bogosities that are found in
   arch/mips.  The kernel emulates MMU, timer and CSR accesses, while
   interrupt controllers are only emulated in userspace, at least for
   now.
 
 RISC-V:
 
 * Support for the Smstateen and Zicond extensions
 
 * Support for virtualizing senvcfg
 
 * Support for virtualized SBI debug console (DBCN)
 
 S390:
 
 * Nested page table management can be monitored through tracepoints
   and statistics
 
 x86:
 
 * Fix incorrect handling of VMX posted interrupt descriptor in KVM_SET_LAPIC,
   which could result in a dropped timer IRQ
 
 * Avoid WARN on systems with Intel IPI virtualization
 
 * Add CONFIG_KVM_MAX_NR_VCPUS, to allow supporting up to 4096 vCPUs without
   forcing more common use cases to eat the extra memory overhead.
 
 * Add virtualization support for AMD SRSO mitigation (IBPB_BRTYPE and
   SBPB, aka Selective Branch Predictor Barrier).
 
 * Fix a bug where restoring a vCPU snapshot that was taken within 1 second of
   creating the original vCPU would cause KVM to try to synchronize the vCPU's
   TSC and thus clobber the correct TSC being set by userspace.
 
 * Compute guest wall clock using a single TSC read to avoid generating an
   inaccurate time, e.g. if the vCPU is preempted between multiple TSC reads.
 
 * "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which complain
   about a "Firmware Bug" if the bit isn't set for select F/M/S combos.
   Likewise "virtualize" (ignore) MSR_AMD64_TW_CFG to appease Windows Server
   2022.
 
 * Don't apply side effects to Hyper-V's synthetic timer on writes from
   userspace to fix an issue where the auto-enable behavior can trigger
   spurious interrupts, i.e. do auto-enabling only for guest writes.
 
 * Remove an unnecessary kick of all vCPUs when synchronizing the dirty log
   without PML enabled.
 
 * Advertise "support" for non-serializing FS/GS base MSR writes as appropriate.
 
 * Harden the fast page fault path to guard against encountering an invalid
   root when walking SPTEs.
 
 * Omit "struct kvm_vcpu_xen" entirely when CONFIG_KVM_XEN=n.
 
 * Use the fast path directly from the timer callback when delivering Xen
   timer events, instead of waiting for the next iteration of the run loop.
   This was not done so far because previously proposed code had races,
   but now care is taken to stop the hrtimer at critical points such as
   restarting the timer or saving the timer information for userspace.
 
 * Follow the lead of upstream Xen and ignore the VCPU_SSHOTTMR_future flag.
 
 * Optimize injection of PMU interrupts that are simultaneous with NMIs.
 
 * Usual handful of fixes for typos and other warts.
 
 x86 - MTRR/PAT fixes and optimizations:
 
 * Clean up code that deals with honoring guest MTRRs when the VM has
   non-coherent DMA and host MTRRs are ignored, i.e. EPT is enabled.
 
 * Zap EPT entries when non-coherent DMA assignment stops/start to prevent
   using stale entries with the wrong memtype.
 
 * Don't ignore guest PAT for CR0.CD=1 && KVM_X86_QUIRK_CD_NW_CLEARED=y.
   This was done as a workaround for virtual machine BIOSes that did not
   bother to clear CR0.CD (because ancient KVM/QEMU did not bother to
   set it, in turn), and there's zero reason to extend the quirk to
   also ignore guest PAT.
 
 x86 - SEV fixes:
 
 * Report KVM_EXIT_SHUTDOWN instead of EINVAL if KVM intercepts SHUTDOWN while
   running an SEV-ES guest.
 
 * Clean up the recognition of emulation failures on SEV guests, when KVM would
   like to "skip" the instruction but it had already been partially emulated.
   This makes it possible to drop a hack that second guessed the (insufficient)
   information provided by the emulator, and just do the right thing.
 
 Documentation:
 
 * Various updates and fixes, mostly for x86
 
 * MTRR and PAT fixes and optimizations:
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmVBZc0UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroP1LQf+NgsmZ1lkGQlKdSdijoQ856w+k0or
 l2SV1wUwiEdFPSGK+RTUlHV5Y1ni1dn/CqCVIJZKEI3ZtZ1m9/4HKIRXvbMwFHIH
 hx+E4Lnf8YUjsGjKTLd531UKcpphztZavQ6pXLEwazkSkDEra+JIKtooI8uU+9/p
 bd/eF1V+13a8CHQf1iNztFJVxqBJbVlnPx4cZDRQQvewskIDGnVDtwbrwCUKGtzD
 eNSzhY7si6O2kdQNkuA8xPhg29dYX9XLaCK2K1l8xOUm8WipLdtF86GAKJ5BVuOL
 6ek/2QCYjZ7a+coAZNfgSEUi8JmFHEqCo7cnKmWzPJp+2zyXsdudqAhT1g==
 =UIxm
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:

   - Generalized infrastructure for 'writable' ID registers, effectively
     allowing userspace to opt-out of certain vCPU features for its
     guest

   - Optimization for vSGI injection, opportunistically compressing
     MPIDR to vCPU mapping into a table

   - Improvements to KVM's PMU emulation, allowing userspace to select
     the number of PMCs available to a VM

   - Guest support for memory operation instructions (FEAT_MOPS)

   - Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
     bugs and getting rid of useless code

   - Changes to the way the SMCCC filter is constructed, avoiding wasted
     memory allocations when not in use

   - Load the stage-2 MMU context at vcpu_load() for VHE systems,
     reducing the overhead of errata mitigations

   - Miscellaneous kernel and selftest fixes

  LoongArch:

   - New architecture for kvm.

     The hardware uses the same model as x86, s390 and RISC-V, where
     guest/host mode is orthogonal to supervisor/user mode. The
     virtualization extensions are very similar to MIPS, therefore the
     code also has some similarities but it's been cleaned up to avoid
     some of the historical bogosities that are found in arch/mips. The
     kernel emulates MMU, timer and CSR accesses, while interrupt
     controllers are only emulated in userspace, at least for now.

  RISC-V:

   - Support for the Smstateen and Zicond extensions

   - Support for virtualizing senvcfg

   - Support for virtualized SBI debug console (DBCN)

  S390:

   - Nested page table management can be monitored through tracepoints
     and statistics

  x86:

   - Fix incorrect handling of VMX posted interrupt descriptor in
     KVM_SET_LAPIC, which could result in a dropped timer IRQ

   - Avoid WARN on systems with Intel IPI virtualization

   - Add CONFIG_KVM_MAX_NR_VCPUS, to allow supporting up to 4096 vCPUs
     without forcing more common use cases to eat the extra memory
     overhead.

   - Add virtualization support for AMD SRSO mitigation (IBPB_BRTYPE and
     SBPB, aka Selective Branch Predictor Barrier).

   - Fix a bug where restoring a vCPU snapshot that was taken within 1
     second of creating the original vCPU would cause KVM to try to
     synchronize the vCPU's TSC and thus clobber the correct TSC being
     set by userspace.

   - Compute guest wall clock using a single TSC read to avoid
     generating an inaccurate time, e.g. if the vCPU is preempted
     between multiple TSC reads.

   - "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which
     complain about a "Firmware Bug" if the bit isn't set for select
     F/M/S combos. Likewise "virtualize" (ignore) MSR_AMD64_TW_CFG to
     appease Windows Server 2022.

   - Don't apply side effects to Hyper-V's synthetic timer on writes
     from userspace to fix an issue where the auto-enable behavior can
     trigger spurious interrupts, i.e. do auto-enabling only for guest
     writes.

   - Remove an unnecessary kick of all vCPUs when synchronizing the
     dirty log without PML enabled.

   - Advertise "support" for non-serializing FS/GS base MSR writes as
     appropriate.

   - Harden the fast page fault path to guard against encountering an
     invalid root when walking SPTEs.

   - Omit "struct kvm_vcpu_xen" entirely when CONFIG_KVM_XEN=n.

   - Use the fast path directly from the timer callback when delivering
     Xen timer events, instead of waiting for the next iteration of the
     run loop. This was not done so far because previously proposed code
     had races, but now care is taken to stop the hrtimer at critical
     points such as restarting the timer or saving the timer information
     for userspace.

   - Follow the lead of upstream Xen and ignore the VCPU_SSHOTTMR_future
     flag.

   - Optimize injection of PMU interrupts that are simultaneous with
     NMIs.

   - Usual handful of fixes for typos and other warts.

  x86 - MTRR/PAT fixes and optimizations:

   - Clean up code that deals with honoring guest MTRRs when the VM has
     non-coherent DMA and host MTRRs are ignored, i.e. EPT is enabled.

   - Zap EPT entries when non-coherent DMA assignment stops/start to
     prevent using stale entries with the wrong memtype.

   - Don't ignore guest PAT for CR0.CD=1 && KVM_X86_QUIRK_CD_NW_CLEARED=y

     This was done as a workaround for virtual machine BIOSes that did
     not bother to clear CR0.CD (because ancient KVM/QEMU did not bother
     to set it, in turn), and there's zero reason to extend the quirk to
     also ignore guest PAT.

  x86 - SEV fixes:

   - Report KVM_EXIT_SHUTDOWN instead of EINVAL if KVM intercepts
     SHUTDOWN while running an SEV-ES guest.

   - Clean up the recognition of emulation failures on SEV guests, when
     KVM would like to "skip" the instruction but it had already been
     partially emulated. This makes it possible to drop a hack that
     second guessed the (insufficient) information provided by the
     emulator, and just do the right thing.

  Documentation:

   - Various updates and fixes, mostly for x86

   - MTRR and PAT fixes and optimizations"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (164 commits)
  KVM: selftests: Avoid using forced target for generating arm64 headers
  tools headers arm64: Fix references to top srcdir in Makefile
  KVM: arm64: Add tracepoint for MMIO accesses where ISV==0
  KVM: arm64: selftest: Perform ISB before reading PAR_EL1
  KVM: arm64: selftest: Add the missing .guest_prepare()
  KVM: arm64: Always invalidate TLB for stage-2 permission faults
  KVM: x86: Service NMI requests after PMI requests in VM-Enter path
  KVM: arm64: Handle AArch32 SPSR_{irq,abt,und,fiq} as RAZ/WI
  KVM: arm64: Do not let a L1 hypervisor access the *32_EL2 sysregs
  KVM: arm64: Refine _EL2 system register list that require trap reinjection
  arm64: Add missing _EL2 encodings
  arm64: Add missing _EL12 encodings
  KVM: selftests: aarch64: vPMU test for validating user accesses
  KVM: selftests: aarch64: vPMU register test for unimplemented counters
  KVM: selftests: aarch64: vPMU register test for implemented counters
  KVM: selftests: aarch64: Introduce vpmu_counter_access test
  tools: Import arm_pmuv3.h
  KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
  KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run
  KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
  ...
2023-11-02 15:45:15 -10:00
Linus Torvalds
56ec8e4cd8 arm64 updates for 6.7:
* Major refactoring of the CPU capability detection logic resulting in
   the removal of the cpus_have_const_cap() function and migrating the
   code to "alternative" branches where possible
 
 * Backtrace/kgdb: use IPIs and pseudo-NMI
 
 * Perf and PMU:
 
   - Add support for Ampere SoC PMUs
 
   - Multi-DTC improvements for larger CMN configurations with multiple
     Debug & Trace Controllers
 
   - Rework the Arm CoreSight PMU driver to allow separate registration of
     vendor backend modules
 
   - Fixes: add missing MODULE_DEVICE_TABLE to the amlogic perf
     driver; use device_get_match_data() in the xgene driver; fix NULL
     pointer dereference in the hisi driver caused by calling
     cpuhp_state_remove_instance(); use-after-free in the hisi driver
 
 * HWCAP updates:
 
   - FEAT_SVE_B16B16 (BFloat16)
 
   - FEAT_LRCPC3 (release consistency model)
 
   - FEAT_LSE128 (128-bit atomic instructions)
 
 * SVE: remove a couple of pseudo registers from the cpufeature code.
   There is logic in place already to detect mismatched SVE features
 
 * Miscellaneous:
 
   - Reduce the default swiotlb size (currently 64MB) if no ZONE_DMA
     bouncing is needed. The buffer is still required for small kmalloc()
     buffers
 
   - Fix module PLT counting with !RANDOMIZE_BASE
 
   - Restrict CPU_BIG_ENDIAN to LLVM IAS 15.x or newer move
     synchronisation code out of the set_ptes() loop
 
   - More compact cpufeature displaying enabled cores
 
   - Kselftest updates for the new CPU features
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmU7/QUACgkQa9axLQDI
 XvEx3xAAjICmHm+ryKJxS1IGXLYu2DXMcHUjeW6w1SxkK/vKhTMlHRx/CIWDze2l
 eENu7TcDLtTw+Gv9kqg30TSwzLfJhP9oFpX2T5TKkh5qlJlbz8fBtm+as14DTLCZ
 p2sra3J0w4B5JwTVqnj2RHOlEftMKvbyLGRkz3ve6wIUbsp5pXMkxAd/k3wOf0lC
 m6d9w1OMA2sOsw9YCgjcCNQGEzFMJk+13w7K+4w6A8Djn/Jxkt4fAFVn2ZlCiZzD
 NA2lTDWJqGmeGHo3iFdCTensWXmWTqjzxsNEf7PyBk5mBOdzDVxlTfEL7vnJg7gf
 BlTQ/nhIpra7rHQ9q2rwqEzbF+4Tn3uWlQfdDb7+/4goPjDh7tlBhEOYyOwTCEIT
 0t9cCSvBmSCKeXC3lKWWtJ+QJKhZHSmXN84EotTs65KyyfIsi4RuSezvV/+aIL86
 06sHYlYxETuujZP1cgOjf69Wsdsgizx0mqXJXf/xOjp22HFDcL4Bki6Rgi6t5OZj
 GEHG15kSE+eJ+RIpxpuAN8fdrlxYubsVLIksCqK7cZf9zXbQGIlifKAIrYiEx6kz
 FD+o+j/5niRWR6yJZCtCcGxqpSlwnYWPqc1Ds0GES8A/BphWMPozXUAZ0ll4Fnp1
 yyR2/Due/eBsCNESn579kP8989rashubB8vxvdx2fcWVtLC7VgE=
 =QaEo
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "No major architecture features this time around, just some new HWCAP
  definitions, support for the Ampere SoC PMUs and a few fixes/cleanups.

  The bulk of the changes is reworking of the CPU capability checking
  code (cpus_have_cap() etc).

   - Major refactoring of the CPU capability detection logic resulting
     in the removal of the cpus_have_const_cap() function and migrating
     the code to "alternative" branches where possible

   - Backtrace/kgdb: use IPIs and pseudo-NMI

   - Perf and PMU:

      - Add support for Ampere SoC PMUs

      - Multi-DTC improvements for larger CMN configurations with
        multiple Debug & Trace Controllers

      - Rework the Arm CoreSight PMU driver to allow separate
        registration of vendor backend modules

      - Fixes: add missing MODULE_DEVICE_TABLE to the amlogic perf
        driver; use device_get_match_data() in the xgene driver; fix
        NULL pointer dereference in the hisi driver caused by calling
        cpuhp_state_remove_instance(); use-after-free in the hisi driver

   - HWCAP updates:

      - FEAT_SVE_B16B16 (BFloat16)

      - FEAT_LRCPC3 (release consistency model)

      - FEAT_LSE128 (128-bit atomic instructions)

   - SVE: remove a couple of pseudo registers from the cpufeature code.
     There is logic in place already to detect mismatched SVE features

   - Miscellaneous:

      - Reduce the default swiotlb size (currently 64MB) if no ZONE_DMA
        bouncing is needed. The buffer is still required for small
        kmalloc() buffers

      - Fix module PLT counting with !RANDOMIZE_BASE

      - Restrict CPU_BIG_ENDIAN to LLVM IAS 15.x or newer move
        synchronisation code out of the set_ptes() loop

      - More compact cpufeature displaying enabled cores

      - Kselftest updates for the new CPU features"

 * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (83 commits)
  arm64: Restrict CPU_BIG_ENDIAN to GNU as or LLVM IAS 15.x or newer
  arm64: module: Fix PLT counting when CONFIG_RANDOMIZE_BASE=n
  arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helper
  perf: hisi: Fix use-after-free when register pmu fails
  drivers/perf: hisi_pcie: Initialize event->cpu only on success
  drivers/perf: hisi_pcie: Check the type first in pmu::event_init()
  arm64: cpufeature: Change DBM to display enabled cores
  arm64: cpufeature: Display the set of cores with a feature
  perf/arm-cmn: Enable per-DTC counter allocation
  perf/arm-cmn: Rework DTC counters (again)
  perf/arm-cmn: Fix DTC domain detection
  drivers: perf: arm_pmuv3: Drop some unused arguments from armv8_pmu_init()
  drivers: perf: arm_pmuv3: Read PMMIR_EL1 unconditionally
  drivers/perf: hisi: use cpuhp_state_remove_instance_nocalls() for hisi_hns3_pmu uninit process
  clocksource/drivers/arm_arch_timer: limit XGene-1 workaround
  arm64: Remove system_uses_lse_atomics()
  arm64: Mark the 'addr' argument to set_ptes() and __set_pte_at() as unused
  drivers/perf: xgene: Use device_get_match_data()
  perf/amlogic: add missing MODULE_DEVICE_TABLE
  arm64/mm: Hoist synchronization out of set_ptes() loop
  ...
2023-11-01 09:34:55 -10:00
Paolo Bonzini
45b890f768 KVM/arm64 updates for 6.7
- Generalized infrastructure for 'writable' ID registers, effectively
    allowing userspace to opt-out of certain vCPU features for its guest
 
  - Optimization for vSGI injection, opportunistically compressing MPIDR
    to vCPU mapping into a table
 
  - Improvements to KVM's PMU emulation, allowing userspace to select
    the number of PMCs available to a VM
 
  - Guest support for memory operation instructions (FEAT_MOPS)
 
  - Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
    bugs and getting rid of useless code
 
  - Changes to the way the SMCCC filter is constructed, avoiding wasted
    memory allocations when not in use
 
  - Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
    the overhead of errata mitigations
 
  - Miscellaneous kernel and selftest fixes
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZUFJRgAKCRCivnWIJHzd
 FtgYAP9cMsc1Mhlw3jNQnTc6q0cbTulD/SoEDPUat1dXMqjs+gEAnskwQTrTX834
 fgGQeCAyp7Gmar+KeP64H0xm8kPSpAw=
 =R4M7
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 6.7

 - Generalized infrastructure for 'writable' ID registers, effectively
   allowing userspace to opt-out of certain vCPU features for its guest

 - Optimization for vSGI injection, opportunistically compressing MPIDR
   to vCPU mapping into a table

 - Improvements to KVM's PMU emulation, allowing userspace to select
   the number of PMCs available to a VM

 - Guest support for memory operation instructions (FEAT_MOPS)

 - Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing
   bugs and getting rid of useless code

 - Changes to the way the SMCCC filter is constructed, avoiding wasted
   memory allocations when not in use

 - Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing
   the overhead of errata mitigations

 - Miscellaneous kernel and selftest fixes
2023-10-31 16:37:07 -04:00
Oliver Upton
123f42f0ad Merge branch kvm-arm64/pmu_pmcr_n into kvmarm/next
* kvm-arm64/pmu_pmcr_n:
  : User-defined PMC limit, courtesy Raghavendra Rao Ananta
  :
  : Certain VMMs may want to reserve some PMCs for host use while running a
  : KVM guest. This was a bit difficult before, as KVM advertised all
  : supported counters to the guest. Userspace can now limit the number of
  : advertised PMCs by writing to PMCR_EL0.N, as KVM's sysreg and PMU
  : emulation enforce the specified limit for handling guest accesses.
  KVM: selftests: aarch64: vPMU test for validating user accesses
  KVM: selftests: aarch64: vPMU register test for unimplemented counters
  KVM: selftests: aarch64: vPMU register test for implemented counters
  KVM: selftests: aarch64: Introduce vpmu_counter_access test
  tools: Import arm_pmuv3.h
  KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
  KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run
  KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
  KVM: arm64: PMU: Set PMCR_EL0.N for vCPU based on the associated PMU
  KVM: arm64: PMU: Add a helper to read a vCPU's PMCR_EL0
  KVM: arm64: Select default PMU in KVM_ARM_VCPU_INIT handler
  KVM: arm64: PMU: Introduce helpers to set the guest's PMU

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:24:19 +00:00
Oliver Upton
53ce49ea75 Merge branch kvm-arm64/mops into kvmarm/next
* kvm-arm64/mops:
  : KVM support for MOPS, courtesy of Kristina Martsenko
  :
  : MOPS adds new instructions for accelerating memcpy(), memset(), and
  : memmove() operations in hardware. This series brings virtualization
  : support for KVM guests, and allows VMs to run on asymmetrict systems
  : that may have different MOPS implementations.
  KVM: arm64: Expose MOPS instructions to guests
  KVM: arm64: Add handler for MOPS exceptions

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:21:19 +00:00
Oliver Upton
a87a36436c Merge branch kvm-arm64/writable-id-regs into kvmarm/next
* kvm-arm64/writable-id-regs:
  : Writable ID registers, courtesy of Jing Zhang
  :
  : This series significantly expands the architectural feature set that
  : userspace can manipulate via the ID registers. A new ioctl is defined
  : that makes the mutable fields in the ID registers discoverable to
  : userspace.
  KVM: selftests: Avoid using forced target for generating arm64 headers
  tools headers arm64: Fix references to top srcdir in Makefile
  KVM: arm64: selftests: Test for setting ID register from usersapce
  tools headers arm64: Update sysreg.h with kernel sources
  KVM: selftests: Generate sysreg-defs.h and add to include path
  perf build: Generate arm64's sysreg-defs.h and add to include path
  tools: arm64: Add a Makefile for generating sysreg-defs.h
  KVM: arm64: Document vCPU feature selection UAPIs
  KVM: arm64: Allow userspace to change ID_AA64ZFR0_EL1
  KVM: arm64: Allow userspace to change ID_AA64PFR0_EL1
  KVM: arm64: Allow userspace to change ID_AA64MMFR{0-2}_EL1
  KVM: arm64: Allow userspace to change ID_AA64ISAR{0-2}_EL1
  KVM: arm64: Bump up the default KVM sanitised debug version to v8p8
  KVM: arm64: Reject attempts to set invalid debug arch version
  KVM: arm64: Advertise selected DebugVer in DBGDIDR.Version
  KVM: arm64: Use guest ID register values for the sake of emulation
  KVM: arm64: Document KVM_ARM_GET_REG_WRITABLE_MASKS
  KVM: arm64: Allow userspace to get the writable masks for feature ID registers

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:21:09 +00:00
Oliver Upton
51e6079614 Merge branch kvm-arm64/nv-trap-fixes into kvmarm/next
* kvm-arm64/nv-trap-fixes:
  : NV trap forwarding fixes, courtesy Miguel Luis and Marc Zyngier
  :
  :  - Explicitly define the effects of HCR_EL2.NV on EL2 sysregs in the
  :    NV trap encoding
  :
  :  - Make EL2 registers that access AArch32 guest state UNDEF or RAZ/WI
  :    where appropriate for NV guests
  KVM: arm64: Handle AArch32 SPSR_{irq,abt,und,fiq} as RAZ/WI
  KVM: arm64: Do not let a L1 hypervisor access the *32_EL2 sysregs
  KVM: arm64: Refine _EL2 system register list that require trap reinjection
  arm64: Add missing _EL2 encodings
  arm64: Add missing _EL12 encodings

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:18:46 +00:00
Oliver Upton
7ff7dfe946 Merge branch kvm-arm64/pmevtyper-filter into kvmarm/next
* kvm-arm64/pmevtyper-filter:
  : Fixes to KVM's handling of the PMUv3 exception level filtering bits
  :
  :  - NSH (count at EL2) and M (count at EL3) should be stateful when the
  :    respective EL is advertised in the ID registers but have no effect on
  :    event counting.
  :
  :  - NSU and NSK modify the event filtering of EL0 and EL1, respectively.
  :    Though the kernel may not use these bits, other KVM guests might.
  :    Implement these bits exactly as written in the pseudocode if EL3 is
  :    advertised.
  KVM: arm64: Add PMU event filter bits required if EL3 is implemented
  KVM: arm64: Make PMEVTYPER<n>_EL0.NSH RES0 if EL2 isn't advertised

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-30 20:18:23 +00:00
Marc Zyngier
3f7915ccc9 KVM: arm64: Handle AArch32 SPSR_{irq,abt,und,fiq} as RAZ/WI
When trapping accesses from a NV guest that tries to access
SPSR_{irq,abt,und,fiq}, make sure we handle them as RAZ/WI,
as if AArch32 wasn't implemented.

This involves a bit of repainting to make the visibility
handler more generic.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231023095444.1587322-6-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-25 00:25:06 +00:00
Marc Zyngier
c7d11a61c7 KVM: arm64: Do not let a L1 hypervisor access the *32_EL2 sysregs
DBGVCR32_EL2, DACR32_EL2, IFSR32_EL2 and FPEXC32_EL2 are required to
UNDEF when AArch32 isn't implemented, which is definitely the case when
running NV.

Given that this is the only case where these registers can trap,
unconditionally inject an UNDEF exception.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20231023095444.1587322-5-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-25 00:24:57 +00:00
Reiji Watanabe
ea9ca904d2 KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
KVM does not yet support userspace modifying PMCR_EL0.N (With
the previous patch, KVM ignores what is written by userspace).
Add support userspace limiting PMCR_EL0.N.

Disallow userspace to set PMCR_EL0.N to a value that is greater
than the host value as KVM doesn't support more event counters
than what the host HW implements. Also, make this register
immutable after the VM has started running. To maintain the
existing expectations, instead of returning an error, KVM
returns a success for these two cases.

Finally, ignore writes to read-only bits that are cleared on
vCPU reset, and RES{0,1} bits (including writable bits that
KVM doesn't support yet), as those bits shouldn't be modified
(at least with the current KVM).

Co-developed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Link: https://lore.kernel.org/r/20231020214053.2144305-8-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:30 +00:00
Raghavendra Rao Ananta
a45f41d754 KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
For unimplemented counters, the bits in PM{C,I}NTEN{SET,CLR} and
PMOVS{SET,CLR} registers are expected to RAZ. To honor this,
explicitly implement the {get,set}_user functions for these
registers to mask out unimplemented counters for userspace reads
and writes.

Co-developed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Link: https://lore.kernel.org/r/20231020214053.2144305-6-rananta@google.com
[Oliver: drop unnecessary locking]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:30 +00:00
Raghavendra Rao Ananta
4d20debf9c KVM: arm64: PMU: Set PMCR_EL0.N for vCPU based on the associated PMU
The number of PMU event counters is indicated in PMCR_EL0.N.
For a vCPU with PMUv3 configured, the value is set to the same
value as the current PE on every vCPU reset.  Unless the vCPU is
pinned to PEs that has the PMU associated to the guest from the
initial vCPU reset, the value might be different from the PMU's
PMCR_EL0.N on heterogeneous PMU systems.

Fix this by setting the vCPU's PMCR_EL0.N to the PMU's PMCR_EL0.N
value. Track the PMCR_EL0.N per guest, as only one PMU can be set
for the guest (PMCR_EL0.N must be the same for all vCPUs of the
guest), and it is convenient for updating the value.

To achieve this, the patch introduces a helper,
kvm_arm_pmu_get_max_counters(), that reads the maximum number of
counters from the arm_pmu associated to the VM. Make the function
global as upcoming patches will be interested to know the value
while setting the PMCR.N of the guest from userspace.

KVM does not yet support userspace modifying PMCR_EL0.N.
The following patch will add support for that.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Co-developed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Link: https://lore.kernel.org/r/20231020214053.2144305-5-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:30 +00:00
Reiji Watanabe
57fc267f1b KVM: arm64: PMU: Add a helper to read a vCPU's PMCR_EL0
Add a helper to read a vCPU's PMCR_EL0, and use it whenever KVM
reads a vCPU's PMCR_EL0.

Currently, the PMCR_EL0 value is tracked per vCPU. The following
patches will make (only) PMCR_EL0.N track per guest. Having the
new helper will be useful to combine the PMCR_EL0.N field
(tracked per guest) and the other fields (tracked per vCPU)
to provide the value of PMCR_EL0.

No functional change intended.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-4-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:29 +00:00
Oliver Upton
bc512d6a9b KVM: arm64: Make PMEVTYPER<n>_EL0.NSH RES0 if EL2 isn't advertised
The NSH bit, which filters event counting at EL2, is required by the
architecture if an implementation has EL2. Even though KVM doesn't
support nested virt yet, it makes no effort to hide the existence of EL2
from the ID registers. Userspace can, however, change the value of PFR0
to hide EL2. Align KVM's sysreg emulation with the architecture and make
NSH RES0 if EL2 isn't advertised. Keep in mind the bit is ignored when
constructing the backing perf event.

While at it, build the event type mask using explicit field definitions
instead of relying on ARMV8_PMU_EVTYPE_MASK. KVM probably should've been
doing this in the first place, as it avoids changes to the
aforementioned mask affecting sysreg emulation.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20231019185618.3442949-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 19:26:14 +00:00
Mark Rutland
d8569fba13 arm64: kvm: Use cpus_have_final_cap() explicitly
Much of the arm64 KVM code uses cpus_have_const_cap() to check for
cpucaps, but this is unnecessary and it would be preferable to use
cpus_have_final_cap().

For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.

Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.

KVM is initialized after cpucaps have been finalized and alternatives
have been patched. Since commit:

  d86de40dec ("arm64: cpufeature: upgrade hyp caps to final")

... use of cpus_have_const_cap() in hyp code is automatically converted
to use cpus_have_final_cap():

| static __always_inline bool cpus_have_const_cap(int num)
| {
| 	if (is_hyp_code())
| 		return cpus_have_final_cap(num);
| 	else if (system_capabilities_finalized())
| 		return __cpus_have_const_cap(num);
| 	else
| 		return cpus_have_cap(num);
| }

Thus, converting hyp code to use cpus_have_final_cap() directly will not
result in any functional change.

Non-hyp KVM code is also not executed until cpucaps have been finalized,
and it would be preferable to extent the same treatment to this code and
use cpus_have_final_cap() directly.

This patch converts instances of cpus_have_const_cap() in KVM-only code
over to cpus_have_final_cap(). As all of this code runs after cpucaps
have been finalized, there should be no functional change as a result of
this patch, but the redundant instructions generated by
cpus_have_const_cap() will be removed from the non-hyp KVM code.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 12:57:56 +01:00
Joey Gouly
839d90357b KVM: arm64: POR{E0}_EL1 do not need trap handlers
These will not be trapped by KVM, so don't need a handler.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231012123459.2820835-3-joey.gouly@arm.com
2023-10-12 16:41:02 +01:00
Kristina Martsenko
e0bb80c62c KVM: arm64: Expose MOPS instructions to guests
Expose the Armv8.8 FEAT_MOPS feature to guests in the ID register and
allow the MOPS instructions to be run in a guest. Only expose MOPS if
the whole system supports it.

Note, it is expected that guests do not use these instructions on MMIO,
similarly to other instructions where ESR_EL2.ISV==0 such as LDP/STP.

Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230922112508.1774352-3-kristina.martsenko@arm.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-09 19:54:25 +00:00
Oliver Upton
f89fbb350d KVM: arm64: Allow userspace to change ID_AA64ZFR0_EL1
All known fields in ID_AA64ZFR0_EL1 describe the unprivileged
instructions supported by the PE's SVE implementation. Allow userspace
to pick and choose the advertised feature set, though nothing stops the
guest from using undisclosed instructions.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231003230408.3405722-10-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-04 17:11:50 +00:00
Jing Zhang
8cfd5be88e KVM: arm64: Allow userspace to change ID_AA64PFR0_EL1
Allow userspace to change the guest-visible value of the register with
some severe limitation:

 - No changes to features not virtualized by KVM (AMU, MPAM, RAS)

 - Short of full GICv2 emulation in kernel, hiding GICv3 from the guest
   makes absolutely no sense.

 - FP is effectively assumed for KVM VMs.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
[oliver: restrict features that are illogical to change]
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231003230408.3405722-9-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-04 17:11:50 +00:00
Jing Zhang
d5a32b60dc KVM: arm64: Allow userspace to change ID_AA64MMFR{0-2}_EL1
Allow userspace to modify the guest-visible values of these ID
registers. Prevent changes to any of the virtualization features until
KVM picks up support for nested and we have a handle on managing NV
features.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
[oliver: prevent changes to EL2 features for now]
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231003230408.3405722-8-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-04 17:11:50 +00:00
Oliver Upton
56d77aa8bd KVM: arm64: Allow userspace to change ID_AA64ISAR{0-2}_EL1
Almost all of the features described by the ISA registers have no KVM
involvement. Allow userspace to change the value of these registers with
a couple exceptions:

 - MOPS is not writable as KVM does not currently virtualize FEAT_MOPS.

 - The PAuth fields are not writable as KVM requires both address and
   generic authentication be enabled.

Co-developed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231003230408.3405722-7-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-04 17:11:50 +00:00
Oliver Upton
9f9917bc71 KVM: arm64: Bump up the default KVM sanitised debug version to v8p8
Since ID_AA64DFR0_EL1 and ID_DFR0_EL1 are now writable from userspace,
it is safe to bump up the default KVM sanitised debug version to v8p8.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231003230408.3405722-6-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-04 17:11:45 +00:00
Oliver Upton
a9bc4a1c1e KVM: arm64: Reject attempts to set invalid debug arch version
The debug architecture is mandatory in ARMv8, so KVM should not allow
userspace to configure a vCPU with less than that. Of course, this isn't
handled elegantly by the generic ID register plumbing, as the respective
ID register fields have a nonzero starting value.

Add an explicit check for debug versions less than v8 of the
architecture.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231003230408.3405722-5-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-04 17:11:39 +00:00
Oliver Upton
5a23e5c7cb KVM: arm64: Advertise selected DebugVer in DBGDIDR.Version
Much like we do for other fields, extract the Debug architecture version
from the ID register to populate the corresponding field in DBGDIDR.
Rewrite the existing sysreg field extractors to use SYS_FIELD_GET() for
consistency.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-04 17:10:15 +00:00
Jing Zhang
8b6958d6ac KVM: arm64: Use guest ID register values for the sake of emulation
Since KVM now supports per-VM ID registers, use per-VM ID register
values for the sake of emulation for DBGDIDR and LORegion.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231003230408.3405722-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-04 17:10:15 +00:00
Jing Zhang
3f9cd0ca84 KVM: arm64: Allow userspace to get the writable masks for feature ID registers
While the Feature ID range is well defined and pretty large, it isn't
inconceivable that the architecture will eventually grow some other
ranges that will need to similarly be described to userspace.

Add a VM ioctl to allow userspace to get writable masks for feature ID
registers in below system register space:
op0 = 3, op1 = {0, 1, 3}, CRn = 0, CRm = {0 - 7}, op2 = {0 - 7}
This is used to support mix-and-match userspace and kernels for writable
ID registers, where userspace may want to know upfront whether it can
actually tweak the contents of an idreg or not.

Add a new capability (KVM_CAP_ARM_SUPPORTED_FEATURE_ID_RANGES) that
returns a bitmap of the valid ranges, which can subsequently be
retrieved, one at a time by setting the index of the set bit as the
range identifier.

Suggested-by: Marc Zyngier <maz@kernel.org>
Suggested-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231003230408.3405722-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-04 17:09:39 +00:00
Oliver Upton
7b424ffcd4 KVM: arm64: Don't use kerneldoc comment for arm64_check_features()
A double-asterisk opening mark to the comment (i.e. '/**') indicates a
comment block is in the kerneldoc format. There's automation in place to
validate that kerneldoc blocks actually adhere to the formatting rules.

The function comment for arm64_check_features() isn't kerneldoc; use a
'regular' comment to silence automation warnings.

Link: https://lore.kernel.org/all/202309112251.e25LqfcK-lkp@intel.com/
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20230913165645.2319017-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-09-18 17:08:21 +00:00
Marc Zyngier
03fb54d0aa KVM: arm64: nv: Add support for HCRX_EL2
HCRX_EL2 has an interesting effect on HFGITR_EL2, as it conditions
the traps of TLBI*nXS.

Expand the FGT support to add a new Fine Grained Filter that will
get checked when the instruction gets trapped, allowing the shadow
register to override the trap as needed.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-29-maz@kernel.org
2023-08-17 10:00:28 +01:00
Marc Zyngier
e58ec47bf6 KVM: arm64: nv: Add trap forwarding infrastructure
A significant part of what a NV hypervisor needs to do is to decide
whether a trap from a L2+ guest has to be forwarded to a L1 guest
or handled locally. This is done by checking for the trap bits that
the guest hypervisor has set and acting accordingly, as described by
the architecture.

A previous approach was to sprinkle a bunch of checks in all the
system register accessors, but this is pretty error prone and doesn't
help getting an overview of what is happening.

Instead, implement a set of global tables that describe a trap bit,
combinations of trap bits, behaviours on trap, and what bits must
be evaluated on a system register trap.

Although this is painful to describe, this allows to specify each
and every control bit in a static manner. To make it efficient,
the table is inserted in an xarray that is global to the system,
and checked each time we trap a system register while running
a L2 guest.

Add the basic infrastructure for now, while additional patches will
implement configuration registers.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-15-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
50d2fe4648 KVM: arm64: nv: Add FGT registers
Add the 5 registers covering FEAT_FGT. The AMU-related registers
are currently left out as we don't have a plan for them. Yet.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-13-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
484f86824a KVM: arm64: Correctly handle ACCDATA_EL1 traps
As we blindly reset some HFGxTR_EL2 bits to 0, we also randomly trap
unsuspecting sysregs that have their trap bits with a negative
polarity.

ACCDATA_EL1 is one such register that can be accessed by the guest,
causing a splat on the host as we don't have a proper handler for
it.

Adding such handler addresses the issue, though there are a number
of other registers missing as the current architecture documentation
doesn't describe them yet.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-11-maz@kernel.org
2023-08-17 10:00:27 +01:00
Xiang Chen
9d2a55b403 KVM: arm64: Fix the name of sys_reg_desc related to PMU
For those PMU system registers defined in sys_reg_descs[], use macro
PMU_SYS_REG() / PMU_PMEVCNTR_EL0 / PMU_PMEVTYPER_EL0 to define them, and
later two macros call macro PMU_SYS_REG() actually.
Currently the input parameter of PMU_SYS_REG() is another macro which is
calculation formula of the value of system registers, so for example, if
we want to "SYS_PMINTENSET_EL1" as the name of sys register, actually
the name we get is as following:
(((3) << 19) | ((0) << 16) | ((9) << 12) | ((14) << 8) | ((1) << 5))
The name of system register is used in some tracepoints such as
trace_kvm_sys_access(), if not set correctly, we need to analyze the
inaccurate name to get the exact name (which also is inconsistent with
other system registers), and also the inaccurate name occupies more space.

To fix the issue, use the name as a input parameter of PMU_SYS_REG like
MTE_REG or EL2_REG.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/1689305920-170523-1-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-07-14 23:34:05 +00:00
Oliver Upton
6d4f9236cd KVM: arm64: Correctly handle RES0 bits PMEVTYPER<n>_EL0.evtCount
The PMU event ID varies from 10 to 16 bits, depending on the PMU
version. If the PMU only supports 10 bits of event ID, bits [15:10] of
the evtCount field behave as RES0.

While the actual PMU emulation code gets this right (i.e. RES0 bits are
masked out when programming the perf event), the sysreg emulation writes
an unmasked value to the in-memory cpu context. The net effect is that
guest reads and writes of PMEVTYPER<n>_EL0 will see non-RES0 behavior in
the reserved bits of the field.

As it so happens, kvm_pmu_set_counter_event_type() already writes a
masked value to the in-memory context that gets overwritten by
access_pmu_evtyper(). Fix the issue by removing the unnecessary (and
incorrect) register write in access_pmu_evtyper().

Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Link: https://lore.kernel.org/r/20230713221649.3889210-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-07-14 23:28:58 +00:00
Linus Torvalds
e8069f5a8e ARM64:
* Eager page splitting optimization for dirty logging, optionally
   allowing for a VM to avoid the cost of hugepage splitting in the stage-2
   fault path.
 
 * Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with
   services that live in the Secure world. pKVM intervenes on FF-A calls
   to guarantee the host doesn't misuse memory donated to the hyp or a
   pKVM guest.
 
 * Support for running the split hypervisor with VHE enabled, known as
   'hVHE' mode. This is extremely useful for testing the split
   hypervisor on VHE-only systems, and paves the way for new use cases
   that depend on having two TTBRs available at EL2.
 
 * Generalized framework for configurable ID registers from userspace.
   KVM/arm64 currently prevents arbitrary CPU feature set configuration
   from userspace, but the intent is to relax this limitation and allow
   userspace to select a feature set consistent with the CPU.
 
 * Enable the use of Branch Target Identification (FEAT_BTI) in the
   hypervisor.
 
 * Use a separate set of pointer authentication keys for the hypervisor
   when running in protected mode, as the host is untrusted at runtime.
 
 * Ensure timer IRQs are consistently released in the init failure
   paths.
 
 * Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps
   (FEAT_EVT), as it is a register commonly read from userspace.
 
 * Erratum workaround for the upcoming AmpereOne part, which has broken
   hardware A/D state management.
 
 RISC-V:
 
 * Redirect AMO load/store misaligned traps to KVM guest
 
 * Trap-n-emulate AIA in-kernel irqchip for KVM guest
 
 * Svnapot support for KVM Guest
 
 s390:
 
 * New uvdevice secret API
 
 * CMM selftest and fixes
 
 * fix racy access to target CPU for diag 9c
 
 x86:
 
 * Fix missing/incorrect #GP checks on ENCLS
 
 * Use standard mmu_notifier hooks for handling APIC access page
 
 * Drop now unnecessary TR/TSS load after VM-Exit on AMD
 
 * Print more descriptive information about the status of SEV and SEV-ES during
   module load
 
 * Add a test for splitting and reconstituting hugepages during and after
   dirty logging
 
 * Add support for CPU pinning in demand paging test
 
 * Add support for AMD PerfMonV2, with a variety of cleanups and minor fixes
   included along the way
 
 * Add a "nx_huge_pages=never" option to effectively avoid creating NX hugepage
   recovery threads (because nx_huge_pages=off can be toggled at runtime)
 
 * Move handling of PAT out of MTRR code and dedup SVM+VMX code
 
 * Fix output of PIC poll command emulation when there's an interrupt
 
 * Add a maintainer's handbook to document KVM x86 processes, preferred coding
   style, testing expectations, etc.
 
 * Misc cleanups, fixes and comments
 
 Generic:
 
 * Miscellaneous bugfixes and cleanups
 
 Selftests:
 
 * Generate dependency files so that partial rebuilds work as expected
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmSgHrIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroORcAf+KkBlXwQMf+Q0Hy6Mfe0OtkKmh0Ae
 6HJ6dsuMfOHhWv5kgukh+qvuGUGzHq+gpVKmZg2yP3h3cLHOLUAYMCDm+rjXyjsk
 F4DbnJLfxq43Pe9PHRKFxxSecRcRYCNox0GD5UYL4PLKcH0FyfQrV+HVBK+GI8L3
 FDzUcyJkR12Lcj1qf++7fsbzfOshL0AJPmidQCoc6wkLJpUEr/nYUqlI1Kx3YNuQ
 LKmxFHS4l4/O/px3GKNDrLWDbrVlwciGIa3GZLS52PZdW3mAqT+cqcPcYK6SW71P
 m1vE80VbNELX5q3YSRoOXtedoZ3Pk97LEmz/xQAsJ/jri0Z5Syk0Ok0m/Q==
 =AMXp
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM64:

   - Eager page splitting optimization for dirty logging, optionally
     allowing for a VM to avoid the cost of hugepage splitting in the
     stage-2 fault path.

   - Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact
     with services that live in the Secure world. pKVM intervenes on
     FF-A calls to guarantee the host doesn't misuse memory donated to
     the hyp or a pKVM guest.

   - Support for running the split hypervisor with VHE enabled, known as
     'hVHE' mode. This is extremely useful for testing the split
     hypervisor on VHE-only systems, and paves the way for new use cases
     that depend on having two TTBRs available at EL2.

   - Generalized framework for configurable ID registers from userspace.
     KVM/arm64 currently prevents arbitrary CPU feature set
     configuration from userspace, but the intent is to relax this
     limitation and allow userspace to select a feature set consistent
     with the CPU.

   - Enable the use of Branch Target Identification (FEAT_BTI) in the
     hypervisor.

   - Use a separate set of pointer authentication keys for the
     hypervisor when running in protected mode, as the host is untrusted
     at runtime.

   - Ensure timer IRQs are consistently released in the init failure
     paths.

   - Avoid trapping CTR_EL0 on systems with Enhanced Virtualization
     Traps (FEAT_EVT), as it is a register commonly read from userspace.

   - Erratum workaround for the upcoming AmpereOne part, which has
     broken hardware A/D state management.

  RISC-V:

   - Redirect AMO load/store misaligned traps to KVM guest

   - Trap-n-emulate AIA in-kernel irqchip for KVM guest

   - Svnapot support for KVM Guest

  s390:

   - New uvdevice secret API

   - CMM selftest and fixes

   - fix racy access to target CPU for diag 9c

  x86:

   - Fix missing/incorrect #GP checks on ENCLS

   - Use standard mmu_notifier hooks for handling APIC access page

   - Drop now unnecessary TR/TSS load after VM-Exit on AMD

   - Print more descriptive information about the status of SEV and
     SEV-ES during module load

   - Add a test for splitting and reconstituting hugepages during and
     after dirty logging

   - Add support for CPU pinning in demand paging test

   - Add support for AMD PerfMonV2, with a variety of cleanups and minor
     fixes included along the way

   - Add a "nx_huge_pages=never" option to effectively avoid creating NX
     hugepage recovery threads (because nx_huge_pages=off can be toggled
     at runtime)

   - Move handling of PAT out of MTRR code and dedup SVM+VMX code

   - Fix output of PIC poll command emulation when there's an interrupt

   - Add a maintainer's handbook to document KVM x86 processes,
     preferred coding style, testing expectations, etc.

   - Misc cleanups, fixes and comments

  Generic:

   - Miscellaneous bugfixes and cleanups

  Selftests:

   - Generate dependency files so that partial rebuilds work as
     expected"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (153 commits)
  Documentation/process: Add a maintainer handbook for KVM x86
  Documentation/process: Add a label for the tip tree handbook's coding style
  KVM: arm64: Fix misuse of KVM_ARM_VCPU_POWER_OFF bit index
  RISC-V: KVM: Remove unneeded semicolon
  RISC-V: KVM: Allow Svnapot extension for Guest/VM
  riscv: kvm: define vcpu_sbi_ext_pmu in header
  RISC-V: KVM: Expose IMSIC registers as attributes of AIA irqchip
  RISC-V: KVM: Add in-kernel virtualization of AIA IMSIC
  RISC-V: KVM: Expose APLIC registers as attributes of AIA irqchip
  RISC-V: KVM: Add in-kernel emulation of AIA APLIC
  RISC-V: KVM: Implement device interface for AIA irqchip
  RISC-V: KVM: Skeletal in-kernel AIA irqchip support
  RISC-V: KVM: Set kvm_riscv_aia_nr_hgei to zero
  RISC-V: KVM: Add APLIC related defines
  RISC-V: KVM: Add IMSIC related defines
  RISC-V: KVM: Implement guest external interrupt line management
  KVM: x86: Remove PRIx* definitions as they are solely for user space
  s390/uv: Update query for secret-UVCs
  s390/uv: replace scnprintf with sysfs_emit
  s390/uvdevice: Add 'Lock Secret Store' UVC
  ...
2023-07-03 15:32:22 -07:00
Paolo Bonzini
cc744042d9 KVM/arm64 updates for 6.5
- Eager page splitting optimization for dirty logging, optionally
    allowing for a VM to avoid the cost of block splitting in the stage-2
    fault path.
 
  - Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with
    services that live in the Secure world. pKVM intervenes on FF-A calls
    to guarantee the host doesn't misuse memory donated to the hyp or a
    pKVM guest.
 
  - Support for running the split hypervisor with VHE enabled, known as
    'hVHE' mode. This is extremely useful for testing the split
    hypervisor on VHE-only systems, and paves the way for new use cases
    that depend on having two TTBRs available at EL2.
 
  - Generalized framework for configurable ID registers from userspace.
    KVM/arm64 currently prevents arbitrary CPU feature set configuration
    from userspace, but the intent is to relax this limitation and allow
    userspace to select a feature set consistent with the CPU.
 
  - Enable the use of Branch Target Identification (FEAT_BTI) in the
    hypervisor.
 
  - Use a separate set of pointer authentication keys for the hypervisor
    when running in protected mode, as the host is untrusted at runtime.
 
  - Ensure timer IRQs are consistently released in the init failure
    paths.
 
  - Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps
    (FEAT_EVT), as it is a register commonly read from userspace.
 
  - Erratum workaround for the upcoming AmpereOne part, which has broken
    hardware A/D state management.
 
 As a consequence of the hVHE series reworking the arm64 software
 features framework, the for-next/module-alloc branch from the arm64 tree
 comes along for the ride.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZJWrhwAKCRCivnWIJHzd
 Fs07AP9xliv5yIoQtRgPZXc0QDPyUm7zg8f5KDgqCVJtyHXcvAEAmmerBr39nbPc
 XoMXALKx6NGt836P0C+bhvRcHdFPGwE=
 =c/Xh
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 6.5

 - Eager page splitting optimization for dirty logging, optionally
   allowing for a VM to avoid the cost of block splitting in the stage-2
   fault path.

 - Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with
   services that live in the Secure world. pKVM intervenes on FF-A calls
   to guarantee the host doesn't misuse memory donated to the hyp or a
   pKVM guest.

 - Support for running the split hypervisor with VHE enabled, known as
   'hVHE' mode. This is extremely useful for testing the split
   hypervisor on VHE-only systems, and paves the way for new use cases
   that depend on having two TTBRs available at EL2.

 - Generalized framework for configurable ID registers from userspace.
   KVM/arm64 currently prevents arbitrary CPU feature set configuration
   from userspace, but the intent is to relax this limitation and allow
   userspace to select a feature set consistent with the CPU.

 - Enable the use of Branch Target Identification (FEAT_BTI) in the
   hypervisor.

 - Use a separate set of pointer authentication keys for the hypervisor
   when running in protected mode, as the host is untrusted at runtime.

 - Ensure timer IRQs are consistently released in the init failure
   paths.

 - Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps
   (FEAT_EVT), as it is a register commonly read from userspace.

 - Erratum workaround for the upcoming AmpereOne part, which has broken
   hardware A/D state management.

As a consequence of the hVHE series reworking the arm64 software
features framework, the for-next/module-alloc branch from the arm64 tree
comes along for the ride.
2023-07-01 07:04:29 -04:00
Linus Torvalds
2605e80d34 arm64 updates for 6.5:
- Support for the Armv8.9 Permission Indirection Extensions. While this
   feature doesn't add new functionality, it enables future support for
   Guarded Control Stacks (GCS) and Permission Overlays.
 
 - User-space support for the Armv8.8 memcpy/memset instructions.
 
 - arm64 perf: support the HiSilicon SoC uncore PMU, Arm CMN sysfs
   identifier, support for the NXP i.MX9 SoC DDRC PMU, fixes and
   cleanups.
 
 - Removal of superfluous ISBs on context switch (following retrospective
   architecture tightening).
 
 - Decode the ISS2 register during faults for additional information to
   help with debugging.
 
 - KPTI clean-up/simplification of the trampoline exit code.
 
 - Addressing several -Wmissing-prototype warnings.
 
 - Kselftest improvements for signal handling and ptrace.
 
 - Fix TPIDR2_EL0 restoring on sigreturn
 
 - Clean-up, robustness improvements of the module allocation code.
 
 - More sysreg conversions to the automatic register/bitfields
   generation.
 
 - CPU capabilities handling cleanup.
 
 - Arm documentation updates: ACPI, ptdump.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmSZyXwACgkQa9axLQDI
 XvEM3BAAkMzHGTDhNVNGLSO07PVmdzTiuoNFlfX7bktdIb+El76VhGXhHeEywTje
 wAq9JIYBf/Src2HbgZLwuly8Fn2vCrhyp++bRJW82o9SiBnx91+0mH7zLf+XHiQ4
 FHKZxvaE6PaDc9o8WXr+IeucPRb5W2HgH37mktxh7ShMLsxorwS94V1oL29A2mV9
 t4XQY7/tdmrDKMKMuQnIr1DurNXBhJ1OKvDnSN/Zzm96JOU/QQ32N2wEE7Y0aHOh
 bBzClksx2mguQqV515mySGFe5yy9NqaAfx2hTAciq+1rwbiCSjqQQmEswoUH8WLX
 JNLylxADWT2qXThFe8W6uyFzEshSAoI1yKxlCGuOsQpu4sFJtR8oh8dDj5669g4Y
 j0jR87r9rWm0iyYI5I+XDMxFVyuh2eFInvjtynRbj+mtS3f/SkO8fXG6Uya+I76C
 UGLlBUKnLr/zHuIGN0LE/V4dYTqsi9EtHoc2Am2xCZsS9jqkxKJG8C93Zsm4GlJC
 OcUtBSjW0rYJq+tLk0yhR6hbh59QbiRh05KnZsPpOKi8purlKSL9ZNPRi7TndLdm
 HjHUY+vQwNIpPIb6pyK4aYZuTdGEQIsQykQ8CULiIGlHi7kc4g9029ouLc5bBAeU
 mU8D62I2ztzPoYljYWNtO7K6g/Dq8c4lpsaMAJ+1Wp2iq2xBJjo=
 =rNBK
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "Notable features are user-space support for the memcpy/memset
  instructions and the permission indirection extension.

   - Support for the Armv8.9 Permission Indirection Extensions. While
     this feature doesn't add new functionality, it enables future
     support for Guarded Control Stacks (GCS) and Permission Overlays

   - User-space support for the Armv8.8 memcpy/memset instructions

   - arm64 perf: support the HiSilicon SoC uncore PMU, Arm CMN sysfs
     identifier, support for the NXP i.MX9 SoC DDRC PMU, fixes and
     cleanups

   - Removal of superfluous ISBs on context switch (following
     retrospective architecture tightening)

   - Decode the ISS2 register during faults for additional information
     to help with debugging

   - KPTI clean-up/simplification of the trampoline exit code

   - Addressing several -Wmissing-prototype warnings

   - Kselftest improvements for signal handling and ptrace

   - Fix TPIDR2_EL0 restoring on sigreturn

   - Clean-up, robustness improvements of the module allocation code

   - More sysreg conversions to the automatic register/bitfields
     generation

   - CPU capabilities handling cleanup

   - Arm documentation updates: ACPI, ptdump"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (124 commits)
  kselftest/arm64: Add a test case for TPIDR2 restore
  arm64/signal: Restore TPIDR2 register rather than memory state
  arm64: alternatives: make clean_dcache_range_nopatch() noinstr-safe
  Documentation/arm64: Add ptdump documentation
  arm64: hibernate: remove WARN_ON in save_processor_state
  kselftest/arm64: Log signal code and address for unexpected signals
  docs: perf: Fix warning from 'make htmldocs' in hisi-pmu.rst
  arm64/fpsimd: Exit streaming mode when flushing tasks
  docs: perf: Add new description for HiSilicon UC PMU
  drivers/perf: hisi: Add support for HiSilicon UC PMU driver
  drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driver
  perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE
  perf/arm-cmn: Add sysfs identifier
  perf/arm-cmn: Revamp model detection
  perf/arm_dmc620: Add cpumask
  arm64: mm: fix VA-range sanity check
  arm64/mm: remove now-superfluous ISBs from TTBR writes
  Documentation/arm64: Update ACPI tables from BBR
  Documentation/arm64: Update references in arm-acpi
  Documentation/arm64: Update ARM and arch reference
  ...
2023-06-26 17:11:53 -07:00
Catalin Marinas
abc17128c8 Merge branch 'for-next/feat_s1pie' into for-next/core
* for-next/feat_s1pie:
  : Support for the Armv8.9 Permission Indirection Extensions (stage 1 only)
  KVM: selftests: get-reg-list: add Permission Indirection registers
  KVM: selftests: get-reg-list: support ID register features
  arm64: Document boot requirements for PIE
  arm64: transfer permission indirection settings to EL2
  arm64: enable Permission Indirection Extension (PIE)
  arm64: add encodings of PIRx_ELx registers
  arm64: disable EL2 traps for PIE
  arm64: reorganise PAGE_/PROT_ macros
  arm64: add PTE_WRITE to PROT_SECT_NORMAL
  arm64: add PTE_UXN/PTE_WRITE to SWAPPER_*_FLAGS
  KVM: arm64: expose ID_AA64MMFR3_EL1 to guests
  KVM: arm64: Save/restore PIE registers
  KVM: arm64: Save/restore TCR2_EL1
  arm64: cpufeature: add Permission Indirection Extension cpucap
  arm64: cpufeature: add TCR2 cpucap
  arm64: cpufeature: add system register ID_AA64MMFR3
  arm64/sysreg: add PIR*_ELx registers
  arm64/sysreg: update HCRX_EL2 register
  arm64/sysreg: add system registers TCR2_ELx
  arm64/sysreg: Add ID register ID_AA64MMFR3
2023-06-23 18:34:16 +01:00
Catalin Marinas
f42039d10b Merge branches 'for-next/kpti', 'for-next/missing-proto-warn', 'for-next/iss2-decode', 'for-next/kselftest', 'for-next/misc', 'for-next/feat_mops', 'for-next/module-alloc', 'for-next/sysreg', 'for-next/cpucap', 'for-next/acpi', 'for-next/kdump', 'for-next/acpi-doc', 'for-next/doc' and 'for-next/tpidr2-fix', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
  docs: perf: Fix warning from 'make htmldocs' in hisi-pmu.rst
  docs: perf: Add new description for HiSilicon UC PMU
  drivers/perf: hisi: Add support for HiSilicon UC PMU driver
  drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driver
  perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE
  perf/arm-cmn: Add sysfs identifier
  perf/arm-cmn: Revamp model detection
  perf/arm_dmc620: Add cpumask
  dt-bindings: perf: fsl-imx-ddr: Add i.MX93 compatible
  drivers/perf: imx_ddr: Add support for NXP i.MX9 SoC DDRC PMU driver
  perf/arm_cspmu: Decouple APMT dependency
  perf/arm_cspmu: Clean up ACPI dependency
  ACPI/APMT: Don't register invalid resource
  perf/arm_cspmu: Fix event attribute type
  perf: arm_cspmu: Set irq affinitiy only if overflow interrupt is used
  drivers/perf: hisi: Don't migrate perf to the CPU going to teardown
  drivers/perf: apple_m1: Force 63bit counters for M2 CPUs
  perf/arm-cmn: Fix DTC reset
  perf: qcom_l2_pmu: Make l2_cache_pmu_probe_cluster() more robust
  perf/arm-cci: Slightly optimize cci_pmu_sync_counters()

* for-next/kpti:
  : Simplify KPTI trampoline exit code
  arm64: entry: Simplify tramp_alias macro and tramp_exit routine
  arm64: entry: Preserve/restore X29 even for compat tasks

* for-next/missing-proto-warn:
  : Address -Wmissing-prototype warnings
  arm64: add alt_cb_patch_nops prototype
  arm64: move early_brk64 prototype to header
  arm64: signal: include asm/exception.h
  arm64: kaslr: add kaslr_early_init() declaration
  arm64: flush: include linux/libnvdimm.h
  arm64: module-plts: inline linux/moduleloader.h
  arm64: hide unused is_valid_bugaddr()
  arm64: efi: add efi_handle_corrupted_x18 prototype
  arm64: cpuidle: fix #ifdef for acpi functions
  arm64: kvm: add prototypes for functions called in asm
  arm64: spectre: provide prototypes for internal functions
  arm64: move cpu_suspend_set_dbg_restorer() prototype to header
  arm64: avoid prototype warnings for syscalls
  arm64: add scs_patch_vmlinux prototype
  arm64: xor-neon: mark xor_arm64_neon_*() static

* for-next/iss2-decode:
  : Add decode of ISS2 to data abort reports
  arm64/esr: Add decode of ISS2 to data abort reporting
  arm64/esr: Use GENMASK() for the ISS mask

* for-next/kselftest:
  : Various arm64 kselftest improvements
  kselftest/arm64: Log signal code and address for unexpected signals
  kselftest/arm64: Add a smoke test for ptracing hardware break/watch points

* for-next/misc:
  : Miscellaneous patches
  arm64: alternatives: make clean_dcache_range_nopatch() noinstr-safe
  arm64: hibernate: remove WARN_ON in save_processor_state
  arm64/fpsimd: Exit streaming mode when flushing tasks
  arm64: mm: fix VA-range sanity check
  arm64/mm: remove now-superfluous ISBs from TTBR writes
  arm64: consolidate rox page protection logic
  arm64: set __exception_irq_entry with __irq_entry as a default
  arm64: syscall: unmask DAIF for tracing status
  arm64: lockdep: enable checks for held locks when returning to userspace
  arm64/cpucaps: increase string width to properly format cpucaps.h
  arm64/cpufeature: Use helper for ECV CNTPOFF cpufeature

* for-next/feat_mops:
  : Support for ARMv8.8 memcpy instructions in userspace
  kselftest/arm64: add MOPS to hwcap test
  arm64: mops: allow disabling MOPS from the kernel command line
  arm64: mops: detect and enable FEAT_MOPS
  arm64: mops: handle single stepping after MOPS exception
  arm64: mops: handle MOPS exceptions
  KVM: arm64: hide MOPS from guests
  arm64: mops: don't disable host MOPS instructions from EL2
  arm64: mops: document boot requirements for MOPS
  KVM: arm64: switch HCRX_EL2 between host and guest
  arm64: cpufeature: detect FEAT_HCX
  KVM: arm64: initialize HCRX_EL2

* for-next/module-alloc:
  : Make the arm64 module allocation code more robust (clean-up, VA range expansion)
  arm64: module: rework module VA range selection
  arm64: module: mandate MODULE_PLTS
  arm64: module: move module randomization to module.c
  arm64: kaslr: split kaslr/module initialization
  arm64: kasan: remove !KASAN_VMALLOC remnants
  arm64: module: remove old !KASAN_VMALLOC logic

* for-next/sysreg: (21 commits)
  : More sysreg conversions to automatic generation
  arm64/sysreg: Convert TRBIDR_EL1 register to automatic generation
  arm64/sysreg: Convert TRBTRG_EL1 register to automatic generation
  arm64/sysreg: Convert TRBMAR_EL1 register to automatic generation
  arm64/sysreg: Convert TRBSR_EL1 register to automatic generation
  arm64/sysreg: Convert TRBBASER_EL1 register to automatic generation
  arm64/sysreg: Convert TRBPTR_EL1 register to automatic generation
  arm64/sysreg: Convert TRBLIMITR_EL1 register to automatic generation
  arm64/sysreg: Rename TRBIDR_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBTRG_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBMAR_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBSR_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBBASER_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBPTR_EL1 fields per auto-gen tools format
  arm64/sysreg: Rename TRBLIMITR_EL1 fields per auto-gen tools format
  arm64/sysreg: Convert OSECCR_EL1 to automatic generation
  arm64/sysreg: Convert OSDTRTX_EL1 to automatic generation
  arm64/sysreg: Convert OSDTRRX_EL1 to automatic generation
  arm64/sysreg: Convert OSLAR_EL1 to automatic generation
  arm64/sysreg: Standardise naming of bitfield constants in OSL[AS]R_EL1
  arm64/sysreg: Convert MDSCR_EL1 to automatic register generation
  ...

* for-next/cpucap:
  : arm64 cpucap clean-up
  arm64: cpufeature: fold cpus_set_cap() into update_cpu_capabilities()
  arm64: cpufeature: use cpucap naming
  arm64: alternatives: use cpucap naming
  arm64: standardise cpucap bitmap names

* for-next/acpi:
  : Various arm64-related ACPI patches
  ACPI: bus: Consolidate all arm specific initialisation into acpi_arm_init()

* for-next/kdump:
  : Simplify the crashkernel reservation behaviour of crashkernel=X,high on arm64
  arm64: add kdump.rst into index.rst
  Documentation: add kdump.rst to present crashkernel reservation on arm64
  arm64: kdump: simplify the reservation behaviour of crashkernel=,high

* for-next/acpi-doc:
  : Update ACPI documentation for Arm systems
  Documentation/arm64: Update ACPI tables from BBR
  Documentation/arm64: Update references in arm-acpi
  Documentation/arm64: Update ARM and arch reference

* for-next/doc:
  : arm64 documentation updates
  Documentation/arm64: Add ptdump documentation

* for-next/tpidr2-fix:
  : Fix the TPIDR2_EL0 register restoring on sigreturn
  kselftest/arm64: Add a test case for TPIDR2 restore
  arm64/signal: Restore TPIDR2 register rather than memory state
2023-06-23 18:32:20 +01:00
Oliver Upton
89a734b54c Merge branch kvm-arm64/configurable-id-regs into kvmarm/next
* kvm-arm64/configurable-id-regs:
  : Configurable ID register infrastructure, courtesy of Jing Zhang
  :
  : Create generalized infrastructure for allowing userspace to select the
  : supported feature set for a VM, so long as the feature set is a subset
  : of what hardware + KVM allows. This does not add any new features that
  : are user-configurable, and instead focuses on the necessary refactoring
  : to enable future work.
  :
  : As a consequence of the series, feature asymmetry is now deliberately
  : disallowed for KVM. It is unlikely that VMMs ever configured VMs with
  : asymmetry, nor does it align with the kernel's overall stance that
  : features must be uniform across all cores in the system.
  :
  : Furthermore, KVM incorrectly advertised an IMP_DEF PMU to guests for
  : some time. Migrations from affected kernels was supported by explicitly
  : allowing such an ID register value from userspace, and forwarding that
  : along to the guest. KVM now allows an IMP_DEF PMU version to be restored
  : through the ID register interface, but reinterprets the user value as
  : not implemented (0).
  KVM: arm64: Rip out the vestiges of the 'old' ID register scheme
  KVM: arm64: Handle ID register reads using the VM-wide values
  KVM: arm64: Use generic sanitisation for ID_AA64PFR0_EL1
  KVM: arm64: Use generic sanitisation for ID_(AA64)DFR0_EL1
  KVM: arm64: Use arm64_ftr_bits to sanitise ID register writes
  KVM: arm64: Save ID registers' sanitized value per guest
  KVM: arm64: Reuse fields of sys_reg_desc for idreg
  KVM: arm64: Rewrite IMPDEF PMU version as NI
  KVM: arm64: Make vCPU feature flags consistent VM-wide
  KVM: arm64: Relax invariance of KVM_ARM_VCPU_POWER_OFF
  KVM: arm64: Separate out feature sanitisation and initialisation

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-15 13:05:11 +00:00
Oliver Upton
686672407e KVM: arm64: Rip out the vestiges of the 'old' ID register scheme
There's no longer a need for the baggage of the old scheme for handling
configurable ID register fields. Rip it all out in favor of the
generalized infrastructure.

Link: https://lore.kernel.org/r/20230609190054.1542113-12-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-15 12:55:35 +00:00
Oliver Upton
6db7af0d5b KVM: arm64: Handle ID register reads using the VM-wide values
Everything is in place now to use the generic ID register
infrastructure. Use the VM-wide values to service ID register reads.
The ID registers are invariant after the VM has started, so there is no
need for locking in that case. This is rather desirable for VM live
migration, as the needless lock contention could prolong the VM blackout
period.

Link: https://lore.kernel.org/r/20230609190054.1542113-11-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-15 12:55:35 +00:00
Jing Zhang
c39f5974d3 KVM: arm64: Use generic sanitisation for ID_AA64PFR0_EL1
KVM allows userspace to write to the CSV2 and CSV3 fields of
ID_AA64PFR0_EL1 so long as it doesn't over-promise on the
Spectre/Meltdown mitigation state. Switch over to the new way of the
world for screening user writes. Leave the old plumbing in place until
we actually start handling ID register reads from the VM-wide values.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230609190054.1542113-10-oliver.upton@linux.dev
[Oliver: split from monster patch, added commit description]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-15 12:55:29 +00:00
Jing Zhang
c118cead07 KVM: arm64: Use generic sanitisation for ID_(AA64)DFR0_EL1
KVM allows userspace to specify a PMU version for the guest by writing
to the corresponding ID registers. Currently the validation of these
writes is done manuallly, but there's no reason we can't switch over to
the generic sanitisation infrastructure.

Start screening user writes through arm64_check_features() to prevent
userspace from over-promising in terms of vPMU support. Leave the old
masking in place for now, as we aren't completely ready to serve reads
from the VM-wide values.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230609190054.1542113-9-oliver.upton@linux.dev
[Oliver: split off from monster patch, cleaned up handling of NI vPMU
 values, wrote commit description]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-15 12:55:20 +00:00
Jing Zhang
2e8bf0cbd0 KVM: arm64: Use arm64_ftr_bits to sanitise ID register writes
Rather than reinventing the wheel in KVM to do ID register sanitisation
we can rely on the work already done in the core kernel. Implement a
generalized sanitisation of ID registers based on the combination of the
arm64_ftr_bits definitions from the core kernel and (optionally) a set
of KVM-specific overrides.

This all amounts to absolutely nothing for now, but will be used in
subsequent changes to realize user-configurable ID registers.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230609190054.1542113-8-oliver.upton@linux.dev
[Oliver: split off from monster patch, rewrote commit description,
 reworked RAZ handling, return EINVAL to userspace]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-15 12:55:20 +00:00
Jing Zhang
4733414690 KVM: arm64: Save ID registers' sanitized value per guest
Initialize the default ID register values upon the first call to
KVM_ARM_VCPU_INIT. The vCPU feature flags are finalized at that point,
so it is possible to determine the maximum feature set supported by a
particular VM configuration. Do nothing with these values for now, as we
need to rework the plumbing of what's already writable to be compatible
with the generic infrastructure.

Co-developed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230609190054.1542113-7-oliver.upton@linux.dev
[Oliver: Hoist everything into KVM_ARM_VCPU_INIT time, so the features
 are final]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-15 12:55:08 +00:00
Marc Zyngier
75c76ab5a6 KVM: arm64: Rework CPTR_EL2 programming for HVHE configuration
Just like we repainted the early arm64 code, we need to update
the CPTR_EL2 accesses that are taking place in the nVHE code
when hVHE is used, making them look as if they were CPACR_EL1
accesses. Just like the VHE code.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230609162200.2024064-14-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-12 23:17:24 +00:00
Jing Zhang
d86cde6e33 KVM: arm64: Reuse fields of sys_reg_desc for idreg
sys_reg_desc::{reset, val} are presently unused for ID register
descriptors. Repurpose these fields to support user-configurable ID
registers.
Use the ::reset() function pointer to return the sanitised value of a
given ID register, optionally with KVM-specific feature sanitisation.
Additionally, keep a mask of writable register fields in ::val.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230609190054.1542113-6-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-12 23:08:33 +00:00
Oliver Upton
f90f9360c3 KVM: arm64: Rewrite IMPDEF PMU version as NI
KVM allows userspace to write an IMPDEF PMU version to the corresponding
32bit and 64bit ID register fields for the sake of backwards
compatibility with kernels that lacked commit 3d0dba5764 ("KVM: arm64:
PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation"). Plumbing
that IMPDEF PMU version through to the gues is getting in the way of
progress, and really doesn't any sense in the first place.

Bite the bullet and reinterpret the IMPDEF PMU version as NI (0) for
userspace writes. Additionally, spill the dirty details into a comment.

Link: https://lore.kernel.org/r/20230609190054.1542113-5-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-12 23:08:33 +00:00
Mark Brown
187de7c2aa arm64/sysreg: Standardise naming of bitfield constants in OSL[AS]R_EL1
Our standard scheme for naming the constants for bitfields in system
registers includes _ELx in the name but not the SYS_, update the
constants for OSL[AS]R_EL1 to follow this convention.

Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20230419-arm64-syreg-gen-v2-3-4c6add1f6257@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06 17:43:08 +01:00
Joey Gouly
8ef67c67e6 KVM: arm64: expose ID_AA64MMFR3_EL1 to guests
Now that KVM context switches the appropriate registers, expose ID_AA64MMFR3_EL1
to guests to allow them to use the new features.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230606145859.697944-11-joey.gouly@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06 16:52:41 +01:00
Joey Gouly
86f9de9db1 KVM: arm64: Save/restore PIE registers
Define the new system registers that PIE introduces and context switch them.
The PIE feature is still hidden from the ID register, and not exposed to a VM.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230606145859.697944-10-joey.gouly@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06 16:52:40 +01:00
Joey Gouly
fbff560682 KVM: arm64: Save/restore TCR2_EL1
Define the new system register TCR2_EL1 and context switch it.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230606145859.697944-9-joey.gouly@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06 16:52:40 +01:00
Kristina Martsenko
3172613fbc KVM: arm64: hide MOPS from guests
As FEAT_MOPS is not supported in guests yet, hide it from the ID
registers for guests.

The MOPS instructions are UNDEFINED in guests as HCRX_EL2.MSCEn is not
set in HCRX_GUEST_FLAGS, and will take an exception to EL1 if executed.

Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20230509142235.3284028-7-kristina.martsenko@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-05 17:05:31 +01:00
Marc Zyngier
d282fa3c5c KVM: arm64: Handle trap of tagged Set/Way CMOs
We appear to have missed the Set/Way CMOs when adding MTE support.
Not that we really expect anyone to use them, but you never know
what stupidity some people can come up with...

Treat these mostly like we deal with the classic S/W CMOs, only
with an additional check that MTE really is enabled.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20230515204601.1270428-3-maz@kernel.org
2023-05-24 13:45:18 +01:00
Linus Torvalds
c8c655c34e s390:
* More phys_to_virt conversions
 
 * Improvement of AP management for VSIE (nested virtualization)
 
 ARM64:
 
 * Numerous fixes for the pathological lock inversion issue that
   plagued KVM/arm64 since... forever.
 
 * New framework allowing SMCCC-compliant hypercalls to be forwarded
   to userspace, hopefully paving the way for some more features
   being moved to VMMs rather than be implemented in the kernel.
 
 * Large rework of the timer code to allow a VM-wide offset to be
   applied to both virtual and physical counters as well as a
   per-timer, per-vcpu offset that complements the global one.
   This last part allows the NV timer code to be implemented on
   top.
 
 * A small set of fixes to make sure that we don't change anything
   affecting the EL1&0 translation regime just after having having
   taken an exception to EL2 until we have executed a DSB. This
   ensures that speculative walks started in EL1&0 have completed.
 
 * The usual selftest fixes and improvements.
 
 KVM x86 changes for 6.4:
 
 * Optimize CR0.WP toggling by avoiding an MMU reload when TDP is enabled,
   and by giving the guest control of CR0.WP when EPT is enabled on VMX
   (VMX-only because SVM doesn't support per-bit controls)
 
 * Add CR0/CR4 helpers to query single bits, and clean up related code
   where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long" return
   as a bool
 
 * Move AMD_PSFD to cpufeatures.h and purge KVM's definition
 
 * Avoid unnecessary writes+flushes when the guest is only adding new PTEs
 
 * Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s optimizations
   when emulating invalidations
 
 * Clean up the range-based flushing APIs
 
 * Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a single
   A/D bit using a LOCK AND instead of XCHG, and skip all of the "handle
   changed SPTE" overhead associated with writing the entire entry
 
 * Track the number of "tail" entries in a pte_list_desc to avoid having
   to walk (potentially) all descriptors during insertion and deletion,
   which gets quite expensive if the guest is spamming fork()
 
 * Disallow virtualizing legacy LBRs if architectural LBRs are available,
   the two are mutually exclusive in hardware
 
 * Disallow writes to immutable feature MSRs (notably PERF_CAPABILITIES)
   after KVM_RUN, similar to CPUID features
 
 * Overhaul the vmx_pmu_caps selftest to better validate PERF_CAPABILITIES
 
 * Apply PMU filters to emulated events and add test coverage to the
   pmu_event_filter selftest
 
 x86 AMD:
 
 * Add support for virtual NMIs
 
 * Fixes for edge cases related to virtual interrupts
 
 x86 Intel:
 
 * Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if XTILE_DATA is
   not being reported due to userspace not opting in via prctl()
 
 * Fix a bug in emulation of ENCLS in compatibility mode
 
 * Allow emulation of NOP and PAUSE for L2
 
 * AMX selftests improvements
 
 * Misc cleanups
 
 MIPS:
 
 * Constify MIPS's internal callbacks (a leftover from the hardware enabling
   rework that landed in 6.3)
 
 Generic:
 
 * Drop unnecessary casts from "void *" throughout kvm_main.c
 
 * Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the struct
   size by 8 bytes on 64-bit kernels by utilizing a padding hole
 
 Documentation:
 
 * Fix goof introduced by the conversion to rST
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmRNExkUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNyjwf+MkzDael9y9AsOZoqhEZ5OsfQYJ32
 Im5ZVYsPRU2K5TuoWql6meIihgclCj1iIU32qYHa2F1WYt2rZ72rJp+HoY8b+TaI
 WvF0pvNtqQyg3iEKUBKPA4xQ6mj7RpQBw86qqiCHmlfNt0zxluEGEPxH8xrWcfhC
 huDQ+NUOdU7fmJ3rqGitCvkUbCuZNkw3aNPR8dhU8RAWrwRzP2hBOmdxIeo81WWY
 XMEpJSijbGpXL9CvM0Jz9nOuMJwZwCCBGxg1vSQq0xTfLySNMxzvWZC2GFaBjucb
 j0UOQ7yE0drIZDVhd3sdNslubXXU6FcSEzacGQb9aigMUon3Tem9SHi7Kw==
 =S2Hq
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "s390:

   - More phys_to_virt conversions

   - Improvement of AP management for VSIE (nested virtualization)

  ARM64:

   - Numerous fixes for the pathological lock inversion issue that
     plagued KVM/arm64 since... forever.

   - New framework allowing SMCCC-compliant hypercalls to be forwarded
     to userspace, hopefully paving the way for some more features being
     moved to VMMs rather than be implemented in the kernel.

   - Large rework of the timer code to allow a VM-wide offset to be
     applied to both virtual and physical counters as well as a
     per-timer, per-vcpu offset that complements the global one. This
     last part allows the NV timer code to be implemented on top.

   - A small set of fixes to make sure that we don't change anything
     affecting the EL1&0 translation regime just after having having
     taken an exception to EL2 until we have executed a DSB. This
     ensures that speculative walks started in EL1&0 have completed.

   - The usual selftest fixes and improvements.

  x86:

   - Optimize CR0.WP toggling by avoiding an MMU reload when TDP is
     enabled, and by giving the guest control of CR0.WP when EPT is
     enabled on VMX (VMX-only because SVM doesn't support per-bit
     controls)

   - Add CR0/CR4 helpers to query single bits, and clean up related code
     where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long"
     return as a bool

   - Move AMD_PSFD to cpufeatures.h and purge KVM's definition

   - Avoid unnecessary writes+flushes when the guest is only adding new
     PTEs

   - Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s
     optimizations when emulating invalidations

   - Clean up the range-based flushing APIs

   - Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a
     single A/D bit using a LOCK AND instead of XCHG, and skip all of
     the "handle changed SPTE" overhead associated with writing the
     entire entry

   - Track the number of "tail" entries in a pte_list_desc to avoid
     having to walk (potentially) all descriptors during insertion and
     deletion, which gets quite expensive if the guest is spamming
     fork()

   - Disallow virtualizing legacy LBRs if architectural LBRs are
     available, the two are mutually exclusive in hardware

   - Disallow writes to immutable feature MSRs (notably
     PERF_CAPABILITIES) after KVM_RUN, similar to CPUID features

   - Overhaul the vmx_pmu_caps selftest to better validate
     PERF_CAPABILITIES

   - Apply PMU filters to emulated events and add test coverage to the
     pmu_event_filter selftest

   - AMD SVM:
       - Add support for virtual NMIs
       - Fixes for edge cases related to virtual interrupts

   - Intel AMX:
       - Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if
         XTILE_DATA is not being reported due to userspace not opting in
         via prctl()
       - Fix a bug in emulation of ENCLS in compatibility mode
       - Allow emulation of NOP and PAUSE for L2
       - AMX selftests improvements
       - Misc cleanups

  MIPS:

   - Constify MIPS's internal callbacks (a leftover from the hardware
     enabling rework that landed in 6.3)

  Generic:

   - Drop unnecessary casts from "void *" throughout kvm_main.c

   - Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the
     struct size by 8 bytes on 64-bit kernels by utilizing a padding
     hole

  Documentation:

   - Fix goof introduced by the conversion to rST"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (211 commits)
  KVM: s390: pci: fix virtual-physical confusion on module unload/load
  KVM: s390: vsie: clarifications on setting the APCB
  KVM: s390: interrupt: fix virtual-physical confusion for next alert GISA
  KVM: arm64: Have kvm_psci_vcpu_on() use WRITE_ONCE() to update mp_state
  KVM: arm64: Acquire mp_state_lock in kvm_arch_vcpu_ioctl_vcpu_init()
  KVM: selftests: Test the PMU event "Instructions retired"
  KVM: selftests: Copy full counter values from guest in PMU event filter test
  KVM: selftests: Use error codes to signal errors in PMU event filter test
  KVM: selftests: Print detailed info in PMU event filter asserts
  KVM: selftests: Add helpers for PMC asserts in PMU event filter test
  KVM: selftests: Add a common helper for the PMU event filter guest code
  KVM: selftests: Fix spelling mistake "perrmited" -> "permitted"
  KVM: arm64: vhe: Drop extra isb() on guest exit
  KVM: arm64: vhe: Synchronise with page table walker on MMU update
  KVM: arm64: pkvm: Document the side effects of kvm_flush_dcache_to_poc()
  KVM: arm64: nvhe: Synchronise with page table walker on TLBI
  KVM: arm64: Handle 32bit CNTPCTSS traps
  KVM: arm64: nvhe: Synchronise with page table walker on vcpu run
  KVM: arm64: vgic: Don't acquire its_lock before config_lock
  KVM: selftests: Add test to verify KVM's supported XCR0
  ...
2023-05-01 12:06:20 -07:00
Marc Zyngier
a6610435ac KVM: arm64: Handle 32bit CNTPCTSS traps
When CNTPOFF isn't implemented and that we have a non-zero counter
offset, CNTPCT and CNTPCTSS are trapped. We properly handle the
former, but not the latter, as it is not present in the sysreg
table (despite being actually handled in the code). Bummer.

Just populate the cp15_64 table with the missing register.

Reported-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-04-13 14:23:42 +01:00
Marc Zyngier
c605ee2450 KVM: arm64: timers: Allow physical offset without CNTPOFF_EL2
CNTPOFF_EL2 is awesome, but it is mostly vapourware, and no publicly
available implementation has it. So for the common mortals, let's
implement the emulated version of this thing.

It means trapping accesses to the physical counter and timer, and
emulate some of it as necessary.

As for CNTPOFF_EL2, nobody sets the offset yet.

Reviewed-by: Colton Lewis <coltonlewis@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230330174800.2677007-6-maz@kernel.org
2023-03-30 19:01:09 +01:00
Reiji Watanabe
f9ea835e99 KVM: arm64: PMU: Restore the guest's EL0 event counting after migration
Currently, with VHE, KVM enables the EL0 event counting for the
guest on vcpu_load() or KVM enables it as a part of the PMU
register emulation process, when needed.  However, in the migration
case (with VHE), the same handling is lacking, as vPMU register
values that were restored by userspace haven't been propagated yet
(the PMU events haven't been created) at the vcpu load-time on the
first KVM_RUN (kvm_vcpu_pmu_restore_guest() called from vcpu_load()
on the first KVM_RUN won't do anything as events_{guest,host} of
kvm_pmu_events are still zero).

So, with VHE, enable the guest's EL0 event counting on the first
KVM_RUN (after the migration) when needed.  More specifically,
have kvm_pmu_handle_pmcr() call kvm_vcpu_pmu_restore_guest()
so that kvm_pmu_handle_pmcr() on the first KVM_RUN can take
care of it.

Fixes: d0c94c4979 ("KVM: arm64: Restore PMU configuration on first run")
Cc: stable@vger.kernel.org
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Link: https://lore.kernel.org/r/20230329023944.2488484-1-reijiw@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-03-30 06:14:43 +00:00
Reiji Watanabe
9228b26194 KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the current value
Have KVM_GET_ONE_REG for vPMU counter (vPMC) registers (PMCCNTR_EL0
and PMEVCNTR<n>_EL0) return the sum of the register value in the sysreg
file and the current perf event counter value.

Values of vPMC registers are saved in sysreg files on certain occasions.
These saved values don't represent the current values of the vPMC
registers if the perf events for the vPMCs count events after the save.
The current values of those registers are the sum of the sysreg file
value and the current perf event counter value.  But, when userspace
reads those registers (using KVM_GET_ONE_REG), KVM returns the sysreg
file value to userspace (not the sum value).

Fix this to return the sum value for KVM_GET_ONE_REG.

Fixes: 051ff581ce ("arm64: KVM: Add access handler for event counter register")
Cc: stable@vger.kernel.org
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Link: https://lore.kernel.org/r/20230313033208.1475499-1-reijiw@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-03-13 18:05:40 +00:00
Oliver Upton
0d3b2b4d23 Merge branch kvm-arm64/nv-prefix into kvmarm/next
* kvm-arm64/nv-prefix:
  : Preamble to NV support, courtesy of Marc Zyngier.
  :
  : This brings in a set of prerequisite patches for supporting nested
  : virtualization in KVM/arm64. Of course, there is a long way to go until
  : NV is actually enabled in KVM.
  :
  :  - Introduce cpucap / vCPU feature flag to pivot the NV code on
  :
  :  - Add support for EL2 vCPU register state
  :
  :  - Basic nested exception handling
  :
  :  - Hide unsupported features from the ID registers for NV-capable VMs
  KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID
  KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes
  KVM: arm64: nv: Filter out unsupported features from ID regs
  KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2
  KVM: arm64: nv: Allow a sysreg to be hidden from userspace only
  KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor
  KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2
  KVM: arm64: nv: Handle SMCs taken from virtual EL2
  KVM: arm64: nv: Handle trapped ERET from virtual EL2
  KVM: arm64: nv: Inject HVC exceptions to the virtual EL2
  KVM: arm64: nv: Support virtual EL2 exceptions
  KVM: arm64: nv: Handle HCR_EL2.NV system register traps
  KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state
  KVM: arm64: nv: Add EL2 system registers to vcpu context
  KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x
  KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set
  KVM: arm64: nv: Introduce nested virtualization VCPU feature
  KVM: arm64: Use the S2 MMU context to iterate over S2 table
  arm64: Add ARM64_HAS_NESTED_VIRT cpufeature

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 23:33:41 +00:00
Oliver Upton
022d3f0800 Merge branch kvm-arm64/misc into kvmarm/next
* kvm-arm64/misc:
  : Miscellaneous updates
  :
  :  - Convert CPACR_EL1_TTA to the new, generated system register
  :    definitions.
  :
  :  - Serialize toggling CPACR_EL1.SMEN to avoid unexpected exceptions when
  :    accessing SVCR in the host.
  :
  :  - Avoid quiescing the guest if a vCPU accesses its own redistributor's
  :    SGIs/PPIs, eliminating the need to IPI. Largely an optimization for
  :    nested virtualization, as the L1 accesses the affected registers
  :    rather often.
  :
  :  - Conversion to kstrtobool()
  :
  :  - Common definition of INVALID_GPA across architectures
  :
  :  - Enable CONFIG_USERFAULTFD for CI runs of KVM selftests
  KVM: arm64: Fix non-kerneldoc comments
  KVM: selftests: Enable USERFAULTFD
  KVM: selftests: Remove redundant setbuf()
  arm64/sysreg: clean up some inconsistent indenting
  KVM: MMU: Make the definition of 'INVALID_GPA' common
  KVM: arm64: vgic-v3: Use kstrtobool() instead of strtobool()
  KVM: arm64: vgic-v3: Limit IPI-ing when accessing GICR_{C,S}ACTIVER0
  KVM: arm64: Synchronize SMEN on vcpu schedule out
  KVM: arm64: Kill CPACR_EL1_TTA definition

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 23:33:25 +00:00
Oliver Upton
1b915210d9 Merge branch kvm-arm64/nv-timer-improvements into kvmarm/next
* kvm-arm64/nv-timer-improvements:
  : Timer emulation improvements, courtesy of Marc Zyngier.
  :
  :  - Avoid re-arming an hrtimer for a guest timer that is already pending
  :
  :  - Only reload the affected timer context when emulating a sysreg access
  :    instead of both the virtual/physical timers.
  KVM: arm64: timers: Don't BUG() on unhandled timer trap
  KVM: arm64: Reduce overhead of trapped timer sysreg accesses
  KVM: arm64: Don't arm a hrtimer for an already pending timer

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 23:26:21 +00:00
Oliver Upton
e8789ab704 Merge branch kvm-arm64/virtual-cache-geometry into kvmarm/next
* kvm-arm64/virtual-cache-geometry:
  : Virtualized cache geometry for KVM guests, courtesy of Akihiko Odaki.
  :
  : KVM/arm64 has always exposed the host cache geometry directly to the
  : guest, even though non-secure software should never perform CMOs by
  : Set/Way. This was slightly wrong, as the cache geometry was derived from
  : the PE on which the vCPU thread was running and not a sanitized value.
  :
  : All together this leads to issues migrating VMs on heterogeneous
  : systems, as the cache geometry saved/restored could be inconsistent.
  :
  : KVM/arm64 now presents 1 level of cache with 1 set and 1 way. The cache
  : geometry is entirely controlled by userspace, such that migrations from
  : older kernels continue to work.
  KVM: arm64: Mark some VM-scoped allocations as __GFP_ACCOUNT
  KVM: arm64: Normalize cache configuration
  KVM: arm64: Mask FEAT_CCIDX
  KVM: arm64: Always set HCR_TID2
  arm64/cache: Move CLIDR macro definitions
  arm64/sysreg: Add CCSIDR2_EL1
  arm64/sysreg: Convert CCSIDR_EL1 to automatic generation
  arm64: Allow the definition of UNKNOWN system register fields

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 22:32:40 +00:00
Oliver Upton
92425e058a Merge branch kvm/kvm-hw-enable-refactor into kvmarm/next
Merge the kvm_init() + hardware enable rework to avoid conflicts
with kvmarm.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-13 22:28:34 +00:00
Marc Zyngier
9f75b6d447 KVM: arm64: nv: Filter out unsupported features from ID regs
As there is a number of features that we either can't support,
or don't want to support right away with NV, let's add some
basic filtering so that we don't advertize silly things to the
EL2 guest.

Whilst we are at it, advertize FEAT_TTL as well as FEAT_GTG, which
the NV implementation will implement.

Reviewed-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-18-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-11 10:13:30 +00:00
Jintack Lim
280b748e87 KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2
With HCR_EL2.NV bit set, accesses to EL12 registers in the virtual EL2
trap to EL2. Handle those traps just like we do for EL1 registers.

One exception is CNTKCTL_EL12. We don't trap on CNTKCTL_EL1 for non-VHE
virtual EL2 because we don't have to. However, accessing CNTKCTL_EL12
will trap since it's one of the EL12 registers controlled by HCR_EL2.NV
bit.  Therefore, add a handler for it and don't treat it as a
non-trap-registers when preparing a shadow context.

These registers, being only a view on their EL1 counterpart, are
permanently hidden from userspace.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
[maz: EL12_REG(), register visibility]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-17-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-11 10:13:30 +00:00
Marc Zyngier
e6b367db0f KVM: arm64: nv: Allow a sysreg to be hidden from userspace only
So far, we never needed to distinguish between registers hidden
from userspace and being hidden from a guest (they are always
either visible to both, or hidden from both).

With NV, we have the ugly case of the EL02 and EL12 registers,
which are only a view on the EL0 and EL1 registers. It makes
absolutely no sense to expose them to userspace, since it
already has the canonical view.

Add a new visibility flag (REG_HIDDEN_USER) and a new helper that
checks for it and REG_HIDDEN when checking whether to expose
a sysreg to userspace. Subsequent patches will make use of it.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-16-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-11 10:13:29 +00:00
Jintack Lim
9da117eec9 KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2
For the same reason we trap virtual memory register accesses at virtual
EL2, we need to trap SPSR_EL1, ELR_EL1 and VBAR_EL1 accesses. ARM v8.3
introduces the HCR_EL2.NV1 bit to be able to trap on those register
accesses in EL1. Do not set this bit until the whole nesting support is
completed, which happens further down the line...

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-14-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-11 10:13:29 +00:00
Jintack Lim
6ff9dc238a KVM: arm64: nv: Handle HCR_EL2.NV system register traps
ARM v8.3 introduces a new bit in the HCR_EL2, which is the NV bit. When
this bit is set, accessing EL2 registers in EL1 traps to EL2. In
addition, executing the following instructions in EL1 will trap to EL2:
tlbi, at, eret, and msr/mrs instructions to access SP_EL1. Most of the
instructions that trap to EL2 with the NV bit were undef at EL1 prior to
ARM v8.3. The only instruction that was not undef is eret.

This patch sets up a handler for EL2 registers and SP_EL1 register
accesses at EL1. The host hypervisor keeps those register values in
memory, and will emulate their behavior.

This patch doesn't set the NV bit yet. It will be set in a later patch
once nested virtualization support is completed.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
[maz: EL2_REG() macros]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-9-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-11 09:16:11 +00:00
Oliver Upton
5f623a598d KVM: arm64: Mark some VM-scoped allocations as __GFP_ACCOUNT
Generally speaking, any memory allocations that can be associated with a
particular VM should be charged to the cgroup of its process.
Nonetheless, there are a couple spots in KVM/arm64 that aren't currently
accounted:

 - the ccsidr array containing the virtualized cache hierarchy

 - the cpumask of supported cpus, for use of the vPMU on heterogeneous
   systems

Go ahead and set __GFP_ACCOUNT for these allocations.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Link: https://lore.kernel.org/r/20230206235229.4174711-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-07 13:56:18 +00:00
Jiapeng Chong
242b6f34b5 arm64/sysreg: clean up some inconsistent indenting
No functional modification involved.

./arch/arm64/kvm/sys_regs.c:80:2-9: code aligned with following code on line 82.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3897
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230131082703.118101-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-01-31 19:43:27 +00:00
Marc Zyngier
ba82e06cf7 KVM: arm64: timers: Don't BUG() on unhandled timer trap
Although not handling a trap is a pretty bad situation to be in,
panicing the kernel isn't useful and provides no valuable
information to help debugging the situation.

Instead, dump the encoding of the unhandled sysreg, and inject
an UNDEF in the guest. At least, this gives a user an opportunity
to report the issue with some information to help debugging it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230112123829.458912-4-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-01-26 18:48:47 +00:00
Akihiko Odaki
7af0c2534f KVM: arm64: Normalize cache configuration
Before this change, the cache configuration of the physical CPU was
exposed to vcpus. This is problematic because the cache configuration a
vcpu sees varies when it migrates between vcpus with different cache
configurations.

Fabricate cache configuration from the sanitized value, which holds the
CTR_EL0 value the userspace sees regardless of which physical CPU it
resides on.

CLIDR_EL1 and CCSIDR_EL1 are now writable from the userspace so that
the VMM can restore the values saved with the old kernel.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Link: https://lore.kernel.org/r/20230112023852.42012-8-akihiko.odaki@daynix.com
[ Oliver: Squash Marc's fix for CCSIDR_EL1.LineSize when set from userspace ]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-01-21 18:09:23 +00:00
Akihiko Odaki
bf48040cd9 KVM: arm64: Mask FEAT_CCIDX
The CCSIDR access handler masks the associativity bits according to the
bit layout for processors without FEAT_CCIDX. KVM also assumes CCSIDR is
32-bit where it will be 64-bit if FEAT_CCIDX is enabled. Mask FEAT_CCIDX
so that these assumptions hold.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Link: https://lore.kernel.org/r/20230112023852.42012-7-akihiko.odaki@daynix.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-01-12 21:07:43 +00:00
Sean Christopherson
8d20bd6381 KVM: x86: Unify pr_fmt to use module name for all KVM modules
Define pr_fmt using KBUILD_MODNAME for all KVM x86 code so that printks
use consistent formatting across common x86, Intel, and AMD code.  In
addition to providing consistent print formatting, using KBUILD_MODNAME,
e.g. kvm_amd and kvm_intel, allows referencing SVM and VMX (and SEV and
SGX and ...) as technologies without generating weird messages, and
without causing naming conflicts with other kernel code, e.g. "SEV: ",
"tdx: ", "sgx: " etc.. are all used by the kernel for non-KVM subsystems.

Opportunistically move away from printk() for prints that need to be
modified anyways, e.g. to drop a manual "kvm: " prefix.

Opportunistically convert a few SGX WARNs that are similarly modified to
WARN_ONCE; in the very unlikely event that the WARNs fire, odds are good
that they would fire repeatedly and spam the kernel log without providing
unique information in each print.

Note, defining pr_fmt yields undesirable results for code that uses KVM's
printk wrappers, e.g. vcpu_unimpl().  But, that's a pre-existing problem
as SVM/kvm_amd already defines a pr_fmt, and thankfully use of KVM's
wrappers is relatively limited in KVM x86 code.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20221130230934.1014142-35-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29 15:47:35 -05:00
James Clark
aff234839f KVM: arm64: PMU: Fix PMCR_EL0 reset value
ARMV8_PMU_PMCR_N_MASK is an unshifted value which results in the wrong
reset value for PMCR_EL0, so shift it to fix it.

This fixes the following error when running qemu:

  $ qemu-system-aarch64 -cpu host -machine type=virt,accel=kvm -kernel ...

  target/arm/helper.c:1813: pmevcntr_rawwrite: Assertion `counter < pmu_num_counters(env)' failed.

Fixes: 292e8f1494 ("KVM: arm64: PMU: Simplify PMCR_EL0 reset handling")
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221209164446.1972014-2-james.clark@arm.com
2022-12-12 09:07:14 +00:00
Marc Zyngier
753d734f3f Merge remote-tracking branch 'arm64/for-next/sysregs' into kvmarm-master/next
Merge arm64's sysreg repainting branch to avoid too many
ugly conflicts...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05 14:39:53 +00:00
James Morse
f4f5969e35 arm64/sysreg: Standardise naming for ID_DFR0_EL1
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.

Ensure symbols for the ID_DFR0_EL1 register have an _EL1 suffix,
and use lower-case for feature names where the arm-arm does the same.

The arm-arm has feature names for some of the ID_DFR0_EL1.PerMon encodings.
Use these feature names in preference to the '8_4' indication of the
architecture version they were introduced in.

No functional change.

Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-12-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-12-01 15:53:14 +00:00
Marc Zyngier
64d6820d64 KVM: arm64: PMU: Sanitise PMCR_EL0.LP on first vcpu run
Userspace can play some dirty tricks on us by selecting a given
PMU version (such as PMUv3p5), restore a PMCR_EL0 value that
has PMCR_EL0.LP set, and then switch the PMU version to PMUv3p1,
for example. In this situation, we end-up with PMCR_EL0.LP being
set and spreading havoc in the PMU emulation.

This is specially hard as the first two step can be done on
one vcpu and the third step on another, meaning that we need
to sanitise *all* vcpus when the PMU version is changed.

In orer to avoid a pretty complicated locking situation,
defer the sanitisation of PMCR_EL0 to the point where the
vcpu is actually run for the first tine, using the existing
KVM_REQ_RELOAD_PMU request that calls into kvm_pmu_handle_pmcr().

There is still an obscure corner case where userspace could
do the above trick, and then save the VM without running it.
They would then observe an inconsistent state (PMUv3.1 + LP set),
but that state will be fixed on the first run anyway whenever
the guest gets restored on a host.

Reported-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-11-28 14:04:08 +00:00
Marc Zyngier
292e8f1494 KVM: arm64: PMU: Simplify PMCR_EL0 reset handling
Resetting PMCR_EL0 is a pretty involved process that includes
poisoning some of the writable bits, just because we can.

It makes it hard to reason about about what gets configured,
and just resetting things to 0 seems like a much saner option.

Reduce reset_pmcr() to just preserving PMCR_EL0.N from the host,
and setting PMCR_EL0.LC if we don't support AArch32.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-11-28 14:04:08 +00:00
Marc Zyngier
11af4c3716 KVM: arm64: PMU: Implement PMUv3p5 long counter support
PMUv3p5 (which is mandatory with ARMv8.5) comes with some extra
features:

- All counters are 64bit

- The overflow point is controlled by the PMCR_EL0.LP bit

Add the required checks in the helpers that control counter
width and overflow, as well as the sysreg handling for the LP
bit. A new kvm_pmu_is_3p5() helper makes it easy to spot the
PMUv3p5 specific handling.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221113163832.3154370-14-maz@kernel.org
2022-11-19 12:56:39 +00:00
Marc Zyngier
d82e0dfdfd KVM: arm64: PMU: Allow ID_DFR0_EL1.PerfMon to be set from userspace
Allow userspace to write ID_DFR0_EL1, on the condition that only
the PerfMon field can be altered and be something that is compatible
with what was computed for the AArch64 view of the guest.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221113163832.3154370-13-maz@kernel.org
2022-11-19 12:56:39 +00:00
Marc Zyngier
60e651ff1f KVM: arm64: PMU: Allow ID_AA64DFR0_EL1.PMUver to be set from userspace
Allow userspace to write ID_AA64DFR0_EL1, on the condition that only
the PMUver field can be altered and be at most the one that was
initially computed for the guest.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221113163832.3154370-12-maz@kernel.org
2022-11-19 12:56:39 +00:00
Marc Zyngier
3d0dba5764 KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation
As further patches will enable the selection of a PMU revision
from userspace, sample the supported PMU revision at VM creation
time, rather than building each time the ID_AA64DFR0_EL1 register
is accessed.

This shouldn't result in any change in behaviour.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221113163832.3154370-11-maz@kernel.org
2022-11-19 12:56:39 +00:00
Marc Zyngier
b04b331502 Merge remote-tracking branch 'arm64/for-next/sysreg' into kvmarm-master/next
Merge arm64/for-next/sysreg in order to avoid upstream conflicts
due to the never ending sysreg repainting...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-09-19 09:45:00 +01:00
Mark Brown
121a8fc088 arm64/sysreg: Use feature numbering for PMU and SPE revisions
Currently the kernel refers to the versions of the PMU and SPE features by
the version of the architecture where those features were updated but the
ARM refers to them using the FEAT_ names for the features. To improve
consistency and help with updating for newer features and since v9 will
make our current naming scheme a bit more confusing update the macros
identfying features to use the FEAT_ based scheme.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220910163354.860255-4-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 12:38:57 +01:00
Mark Brown
fcf37b38ff arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64DFR0_EL1 to follow the convention. No functional changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220910163354.860255-3-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 12:38:57 +01:00
Mark Brown
c0357a73fa arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture
The naming scheme the architecture uses for the fields in ID_AA64DFR0_EL1
does not align well with kernel conventions, using as it does a lot of
MixedCase in various arrangements. In preparation for automatically
generating the defines for this register rename the defines used to match
what is in the architecture.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220910163354.860255-2-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 12:38:57 +01:00
Oliver Upton
d5efec7ed8 KVM: arm64: Treat 32bit ID registers as RAZ/WI on 64bit-only system
One of the oddities of the architecture is that the AArch64 views of the
AArch32 ID registers are UNKNOWN if AArch32 isn't implemented at any EL.
Nonetheless, KVM exposes these registers to userspace for the sake of
save/restore. It is possible that the UNKNOWN value could differ between
systems, leading to a rejected write from userspace.

Avoid the issue altogether by handling the AArch32 ID registers as
RAZ/WI when on an AArch64-only system.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220913094441.3957645-7-oliver.upton@linux.dev
2022-09-14 11:36:16 +01:00
Oliver Upton
4de06e4c1d KVM: arm64: Add a visibility bit to ignore user writes
We're about to ignore writes to AArch32 ID registers on AArch64-only
systems. Add a bit to indicate a register is handled as write ignore
when accessed from userspace.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220913094441.3957645-6-oliver.upton@linux.dev
2022-09-14 11:36:16 +01:00
Oliver Upton
cdd5036d04 KVM: arm64: Drop raz parameter from read_id_reg()
There is no longer a need for caller-specified RAZ visibility. Hoist the
call to sysreg_visible_as_raz() into read_id_reg() and drop the
parameter.

No functional change intended.

Suggested-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220913094441.3957645-4-oliver.upton@linux.dev
2022-09-14 11:36:16 +01:00
Oliver Upton
4782ccc8ef KVM: arm64: Remove internal accessor helpers for id regs
The internal accessors are only ever called once. Dump out their
contents in the caller.

No functional change intended.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220913094441.3957645-3-oliver.upton@linux.dev
2022-09-14 11:36:16 +01:00
Oliver Upton
34b4d20399 KVM: arm64: Use visibility hook to treat ID regs as RAZ
The generic id reg accessors already handle RAZ registers by way of the
visibility hook. Add a visibility hook that returns REG_RAZ
unconditionally and throw out the RAZ specific accessors.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220913094441.3957645-2-oliver.upton@linux.dev
2022-09-14 11:36:16 +01:00
Kristina Martsenko
6fcd019359 arm64/sysreg: Standardise naming for ID_AA64MMFR1_EL1 fields
In preparation for converting the ID_AA64MMFR1_EL1 system register
defines to automatic generation, rename them to follow the conventions
used by other automatically generated registers:

 * Add _EL1 in the register name.

 * Rename fields to match the names in the ARM ARM:
   * LOR -> LO
   * HPD -> HPDS
   * VHE -> VH
   * HADBS -> HAFDBS
   * SPECSEI -> SpecSEI
   * VMIDBITS -> VMIDBits

There should be no functional change as a result of this patch.

Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-11-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 10:59:03 +01:00
Mark Brown
6ca2b9ca45 arm64/sysreg: Add _EL1 into ID_AA64PFR1_EL1 constant names
Our standard is to include the _EL1 in the constant names for registers but
we did not do that for ID_AA64PFR1_EL1, update to do so in preparation for
conversion to automatic generation. No functional change.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-8-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 10:59:02 +01:00
Mark Brown
55adc08d7e arm64/sysreg: Add _EL1 into ID_AA64PFR0_EL1 definition names
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64PFR0_EL1 to follow the convention. No functional changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-7-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 10:59:02 +01:00
Paolo Bonzini
959d6c4ae2 KVM/arm64 fixes for 6.0, take #1
- Fix unexpected sign extension of KVM_ARM_DEVICE_ID_MASK
 
 - Tidy-up handling of AArch32 on asymmetric systems
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmL+RsgPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDHrUP/3IYZ0LnYUZBImSU/YTPL5yYzdSVAuMNcdRQ
 EgvLQKwP+JSrmd7B7wZ4MhY1LheKjpNmmuSqTRsZOHb/yBmnh3+ao5n2gqusYQeJ
 PCuLYjeF7ZU5fGIrPAW6BW0BFlmYMVbTrC6SEMhZsisBhna44jrrWgkBz9mOsXE/
 YcDWv8kP15lisuQzMvnYxmZobbVgSJ3KgQY4/Dp6vyKMR8ULujCxziFV5R4RD0xP
 Ay8wnxtMUymx9P6sZsd6Vwi5h1MUXOOoI4He7+8ejIfoMOManIMOIq4PDQhINwQv
 tGysDmQavftSbUkXJ1VB+8cJ/9KufzwKxFoc5WqGk1y14QulyBNyb/XR3UtORe1n
 bitINTTkqibHY6fdQJA7z1sD0jaEAh/xNwO1Gq0BS40o4XVQDv2BjdQir9TEdlZO
 tsZKVaFpN3UZe681ru12No8YzQDhpuLH65gDHDjLaftH99WKsrSwMZLoEjqZTlM/
 vH/9acd4UB+9zMGTpN2tJ//2cq6g3JoUC7jJIQB1oGStHX0/7AxKMlabR2xHmt9E
 4CmJND9RLK6+yEelagxOAYMQfnCdj6pW/3bhvAmsZWh0t3fNxCeBBXFr2I5os+E9
 hV0FYx4PG9GtorMSqudCDsP83SDIxCluNZ5iM8t1suSn3dFhk5bDFChS86XGuqQe
 XxHQ6JTF
 =r5iq
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.0, take #1

- Fix unexpected sign extension of KVM_ARM_DEVICE_ID_MASK

- Tidy-up handling of AArch32 on asymmetric systems
2022-08-19 05:43:53 -04:00
Oliver Upton
f3c6efc72f KVM: arm64: Treat PMCR_EL1.LC as RES1 on asymmetric systems
KVM does not support AArch32 on asymmetric systems. To that end, enforce
AArch64-only behavior on PMCR_EL1.LC when on an asymmetric system.

Fixes: 2122a83331 ("arm64: Allow mismatched 32-bit EL0 support")
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220816192554.1455559-2-oliver.upton@linux.dev
2022-08-17 10:29:07 +01:00
Linus Torvalds
7c5c3a6177 ARM:
* Unwinder implementations for both nVHE modes (classic and
   protected), complete with an overflow stack
 
 * Rework of the sysreg access from userspace, with a complete
   rewrite of the vgic-v3 view to allign with the rest of the
   infrastructure
 
 * Disagregation of the vcpu flags in separate sets to better track
   their use model.
 
 * A fix for the GICv2-on-v3 selftest
 
 * A small set of cosmetic fixes
 
 RISC-V:
 
 * Track ISA extensions used by Guest using bitmap
 
 * Added system instruction emulation framework
 
 * Added CSR emulation framework
 
 * Added gfp_custom flag in struct kvm_mmu_memory_cache
 
 * Added G-stage ioremap() and iounmap() functions
 
 * Added support for Svpbmt inside Guest
 
 s390:
 
 * add an interface to provide a hypervisor dump for secure guests
 
 * improve selftests to use TAP interface
 
 * enable interpretive execution of zPCI instructions (for PCI passthrough)
 
 * First part of deferred teardown
 
 * CPU Topology
 
 * PV attestation
 
 * Minor fixes
 
 x86:
 
 * Permit guests to ignore single-bit ECC errors
 
 * Intel IPI virtualization
 
 * Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS
 
 * PEBS virtualization
 
 * Simplify PMU emulation by just using PERF_TYPE_RAW events
 
 * More accurate event reinjection on SVM (avoid retrying instructions)
 
 * Allow getting/setting the state of the speaker port data bit
 
 * Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent
 
 * "Notify" VM exit (detect microarchitectural hangs) for Intel
 
 * Use try_cmpxchg64 instead of cmpxchg64
 
 * Ignore benign host accesses to PMU MSRs when PMU is disabled
 
 * Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior
 
 * Allow NX huge page mitigation to be disabled on a per-vm basis
 
 * Port eager page splitting to shadow MMU as well
 
 * Enable CMCI capability by default and handle injected UCNA errors
 
 * Expose pid of vcpu threads in debugfs
 
 * x2AVIC support for AMD
 
 * cleanup PIO emulation
 
 * Fixes for LLDT/LTR emulation
 
 * Don't require refcounted "struct page" to create huge SPTEs
 
 * Miscellaneous cleanups:
 ** MCE MSR emulation
 ** Use separate namespaces for guest PTEs and shadow PTEs bitmasks
 ** PIO emulation
 ** Reorganize rmap API, mostly around rmap destruction
 ** Do not workaround very old KVM bugs for L0 that runs with nesting enabled
 ** new selftests API for CPUID
 
 Generic:
 
 * Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache
 
 * new selftests API using struct kvm_vcpu instead of a (vm, id) tuple
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmLnyo4UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMtQQf/XjVWiRcWLPR9dqzRM/vvRXpiG+UL
 jU93R7m6ma99aqTtrxV/AE+kHgamBlma3Cwo+AcWk9uCVNbIhFjv2YKg6HptKU0e
 oJT3zRYp+XIjEo7Kfw+TwroZbTlG6gN83l1oBLFMqiFmHsMLnXSI2mm8MXyi3dNB
 vR2uIcTAl58KIprqNNsYJ2dNn74ogOMiXYx9XzoA9/5Xb6c0h4rreHJa5t+0s9RO
 Gz7Io3PxumgsbJngjyL1Ve5oxhlIAcZA8DU0PQmjxo3eS+k6BcmavGFd45gNL5zg
 iLpCh4k86spmzh8CWkAAwWPQE4dZknK6jTctJc0OFVad3Z7+X7n0E8TFrA==
 =PM8o
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "Quite a large pull request due to a selftest API overhaul and some
  patches that had come in too late for 5.19.

  ARM:

   - Unwinder implementations for both nVHE modes (classic and
     protected), complete with an overflow stack

   - Rework of the sysreg access from userspace, with a complete rewrite
     of the vgic-v3 view to allign with the rest of the infrastructure

   - Disagregation of the vcpu flags in separate sets to better track
     their use model.

   - A fix for the GICv2-on-v3 selftest

   - A small set of cosmetic fixes

  RISC-V:

   - Track ISA extensions used by Guest using bitmap

   - Added system instruction emulation framework

   - Added CSR emulation framework

   - Added gfp_custom flag in struct kvm_mmu_memory_cache

   - Added G-stage ioremap() and iounmap() functions

   - Added support for Svpbmt inside Guest

  s390:

   - add an interface to provide a hypervisor dump for secure guests

   - improve selftests to use TAP interface

   - enable interpretive execution of zPCI instructions (for PCI
     passthrough)

   - First part of deferred teardown

   - CPU Topology

   - PV attestation

   - Minor fixes

  x86:

   - Permit guests to ignore single-bit ECC errors

   - Intel IPI virtualization

   - Allow getting/setting pending triple fault with
     KVM_GET/SET_VCPU_EVENTS

   - PEBS virtualization

   - Simplify PMU emulation by just using PERF_TYPE_RAW events

   - More accurate event reinjection on SVM (avoid retrying
     instructions)

   - Allow getting/setting the state of the speaker port data bit

   - Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls
     are inconsistent

   - "Notify" VM exit (detect microarchitectural hangs) for Intel

   - Use try_cmpxchg64 instead of cmpxchg64

   - Ignore benign host accesses to PMU MSRs when PMU is disabled

   - Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior

   - Allow NX huge page mitigation to be disabled on a per-vm basis

   - Port eager page splitting to shadow MMU as well

   - Enable CMCI capability by default and handle injected UCNA errors

   - Expose pid of vcpu threads in debugfs

   - x2AVIC support for AMD

   - cleanup PIO emulation

   - Fixes for LLDT/LTR emulation

   - Don't require refcounted "struct page" to create huge SPTEs

   - Miscellaneous cleanups:
      - MCE MSR emulation
      - Use separate namespaces for guest PTEs and shadow PTEs bitmasks
      - PIO emulation
      - Reorganize rmap API, mostly around rmap destruction
      - Do not workaround very old KVM bugs for L0 that runs with nesting enabled
      - new selftests API for CPUID

  Generic:

   - Fix races in gfn->pfn cache refresh; do not pin pages tracked by
     the cache

   - new selftests API using struct kvm_vcpu instead of a (vm, id)
     tuple"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (606 commits)
  selftests: kvm: set rax before vmcall
  selftests: KVM: Add exponent check for boolean stats
  selftests: KVM: Provide descriptive assertions in kvm_binary_stats_test
  selftests: KVM: Check stat name before other fields
  KVM: x86/mmu: remove unused variable
  RISC-V: KVM: Add support for Svpbmt inside Guest/VM
  RISC-V: KVM: Use PAGE_KERNEL_IO in kvm_riscv_gstage_ioremap()
  RISC-V: KVM: Add G-stage ioremap() and iounmap() functions
  KVM: Add gfp_custom flag in struct kvm_mmu_memory_cache
  RISC-V: KVM: Add extensible CSR emulation framework
  RISC-V: KVM: Add extensible system instruction emulation framework
  RISC-V: KVM: Factor-out instruction emulation into separate sources
  RISC-V: KVM: move preempt_disable() call in kvm_arch_vcpu_ioctl_run
  RISC-V: KVM: Make kvm_riscv_guest_timer_init a void function
  RISC-V: KVM: Fix variable spelling mistake
  RISC-V: KVM: Improve ISA extension by using a bitmap
  KVM, x86/mmu: Fix the comment around kvm_tdp_mmu_zap_leafs()
  KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog
  KVM: x86/mmu: Treat NX as a valid SPTE bit for NPT
  KVM: x86: Do not block APIC write for non ICR registers
  ...
2022-08-04 14:59:54 -07:00
Marc Zyngier
ae98a4a989 Merge branch kvm-arm64/sysreg-cleanup-5.20 into kvmarm-master/next
* kvm-arm64/sysreg-cleanup-5.20:
  : .
  : Long overdue cleanup of the sysreg userspace access,
  : with extra scrubbing on the vgic side of things.
  : From the cover letter:
  :
  : "Schspa Shi recently reported[1] that some of the vgic code interacting
  : with userspace was reading uninitialised stack memory, and although
  : that read wasn't used any further, it prompted me to revisit this part
  : of the code.
  :
  : Needless to say, this area of the kernel is pretty crufty, and shows a
  : bunch of issues in other parts of the KVM/arm64 infrastructure. This
  : series tries to remedy a bunch of them:
  :
  : - Sanitise the way we deal with sysregs from userspace: at the moment,
  :   each and every .set_user/.get_user callback has to implement its own
  :   userspace accesses (directly or indirectly). It'd be much better if
  :   that was centralised so that we can reason about it.
  :
  : - Enforce that all AArch64 sysregs are 64bit. Always. This was sort of
  :   implied by the code, but it took some effort to convince myself that
  :   this was actually the case.
  :
  : - Move the vgic-v3 sysreg userspace accessors to the userspace
  :   callbacks instead of hijacking the vcpu trap callback. This allows
  :   us to reuse the sysreg infrastructure.
  :
  : - Consolidate userspace accesses for both GICv2, GICv3 and common code
  :   as much as possible.
  :
  : - Cleanup a bunch of not-very-useful helpers, tidy up some of the code
  :   as we touch it.
  :
  : [1] https://lore.kernel.org/r/m2h740zz1i.fsf@gmail.com"
  : .
  KVM: arm64: Get rid or outdated comments
  KVM: arm64: Descope kvm_arm_sys_reg_{get,set}_reg()
  KVM: arm64: Get rid of find_reg_by_id()
  KVM: arm64: vgic: Tidy-up calls to vgic_{get,set}_common_attr()
  KVM: arm64: vgic: Consolidate userspace access for base address setting
  KVM: arm64: vgic-v2: Add helper for legacy dist/cpuif base address setting
  KVM: arm64: vgic: Use {get,put}_user() instead of copy_{from.to}_user
  KVM: arm64: vgic-v2: Consolidate userspace access for MMIO registers
  KVM: arm64: vgic-v3: Consolidate userspace access for MMIO registers
  KVM: arm64: vgic-v3: Use u32 to manage the line level from userspace
  KVM: arm64: vgic-v3: Convert userspace accessors over to FIELD_GET/FIELD_PREP
  KVM: arm64: vgic-v3: Make the userspace accessors use sysreg API
  KVM: arm64: vgic-v3: Push user access into vgic_v3_cpu_sysregs_uaccess()
  KVM: arm64: vgic-v3: Simplify vgic_v3_has_cpu_sysregs_attr()
  KVM: arm64: Get rid of reg_from/to_user()
  KVM: arm64: Consolidate sysreg userspace accesses
  KVM: arm64: Rely on index_to_param() for size checks on userspace access
  KVM: arm64: Introduce generic get_user/set_user helpers for system registers
  KVM: arm64: Reorder handling of invariant sysregs from userspace
  KVM: arm64: Add get_reg_by_id() as a sys_reg_desc retrieving helper

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:55:58 +01:00
Marc Zyngier
4274d42716 KVM: arm64: Get rid or outdated comments
Once apon a time, the 32bit KVM/arm port was the reference, while
the arm64 version was the new kid on the block, without a clear
future... This was a long time ago.

"The times, they are a-changing."

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:55:34 +01:00
Marc Zyngier
f6dddbb255 KVM: arm64: Get rid of find_reg_by_id()
This helper doesn't have a user anymore, let's get rid of it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:55:33 +01:00
Marc Zyngier
5a420ed964 KVM: arm64: Get rid of reg_from/to_user()
These helpers are only used by the invariant stuff now, and while
they pretend to support non-64bit registers, this only serves as
a way to scare the casual reviewer...

Replace these helpers with our good friends get/put_user(), and
don't look back.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:55:33 +01:00
Marc Zyngier
978ceeb3e4 KVM: arm64: Consolidate sysreg userspace accesses
Until now, the .set_user and .get_user callbacks have to implement
(directly or not) the userspace memory accesses. Although this gives
us maximem flexibility, this is also a maintenance burden, making it
hard to audit, and I'd feel much better if it was all located in
a single place.

So let's do just that, simplifying most of the function signatures
in the process (the callbacks are now only concerned with the
data itself, and not with userspace).

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:55:33 +01:00
Marc Zyngier
e48407ff97 KVM: arm64: Rely on index_to_param() for size checks on userspace access
index_to_param() already checks that we use 64bit accesses for all
registers accessed from userspace.

However, we have extra checks in other places (such as index_to_params),
which is pretty confusing. Get rid off these redundant checks.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:55:33 +01:00
Marc Zyngier
ba23aec9f4 KVM: arm64: Introduce generic get_user/set_user helpers for system registers
The userspace access to the system registers is done using helpers
that hardcode the table that is looked up. extract some generic
helpers from this, moving the handling of hidden sysregs into
the core code.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:55:33 +01:00
Marc Zyngier
1deeffb559 KVM: arm64: Reorder handling of invariant sysregs from userspace
In order to allow some further refactor of the sysreg helpers,
move the handling of invariant sysreg to occur before we handle
all the other ones.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:55:33 +01:00
Marc Zyngier
da8d120fba KVM: arm64: Add get_reg_by_id() as a sys_reg_desc retrieving helper
find_reg_by_id() requires a sys_reg_param as input, which most
users provide as a on-stack variable, but don't make any use of
the result.

Provide a helper that doesn't have this requirement and simplify
the callers (all but one).

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-17 11:55:32 +01:00
Mark Brown
b2d71f275d arm64/sysreg: Add _EL1 into ID_AA64ISAR2_EL1 definition names
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64ISAR2_EL1 to follow the convention. No functional changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220704170302.2609529-17-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-05 11:45:46 +01:00
Mark Brown
aa50479b4f arm64/sysreg: Add _EL1 into ID_AA64ISAR1_EL1 definition names
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64ISAR1_EL1 to follow the convention. No functional changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220704170302.2609529-16-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-05 11:45:46 +01:00
Mark Brown
9a2f3290bb arm64/sysreg: Standardise naming for WFxT defines
The defines for WFxT refer to the feature as WFXT and use SUPPORTED rather
than IMP. In preparation for automatic generation of defines update these
to be more standard. No functional changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220704170302.2609529-12-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-07-05 11:45:46 +01:00
Marc Zyngier
30b6ab45f8 KVM: arm64: Convert vcpu sysregs_loaded_on_cpu to a state flag
The aptly named boolean 'sysregs_loaded_on_cpu' tracks whether
some of the vcpu system registers are resident on the physical
CPU when running in VHE mode.

This is obviously a flag in hidding, so let's convert it to
a state flag, since this is solely a host concern (the hypervisor
itself always knows which state we're in).

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29 10:23:32 +01:00
Marc Zyngier
b1da49088a KVM: arm64: Move vcpu debug/SPE/TRBE flags to the input flag set
The three debug flags (which deal with the debug registers, SPE and
TRBE) all are input flags to the hypervisor code.

Move them into the input set and convert them to the new accessors.

Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-06-29 10:23:03 +01:00
Linus Torvalds
bf9095424d S390:
* ultravisor communication device driver
 
 * fix TEID on terminating storage key ops
 
 RISC-V:
 
 * Added Sv57x4 support for G-stage page table
 
 * Added range based local HFENCE functions
 
 * Added remote HFENCE functions based on VCPU requests
 
 * Added ISA extension registers in ONE_REG interface
 
 * Updated KVM RISC-V maintainers entry to cover selftests support
 
 ARM:
 
 * Add support for the ARMv8.6 WFxT extension
 
 * Guard pages for the EL2 stacks
 
 * Trap and emulate AArch32 ID registers to hide unsupported features
 
 * Ability to select and save/restore the set of hypercalls exposed
   to the guest
 
 * Support for PSCI-initiated suspend in collaboration with userspace
 
 * GICv3 register-based LPI invalidation support
 
 * Move host PMU event merging into the vcpu data structure
 
 * GICv3 ITS save/restore fixes
 
 * The usual set of small-scale cleanups and fixes
 
 x86:
 
 * New ioctls to get/set TSC frequency for a whole VM
 
 * Allow userspace to opt out of hypercall patching
 
 * Only do MSR filtering for MSRs accessed by rdmsr/wrmsr
 
 AMD SEV improvements:
 
 * Add KVM_EXIT_SHUTDOWN metadata for SEV-ES
 
 * V_TSC_AUX support
 
 Nested virtualization improvements for AMD:
 
 * Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE,
   nested vGIF)
 
 * Allow AVIC to co-exist with a nested guest running
 
 * Fixes for LBR virtualizations when a nested guest is running,
   and nested LBR virtualization support
 
 * PAUSE filtering for nested hypervisors
 
 Guest support:
 
 * Decoupling of vcpu_is_preempted from PV spinlocks
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmKN9M4UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNLeAf+KizAlQwxEehHHeNyTkZuKyMawrD6
 zsqAENR6i1TxiXe7fDfPFbO2NR0ZulQopHbD9mwnHJ+nNw0J4UT7g3ii1IAVcXPu
 rQNRGMVWiu54jt+lep8/gDg0JvPGKVVKLhxUaU1kdWT9PhIOC6lwpP3vmeWkUfRi
 PFL/TMT0M8Nfryi0zHB0tXeqg41BiXfqO8wMySfBAHUbpv8D53D2eXQL6YlMM0pL
 2quB1HxHnpueE5vj3WEPQ3PCdy1M2MTfCDBJAbZGG78Ljx45FxSGoQcmiBpPnhJr
 C6UGP4ZDWpml5YULUoA70k5ylCbP+vI61U4vUtzEiOjHugpPV5wFKtx5nw==
 =ozWx
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "S390:

   - ultravisor communication device driver

   - fix TEID on terminating storage key ops

  RISC-V:

   - Added Sv57x4 support for G-stage page table

   - Added range based local HFENCE functions

   - Added remote HFENCE functions based on VCPU requests

   - Added ISA extension registers in ONE_REG interface

   - Updated KVM RISC-V maintainers entry to cover selftests support

  ARM:

   - Add support for the ARMv8.6 WFxT extension

   - Guard pages for the EL2 stacks

   - Trap and emulate AArch32 ID registers to hide unsupported features

   - Ability to select and save/restore the set of hypercalls exposed to
     the guest

   - Support for PSCI-initiated suspend in collaboration with userspace

   - GICv3 register-based LPI invalidation support

   - Move host PMU event merging into the vcpu data structure

   - GICv3 ITS save/restore fixes

   - The usual set of small-scale cleanups and fixes

  x86:

   - New ioctls to get/set TSC frequency for a whole VM

   - Allow userspace to opt out of hypercall patching

   - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr

  AMD SEV improvements:

   - Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

   - V_TSC_AUX support

  Nested virtualization improvements for AMD:

   - Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE,
     nested vGIF)

   - Allow AVIC to co-exist with a nested guest running

   - Fixes for LBR virtualizations when a nested guest is running, and
     nested LBR virtualization support

   - PAUSE filtering for nested hypervisors

  Guest support:

   - Decoupling of vcpu_is_preempted from PV spinlocks"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (199 commits)
  KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest
  KVM: selftests: x86: Sync the new name of the test case to .gitignore
  Documentation: kvm: reorder ARM-specific section about KVM_SYSTEM_EVENT_SUSPEND
  x86, kvm: use correct GFP flags for preemption disabled
  KVM: LAPIC: Drop pending LAPIC timer injection when canceling the timer
  x86/kvm: Alloc dummy async #PF token outside of raw spinlock
  KVM: x86: avoid calling x86 emulator without a decoded instruction
  KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak
  x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave)
  s390/uv_uapi: depend on CONFIG_S390
  KVM: selftests: x86: Fix test failure on arch lbr capable platforms
  KVM: LAPIC: Trace LAPIC timer expiration on every vmentry
  KVM: s390: selftest: Test suppression indication on key prot exception
  KVM: s390: Don't indicate suppression on dirtying, failing memop
  selftests: drivers/s390x: Add uvdevice tests
  drivers/s390/char: Add Ultravisor io device
  MAINTAINERS: Update KVM RISC-V entry to cover selftests support
  RISC-V: KVM: Introduce ISA extension register
  RISC-V: KVM: Cleanup stale TLB entries when host CPU changes
  RISC-V: KVM: Add remote HFENCE functions based on VCPU requests
  ...
2022-05-26 14:20:14 -07:00
Linus Torvalds
143a6252e1 arm64 updates for 5.19:
- Initial support for the ARMv9 Scalable Matrix Extension (SME). SME
   takes the approach used for vectors in SVE and extends this to provide
   architectural support for matrix operations. No KVM support yet, SME
   is disabled in guests.
 
 - Support for crashkernel reservations above ZONE_DMA via the
   'crashkernel=X,high' command line option.
 
 - btrfs search_ioctl() fix for live-lock with sub-page faults.
 
 - arm64 perf updates: support for the Hisilicon "CPA" PMU for monitoring
   coherent I/O traffic, support for Arm's CMN-650 and CMN-700
   interconnect PMUs, minor driver fixes, kerneldoc cleanup.
 
 - Kselftest updates for SME, BTI, MTE.
 
 - Automatic generation of the system register macros from a 'sysreg'
   file describing the register bitfields.
 
 - Update the type of the function argument holding the ESR_ELx register
   value to unsigned long to match the architecture register size
   (originally 32-bit but extended since ARMv8.0).
 
 - stacktrace cleanups.
 
 - ftrace cleanups.
 
 - Miscellaneous updates, most notably: arm64-specific huge_ptep_get(),
   avoid executable mappings in kexec/hibernate code, drop TLB flushing
   from get_clear_flush() (and rename it to get_clear_contig()),
   ARCH_NR_GPIO bumped to 2048 for ARCH_APPLE.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmKH19IACgkQa9axLQDI
 XvEFWg//bf0p6zjeNaOJmBbyVFsXsVyYiEaLUpFPUs3oB+81s2YZ+9i1rgMrNCft
 EIDQ9+/HgScKxJxnzWf68heMdcBDbk76VJtLALExbge6owFsjByQDyfb/b3v/bLd
 ezAcGzc6G5/FlI1IP7ct4Z9MnQry4v5AG8lMNAHjnf6GlBS/tYNAqpmj8HpQfgRQ
 ZbhfZ8Ayu3TRSLWL39NHVevpmxQm/bGcpP3Q9TtjUqg0r1FQ5sK/LCqOksueIAzT
 UOgUVYWSFwTpLEqbYitVqgERQp9LiLoK5RmNYCIEydfGM7+qmgoxofSq5e2hQtH2
 SZM1XilzsZctRbBbhMit1qDBqMlr/XAy/R5FO0GauETVKTaBhgtj6mZGyeC9nU/+
 RGDljaArbrOzRwMtSuXF+Fp6uVo5spyRn1m8UT/k19lUTdrV9z6EX5Fzuc4Mnhed
 oz4iokbl/n8pDObXKauQspPA46QpxUYhrAs10B/ELc3yyp/Qj3jOfzYHKDNFCUOq
 HC9mU+YiO9g2TbYgCrrFM6Dah2E8fU6/cR0ZPMeMgWK4tKa+6JMEINYEwak9e7M+
 8lZnvu3ntxiJLN+PrPkiPyG+XBh2sux1UfvNQ+nw4Oi9xaydeX7PCbQVWmzTFmHD
 q7UPQ8220e2JNCha9pULS8cxDLxiSksce06DQrGXwnHc1Ir7T04=
 =0DjE
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - Initial support for the ARMv9 Scalable Matrix Extension (SME).

   SME takes the approach used for vectors in SVE and extends this to
   provide architectural support for matrix operations. No KVM support
   yet, SME is disabled in guests.

 - Support for crashkernel reservations above ZONE_DMA via the
   'crashkernel=X,high' command line option.

 - btrfs search_ioctl() fix for live-lock with sub-page faults.

 - arm64 perf updates: support for the Hisilicon "CPA" PMU for
   monitoring coherent I/O traffic, support for Arm's CMN-650 and
   CMN-700 interconnect PMUs, minor driver fixes, kerneldoc cleanup.

 - Kselftest updates for SME, BTI, MTE.

 - Automatic generation of the system register macros from a 'sysreg'
   file describing the register bitfields.

 - Update the type of the function argument holding the ESR_ELx register
   value to unsigned long to match the architecture register size
   (originally 32-bit but extended since ARMv8.0).

 - stacktrace cleanups.

 - ftrace cleanups.

 - Miscellaneous updates, most notably: arm64-specific huge_ptep_get(),
   avoid executable mappings in kexec/hibernate code, drop TLB flushing
   from get_clear_flush() (and rename it to get_clear_contig()),
   ARCH_NR_GPIO bumped to 2048 for ARCH_APPLE.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (145 commits)
  arm64/sysreg: Generate definitions for FAR_ELx
  arm64/sysreg: Generate definitions for DACR32_EL2
  arm64/sysreg: Generate definitions for CSSELR_EL1
  arm64/sysreg: Generate definitions for CPACR_ELx
  arm64/sysreg: Generate definitions for CONTEXTIDR_ELx
  arm64/sysreg: Generate definitions for CLIDR_EL1
  arm64/sve: Move sve_free() into SVE code section
  arm64: Kconfig.platforms: Add comments
  arm64: Kconfig: Fix indentation and add comments
  arm64: mm: avoid writable executable mappings in kexec/hibernate code
  arm64: lds: move special code sections out of kernel exec segment
  arm64/hugetlb: Implement arm64 specific huge_ptep_get()
  arm64/hugetlb: Use ptep_get() to get the pte value of a huge page
  arm64: kdump: Do not allocate crash low memory if not needed
  arm64/sve: Generate ZCR definitions
  arm64/sme: Generate defintions for SVCR
  arm64/sme: Generate SMPRI_EL1 definitions
  arm64/sme: Automatically generate SMPRIMAP_EL2 definitions
  arm64/sme: Automatically generate SMIDR_EL1 defines
  arm64/sme: Automatically generate defines for SMCR
  ...
2022-05-23 21:06:11 -07:00
Catalin Marinas
0616ea3f1b Merge branch 'for-next/esr-elx-64-bit' into for-next/core
* for-next/esr-elx-64-bit:
  : Treat ESR_ELx as a 64-bit register.
  KVM: arm64: uapi: Add kvm_debug_exit_arch.hsr_high
  KVM: arm64: Treat ESR_EL2 as a 64-bit register
  arm64: Treat ESR_ELx as a 64-bit register
  arm64: compat: Do not treat syscall number as ESR_ELx for a bad syscall
  arm64: Make ESR_ELx_xVC_IMM_MASK compatible with assembly
2022-05-20 18:51:54 +01:00
Mark Brown
ec0067a63e arm64/sme: Remove _EL0 from name of SVCR - FIXME sysreg.h
The defines for SVCR call it SVCR_EL0 however the architecture calls the
register SVCR with no _EL0 suffix. In preparation for generating the sysreg
definitions rename to match the architecture, no functional change.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220510161208.631259-6-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16 19:50:20 +01:00
Marc Zyngier
822ca7f82b Merge branch kvm-arm64/misc-5.19 into kvmarm-master/next
* kvm-arm64/misc-5.19:
  : .
  : Misc fixes and general improvements for KVMM/arm64:
  :
  : - Better handle out of sequence sysregs in the global tables
  :
  : - Remove a couple of unnecessary loads from constant pool
  :
  : - Drop unnecessary pKVM checks
  :
  : - Add all known M1 implementations to the SEIS workaround
  :
  : - Cleanup kerneldoc warnings
  : .
  KVM: arm64: vgic-v3: List M1 Pro/Max as requiring the SEIS workaround
  KVM: arm64: pkvm: Don't mask already zeroed FEAT_SVE
  KVM: arm64: pkvm: Drop unnecessary FP/SIMD trap handler
  KVM: arm64: nvhe: Eliminate kernel-doc warnings
  KVM: arm64: Avoid unnecessary absolute addressing via literals
  KVM: arm64: Print emulated register table name when it is unsorted
  KVM: arm64: Don't BUG_ON() if emulated register table is unsorted

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-16 17:48:36 +01:00
Marc Zyngier
5163373af1 KVM: arm64: vgic-v3: Consistently populate ID_AA64PFR0_EL1.GIC
When adding support for the slightly wonky Apple M1, we had to
populate ID_AA64PFR0_EL1.GIC==1 to present something to the guest,
as the HW itself doesn't advertise the feature.

However, we gated this on the in-kernel irqchip being created.
This causes some trouble for QEMU, which snapshots the state of
the registers before creating a virtual GIC, and then tries to
restore these registers once the GIC has been created.  Obviously,
between the two stages, ID_AA64PFR0_EL1.GIC has changed value,
and the write fails.

The fix is to actually emulate the HW, and always populate the
field if the HW is capable of it.

Fixes: 562e530fd7 ("KVM: arm64: Force ID_AA64PFR0_EL1.GIC=1 when exposing a virtual GICv3")
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Oliver Upton <oupton@google.com>
Link: https://lore.kernel.org/r/20220503211424.3375263-1-maz@kernel.org
2022-05-15 12:12:14 +01:00
Alexandru Elisei
325031d4f3 KVM: arm64: Print emulated register table name when it is unsorted
When a sysreg table entry is out-of-order, KVM attempts to print the
address of the table:

[    0.143911] kvm [1]: sys_reg table (____ptrval____) out of order (1)

Printing the name of the table instead of a pointer is more helpful in this
case. The message has also been slightly tweaked to be point out the
offending entry (and to match the missing reset error message):

[    0.143891] kvm [1]: sys_reg table sys_reg_descs+0x50/0x7490 entry 1 out of order

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220428103405.70884-3-alexandru.elisei@arm.com
2022-05-04 17:52:56 +01:00
Alexandru Elisei
f1f0c0cfea KVM: arm64: Don't BUG_ON() if emulated register table is unsorted
To emulate a register access, KVM uses a table of registers sorted by
register encoding to speed up queries using binary search.

When Linux boots, KVM checks that the table is sorted and uses a BUG_ON()
statement to let the user know if it's not. The unfortunate side effect is
that an unsorted sysreg table brings down the whole kernel, not just KVM,
even though the rest of the kernel can function just fine without KVM. To
make matters worse, on machines which lack a serial console, the user is
left pondering why the machine is taking so long to boot.

Improve this situation by returning an error from kvm_arch_init() if the
sysreg tables are not in the correct order. The machine is still very much
usable for the user, with the exception of virtualization, who can now
easily determine what went wrong.

A minor typo has also been corrected in the check_sysreg_table() function.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220428103405.70884-2-alexandru.elisei@arm.com
2022-05-04 17:52:50 +01:00
Marc Zyngier
d25f30fe41 Merge branch kvm-arm64/aarch32-idreg-trap into kvmarm-master/next
* kvm-arm64/aarch32-idreg-trap:
  : .
  : Add trapping/sanitising infrastructure for AArch32 systen registers,
  : allowing more control over what we actually expose (such as the PMU).
  :
  : Patches courtesy of Oliver and Alexandru.
  : .
  KVM: arm64: Fix new instances of 32bit ESRs
  KVM: arm64: Hide AArch32 PMU registers when not available
  KVM: arm64: Start trapping ID registers for 32 bit guests
  KVM: arm64: Plumb cp10 ID traps through the AArch64 sysreg handler
  KVM: arm64: Wire up CP15 feature registers to their AArch64 equivalents
  KVM: arm64: Don't write to Rt unless sys_reg emulation succeeds
  KVM: arm64: Return a bool from emulate_cp()

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-04 09:42:45 +01:00
Marc Zyngier
b2c4caf331 Merge branch kvm-arm64/wfxt into kvmarm-master/next
* kvm-arm64/wfxt:
  : .
  : Add support for the WFET/WFIT instructions that provide the same
  : service as WFE/WFI, only with a timeout.
  : .
  KVM: arm64: Expose the WFXT feature to guests
  KVM: arm64: Offer early resume for non-blocking WFxT instructions
  KVM: arm64: Handle blocking WFIT instruction
  KVM: arm64: Introduce kvm_counter_compute_delta() helper
  KVM: arm64: Simplify kvm_cpu_has_pending_timer()
  arm64: Use WFxT for __delay() when possible
  arm64: Add wfet()/wfit() helpers
  arm64: Add HWCAP advertising FEAT_WFXT
  arm64: Add RV and RN fields for ESR_ELx_WFx_ISS
  arm64: Expand ESR_ELx_WFx_ISS_TI to match its ARMv8.7 definition

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-04 09:42:16 +01:00
Marc Zyngier
ee87a9bd65 KVM: arm64: Fix new instances of 32bit ESRs
Fix the new instances of ESR being described as a u32, now that
we consistently are using a u64 for this register.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-04 08:01:05 +01:00
Alexandru Elisei
a9e192cd4f KVM: arm64: Hide AArch32 PMU registers when not available
commit 11663111cd ("KVM: arm64: Hide PMU registers from userspace when
not available") hid the AArch64 PMU registers from userspace and guest
when the PMU VCPU feature was not set. Do the same when the PMU
registers are accessed by an AArch32 guest. While we're at it, rename
the previously unused AA32_ZEROHIGH to AA32_DIRECT to match the behavior
of get_access_mask().

Now that KVM emulates ID_DFR0 and hides the PMU from the guest when the
feature is not set, it is safe to inject to inject an undefined exception
when the PMU is not present, as that corresponds to the architected
behaviour.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
[Oliver - Add AA32_DIRECT to match the zero value of the enum]
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220503060205.2823727-7-oupton@google.com
2022-05-03 11:17:41 +01:00
Oliver Upton
9369bc5c5e KVM: arm64: Plumb cp10 ID traps through the AArch64 sysreg handler
In order to enable HCR_EL2.TID3 for AArch32 guests KVM needs to handle
traps where ESR_EL2.EC=0x8, which corresponds to an attempted VMRS
access from an ID group register. Specifically, the MVFR{0-2} registers
are accessed this way from AArch32. Conveniently, these registers are
architecturally mapped to MVFR{0-2}_EL1 in AArch64. Furthermore, KVM
already handles reads to these aliases in AArch64.

Plumb VMRS read traps through to the general AArch64 system register
handler.

Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220503060205.2823727-5-oupton@google.com
2022-05-03 11:14:34 +01:00
Oliver Upton
e651976667 KVM: arm64: Wire up CP15 feature registers to their AArch64 equivalents
KVM currently does not trap ID register accesses from an AArch32 EL1.
This is painful for a couple of reasons. Certain unimplemented features
are visible to AArch32 EL1, as we limit PMU to version 3 and the debug
architecture to v8.0. Additionally, we attempt to paper over
heterogeneous systems by using register values that are safe
system-wide. All this hard work is completely sidestepped because KVM
does not set TID3 for AArch32 guests.

Fix up handling of CP15 feature registers by simply rerouting to their
AArch64 aliases. Punt setting HCR_EL2.TID3 to a later change, as we need
to fix up the oddball CP10 feature registers still.

Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220503060205.2823727-4-oupton@google.com
2022-05-03 11:14:33 +01:00
Oliver Upton
28eda7b5e8 KVM: arm64: Don't write to Rt unless sys_reg emulation succeeds
emulate_sys_reg() returns 1 unconditionally, even though a a system
register access can fail. Furthermore, kvm_handle_sys_reg() writes to Rt
for every register read, regardless of if it actually succeeded.

Though this pattern is safe (as params.regval is initialized with the
current value of Rt) it is a bit ugly. Indicate failure if the register
access could not be emulated and only write to Rt on success.

Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220503060205.2823727-3-oupton@google.com
2022-05-03 11:14:33 +01:00
Oliver Upton
001bb81999 KVM: arm64: Return a bool from emulate_cp()
KVM indicates success/failure in several ways, but generally an integer
is used when conditionally bouncing to userspace is involved. That is
not the case from emulate_cp(); just use a bool instead.

No functional change intended.

Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220503060205.2823727-2-oupton@google.com
2022-05-03 11:14:33 +01:00
Alexandru Elisei
0b12620fdd KVM: arm64: Treat ESR_EL2 as a 64-bit register
ESR_EL2 was defined as a 32-bit register in the initial release of the
ARM Architecture Manual for Armv8-A, and was later extended to 64 bits,
with bits [63:32] RES0. ARMv8.7 introduced FEAT_LS64, which makes use of
bits [36:32].

KVM treats ESR_EL1 as a 64-bit register when saving and restoring the
guest context, but ESR_EL2 is handled as a 32-bit register. Start
treating ESR_EL2 as a 64-bit register to allow KVM to make use of the
most significant 32 bits in the future.

The type chosen to represent ESR_EL2 is u64, as that is consistent with the
notation KVM overwhelmingly uses today (u32), and how the rest of the
registers are declared.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-5-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-29 19:26:27 +01:00
Mark Brown
90807748ca KVM: arm64: Hide SME system registers from guests
For the time being we do not support use of SME by KVM guests, support for
this will be enabled in future. In order to prevent any side effects or
side channels via the new system registers, including the EL0 read/write
register TPIDR2, explicitly undefine all the system registers added by
SME and mask out the SME bitfield in SYS_ID_AA64PFR1.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220419112247.711548-25-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22 18:51:22 +01:00
Marc Zyngier
06e0b80258 KVM: arm64: Expose the WFXT feature to guests
Plumb in the capability, and expose WFxT to guests when available.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220419182755.601427-8-maz@kernel.org
2022-04-20 13:24:45 +01:00
Linus Torvalds
1ebdbeb03e ARM:
- Proper emulation of the OSLock feature of the debug architecture
 
 - Scalibility improvements for the MMU lock when dirty logging is on
 
 - New VMID allocator, which will eventually help with SVA in VMs
 
 - Better support for PMUs in heterogenous systems
 
 - PSCI 1.1 support, enabling support for SYSTEM_RESET2
 
 - Implement CONFIG_DEBUG_LIST at EL2
 
 - Make CONFIG_ARM64_ERRATUM_2077057 default y
 
 - Reduce the overhead of VM exit when no interrupt is pending
 
 - Remove traces of 32bit ARM host support from the documentation
 
 - Updated vgic selftests
 
 - Various cleanups, doc updates and spelling fixes
 
 RISC-V:
 
 - Prevent KVM_COMPAT from being selected
 
 - Optimize __kvm_riscv_switch_to() implementation
 
 - RISC-V SBI v0.3 support
 
 s390:
 
 - memop selftest
 
 - fix SCK locking
 
 - adapter interruptions virtualization for secure guests
 
 - add Claudio Imbrenda as maintainer
 
 - first step to do proper storage key checking
 
 x86:
 
 - Continue switching kvm_x86_ops to static_call(); introduce
   static_call_cond() and __static_call_ret0 when applicable.
 
 - Cleanup unused arguments in several functions
 
 - Synthesize AMD 0x80000021 leaf
 
 - Fixes and optimization for Hyper-V sparse-bank hypercalls
 
 - Implement Hyper-V's enlightened MSR bitmap for nested SVM
 
 - Remove MMU auditing
 
 - Eager splitting of page tables (new aka "TDP" MMU only) when dirty
   page tracking is enabled
 
 - Cleanup the implementation of the guest PGD cache
 
 - Preparation for the implementation of Intel IPI virtualization
 
 - Fix some segment descriptor checks in the emulator
 
 - Allow AMD AVIC support on systems with physical APIC ID above 255
 
 - Better API to disable virtualization quirks
 
 - Fixes and optimizations for the zapping of page tables:
 
   - Zap roots in two passes, avoiding RCU read-side critical sections
     that last too long for very large guests backed by 4 KiB SPTEs.
 
   - Zap invalid and defunct roots asynchronously via concurrency-managed
     work queue.
 
   - Allowing yielding when zapping TDP MMU roots in response to the root's
     last reference being put.
 
   - Batch more TLB flushes with an RCU trick.  Whoever frees the paging
     structure now holds RCU as a proxy for all vCPUs running in the guest,
     i.e. to prolongs the grace period on their behalf.  It then kicks the
     the vCPUs out of guest mode before doing rcu_read_unlock().
 
 Generic:
 
 - Introduce __vcalloc and use it for very large allocations that
   need memcg accounting
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmI4fdwUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMq8gf/WoeVHtw2QlL5Mmz6McvRRmPAYPLV
 wLUIFNrRqRvd8Tw4kivzZoh/xTpwmnojv0YdK5SjKAiMjgv094YI1LrNp1JSPvmL
 pitocMkA10RSJNWHeEMg9cMSKH0rKiqeYl6S1e2XsdB+UZZ2BINOCVtvglmjTAvJ
 dFBdKdBkqjAUZbdXAGIvz4JEEER3N/LkFDKGaUGX+0QIQOzGBPIyLTxynxIDG6mt
 RViCCFyXdy5NkVp5hZFm96vQ2qAlWL9B9+iKruQN++82+oqWbeTdSqPhdwF7GyFz
 BfOv3gobQ2c4ef/aMLO5LswZ9joI1t/4kQbbAn6dNybpOAz/NXfDnbNefg==
 =keox
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:
   - Proper emulation of the OSLock feature of the debug architecture

   - Scalibility improvements for the MMU lock when dirty logging is on

   - New VMID allocator, which will eventually help with SVA in VMs

   - Better support for PMUs in heterogenous systems

   - PSCI 1.1 support, enabling support for SYSTEM_RESET2

   - Implement CONFIG_DEBUG_LIST at EL2

   - Make CONFIG_ARM64_ERRATUM_2077057 default y

   - Reduce the overhead of VM exit when no interrupt is pending

   - Remove traces of 32bit ARM host support from the documentation

   - Updated vgic selftests

   - Various cleanups, doc updates and spelling fixes

  RISC-V:
   - Prevent KVM_COMPAT from being selected

   - Optimize __kvm_riscv_switch_to() implementation

   - RISC-V SBI v0.3 support

  s390:
   - memop selftest

   - fix SCK locking

   - adapter interruptions virtualization for secure guests

   - add Claudio Imbrenda as maintainer

   - first step to do proper storage key checking

  x86:
   - Continue switching kvm_x86_ops to static_call(); introduce
     static_call_cond() and __static_call_ret0 when applicable.

   - Cleanup unused arguments in several functions

   - Synthesize AMD 0x80000021 leaf

   - Fixes and optimization for Hyper-V sparse-bank hypercalls

   - Implement Hyper-V's enlightened MSR bitmap for nested SVM

   - Remove MMU auditing

   - Eager splitting of page tables (new aka "TDP" MMU only) when dirty
     page tracking is enabled

   - Cleanup the implementation of the guest PGD cache

   - Preparation for the implementation of Intel IPI virtualization

   - Fix some segment descriptor checks in the emulator

   - Allow AMD AVIC support on systems with physical APIC ID above 255

   - Better API to disable virtualization quirks

   - Fixes and optimizations for the zapping of page tables:

      - Zap roots in two passes, avoiding RCU read-side critical
        sections that last too long for very large guests backed by 4
        KiB SPTEs.

      - Zap invalid and defunct roots asynchronously via
        concurrency-managed work queue.

      - Allowing yielding when zapping TDP MMU roots in response to the
        root's last reference being put.

      - Batch more TLB flushes with an RCU trick. Whoever frees the
        paging structure now holds RCU as a proxy for all vCPUs running
        in the guest, i.e. to prolongs the grace period on their behalf.
        It then kicks the the vCPUs out of guest mode before doing
        rcu_read_unlock().

  Generic:
   - Introduce __vcalloc and use it for very large allocations that need
     memcg accounting"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits)
  KVM: use kvcalloc for array allocations
  KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2
  kvm: x86: Require const tsc for RT
  KVM: x86: synthesize CPUID leaf 0x80000021h if useful
  KVM: x86: add support for CPUID leaf 0x80000021
  KVM: x86: do not use KVM_X86_OP_OPTIONAL_RET0 for get_mt_mask
  Revert "KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range()"
  kvm: x86/mmu: Flush TLB before zap_gfn_range releases RCU
  KVM: arm64: fix typos in comments
  KVM: arm64: Generalise VM features into a set of flags
  KVM: s390: selftests: Add error memop tests
  KVM: s390: selftests: Add more copy memop tests
  KVM: s390: selftests: Add named stages for memop test
  KVM: s390: selftests: Add macro as abstraction for MEM_OP
  KVM: s390: selftests: Split memop tests
  KVM: s390x: fix SCK locking
  RISC-V: KVM: Implement SBI HSM suspend call
  RISC-V: KVM: Add common kvm_riscv_vcpu_wfi() function
  RISC-V: Add SBI HSM suspend related defines
  RISC-V: KVM: Implement SBI v0.3 SRST extension
  ...
2022-03-24 11:58:57 -07:00
Vladimir Murzin
def8c222f0 arm64: Add support of PAuth QARMA3 architected algorithm
QARMA3 is relaxed version of the QARMA5 algorithm which expected to
reduce the latency of calculation while still delivering a suitable
level of security.

Support for QARMA3 can be discovered via ID_AA64ISAR2_EL1

    APA3, bits [15:12] Indicates whether the QARMA3 algorithm is
                       implemented in the PE for address
                       authentication in AArch64 state.

    GPA3, bits [11:8]  Indicates whether the QARMA3 algorithm is
                       implemented in the PE for generic code
                       authentication in AArch64 state.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220224124952.119612-4-vladimir.murzin@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-25 13:38:52 +00:00
Oliver Upton
7dabf02f43 KVM: arm64: Emulate the OS Lock
The OS lock blocks all debug exceptions at every EL. To date, KVM has
not implemented the OS lock for its guests, despite the fact that it is
mandatory per the architecture. Simple context switching between the
guest and host is not appropriate, as its effects are not constrained to
the guest context.

Emulate the OS Lock by clearing MDE and SS in MDSCR_EL1, thereby
blocking all but software breakpoint instructions.

Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220203174159.2887882-5-oupton@google.com
2022-02-08 14:23:41 +00:00
Oliver Upton
f24adc65c5 KVM: arm64: Allow guest to set the OSLK bit
Allow writes to OSLAR and forward the OSLK bit to OSLSR. Do nothing with
the value for now.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220203174159.2887882-4-oupton@google.com
2022-02-08 14:23:41 +00:00
Oliver Upton
d42e26716d KVM: arm64: Stash OSLSR_EL1 in the cpu context
An upcoming change to KVM will emulate the OS Lock from the PoV of the
guest. Add OSLSR_EL1 to the cpu context and handle reads using the
stored value. Define some mnemonics for for handling the OSLM field and
use them to make the reset value of OSLSR_EL1 more readable.

Wire up a custom handler for writes from userspace and prevent any of
the invariant bits from changing. Note that the OSLK bit is not
invariant and will be made writable by the aforementioned change.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220203174159.2887882-3-oupton@google.com
2022-02-08 14:23:41 +00:00
Oliver Upton
e2ffceaae5 KVM: arm64: Correctly treat writes to OSLSR_EL1 as undefined
Writes to OSLSR_EL1 are UNDEFINED and should never trap from EL1 to
EL2, but the kvm trap handler for OSLSR_EL1 handles writes via
ignore_write(). This is confusing to readers of code, but should have
no functional impact.

For clarity, use write_to_read_only() rather than ignore_write(). If a
trap is unexpectedly taken to EL2 in violation of the architecture, this
will WARN_ONCE() and inject an undef into the guest.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
[adopted Mark's changelog suggestion, thanks!]
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220203174159.2887882-2-oupton@google.com
2022-02-08 14:23:40 +00:00
Joey Gouly
9e45365f14 arm64: add ID_AA64ISAR2_EL1 sys register
This is a new ID register, introduced in 8.7.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Reiji Watanabe <reijiw@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211210165432.8106-3-joey.gouly@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-13 18:53:00 +00:00
Marc Zyngier
be08c3cf3c Merge branch kvm-arm64/pkvm/fixed-features into kvmarm-master/next
* kvm-arm64/pkvm/fixed-features: (22 commits)
  : .
  : Add the pKVM fixed feature that allows a bunch of exceptions
  : to either be forbidden or be easily handled at EL2.
  : .
  KVM: arm64: pkvm: Give priority to standard traps over pvm handling
  KVM: arm64: pkvm: Pass vpcu instead of kvm to kvm_get_exit_handler_array()
  KVM: arm64: pkvm: Move kvm_handle_pvm_restricted around
  KVM: arm64: pkvm: Consolidate include files
  KVM: arm64: pkvm: Preserve pending SError on exit from AArch32
  KVM: arm64: pkvm: Handle GICv3 traps as required
  KVM: arm64: pkvm: Drop sysregs that should never be routed to the host
  KVM: arm64: pkvm: Drop AArch32-specific registers
  KVM: arm64: pkvm: Make the ERR/ERX*_EL1 registers RAZ/WI
  KVM: arm64: pkvm: Use a single function to expose all id-regs
  KVM: arm64: Fix early exit ptrauth handling
  KVM: arm64: Handle protected guests at 32 bits
  KVM: arm64: Trap access to pVM restricted features
  KVM: arm64: Move sanitized copies of CPU features
  KVM: arm64: Initialize trap registers for protected VMs
  KVM: arm64: Add handlers for protected VM System Registers
  KVM: arm64: Simplify masking out MTE in feature id reg
  KVM: arm64: Add missing field descriptor for MDCR_EL2
  KVM: arm64: Pass struct kvm to per-EC handlers
  KVM: arm64: Move early handlers to per-EC handlers
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-10-18 17:20:50 +01:00
Marc Zyngier
20a3043075 Merge branch kvm-arm64/vgic-fixes-5.16 into kvmarm-master/next
* kvm-arm64/vgic-fixes-5.16:
  : .
  : Multiple updates to the GICv3 emulation in order to better support
  : the dreadful Apple M1 that only implements half of it, and in a
  : broken way...
  : .
  KVM: arm64: vgic-v3: Align emulated cpuif LPI state machine with the pseudocode
  KVM: arm64: vgic-v3: Don't advertise ICC_CTLR_EL1.SEIS
  KVM: arm64: vgic-v3: Reduce common group trapping to ICV_DIR_EL1 when possible
  KVM: arm64: vgic-v3: Work around GICv3 locally generated SErrors
  KVM: arm64: Force ID_AA64PFR0_EL1.GIC=1 when exposing a virtual GICv3

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-10-17 11:10:14 +01:00
Marc Zyngier
562e530fd7 KVM: arm64: Force ID_AA64PFR0_EL1.GIC=1 when exposing a virtual GICv3
Until now, we always let ID_AA64PFR0_EL1.GIC reflect the value
visible on the host, even if we were running a GICv2-enabled VM
on a GICv3+compat host.

That's fine, but we also now have the case of a host that does not
expose ID_AA64PFR0_EL1.GIC==1 despite having a vGIC. Yes, this is
confusing. Thank you M1.

Let's go back to first principles and expose ID_AA64PFR0_EL1.GIC=1
when a GICv3 is exposed to the guest. This also hides a GICv4.1
CPU interface from the guest which has no business knowing about
the v4.1 extension.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211010150910.2911495-2-maz@kernel.org
2021-10-17 11:06:36 +01:00
Fuad Tabba
16dd1fbb12 KVM: arm64: Simplify masking out MTE in feature id reg
Simplify code for hiding MTE support in feature id register when
MTE is not enabled/supported by KVM.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211010145636.1950948-7-tabba@google.com
2021-10-11 14:57:29 +01:00
Alexandru Elisei
ebf6aa8c04 KVM: arm64: Replace get_raz_id_reg() with get_raz_reg()
Reading a RAZ ID register isn't different from reading any other RAZ
register, so get rid of get_raz_id_reg() and replace it with get_raz_reg(),
which does the same thing, but does it without going through two layers of
indirection.

No functional change.

Suggested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211011105840.155815-4-alexandru.elisei@arm.com
2021-10-11 14:13:59 +01:00
Alexandru Elisei
5a43097623 KVM: arm64: Use get_raz_reg() for userspace reads of PMSWINC_EL0
PMSWINC_EL0 is a write-only register and was initially part of the VCPU
register state, but was later removed in commit 7a3ba3095a ("KVM:
arm64: Remove PMSWINC_EL0 shadow register"). To prevent regressions, the
register was kept accessible from userspace as Read-As-Zero (RAZ).

The read function that is used to handle userspace reads of this
register is get_raz_id_reg(), which, while technically correct, as it
returns 0, it is not semantically correct, as PMSWINC_EL0 is not an ID
register as the function name suggests.

Add a new function, get_raz_reg(), to use it as the accessor for
PMSWINC_EL0, as to not conflate get_raz_id_reg() to handle other types
of registers.

No functional change intended.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211011105840.155815-3-alexandru.elisei@arm.com
2021-10-11 14:13:59 +01:00
Alexandru Elisei
00d5101b25 KVM: arm64: Return early from read_id_reg() if register is RAZ
If read_id_reg() is called for an ID register which is Read-As-Zero (RAZ),
it initializes the return value to zero, then goes through a list of
registers which require special handling before returning the final value.

By not returning as soon as it checks that the register should be RAZ, the
function creates the opportunity for bugs, if, for example, a patch changes
a register to RAZ (like has happened with PMSWINC_EL0 in commit
11663111cd), but doesn't remove the special handling from read_id_reg();
or if a register is RAZ in certain situations, but readable in others.

Return early to make it impossible for a RAZ register to be anything other
than zero.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211011105840.155815-2-alexandru.elisei@arm.com
2021-10-11 14:13:58 +01:00
Marc Zyngier
7c7b363d62 Merge branch kvm-arm64/pkvm-fixed-features-prologue into kvmarm-master/next
* kvm-arm64/pkvm-fixed-features-prologue:
  : Rework a bunch of common infrastructure as a prologue
  : to Fuad Tabba's protected VM fixed feature series.
  KVM: arm64: Upgrade trace_kvm_arm_set_dreg32() to 64bit
  KVM: arm64: Add config register bit definitions
  KVM: arm64: Add feature register flag definitions
  KVM: arm64: Track value of cptr_el2 in struct kvm_vcpu_arch
  KVM: arm64: Keep mdcr_el2's value as set by __init_el2_debug
  KVM: arm64: Restore mdcr_el2 from vcpu
  KVM: arm64: Refactor sys_regs.h,c for nVHE reuse
  KVM: arm64: Fix names of config register fields
  KVM: arm64: MDCR_EL2 is a 64-bit register
  KVM: arm64: Remove trailing whitespace in comment
  KVM: arm64: placeholder to check if VM is protected

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-20 12:23:53 +01:00
Fuad Tabba
f76f89e2f7 KVM: arm64: Refactor sys_regs.h,c for nVHE reuse
Refactor sys_regs.h and sys_regs.c to make it easier to reuse
common code. It will be used in nVHE in a later patch.

Note that the refactored code uses __inline_bsearch for find_reg
instead of bsearch to avoid copying the bsearch code for nVHE.

No functional change intended.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210817081134.2918285-6-tabba@google.com
2021-08-20 11:12:17 +01:00
Fuad Tabba
e6bc555c96 KVM: arm64: Remove trailing whitespace in comment
Remove trailing whitespace from comment in trap_dbgauthstatus_el1().

No functional change intended.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210817081134.2918285-3-tabba@google.com
2021-08-20 11:12:16 +01:00
Marc Zyngier
7a3ba3095a KVM: arm64: Remove PMSWINC_EL0 shadow register
We keep an entry for the PMSWINC_EL0 register in the vcpu structure,
while *never* writing anything there outside of reset.

Given that the register is defined as write-only, that we always
trap when this register is accessed, there is little point in saving
anything anyway.

Get rid of the entry, and save a mighty 8 bytes per vcpu structure.

We still need to keep it exposed to userspace in order to preserve
backward compatibility with previously saved VMs. Since userspace
cannot expect any effect of writing to PMSWINC_EL0, treat the
register as RAZ/WI for the purpose of userspace access.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210719123902.1493805-5-maz@kernel.org
2021-08-02 14:26:34 +01:00
Marc Zyngier
f5eff40058 KVM: arm64: Drop unnecessary masking of PMU registers
We always sanitise our PMU sysreg on the write side, so there
is no need to do it on the read side as well.

Drop the unnecessary masking.

Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210719123902.1493805-3-maz@kernel.org
2021-08-02 14:26:33 +01:00
Marc Zyngier
0ab410a93d KVM: arm64: Narrow PMU sysreg reset values to architectural requirements
A number of the PMU sysregs expose reset values that are not
compliant with the architecture (set bits in the RES0 ranges,
for example).

This in turn has the effect that we need to pointlessly mask
some register fields when using them.

Let's start by making sure we don't have illegal values in the
shadow registers at reset time. This affects all the registers
that dedicate one bit per counter, the counters themselves,
PMEVTYPERn_EL0 and PMSELR_EL0.

Reported-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20210719123902.1493805-2-maz@kernel.org
2021-08-02 14:26:33 +01:00
Steven Price
673638f434 KVM: arm64: Expose KVM_ARM_CAP_MTE
It's now safe for the VMM to enable MTE in a guest, so expose the
capability to user space.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210621111716.37157-5-steven.price@arm.com
2021-06-22 14:08:06 +01:00
Steven Price
e1f358b504 KVM: arm64: Save/restore MTE registers
Define the new system registers that MTE introduces and context switch
them. The MTE feature is still hidden from the ID register as it isn't
supported in a VM yet.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210621111716.37157-4-steven.price@arm.com
2021-06-22 14:08:05 +01:00
Steven Price
ea7fc1bb1c KVM: arm64: Introduce MTE VM feature
Add a new VM feature 'KVM_ARM_CAP_MTE' which enables memory tagging
for a VM. This will expose the feature to the guest and automatically
tag memory pages touched by the VM as PG_mte_tagged (and clear the tag
storage) to ensure that the guest cannot see stale tags, and so that
the tags are correctly saved/restored across swap.

Actually exposing the new capability to user space happens in a later
patch.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
[maz: move VM_SHARED sampling into the critical section]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210621111716.37157-3-steven.price@arm.com
2021-06-22 14:08:05 +01:00
Marc Zyngier
cb853ded1d KVM: arm64: Fix debug register indexing
Commit 03fdfb2690 ("KVM: arm64: Don't write junk to sysregs on
reset") flipped the register number to 0 for all the debug registers
in the sysreg table, hereby indicating that these registers live
in a separate shadow structure.

However, the author of this patch failed to realise that all the
accessors are using that particular index instead of the register
encoding, resulting in all the registers hitting index 0. Not quite
a valid implementation of the architecture...

Address the issue by fixing all the accessors to use the CRm field
of the encoding, which contains the debug register index.

Fixes: 03fdfb2690 ("KVM: arm64: Don't write junk to sysregs on reset")
Reported-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
2021-05-15 10:27:59 +01:00
Marc Zyngier
fbb31e5f3a Merge branch 'kvm-arm64/debug-5.13' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-13 15:34:15 +01:00
Alexandru Elisei
96f4f6809b KVM: arm64: Don't advertise FEAT_SPE to guests
Even though KVM sets up MDCR_EL2 to trap accesses to the SPE buffer and
sampling control registers and to inject an undefined exception, the
presence of FEAT_SPE is still advertised in the ID_AA64DFR0_EL1 register,
if the hardware supports it. Getting an undefined exception when accessing
a register usually happens for a hardware feature which is not implemented,
and indeed this is how PMU emulation is handled when the virtual machine
has been created without the KVM_ARM_VCPU_PMU_V3 feature. Let's be
consistent and never advertise FEAT_SPE, because KVM doesn't have support
for emulating it yet.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210409152154.198566-3-alexandru.elisei@arm.com
2021-04-11 09:46:13 +01:00
Alexandru Elisei
13611bc80d KVM: arm64: Don't print warning when trapping SPE registers
KVM sets up MDCR_EL2 to trap accesses to the SPE buffer and sampling
control registers and it relies on the fact that KVM injects an undefined
exception for unknown registers. This mechanism of injecting undefined
exceptions also prints a warning message for the host kernel; for example,
when a guest tries to access PMSIDR_EL1:

[    2.691830] kvm [142]: Unsupported guest sys_reg access at: 80009e78 [800003c5]
[    2.691830]  { Op0( 3), Op1( 0), CRn( 9), CRm( 9), Op2( 7), func_read },

This is unnecessary, because KVM has explicitly configured trapping of
those registers and is well aware of their existence. Prevent the warning
by adding the SPE registers to the list of registers that KVM emulates.
The access function will inject the undefined exception.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210409152154.198566-2-alexandru.elisei@arm.com
2021-04-11 09:46:12 +01:00
Suzuki K Poulose
cc427cbb15 KVM: arm64: Handle access to TRFCR_EL1
Rather than falling to an "unhandled access", inject add an explicit
"undefined access" for TRFCR_EL1 access from the guest.

Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210405164307.1720226-6-suzuki.poulose@arm.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2021-04-06 16:05:12 -06:00
Marc Zyngier
c93199e93e Merge branch 'kvm-arm64/pmu-debug-fixes-5.11' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-02-12 14:08:41 +00:00
Alexandru Elisei
8c358b29e0 KVM: arm64: Correct spelling of DBGDIDR register
The aarch32 debug ID register is called DBG*D*IDR (emphasis added), not
DBGIDR, use the correct spelling.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210128132823.35067-1-alexandru.elisei@arm.com
2021-02-03 11:01:19 +00:00
Marc Zyngier
46081078fe KVM: arm64: Upgrade PMU support to ARMv8.4
Upgrading the PMU code from ARMv8.1 to ARMv8.4 turns out to be
pretty easy. All that is required is support for PMMIR_EL1, which
is read-only, and for which returning 0 is a valid option as long
as we don't advertise STALL_SLOT as an implemented event.

Let's just do that and adjust what we return to the guest.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-02-03 11:00:22 +00:00
Marc Zyngier
94893fc9ad KVM: arm64: Limit the debug architecture to ARMv8.0
Let's not pretend we support anything but ARMv8.0 as far as the
debug architecture is concerned.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-02-03 11:00:08 +00:00
Marc Zyngier
c885793558 KVM: arm64: Refactor filtering of ID registers
Our current ID register filtering is starting to be a mess of if()
statements, and isn't going to get any saner.

Let's turn it into a switch(), which has a chance of being more
readable, and introduce a FEATURE() macro that allows easy generation
of feature masks.

No functionnal change intended.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-02-03 11:00:01 +00:00
Marc Zyngier
99b6a4013f KVM: arm64: Add handling of AArch32 PCMEID{2,3} PMUv3 registers
Despite advertising support for AArch32 PMUv3p1, we fail to handle
the PMCEID{2,3} registers, which conveniently alias with the top
bits of PMCEID{0,1}_EL1.

Implement these registers with the usual AA32(HI/LO) aliasing
mechanism.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-02-03 10:59:26 +00:00
Marc Zyngier
cb95914685 KVM: arm64: Fix AArch32 PMUv3 capping
We shouldn't expose *any* PMU capability when no PMU has been
configured for this VM.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-02-03 10:59:16 +00:00
Marc Zyngier
bea7e97fef KVM: arm64: Fix missing RES1 in emulation of DBGBIDR
The AArch32 CP14 DBGDIDR has bit 15 set to RES1, which our current
emulation doesn't set. Just add the missing bit.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-02-03 10:59:06 +00:00
Alexandru Elisei
7ba8b4380a KVM: arm64: Use the reg_to_encoding() macro instead of sys_reg()
The reg_to_encoding() macro is a wrapper over sys_reg() and conveniently
takes a sys_reg_desc or a sys_reg_params argument and returns the 32 bit
register encoding. Use it instead of calling sys_reg() directly.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210106144218.110665-1-alexandru.elisei@arm.com
2021-01-14 11:09:38 +00:00
Marc Zyngier
7ded92e25c KVM: arm64: Simplify handling of absent PMU system registers
Now that all PMU registers are gated behind a .visibility callback,
remove the other checks against an absent PMU.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-01-14 11:02:52 +00:00
Marc Zyngier
11663111cd KVM: arm64: Hide PMU registers from userspace when not available
It appears that while we are now able to properly hide PMU
registers from the guest when a PMU isn't available (either
because none has been configured, the host doesn't have
the PMU support compiled in, or that the HW doesn't have
one at all), we are still exposing more than we should to
userspace.

Introduce a visibility callback gating all the PMU registers,
which covers both usrespace and guest.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-01-14 11:02:51 +00:00
Marc Zyngier
957cbca731 KVM: arm64: Remove spurious semicolon in reg_to_encoding()
Although not a problem right now, it flared up while working
on some other aspects of the code-base. Remove the useless
semicolon.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-12-31 15:05:46 +00:00
Marc Zyngier
2a5f1b67ec KVM: arm64: Don't access PMCR_EL0 when no PMU is available
We reset the guest's view of PMCR_EL0 unconditionally, based on
the host's view of this register. It is however legal for an
implementation not to provide any PMU, resulting in an UNDEF.

The obvious fix is to skip the reset of this shadow register
when no PMU is available, sidestepping the issue entirely.
If no PMU is available, the guest is not able to request
a virtual PMU anyway, so not doing nothing is the right thing
to do!

It is unlikely that this bug can hit any HW implementation
though, as they all provide a PMU. It has been found using nested
virt with the host KVM not implementing the PMU itself.

Fixes: ab9468340d ("arm64: KVM: Add access handler for PMCR register")
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201210083059.1277162-1-maz@kernel.org
2020-12-22 10:47:09 +00:00
Marc Zyngier
f86e54653e Merge remote-tracking branch 'origin/kvm-arm64/csv3' into kvmarm-master/queue
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-12-03 19:12:24 +00:00
Marc Zyngier
4f1df628d4 KVM: arm64: Advertise ID_AA64PFR0_EL1.CSV3=1 if the CPUs are Meltdown-safe
Cores that predate the introduction of ID_AA64PFR0_EL1.CSV3 to
the ARMv8 architecture have this field set to 0, even of some of
them are not affected by the vulnerability.

The kernel maintains a list of unaffected cores (A53, A55 and a few
others) so that it doesn't impose an expensive mitigation uncessarily.

As we do for CSV2, let's expose the CSV3 property to guests that run
on HW that is effectively not vulnerable. This can be reset to zero
by writing to the ID register from userspace, ensuring that VMs can
be migrated despite the new property being set.

Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-30 16:02:53 +00:00
Marc Zyngier
bb528f4f57 Merge branch 'kvm-arm64/cache-demux' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-27 19:48:12 +00:00
Andrew Jones
c73a441617 KVM: arm64: CSSELR_EL1 max is 13
Not counting TnD, which KVM doesn't currently consider, CSSELR_EL1
can have a maximum value of 0b1101 (13), which corresponds to an
instruction cache at level 7. With CSSELR_MAX set to 12 we can
only select up to cache level 6. Change it to 14.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201126134641.35231-2-drjones@redhat.com
2020-11-27 19:46:30 +00:00
Marc Zyngier
6e5d8c713d Merge branch 'kvm-arm64/pmu-undef' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-27 11:46:47 +00:00
Marc Zyngier
a3da935802 KVM: arm64: Remove dead PMU sysreg decoding code
The handling of traps in access_pmu_evcntr() has a couple of
omminous "else return false;" statements that don't make any sense:
the decoding tree coverse all the registers that trap to this handler,
and returning false implies that we change PC, which we don't.

Get rid of what is evidently dead code.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-27 11:41:03 +00:00
Marc Zyngier
f975ccb08d KVM: arm64: Remove PMU RAZ/WI handling
There is no RAZ/WI handling allowed for the PMU registers in the
ARMv8 architecture. Nobody can remember how we cam to the conclusion
that we could do this, but the ARMv8 ARM is pretty clear that we cannot.

Remove the RAZ/WI handling of the PMU system registers when it is
not configured.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-27 11:40:53 +00:00
Marc Zyngier
b0737e999e KVM: arm64: Inject UNDEF on PMU access when no PMU configured
The ARMv8 architecture says that in the absence of FEAT_PMUv3,
all the PMU-related register generate an UNDEF. Let's make
sure that all our PMU handers catch this case by hooking into
check_pmu_access_disabled(), and add checks in a couple of
other places.

Note that we still cannot deliver an exception into the guest
as the offending cases are already caught by the RAZ/WI handling.
But this puts us one step away to architectural compliance.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-27 11:40:47 +00:00
Marc Zyngier
04355e41a6 KVM: arm64: Set ID_AA64DFR0_EL1.PMUVer to 0 when no PMU support
We always expose the HW view of PMU in ID_AA64FDR0_EL1.PMUver,
even when the PMU feature is disabled, while the architecture
says that FEAT_PMUv3 not being implemented should result in this
field being zero.

Let's follow the architecture's guidance.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-27 11:40:32 +00:00
Marc Zyngier
149f120edb Merge branch 'kvm-arm64/copro-no-more' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-27 11:33:16 +00:00
Marc Zyngier
37da329ed6 Merge branch 'kvm-arm64/el2-pc' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-27 11:33:10 +00:00
Paolo Bonzini
2c38234c42 KVM/arm64 fixes for v5.10, take #3
- Allow userspace to downgrade ID_AA64PFR0_EL1.CSV2
 - Inject UNDEF on SCXTNUM_ELx access
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl+tsAQPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDieIP/06lrDbhKUv1BX5oOlNFKifsaxmrCiP2A9Ql
 1RiT1wI4Ba+QcgtnyUOI/SQgNx4Z+LkUFghkqP3TvtPEj3Y3zhCFiyz3wn/H0YJA
 eZ5kI5XkG+9NOdzpyhNKiN2ZOVz0/RpHnIyHWU1SFD3Ky58xHsI1w5boNcTYJDXE
 IVVAQ05HzNMOnqEnfS3Z2Oe99jiYXS1C80Rf2WvQuQQW6Nwu3J0W5VZztw/E9VG0
 wbivuOaFzk2Zee30oTXxkJfFDS7m3fZ2dXvHSUB9Luv3GMAFp/sK2ZmEg7ZUiAl1
 zBPW35jHv1bahU88IQ7LhvTa+Tg6aEGnCrjHO9JiCx4z0VLnEz86AzejItaGvRu7
 SGf7taj4xRfUVxlJsW1i5Nel7hpmk8ip59hWUq5jTu7bPQvnEFpSfWANgobQrGF4
 pAtYUyaJcU5hRml4NUOy/gGkBzZSDloe1ClDUsdVZrbMKSjnATD8/0Z2oxHthVI1
 vvzovTXOQ7LK81Qm9GZ6Xlj0vXJh2V91wMTxy82lK5PAmKuVWvgqOWbH7e8YX+2T
 VlY5jkIyjwj9vwyMQHmaR5f01eZotYVTM+YKZcjx6O+1MGkrSxZkVptf0g8Bj0X3
 VmCYHyA5LIil8bx58kLfoZhAtjOaAFf+j5XCTjP0zCB4mVHcrCk0rLBPyvPsZB73
 I3WFpQPq
 =eZCZ
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for v5.10, take #3

- Allow userspace to downgrade ID_AA64PFR0_EL1.CSV2
- Inject UNDEF on SCXTNUM_ELx access
2020-11-13 06:28:23 -05:00
Marc Zyngier
ed4ffaf49b KVM: arm64: Handle SCXTNUM_ELx traps
As the kernel never sets HCR_EL2.EnSCXT, accesses to SCXTNUM_ELx
will trap to EL2. Let's handle that as gracefully as possible
by injecting an UNDEF exception into the guest. This is consistent
with the guest's view of ID_AA64PFR0_EL1.CSV2 being at most 1.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201110141308.451654-4-maz@kernel.org
2020-11-12 21:22:46 +00:00
Marc Zyngier
338b17933a KVM: arm64: Unify trap handlers injecting an UNDEF
A large number of system register trap handlers only inject an
UNDEF exeption, and yet each class of sysreg seems to provide its
own, identical function.

Let's unify them all, saving us introducing yet another one later.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201110141308.451654-3-maz@kernel.org
2020-11-12 21:22:45 +00:00
Marc Zyngier
23711a5e66 KVM: arm64: Allow setting of ID_AA64PFR0_EL1.CSV2 from userspace
We now expose ID_AA64PFR0_EL1.CSV2=1 to guests running on hosts
that are immune to Spectre-v2, but that don't have this field set,
most likely because they predate the specification.

However, this prevents the migration of guests that have started on
a host the doesn't fake this CSV2 setting to one that does, as KVM
rejects the write to ID_AA64PFR0_EL2 on the grounds that it isn't
what is already there.

In order to fix this, allow userspace to set this field as long as
this doesn't result in a promising more than what is already there
(setting CSV2 to 0 is acceptable, but setting it to 1 when it is
already set to 0 isn't).

Fixes: e1026237f9 ("KVM: arm64: Set CSV2 for guests on hardware unaffected by Spectre-v2")
Reported-by: Peng Liang <liangpeng10@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201110141308.451654-2-maz@kernel.org
2020-11-12 21:22:22 +00:00
Marc Zyngier
4f6b838c37 Linux 5.10-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl+V+LMeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGGQoH/1FIf6373lekuQf0
 pSq+2PPeILjL6+BppjNGJdwTKTEFEaz7xBpDwZURW2dt0M5jib2sn/0VJ/lh0Ln3
 880hXPjVyziU7/p1vTiPFYwKxav/ZE5cHrEW+nKimucyYPgkDxikFRuvrPQ1M0Sc
 vLZMmwjQlBD1kTsh9WR5lQ9Z8KqUtOazW47AbWE5QTTCQPmIXIdqByqLXlqS46Ok
 gW8tqaCI+FpBLP3fJn0EX5UTYH1Tsj9TmIFE8jqm5lGa/+VDM5KNyczEosKv86Xk
 0hBEUbAAZWdwieySJwBH7Njqu9g1o7bRUIJJsbXm0Fcnu+Ft619r3mJkkkXaaWKN
 mk7M/Uk=
 =1dE8
 -----END PGP SIGNATURE-----

Merge tag 'v5.10-rc1' into kvmarm-master/next

Linux 5.10-rc1

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-12 21:20:43 +00:00
Marc Zyngier
6ac4a5ac50 KVM: arm64: Drop kvm_coproc.h
kvm_coproc.h used to serve as a compatibility layer for the files
shared between the 32 and 64 bit ports.

Another one bites the dust...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 11:22:52 +00:00
Marc Zyngier
50f3045327 KVM: arm64: Drop is_aarch32 trap attribute
is_aarch32 is only used once, and can be trivially replaced by
testing Op0 instead. Drop it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 11:22:52 +00:00
Marc Zyngier
2d27fd7848 KVM: arm64: Drop is_32bit trap attribute
The is_32bit attribute is now completely unused, drop it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 11:22:52 +00:00
Marc Zyngier
1da42c34d7 KVM: arm64: Map AArch32 cp14 register to AArch64 sysregs
Similarly to what has been done on the cp15 front, repaint the
debug registers to use their AArch64 counterparts. This results
in some simplification as we can remove the 32bit-specific
accessors.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 11:22:51 +00:00
Marc Zyngier
b1ea1d760d KVM: arm64: Map AArch32 cp15 register to AArch64 sysregs
Move all the cp15 registers over to their AArch64 counterpart.
This requires the annotation of a few of them (such as the usual
DFAR/IFAR vs FAR_EL1), and a new helper that generates mask/shift
pairs for the various configurations.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 11:22:51 +00:00
Marc Zyngier
ca4e514774 KVM: arm64: Introduce handling of AArch32 TTBCR2 traps
ARMv8.2 introduced TTBCR2, which shares TCR_EL1 with TTBCR.
Gracefully handle traps to this register when HCR_EL2.TVM is set.

Cc: stable@vger.kernel.org
Reported-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 11:22:50 +00:00
Marc Zyngier
21c810017c KVM: arm64: Move VHE direct sysreg accessors into kvm_host.h
As we are about to need to access system registers from the HYP
code based on their internal encoding, move the direct sysreg
accessors to a common include file, with a VHE-specific guard.

No functionnal change.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:25 +00:00
Marc Zyngier
cdb5e02ed1 KVM: arm64: Make kvm_skip_instr() and co private to HYP
In an effort to remove the vcpu PC manipulations from EL1 on nVHE
systems, move kvm_skip_instr() to be HYP-specific. EL1's intent
to increment PC post emulation is now signalled via a flag in the
vcpu structure.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:24 +00:00
Marc Zyngier
6ddbc281e2 KVM: arm64: Move kvm_vcpu_trap_il_is32bit into kvm_skip_instr32()
There is no need to feed the result of kvm_vcpu_trap_il_is32bit()
to kvm_skip_instr(), as only AArch32 has a variable length ISA, and
this helper can equally be called from kvm_skip_instr32(), reducing
the complexity at all the call sites.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10 08:34:24 +00:00
Linus Torvalds
407ab57963 ARM:
- Fix compilation error when PMD and PUD are folded
 - Fix regression in reads-as-zero behaviour of ID_AA64ZFR0_EL1
 - Add aarch64 get-reg-list test
 
 x86:
 - fix semantic conflict between two series merged for 5.10
 - fix (and test) enforcement of paravirtual cpuid features
 
 Generic:
 - various cleanups to memory management selftests
 - new selftests testcase for performance of dirty logging
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl+pVjkUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroO3fAf/ZniW/7FC4pD/M0txXUst3mKNcC16
 AbMfN36dvzdWBnAuTVsP2d+XM/sbPNacomcJGfJ5II9TKrb00FUNxU37In7vdbbm
 WjpyDEpRDXnCY+OXs7dwY66dEXzv9GTzlQaGuah67AeGpzSuu3zrXlu07di446Gv
 ZtHvbzFEvos7cByp3LoPfvbnvv9kkD5mQkOW7wG42hUPrxMNxtHC+qyP92DIpV8d
 etDNC95rhdhhZM3LAlvO6Bp4I1uFXpYHEHtIOOT05IB9clNhfdgsuD8wiqWfEo0l
 sVhg3yXWbbfGaP3vEZp5QY9qko8I0XjwIWc5hWsIHST7uPqgi8a/wIbbEA==
 =jBcA
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:
   - fix compilation error when PMD and PUD are folded
   - fix regression in reads-as-zero behaviour of ID_AA64ZFR0_EL1
   - add aarch64 get-reg-list test

  x86:
   - fix semantic conflict between two series merged for 5.10
   - fix (and test) enforcement of paravirtual cpuid features

  selftests:
   - various cleanups to memory management selftests
   - new selftests testcase for performance of dirty logging"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits)
  KVM: selftests: allow two iterations of dirty_log_perf_test
  KVM: selftests: Introduce the dirty log perf test
  KVM: selftests: Make the number of vcpus global
  KVM: selftests: Make the per vcpu memory size global
  KVM: selftests: Drop pointless vm_create wrapper
  KVM: selftests: Add wrfract to common guest code
  KVM: selftests: Simplify demand_paging_test with timespec_diff_now
  KVM: selftests: Remove address rounding in guest code
  KVM: selftests: Factor code out of demand_paging_test
  KVM: selftests: Use a single binary for dirty/clear log test
  KVM: selftests: Always clear dirty bitmap after iteration
  KVM: selftests: Add blessed SVE registers to get-reg-list
  KVM: selftests: Add aarch64 get-reg-list test
  selftests: kvm: test enforcement of paravirtual cpuid features
  selftests: kvm: Add exception handling to selftests
  selftests: kvm: Clear uc so UCALL_NONE is being properly reported
  selftests: kvm: Fix the segment descriptor layout to match the actual layout
  KVM: x86: handle MSR_IA32_DEBUGCTLMSR with report_ignored_msrs
  kvm: x86: request masterclock update any time guest uses different msr
  kvm: x86: ensure pv_cpuid.features is initialized when enabling cap
  ...
2020-11-09 13:58:10 -08:00
Paolo Bonzini
ff2bb93f53 KVM/arm64 fixes for v5.10, take #2
- Fix compilation error when PMD and PUD are folded
 - Fix regresssion of the RAZ behaviour of ID_AA64ZFR0_EL1
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl+lep8PHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDwNEP/3KJfcJtA//6JlQkRXRkYjXVZ2/2Crr0IHdu
 TqzQZ7Mg8w281HuZjrpvYwzHlXbQ89RJT+G/avG5EmfQcmJU5eQna9T1w1Vq2d2q
 3K35HdQskfFJYJ5MMvxSZ1WsE+EMWOXJGwL3jss/ThS+qzD+Ag7Fdg3Eg6kTv0Ic
 eMFtnBzI7UddNwZcrPM43dZTh9JEls9mySF6kjsIleUm3Xnk+6NKP6nDnJMukBOF
 b+9DaGx9cdXI7bqm3elvaWIeSpJQIBhLvYQqyD0OyF2qAqWrGHEULx2qZbMHRG2x
 lhQDcMyKjtv0hzKxmotVjhDaz/Af+yDJ57IRHLfEq/v5ytIqg76vxDwIjyLUHXyk
 3lPoycCDtcgRKlkoz1jQ35oDCo0LUG2sUgWIn6D3Pim1aZfppnlDjCuuOMGyf5Db
 RS0jIBm5u1YDchnL43HsQRBiVUD1In+QHtR/ZECMMUqjtnCojSZp30BGlqoVSlb8
 aSpzecBaA+C1lRFqLTHCldONloE/85vpsYEIfxB1SqVwrpDkrXaEwZPU6im5a8om
 Q9aJ+TqIUGHLLWsI4SGNrS/kjSpt/GAP4Kkfg+wBqPUgYL+lDTSekFWY6DbekwDP
 +CdopqFCSa1Jfby/6rTYzGK1152NH03O63Ky1Z3uGKpCNK9lAptdUIYMkYCj69C2
 m8zS3zyz
 =LmMm
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for v5.10, take #2

- Fix compilation error when PMD and PUD are folded
- Fix regresssion of the RAZ behaviour of ID_AA64ZFR0_EL1
2020-11-08 04:15:53 -05:00
Andrew Jones
c512298eed KVM: arm64: Remove AA64ZFR0_EL1 accessors
The AA64ZFR0_EL1 accessors are just the general accessors with
its visibility function open-coded. It also skips the if-else
chain in read_id_reg, but there's no reason not to go there.
Indeed consolidating ID register accessors and removing lines
of code make it worthwhile.

Remove the AA64ZFR0_EL1 accessors, replacing them with the
general accessors for sanitized ID registers.

No functional change intended.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201105091022.15373-5-drjones@redhat.com
2020-11-06 16:00:29 +00:00
Andrew Jones
912dee5726 KVM: arm64: Check RAZ visibility in ID register accessors
The instruction encodings of ID registers are preallocated. Until an
encoding is assigned a purpose the register is RAZ. KVM's general ID
register accessor functions already support both paths, RAZ or not.
If for each ID register we can determine if it's RAZ or not, then all
ID registers can build on the general functions. The register visibility
function allows us to check whether a register should be completely
hidden or not, extending it to also report when the register should
be RAZ or not allows us to use it for ID registers as well.

Check for RAZ visibility in the ID register accessor functions,
allowing the RAZ case to be handled in a generic way for all system
registers.

The new REG_RAZ flag will be used in a later patch. This patch has
no intended functional change.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201105091022.15373-4-drjones@redhat.com
2020-11-06 16:00:29 +00:00
Andrew Jones
01fe5ace92 KVM: arm64: Consolidate REG_HIDDEN_GUEST/USER
REG_HIDDEN_GUEST and REG_HIDDEN_USER are always used together.
Consolidate them into a single REG_HIDDEN flag. We can always
add another flag later if some register needs to expose itself
differently to the guest than it does to userspace.

No functional change intended.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201105091022.15373-3-drjones@redhat.com
2020-11-06 16:00:29 +00:00
Andrew Jones
f81cb2c3ad KVM: arm64: Don't hide ID registers from userspace
ID registers are RAZ until they've been allocated a purpose, but
that doesn't mean they should be removed from the KVM_GET_REG_LIST
list. So far we only have one register, SYS_ID_AA64ZFR0_EL1, that
is hidden from userspace when its function, SVE, is not present.

Expose SYS_ID_AA64ZFR0_EL1 to userspace as RAZ when SVE is not
implemented. Removing the userspace visibility checks is enough
to reexpose it, as it will already return zero to userspace when
SVE is not present. The register already behaves as RAZ for the
guest when SVE is not present.

Fixes: 73433762fc ("KVM: arm64/sve: System register context switch and access support")
Reported-by: 张东旭 <xu910121@sina.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org#v5.2+
Link: https://lore.kernel.org/r/20201105091022.15373-2-drjones@redhat.com
2020-11-06 16:00:29 +00:00
Linus Torvalds
2d38c80d5b ARM:
* selftest fix
 * Force PTE mapping on device pages provided via VFIO
 * Fix detection of cacheable mapping at S2
 * Fallback to PMD/PTE mappings for composite huge pages
 * Fix accounting of Stage-2 PGD allocation
 * Fix AArch32 handling of some of the debug registers
 * Simplify host HYP entry
 * Fix stray pointer conversion on nVHE TLB invalidation
 * Fix initialization of the nVHE code
 * Simplify handling of capabilities exposed to HYP
 * Nuke VCPUs caught using a forbidden AArch32 EL0
 
 x86:
 * new nested virtualization selftest
 * Miscellaneous fixes
 * make W=1 fixes
 * Reserve new CPUID bit in the KVM leaves
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl+dhRAUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPWCgf/U997UW/11IdNtkehQO/DFdx7lHev
 +IahN1Pnbt92ZoR5nGhK9pgvDahIVhqTmUvgV+3fD24OnqXTpYTu1fliBvL6ynbN
 J9Ycf0zFAgwfgTTD5UexTlEovnhX4xz7NDmd6rpxGDZdMaBHQFPkCXBFK45pf4nd
 O349aHV0X1AA7Tt/sLhpXpi74Vake1xErLHKhIVLHKyo/zDm+Q0UZry068NNBzTr
 St3+QSGlFXhuekVrZLh+DShh6rZGLyY9tcySt6o0Jk7fSs1lmEnPbBgeeqYmyHMd
 Yn+ybhthmNkkpI8so70TA9roiVar4UmjnMBOiav62bo7ue26pKE5cWQyXw==
 =mvBr
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:
   - selftest fix
   - force PTE mapping on device pages provided via VFIO
   - fix detection of cacheable mapping at S2
   - fallback to PMD/PTE mappings for composite huge pages
   - fix accounting of Stage-2 PGD allocation
   - fix AArch32 handling of some of the debug registers
   - simplify host HYP entry
   - fix stray pointer conversion on nVHE TLB invalidation
   - fix initialization of the nVHE code
   - simplify handling of capabilities exposed to HYP
   - nuke VCPUs caught using a forbidden AArch32 EL0

  x86:
   - new nested virtualization selftest
   - miscellaneous fixes
   - make W=1 fixes
   - reserve new CPUID bit in the KVM leaves"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: vmx: remove unused variable
  KVM: selftests: Don't require THP to run tests
  KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again
  KVM: selftests: test behavior of unmapped L2 APIC-access address
  KVM: x86: Fix NULL dereference at kvm_msr_ignored_check()
  KVM: x86: replace static const variables with macros
  KVM: arm64: Handle Asymmetric AArch32 systems
  arm64: cpufeature: upgrade hyp caps to final
  arm64: cpufeature: reorder cpus_have_{const, final}_cap()
  KVM: arm64: Factor out is_{vhe,nvhe}_hyp_code()
  KVM: arm64: Force PTE mapping on fault resulting in a device mapping
  KVM: arm64: Use fallback mapping sizes for contiguous huge page sizes
  KVM: arm64: Fix masks in stage2_pte_cacheable()
  KVM: arm64: Fix AArch32 handling of DBGD{CCINT,SCRext} and DBGVCR
  KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT
  KVM: arm64: Drop useless PAN setting on host EL1 to EL2 transition
  KVM: arm64: Remove leftover kern_hyp_va() in nVHE TLB invalidation
  KVM: arm64: Don't corrupt tpidr_el2 on failed HVC call
  x86/kvm: Reserve KVM_FEATURE_MSI_EXT_DEST_ID
2020-11-01 09:43:32 -08:00
Paolo Bonzini
699116c45e KVM/arm64 fixes for 5.10, take #1
- Force PTE mapping on device pages provided via VFIO
 - Fix detection of cacheable mapping at S2
 - Fallback to PMD/PTE mappings for composite huge pages
 - Fix accounting of Stage-2 PGD allocation
 - Fix AArch32 handling of some of the debug registers
 - Simplify host HYP entry
 - Fix stray pointer conversion on nVHE TLB invalidation
 - Fix initialization of the nVHE code
 - Simplify handling of capabilities exposed to HYP
 - Nuke VCPUs caught using a forbidden AArch32 EL0
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl+cO3oPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDJdoP/jiKYR8iVkq/RmIsQl383KwQiJGTMi0iL2Zw
 /tHnf8bKowAPyG8bqyXMJqlWOb7tcp6U3m+WhENAZHWH02r2M921q0DGVW5p48ou
 Ek4zJnFF1iL5ryOBgROKK1nymUZOi3W1a1SsD6ZPImQsKsjNGbqKgWsGs8i9ft0P
 vkNZwlqebzJp+OR3agJemc8dkXcGlcRHk7fffdMcU8jsF5RJ9zC0XU0+scKryxhV
 o8PzKSlwCeisyL+Vz+s7POzoD3Rt+P+qjblz5NWqy/NHuLh+V9hzUSDOjWbZb70f
 Er29vGv7Yjb4nKK2KUzNqirSfXsRylfsjGr+YibP6uKEUMuUm/V41DqzT7nMalIm
 cOBGtPk6W9wOL8JNDmlyVGCfATI+5RrErQ8nFClrPu3qw4Hv4pb1Ad5OgAhNE0u1
 PfUyBBtQKNAjTdVCRfSuFL4d2yegy1rrpCmYWrvdQjLlXemwgYgKnSQN98cZHgjA
 foCAP5gJpAWGualyhKJx2CkY/5deeWKS39ISiNgHo5eRvKsGEnMN7j9UX77VbhRr
 PkwCmeUJ3kjzaAfmtcBN/iLjwQbWypidjX2Vbfl5WoVdLuiYXFZIvsdaqRHGl56F
 5zhYxM8DKODNEJKMl7a89oEFGKy8x1PQ0kqer9a6GBWkNDrQMOSL4+FkxCyM2m9g
 RoHtmdy0
 =gVaX
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 5.10, take #1

- Force PTE mapping on device pages provided via VFIO
- Fix detection of cacheable mapping at S2
- Fallback to PMD/PTE mappings for composite huge pages
- Fix accounting of Stage-2 PGD allocation
- Fix AArch32 handling of some of the debug registers
- Simplify host HYP entry
- Fix stray pointer conversion on nVHE TLB invalidation
- Fix initialization of the nVHE code
- Simplify handling of capabilities exposed to HYP
- Nuke VCPUs caught using a forbidden AArch32 EL0
2020-10-30 13:25:09 -04:00
Marc Zyngier
4a1c2c7f63 KVM: arm64: Fix AArch32 handling of DBGD{CCINT,SCRext} and DBGVCR
The DBGD{CCINT,SCRext} and DBGVCR register entries in the cp14 array
are missing their target register, resulting in all accesses being
targetted at the guard sysreg (indexed by __INVALID_SYSREG__).

Point the emulation code at the actual register entries.

Fixes: bdfb4b389c ("arm64: KVM: add trap handlers for AArch32 debug registers")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201029172409.2768336-1-maz@kernel.org
2020-10-29 19:49:03 +00:00
Rob Herring
96d389ca10 arm64: Add workaround for Arm Cortex-A77 erratum 1508412
On Cortex-A77 r0p0 and r1p0, a sequence of a non-cacheable or device load
and a store exclusive or PAR_EL1 read can cause a deadlock.

The workaround requires a DMB SY before and after a PAR_EL1 register
read. In addition, it's possible an interrupt (doing a device read) or
KVM guest exit could be taken between the DMB and PAR read, so we
also need a DMB before returning from interrupt and before returning to
a guest.

A deadlock is still possible with the workaround as KVM guests must also
have the workaround. IOW, a malicious guest can deadlock an affected
systems.

This workaround also depends on a firmware counterpart to enable the h/w
to insert DMB SY after load and store exclusive instructions. See the
errata document SDEN-1152370 v10 [1] for more information.

[1] https://static.docs.arm.com/101992/0010/Arm_Cortex_A77_MP074_Software_Developer_Errata_Notice_v10.pdf

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: kvmarm@lists.cs.columbia.edu
Link: https://lore.kernel.org/r/20201028182839.166037-2-robh@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-10-29 12:56:01 +00:00
Linus Torvalds
f9a705ad1c ARM:
- New page table code for both hypervisor and guest stage-2
 - Introduction of a new EL2-private host context
 - Allow EL2 to have its own private per-CPU variables
 - Support of PMU event filtering
 - Complete rework of the Spectre mitigation
 
 PPC:
 - Fix for running nested guests with in-kernel IRQ chip
 - Fix race condition causing occasional host hard lockup
 - Minor cleanups and bugfixes
 
 x86:
 - allow trapping unknown MSRs to userspace
 - allow userspace to force #GP on specific MSRs
 - INVPCID support on AMD
 - nested AMD cleanup, on demand allocation of nested SVM state
 - hide PV MSRs and hypercalls for features not enabled in CPUID
 - new test for MSR_IA32_TSC writes from host and guest
 - cleanups: MMU, CPUID, shared MSRs
 - LAPIC latency optimizations ad bugfixes
 
 For x86, also included in this pull request is a new alternative and
 (in the future) more scalable implementation of extended page tables
 that does not need a reverse map from guest physical addresses to
 host physical addresses.  For now it is disabled by default because
 it is still lacking a few of the existing MMU's bells and whistles.
 However it is a very solid piece of work and it is already available
 for people to hammer on it.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl+S8dsUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroM40Af+M46NJmuS5rcwFfybvK/c42KT6svX
 Co1NrZDwzSQ2mMy3WQzH9qeLvb+nbY4sT3n5BPNPNsT+aIDPOTDt//qJ2/Ip9UUs
 tRNea0MAR96JWLE7MSeeRxnTaQIrw/AAZC0RXFzZvxcgytXwdqBExugw4im+b+dn
 Dcz8QxX1EkwT+4lTm5HC0hKZAuo4apnK1QkqCq4SdD2QVJ1YE6+z7pgj4wX7xitr
 STKD6q/Yt/0ndwqS0GSGbyg0jy6mE620SN6isFRkJYwqfwLJci6KnqvEK67EcNMu
 qeE017K+d93yIVC46/6TfVHzLR/D1FpQ8LZ16Yl6S13OuGIfAWBkQZtPRg==
 =AD6a
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "For x86, there is a new alternative and (in the future) more scalable
  implementation of extended page tables that does not need a reverse
  map from guest physical addresses to host physical addresses.

  For now it is disabled by default because it is still lacking a few of
  the existing MMU's bells and whistles. However it is a very solid
  piece of work and it is already available for people to hammer on it.

  Other updates:

  ARM:
   - New page table code for both hypervisor and guest stage-2
   - Introduction of a new EL2-private host context
   - Allow EL2 to have its own private per-CPU variables
   - Support of PMU event filtering
   - Complete rework of the Spectre mitigation

  PPC:
   - Fix for running nested guests with in-kernel IRQ chip
   - Fix race condition causing occasional host hard lockup
   - Minor cleanups and bugfixes

  x86:
   - allow trapping unknown MSRs to userspace
   - allow userspace to force #GP on specific MSRs
   - INVPCID support on AMD
   - nested AMD cleanup, on demand allocation of nested SVM state
   - hide PV MSRs and hypercalls for features not enabled in CPUID
   - new test for MSR_IA32_TSC writes from host and guest
   - cleanups: MMU, CPUID, shared MSRs
   - LAPIC latency optimizations ad bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (232 commits)
  kvm: x86/mmu: NX largepage recovery for TDP MMU
  kvm: x86/mmu: Don't clear write flooding count for direct roots
  kvm: x86/mmu: Support MMIO in the TDP MMU
  kvm: x86/mmu: Support write protection for nesting in tdp MMU
  kvm: x86/mmu: Support disabling dirty logging for the tdp MMU
  kvm: x86/mmu: Support dirty logging for the TDP MMU
  kvm: x86/mmu: Support changed pte notifier in tdp MMU
  kvm: x86/mmu: Add access tracking for tdp_mmu
  kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU
  kvm: x86/mmu: Allocate struct kvm_mmu_pages for all pages in TDP MMU
  kvm: x86/mmu: Add TDP MMU PF handler
  kvm: x86/mmu: Remove disallowed_hugepage_adjust shadow_walk_iterator arg
  kvm: x86/mmu: Support zapping SPTEs in the TDP MMU
  KVM: Cache as_id in kvm_memory_slot
  kvm: x86/mmu: Add functions to handle changed TDP SPTEs
  kvm: x86/mmu: Allocate and free TDP MMU roots
  kvm: x86/mmu: Init / Uninit the TDP MMU
  kvm: x86/mmu: Introduce tdp_iter
  KVM: mmu: extract spte.h and spte.c
  KVM: mmu: Separate updating a PTE from kvm_set_pte_rmapp
  ...
2020-10-23 11:17:56 -07:00
Will Deacon
baab853229 Merge branch 'for-next/mte' into for-next/core
Add userspace support for the Memory Tagging Extension introduced by
Armv8.5.

(Catalin Marinas and others)
* for-next/mte: (30 commits)
  arm64: mte: Fix typo in memory tagging ABI documentation
  arm64: mte: Add Memory Tagging Extension documentation
  arm64: mte: Kconfig entry
  arm64: mte: Save tags when hibernating
  arm64: mte: Enable swap of tagged pages
  mm: Add arch hooks for saving/restoring tags
  fs: Handle intra-page faults in copy_mount_options()
  arm64: mte: ptrace: Add NT_ARM_TAGGED_ADDR_CTRL regset
  arm64: mte: ptrace: Add PTRACE_{PEEK,POKE}MTETAGS support
  arm64: mte: Allow {set,get}_tagged_addr_ctrl() on non-current tasks
  arm64: mte: Restore the GCR_EL1 register after a suspend
  arm64: mte: Allow user control of the generated random tags via prctl()
  arm64: mte: Allow user control of the tag check mode via prctl()
  mm: Allow arm64 mmap(PROT_MTE) on RAM-based files
  arm64: mte: Validate the PROT_MTE request via arch_validate_flags()
  mm: Introduce arch_validate_flags()
  arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  mm: Introduce arch_calc_vm_flag_bits()
  arm64: mte: Tags-aware aware memcmp_pages() implementation
  arm64: Avoid unnecessary clear_user_page() indirection
  ...
2020-10-02 12:16:11 +01:00
Marc Zyngier
14ef9d0492 Merge branch 'kvm-arm64/hyp-pcpu' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-30 14:05:35 +01:00
Marc Zyngier
e1026237f9 KVM: arm64: Set CSV2 for guests on hardware unaffected by Spectre-v2
If the system is not affected by Spectre-v2, then advertise to the KVM
guest that it is not affected, without the need for a safelist in the
guest.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29 16:08:16 +01:00
Marc Zyngier
88865beca9 KVM: arm64: Mask out filtered events in PCMEID{0,1}_EL1
As we can now hide events from the guest, let's also adjust its view of
PCMEID{0,1}_EL1 so that it can figure out why some common events are not
counting as they should.

The astute user can still look into the TRM for their CPU and find out
they've been cheated, though. Nobody's perfect.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-29 14:19:39 +01:00
Catalin Marinas
2ac638fc57 arm64: kvm: mte: Hide the MTE CPUID information from the guests
KVM does not support MTE in guests yet, so clear the corresponding field
in the ID_AA64PFR1_EL1 register. In addition, inject an undefined
exception in the guest if it accesses one of the GCR_EL1, RGSR_EL1,
TFSR_EL1 or TFSRE0_EL1 registers. While the emulate_sys_reg() function
already injects an undefined exception, this patch prevents the
unnecessary printk.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Steven Price <steven.price@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
2020-09-04 12:45:44 +01:00
Paolo Bonzini
0378daef0c KVM/arm64 updates for Linux 5.9:
- Split the VHE and nVHE hypervisor code bases, build the EL2 code
   separately, allowing for the VHE code to now be built with instrumentation
 
 - Level-based TLB invalidation support
 
 - Restructure of the vcpu register storage to accomodate the NV code
 
 - Pointer Authentication available for guests on nVHE hosts
 
 - Simplification of the system register table parsing
 
 - MMU cleanups and fixes
 
 - A number of post-32bit cleanups and other fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl8q5DEPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDQFAP/jtscnC5OxEOoGNW1gvg/1QI/BuU4zLvqQL1
 OEW72fUQlil7tmF/CbLLKnsBpxKmzO02C3wDdg3oaRi884bRtTXdok0nsFuCvrZD
 u/wrlMnP0zTjjk1uwIFfZJTx+nnUiT0jC6ffvGxB/jnTJk/8atvOUFL7ODFEfixz
 mS5g1jwwJkRmWKESFg7KGSghKuwXTvo4HVWCfME+t1rQwAa03stXFV8H5tkU6+cG
 BRIssxo7BkAV2AozwL7hgl/M6wd6QvbOrYJqgb67+sQ8qts0YNne96NN3InMedb1
 RENyDssXlA+VI0HoYyEbYnPtFy1Hoj1lOGDZLEZAEH1qcmWrV+hApnoSXSmuofvn
 QlfOWCyd92CZySu21MALRUVXbrKkA3zT2b9R93A5z7iEBPY+Wk0ryJCO6IxdZzF8
 48LNjtzb/Kd0SMU/issJlw+u6fJvLbpnSzXNsYYhiiTMUE9cbu2SEkq0SkonH0a4
 d3V8UifZyeffXsOfOAG0DJZOu/fWZp1/I3tfzujtG9rCb+jTQueJ4E1cFYrwSO6b
 sFNyiI1AzlwcCippG08zSUX61nGfKXBuMXuhIlMRk7GeiF95DmSXuxEgYndZX9I+
 E6zJr1iQk/1lrip41svDIIOBHuMbIeD/w1bsOKi7Zoa270MxB4r2Z3IqRMgosoE5
 l4YO9pl1
 =Ukr4
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-next-5.6

KVM/arm64 updates for Linux 5.9:

- Split the VHE and nVHE hypervisor code bases, build the EL2 code
  separately, allowing for the VHE code to now be built with instrumentation

- Level-based TLB invalidation support

- Restructure of the vcpu register storage to accomodate the NV code

- Pointer Authentication available for guests on nVHE hosts

- Simplification of the system register table parsing

- MMU cleanups and fixes

- A number of post-32bit cleanups and other fixes
2020-08-09 12:58:23 -04:00
Linus Torvalds
921d2597ab s390: implement diag318
x86:
 * Report last CPU for debugging
 * Emulate smaller MAXPHYADDR in the guest than in the host
 * .noinstr and tracing fixes from Thomas
 * nested SVM page table switching optimization and fixes
 
 Generic:
 * Unify shadow MMU cache data structures across architectures
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl8pC+oUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNcOwgAjomqtEqQNlp7DdZT7VyyklzbxX1/
 ud7v+oOJ8K4sFlf64lSthjPo3N9rzZCcw+yOXmuyuITngXOGc3tzIwXpCzpLtuQ1
 WO1Ql3B/2dCi3lP5OMmsO1UAZqy9pKLg1dfeYUPk48P5+p7d/NPmk+Em5kIYzKm5
 JsaHfCp2EEXomwmljNJ8PQ1vTjIQSSzlgYUBZxmCkaaX7zbEUMtxAQCStHmt8B84
 33LczwXBm3viSWrzsoBV37I70+tseugiSGsCfUyupXOvq55d6D9FCqtCb45Hn4Vh
 Ik8ggKdalsk/reiGEwNw1/3nr6mRMkHSbl+Mhc4waOIFf9dn0urgQgOaDg==
 =YVx0
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "s390:
   - implement diag318

  x86:
   - Report last CPU for debugging
   - Emulate smaller MAXPHYADDR in the guest than in the host
   - .noinstr and tracing fixes from Thomas
   - nested SVM page table switching optimization and fixes

  Generic:
   - Unify shadow MMU cache data structures across architectures"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
  KVM: SVM: Fix sev_pin_memory() error handling
  KVM: LAPIC: Set the TDCR settable bits
  KVM: x86: Specify max TDP level via kvm_configure_mmu()
  KVM: x86/mmu: Rename max_page_level to max_huge_page_level
  KVM: x86: Dynamically calculate TDP level from max level and MAXPHYADDR
  KVM: VXM: Remove temporary WARN on expected vs. actual EPTP level mismatch
  KVM: x86: Pull the PGD's level from the MMU instead of recalculating it
  KVM: VMX: Make vmx_load_mmu_pgd() static
  KVM: x86/mmu: Add separate helper for shadow NPT root page role calc
  KVM: VMX: Drop a duplicate declaration of construct_eptp()
  KVM: nSVM: Correctly set the shadow NPT root level in its MMU role
  KVM: Using macros instead of magic values
  MIPS: KVM: Fix build error caused by 'kvm_run' cleanup
  KVM: nSVM: remove nonsensical EXITINFO1 adjustment on nested NPF
  KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support
  KVM: VMX: optimize #PF injection when MAXPHYADDR does not match
  KVM: VMX: Add guest physical address check in EPT violation and misconfig
  KVM: VMX: introduce vmx_need_pf_intercept
  KVM: x86: update exception bitmap on CPUID changes
  KVM: x86: rename update_bp_intercept to update_exception_bitmap
  ...
2020-08-06 12:59:31 -07:00
Marc Zyngier
a394cf6e85 Merge branch 'kvm-arm64/misc-5.9' into kvmarm-master/next-WIP
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-28 16:26:16 +01:00
Marc Zyngier
c9dc95005a Merge branch 'kvm-arm64/target-table-no-more' into kvmarm-master/next-WIP
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-28 16:10:32 +01:00
Vladimir Murzin
493cf9b723 arm64: s/AMEVTYPE/AMEVTYPER
Activity Monitor Event Type Registers are named as AMEVTYPER{0,1}<n>

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20200721091259.102756-1-vladimir.murzin@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-22 13:59:38 +01:00
Tianjia Zhang
74cc7e0c35 KVM: arm64: clean up redundant 'kvm_run' parameters
In the current kvm version, 'kvm_run' has been included in the 'kvm_vcpu'
structure. For historical reasons, many kvm-related function parameters
retain the 'kvm_run' and 'kvm_vcpu' parameters at the same time. This
patch does a unified cleanup of these remaining redundant parameters.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200623131418.31473-3-tianjia.zhang@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 04:26:40 -04:00
Marc Zyngier
98909e6d1c KVM: arm64: Move ELR_EL1 to the system register array
As ELR-EL1 is a VNCR-capable register with ARMv8.4-NV, let's move it to
the sys_regs array and repaint the accessors. While we're at it, let's
kill the now useless accessors used only on the fault injection path.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:38 +01:00
Gavin Shan
3a949f4c93 KVM: arm64: Rename HSR to ESR
kvm/arm32 isn't supported since commit 541ad0150c ("arm: Remove
32bit KVM host support"). So HSR isn't meaningful since then. This
renames HSR to ESR accordingly. This shouldn't cause any functional
changes:

   * Rename kvm_vcpu_get_hsr() to kvm_vcpu_get_esr() to make the
     function names self-explanatory.
   * Rename variables from @hsr to @esr to make them self-explanatory.

Note that the renaming on uapi and tracepoint will cause ABI changes,
which we should avoid. Specificly, there are 4 related source files
in this regard:

   * arch/arm64/include/uapi/asm/kvm.h  (struct kvm_debug_exit_arch::hsr)
   * arch/arm64/kvm/handle_exit.c       (struct kvm_debug_exit_arch::hsr)
   * arch/arm64/kvm/trace_arm.h         (tracepoints)
   * arch/arm64/kvm/trace_handle_exit.h (tracepoints)

Signed-off-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Andrew Scull <ascull@google.com>
Link: https://lore.kernel.org/r/20200630015705.103366-1-gshan@redhat.com
2020-07-05 21:57:59 +01:00
James Morse
750ed56693 KVM: arm64: Remove the target table
Finally, remove the target table. Merge the code that checks the
tables into kvm_reset_sys_regs() as there is now only one table.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200622113317.20477-6-james.morse@arm.com
2020-07-05 18:20:45 +01:00
James Morse
dcaffa7bf9 KVM: arm64: Remove target_table from exit handlers
Whenever KVM searches for a register (e.g. due to a guest exit), it
works with two tables, as the target table overrides the sys_regs array.

Now that everything is in the sys_regs array, and the target table is
empty, stop doing that.

Remove the second table and its size from all the functions that take
it.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200622113317.20477-5-james.morse@arm.com
2020-07-05 18:20:45 +01:00
James Morse
af4738290d KVM: arm64: Move ACTLR_EL1 emulation to the sys_reg_descs array
The only entry in the genericv8_sys_regs arrays is for emulation of
ACTLR_EL1. As all targets emulate this in the same way, move it to
sys_reg_descs[].

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200622113317.20477-4-james.morse@arm.com
2020-07-05 18:20:45 +01:00
James Morse
04343ae312 KVM: arm64: Tolerate an empty target_table list
Before emptying the target_table lists, and then removing their
infrastructure, add some tolerance to an empty list.

Instead of bugging-out on an empty list, pretend we already
reached the end in the two-list-walk.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200622113317.20477-3-james.morse@arm.com
2020-07-05 18:20:45 +01:00
James Morse
6b33e0d64f KVM: arm64: Drop the target_table[] indirection
KVM for 32bit arm had a get/set target mechanism to allow for
micro-architecture differences that are visible in system registers
to be described.

KVM's user-space can query the supported targets for a CPU, and
create vCPUs for that target. The target can override the handling
of system registers to provide different reset or RES0 behaviour.
On 32bit arm this was used to provide different ACTLR reset values
for A7 and A15.

On 64bit arm, the first few CPUs out of the gate used this mechanism,
before it was deemed redundant in commit bca556ac46 ("arm64/kvm:
Add generic v8 KVM target"). All future CPUs use the
KVM_ARM_TARGET_GENERIC_V8 target.

The 64bit target_table[] stuff exists to preserve the ABI to
user-space. As all targets registers genericv8_target_table, there
is no reason to look the target up.

Until we can merge genericv8_target_table with the main sys_regs
array, kvm_register_target_sys_reg_table() becomes
kvm_check_target_sys_reg_table(), which uses BUG_ON() in keeping
with the other callers in this file.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200622113317.20477-2-james.morse@arm.com
2020-07-05 18:20:45 +01:00
Paolo Bonzini
49b3deaad3 KVM/arm64 fixes for Linux 5.8, take #1
* 32bit VM fixes:
   - Fix embarassing mapping issue between AArch32 CSSELR and AArch64
     ACTLR
   - Add ACTLR2 support for AArch32
   - Get rid of the useless ACTLR_EL1 save/restore
   - Fix CP14/15 accesses for AArch32 guests on BE hosts
   - Ensure that we don't loose any state when injecting a 32bit
     exception when running on a VHE host
 
 * 64bit VM fixes:
   - Fix PtrAuth host saving happening in preemptible contexts
   - Optimize PtrAuth lazy enable
   - Drop vcpu to cpu context pointer
   - Fix sparse warnings for HYP per-CPU accesses
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl7h6r8PHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDE3gP/iogqGjZasUIwk4gdIc4IaxxNsfTYJFIh5uw
 sedAqwCQg3OftX0jptp6GhI3ZIG5UPuGDM7f3aio6i02pjx6bfBxGJ9AXqNcp6gN
 WcECHsAfzHUScznRhBbVflKkOF4dzfzyiutnMdknihePOyO9drwdvzXuJa37cs52
 tsCneP9xQ/vQWdqu42uPS7HtSepSa/Lf/qeKGaTDWQIvNYGI3PctQvRAxx4FNHc/
 SMUpS5zdTFceVoya/2+azTJ24R1lbwlPwaw2WoaghB+QmREKN8uMKy5kjrO5YUnH
 8BtjESiNBI2CZYSwcxFt+QNA6EmymwDwfrmOE+7iBCZelOLWLVYbJ7icKX3kT731
 gts5PBD8JlZWAnbH/Mbo4qngXJwHaijA38Bt8rvSphI0aK6iOU6DP5BuOurzNRde
 XczDYq3lqdCC2ynROjRpH4paVo7s0sBjjgZ7OsWqsw9uRAogwTkVE2sEi4HdqNAH
 JHhIHEKj7t/bRtzneXVk6ngoezIs6sIdcqrUZ+rAMnmMHbrzBoEqnlrlQ7e2/UXY
 yvY5Yc3/H2pKRCK/KznOi1nVG+xUZp4RZp552pwULF+JVbmMHIOxn3IxiejfMZVx
 czD5cxMcgMWa14ZZRN0DynT9wCg+s+MGaKGR6STyudVYHFBTr7hrsuM1zq/neMQf
 JcUBVUot
 =I2Li
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for Linux 5.8, take #1

* 32bit VM fixes:
  - Fix embarassing mapping issue between AArch32 CSSELR and AArch64
    ACTLR
  - Add ACTLR2 support for AArch32
  - Get rid of the useless ACTLR_EL1 save/restore
  - Fix CP14/15 accesses for AArch32 guests on BE hosts
  - Ensure that we don't loose any state when injecting a 32bit
    exception when running on a VHE host

* 64bit VM fixes:
  - Fix PtrAuth host saving happening in preemptible contexts
  - Optimize PtrAuth lazy enable
  - Drop vcpu to cpu context pointer
  - Fix sparse warnings for HYP per-CPU accesses
2020-06-11 14:02:32 -04:00
Marc Zyngier
15c99816ed Merge branch 'kvm-arm64/ptrauth-fixes' into kvmarm-master/next
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-06-10 19:10:40 +01:00
Marc Zyngier
29eb5a3c57 KVM: arm64: Handle PtrAuth traps early
The current way we deal with PtrAuth is a bit heavy handed:

- We forcefully save the host's keys on each vcpu_load()
- Handling the PtrAuth trap forces us to go all the way back
  to the exit handling code to just set the HCR bits

Overall, this is pretty cumbersome. A better approach would be
to handle it the same way we deal with the FPSIMD registers:

- On vcpu_load() disable PtrAuth for the guest
- On first use, save the host's keys, enable PtrAuth in the
  guest

Crucially, this can happen as a fixup, which is done very early
on exit. We can then reenter the guest immediately without
leaving the hypervisor role.

Another thing is that it simplify the rest of the host handling:
exiting all the way to the host means that the only possible
outcome for this trap is to inject an UNDEF.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-06-09 10:59:52 +01:00
James Morse
e8679fedd0 KVM: arm64: Stop save/restoring ACTLR_EL1
KVM sets HCR_EL2.TACR via HCR_GUEST_FLAGS. This means ACTLR* accesses
from the guest are always trapped, and always return the value in the
sys_regs array.

The guest can't change the value of these registers, so we are
save restoring the reset value, which came from the host.

Stop save/restoring this register. Keep the storage for this register
in sys_regs[] as this is how the value is exposed to user-space,
removing it would break migration.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200529150656.7339-4-james.morse@arm.com
2020-06-09 09:07:58 +01:00
James Morse
7c582bf4ed KVM: arm64: Stop writing aarch32's CSSELR into ACTLR
aarch32 has pairs of registers to access the high and low parts of 64bit
registers. KVM has a union of 64bit sys_regs[] and 32bit copro[]. The
32bit accessors read the high or low part of the 64bit sys_reg[] value
through the union.

Both sys_reg_descs[] and cp15_regs[] list access_csselr() as the accessor
for CSSELR{,_EL1}. access_csselr() is only aware of the 64bit sys_regs[],
and expects r->reg to be 'CSSELR_EL1' in the enum, index 2 of the 64bit
array.

cp15_regs[] uses the 32bit copro[] alias of sys_regs[]. Here CSSELR is
c0_CSSELR which is the same location in sys_reg[]. r->reg is 'c0_CSSELR',
index 4 in the 32bit array.

access_csselr() uses the 32bit r->reg value to access the 64bit array,
so reads and write the wrong value. sys_regs[4], is ACTLR_EL1, which
is subsequently save/restored when we enter the guest.

ACTLR_EL1 is supposed to be read-only for the guest. This register
only affects execution at EL1, and the host's value is restored before
we return to host EL1.

Convert the 32bit register index back to the 64bit version.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200529150656.7339-2-james.morse@arm.com
2020-06-09 09:04:42 +01:00
Linus Torvalds
039aeb9deb ARM:
- Move the arch-specific code into arch/arm64/kvm
 - Start the post-32bit cleanup
 - Cherry-pick a few non-invasive pre-NV patches
 
 x86:
 - Rework of TLB flushing
 - Rework of event injection, especially with respect to nested virtualization
 - Nested AMD event injection facelift, building on the rework of generic code
 and fixing a lot of corner cases
 - Nested AMD live migration support
 - Optimization for TSC deadline MSR writes and IPIs
 - Various cleanups
 - Asynchronous page fault cleanups (from tglx, common topic branch with tip tree)
 - Interrupt-based delivery of asynchronous "page ready" events (host side)
 - Hyper-V MSRs and hypercalls for guest debugging
 - VMX preemption timer fixes
 
 s390:
 - Cleanups
 
 Generic:
 - switch vCPU thread wakeup from swait to rcuwait
 
 The other architectures, and the guest side of the asynchronous page fault
 work, will come next week.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl7VJcYUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPf6QgAq4wU5wdd1lTGz/i3DIhNVJNJgJlp
 ozLzRdMaJbdbn5RpAK6PEBd9+pt3+UlojpFB3gpJh2Nazv2OzV4yLQgXXXyyMEx1
 5Hg7b4UCJYDrbkCiegNRv7f/4FWDkQ9dx++RZITIbxeskBBCEI+I7GnmZhGWzuC4
 7kj4ytuKAySF2OEJu0VQF6u0CvrNYfYbQIRKBXjtOwuRK4Q6L63FGMJpYo159MBQ
 asg3B1jB5TcuGZ9zrjL5LkuzaP4qZZHIRs+4kZsH9I6MODHGUxKonrkablfKxyKy
 CFK+iaHCuEXXty5K0VmWM3nrTfvpEjVjbMc7e1QGBQ5oXsDM0pqn84syRg==
 =v7Wn
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:
   - Move the arch-specific code into arch/arm64/kvm

   - Start the post-32bit cleanup

   - Cherry-pick a few non-invasive pre-NV patches

  x86:
   - Rework of TLB flushing

   - Rework of event injection, especially with respect to nested
     virtualization

   - Nested AMD event injection facelift, building on the rework of
     generic code and fixing a lot of corner cases

   - Nested AMD live migration support

   - Optimization for TSC deadline MSR writes and IPIs

   - Various cleanups

   - Asynchronous page fault cleanups (from tglx, common topic branch
     with tip tree)

   - Interrupt-based delivery of asynchronous "page ready" events (host
     side)

   - Hyper-V MSRs and hypercalls for guest debugging

   - VMX preemption timer fixes

  s390:
   - Cleanups

  Generic:
   - switch vCPU thread wakeup from swait to rcuwait

  The other architectures, and the guest side of the asynchronous page
  fault work, will come next week"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (256 commits)
  KVM: selftests: fix rdtsc() for vmx_tsc_adjust_test
  KVM: check userspace_addr for all memslots
  KVM: selftests: update hyperv_cpuid with SynDBG tests
  x86/kvm/hyper-v: Add support for synthetic debugger via hypercalls
  x86/kvm/hyper-v: enable hypercalls regardless of hypercall page
  x86/kvm/hyper-v: Add support for synthetic debugger interface
  x86/hyper-v: Add synthetic debugger definitions
  KVM: selftests: VMX preemption timer migration test
  KVM: nVMX: Fix VMX preemption timer migration
  x86/kvm/hyper-v: Explicitly align hcall param for kvm_hyperv_exit
  KVM: x86/pmu: Support full width counting
  KVM: x86/pmu: Tweak kvm_pmu_get_msr to pass 'struct msr_data' in
  KVM: x86: announce KVM_FEATURE_ASYNC_PF_INT
  KVM: x86: acknowledgment mechanism for async pf page ready notifications
  KVM: x86: interrupt based APF 'page ready' event delivery
  KVM: introduce kvm_read_guest_offset_cached()
  KVM: rename kvm_arch_can_inject_async_page_present() to kvm_arch_can_dequeue_async_page_present()
  KVM: x86: extend struct kvm_vcpu_pv_apf_data with token info
  Revert "KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously"
  KVM: VMX: Replace zero-length array with flexible-array
  ...
2020-06-03 15:13:47 -07:00
Marc Zyngier
bb44a8dbea KVM: arm64: Move sysreg reset check to boot time
Our sysreg reset check has become a bit silly, as it only checks whether
a reset callback actually exists for a given sysreg entry, and apply the
method if available. Doing the check at each vcpu reset is pretty dumb,
as the tables never change. It is thus perfectly possible to do the same
checks at boot time.

This also allows us to introduce a sparse sys_regs[] array, something
that will be required with ARMv8.4-NV.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28 11:57:10 +01:00
Marc Zyngier
7ccadf23b8 KVM: arm64: Add missing reset handlers for PMU emulation
As we're about to become a bit more harsh when it comes to the lack of
reset callbacks, let's add the missing PMU reset handlers. Note that
these only cover *CLR registers that were always covered by their *SET
counterpart, so there is no semantic change here.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28 11:57:10 +01:00
Marc Zyngier
7ea90bdd70 KVM: arm64: Refactor vcpu_{read,write}_sys_reg
Extract the direct HW accessors for later reuse.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28 11:57:10 +01:00
Anshuman Khandual
152accf847 arm64/cpufeature: Introduce ID_MMFR5 CPU register
This adds basic building blocks required for ID_MMFR5 CPU register which
provides information about the implemented memory model and memory
management support in AArch32 state. This is added per ARM DDI 0487F.a
specification.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/1589881254-10082-7-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-05-21 15:47:11 +01:00
Anshuman Khandual
dd35ec0704 arm64/cpufeature: Introduce ID_DFR1 CPU register
This adds basic building blocks required for ID_DFR1 CPU register which
provides top level information about the debug system in AArch32 state.
We hide the register from KVM guests, as we don't emulate the 'MTPMU'
feature.

This is added per ARM DDI 0487F.a specification.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org

Suggested-by: Will Deacon <will@kernel.org>
Reviewed-by : Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1589881254-10082-6-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-05-21 15:47:11 +01:00
Anshuman Khandual
16824085a7 arm64/cpufeature: Introduce ID_PFR2 CPU register
This adds basic building blocks required for ID_PFR2 CPU register which
provides information about the AArch32 programmers model which must be
interpreted along with ID_PFR0 and ID_PFR1 CPU registers. This is added
per ARM DDI 0487F.a specification.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/1589881254-10082-5-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-05-21 15:47:11 +01:00
Fuad Tabba
656012c731 KVM: Fix spelling in code comments
Fix spelling and typos (e.g., repeated words) in comments.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200401140310.29701-1-tabba@google.com
2020-05-16 15:05:01 +01:00
Linus Torvalds
8c1b724ddb ARM:
* GICv4.1 support
 * 32bit host removal
 
 PPC:
 * secure (encrypted) using under the Protected Execution Framework
 ultravisor
 
 s390:
 * allow disabling GISA (hardware interrupt injection) and protected
 VMs/ultravisor support.
 
 x86:
 * New dirty bitmap flag that sets all bits in the bitmap when dirty
 page logging is enabled; this is faster because it doesn't require bulk
 modification of the page tables.
 * Initial work on making nested SVM event injection more similar to VMX,
 and less buggy.
 * Various cleanups to MMU code (though the big ones and related
 optimizations were delayed to 5.8).  Instead of using cr3 in function
 names which occasionally means eptp, KVM too has standardized on "pgd".
 * A large refactoring of CPUID features, which now use an array that
 parallels the core x86_features.
 * Some removal of pointer chasing from kvm_x86_ops, which will also be
 switched to static calls as soon as they are available.
 * New Tigerlake CPUID features.
 * More bugfixes, optimizations and cleanups.
 
 Generic:
 * selftests: cleanups, new MMU notifier stress test, steal-time test
 * CSV output for kvm_stat.
 
 KVM/MIPS has been broken since 5.5, it does not compile due to a patch committed
 by MIPS maintainers.  I had already prepared a fix, but the MIPS maintainers
 prefer to fix it in generic code rather than KVM so they are taking care of it.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl6GOnIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMfxwf/ZKLZiRoaovXCOG71M/eHtQb8ZIqU
 3MPy+On3eC5Sk/aBxWUL9EFZsbYG6kYdbZ1VOvG9XPBoLlnkDSm/IR0kaELHtnjj
 oGVda/tvGn46Ne39y8xBptmb91WDcWH0vFthT/CwlMxAw3xjr+gG7Qyo+8F2CW6m
 SSSuLiHSBnyO1cQKruBTHZ8qnR8LlnfXEqtd6Y4LFLic0LbLIoIdRcT3wjQrcZrm
 Djd7wbTEYZjUfoqZ72ekwEDUsONcDLDSKcguDO9pSMSCGhpxCVT5Vy68KRpoIMs2
 nzNWDKjvqQo5zb2+GWxJgkd12Hv+n7PCXZMbVrWBu1pQsewUns9m4mkpGw==
 =6fGt
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:
   - GICv4.1 support

   - 32bit host removal

  PPC:
   - secure (encrypted) using under the Protected Execution Framework
     ultravisor

  s390:
   - allow disabling GISA (hardware interrupt injection) and protected
     VMs/ultravisor support.

  x86:
   - New dirty bitmap flag that sets all bits in the bitmap when dirty
     page logging is enabled; this is faster because it doesn't require
     bulk modification of the page tables.

   - Initial work on making nested SVM event injection more similar to
     VMX, and less buggy.

   - Various cleanups to MMU code (though the big ones and related
     optimizations were delayed to 5.8). Instead of using cr3 in
     function names which occasionally means eptp, KVM too has
     standardized on "pgd".

   - A large refactoring of CPUID features, which now use an array that
     parallels the core x86_features.

   - Some removal of pointer chasing from kvm_x86_ops, which will also
     be switched to static calls as soon as they are available.

   - New Tigerlake CPUID features.

   - More bugfixes, optimizations and cleanups.

  Generic:
   - selftests: cleanups, new MMU notifier stress test, steal-time test

   - CSV output for kvm_stat"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (277 commits)
  x86/kvm: fix a missing-prototypes "vmread_error"
  KVM: x86: Fix BUILD_BUG() in __cpuid_entry_get_reg() w/ CONFIG_UBSAN=y
  KVM: VMX: Add a trampoline to fix VMREAD error handling
  KVM: SVM: Annotate svm_x86_ops as __initdata
  KVM: VMX: Annotate vmx_x86_ops as __initdata
  KVM: x86: Drop __exit from kvm_x86_ops' hardware_unsetup()
  KVM: x86: Copy kvm_x86_ops by value to eliminate layer of indirection
  KVM: x86: Set kvm_x86_ops only after ->hardware_setup() completes
  KVM: VMX: Configure runtime hooks using vmx_x86_ops
  KVM: VMX: Move hardware_setup() definition below vmx_x86_ops
  KVM: x86: Move init-only kvm_x86_ops to separate struct
  KVM: Pass kvm_init()'s opaque param to additional arch funcs
  s390/gmap: return proper error code on ksm unsharing
  KVM: selftests: Fix cosmetic copy-paste error in vm_mem_region_move()
  KVM: Fix out of range accesses to memslots
  KVM: X86: Micro-optimize IPI fastpath delay
  KVM: X86: Delay read msr data iff writes ICR MSR
  KVM: PPC: Book3S HV: Add a capability for enabling secure guests
  KVM: arm64: GICv4.1: Expose HW-based SGIs in debugfs
  KVM: arm64: GICv4.1: Allow non-trapping WFI when using HW SGIs
  ...
2020-04-02 15:13:15 -07:00
Catalin Marinas
da12d2739f Merge branches 'for-next/memory-hotremove', 'for-next/arm_sdei', 'for-next/amu', 'for-next/final-cap-helper', 'for-next/cpu_ops-cleanup', 'for-next/misc' and 'for-next/perf' into for-next/core
* for-next/memory-hotremove:
  : Memory hot-remove support for arm64
  arm64/mm: Enable memory hot remove
  arm64/mm: Hold memory hotplug lock while walking for kernel page table dump

* for-next/arm_sdei:
  : SDEI: fix double locking on return from hibernate and clean-up
  firmware: arm_sdei: clean up sdei_event_create()
  firmware: arm_sdei: Use cpus_read_lock() to avoid races with cpuhp
  firmware: arm_sdei: fix possible double-lock on hibernate error path
  firmware: arm_sdei: fix double-lock on hibernate with shared events

* for-next/amu:
  : ARMv8.4 Activity Monitors support
  clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  arm64: use activity monitors for frequency invariance
  cpufreq: add function to get the hardware max frequency
  Documentation: arm64: document support for the AMU extension
  arm64/kvm: disable access to AMU registers from kvm guests
  arm64: trap to EL1 accesses to AMU counters from EL0
  arm64: add support for the AMU extension v1

* for-next/final-cap-helper:
  : Introduce cpus_have_final_cap_helper(), migrate arm64 KVM to it
  arm64: kvm: hyp: use cpus_have_final_cap()
  arm64: cpufeature: add cpus_have_final_cap()

* for-next/cpu_ops-cleanup:
  : cpu_ops[] access code clean-up
  arm64: Introduce get_cpu_ops() helper function
  arm64: Rename cpu_read_ops() to init_cpu_ops()
  arm64: Declare ACPI parking protocol CPU operation if needed

* for-next/misc:
  : Various fixes and clean-ups
  arm64: define __alloc_zeroed_user_highpage
  arm64/kernel: Simplify __cpu_up() by bailing out early
  arm64: remove redundant blank for '=' operator
  arm64: kexec_file: Fixed code style.
  arm64: add blank after 'if'
  arm64: fix spelling mistake "ca not" -> "cannot"
  arm64: entry: unmask IRQ in el0_sp()
  arm64: efi: add efi-entry.o to targets instead of extra-$(CONFIG_EFI)
  arm64: csum: Optimise IPv6 header checksum
  arch/arm64: fix typo in a comment
  arm64: remove gratuitious/stray .ltorg stanzas
  arm64: Update comment for ASID() macro
  arm64: mm: convert cpu_do_switch_mm() to C
  arm64: fix NUMA Kconfig typos

* for-next/perf:
  : arm64 perf updates
  arm64: perf: Add support for ARMv8.5-PMU 64-bit counters
  KVM: arm64: limit PMU version to PMUv3 for ARMv8.1
  arm64: cpufeature: Extract capped perfmon fields
  arm64: perf: Clean up enable/disable calls
  perf: arm-ccn: Use scnprintf() for robustness
  arm64: perf: Support new DT compatibles
  arm64: perf: Refactor PMU init callbacks
  perf: arm_spe: Remove unnecessary zero check on 'nr_pages'
2020-03-25 11:10:32 +00:00
Andrew Murray
c854188ea0 KVM: arm64: limit PMU version to PMUv3 for ARMv8.1
We currently expose the PMU version of the host to the guest via
emulation of the DFR0_EL1 and AA64DFR0_EL1 debug feature registers.
However many of the features offered beyond PMUv3 for 8.1 are not
supported in KVM. Examples of this include support for the PMMIR
registers (added in PMUv3 for ARMv8.4) and 64-bit event counters
added in (PMUv3 for ARMv8.5).

Let's trap the Debug Feature Registers in order to limit
PMUVer/PerfMon in the Debug Feature Registers to PMUv3 for ARMv8.1
to avoid unexpected behaviour.

Both ID_AA64DFR0.PMUVer and ID_DFR0.PerfMon follow the "Alternative ID
scheme used for the Performance Monitors Extension version" where 0xF
means an IMPLEMENTATION DEFINED PMU is implemented, and values 0x0-0xE
are treated as with an unsigned field (with 0x0 meaning no PMU is
present). As we don't expect to expose an IMPLEMENTATION DEFINED PMU,
and our cap is below 0xF, we can treat these fields as unsigned when
applying the cap.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[Mark: make field names consistent, use perfmon cap]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-03-17 22:46:14 +00:00
Peter Xu
4d39576259 KVM: Remove unnecessary asm/kvm_host.h includes
Remove includes of asm/kvm_host.h from files that already include
linux/kvm_host.h to make it more obvious that there is no ordering issue
between the two headers.  linux/kvm_host.h includes asm/kvm_host.h to
pick up architecture specific settings, and this will never change, i.e.
including asm/kvm_host.h after linux/kvm_host.h may seem problematic,
but in practice is simply redundant.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16 17:57:34 +01:00
Ionela Voinescu
4fcdf106a4 arm64/kvm: disable access to AMU registers from kvm guests
Access to the AMU counters should be disabled by default in kvm guests,
as information from the counters might reveal activity in other guests
or activity on the host.

Therefore, disable access to AMU registers from EL0 and EL1 in kvm
guests by:
 - Hiding the presence of the extension in the feature register
   (SYS_ID_AA64PFR0_EL1) on the VCPU.
 - Disabling access to the AMU registers before switching to the guest.
 - Trapping accesses and injecting an undefined instruction into the
   guest.

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-06 16:02:50 +00:00