Commit Graph

1914 Commits

Author SHA1 Message Date
Linus Torvalds
63eb28bb14 ARM:
- Host driver for GICv5, the next generation interrupt controller for
   arm64, including support for interrupt routing, MSIs, interrupt
   translation and wired interrupts.
 
 - Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on
   GICv5 hardware, leveraging the legacy VGIC interface.
 
 - Userspace control of the 'nASSGIcap' GICv3 feature, allowing
   userspace to disable support for SGIs w/o an active state on hardware
   that previously advertised it unconditionally.
 
 - Map supporting endpoints with cacheable memory attributes on systems
   with FEAT_S2FWB and DIC where KVM no longer needs to perform cache
   maintenance on the address range.
 
 - Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the guest
   hypervisor to inject external aborts into an L2 VM and take traps of
   masked external aborts to the hypervisor.
 
 - Convert more system register sanitization to the config-driven
   implementation.
 
 - Fixes to the visibility of EL2 registers, namely making VGICv3 system
   registers accessible through the VGIC device instead of the ONE_REG
   vCPU ioctls.
 
 - Various cleanups and minor fixes.
 
 LoongArch:
 
 - Add stat information for in-kernel irqchip
 
 - Add tracepoints for CPUCFG and CSR emulation exits
 
 - Enhance in-kernel irqchip emulation
 
 - Various cleanups.
 
 RISC-V:
 
 - Enable ring-based dirty memory tracking
 
 - Improve perf kvm stat to report interrupt events
 
 - Delegate illegal instruction trap to VS-mode
 
 - MMU improvements related to upcoming nested virtualization
 
 s390x
 
 - Fixes
 
 x86:
 
 - Add CONFIG_KVM_IOAPIC for x86 to allow disabling support for I/O APIC,
   PIC, and PIT emulation at compile time.
 
 - Share device posted IRQ code between SVM and VMX and
   harden it against bugs and runtime errors.
 
 - Use vcpu_idx, not vcpu_id, for GA log tag/metadata, to make lookups O(1)
   instead of O(n).
 
 - For MMIO stale data mitigation, track whether or not a vCPU has access to
   (host) MMIO based on whether the page tables have MMIO pfns mapped; using
   VFIO is prone to false negatives
 
 - Rework the MSR interception code so that the SVM and VMX APIs are more or
   less identical.
 
 - Recalculate all MSR intercepts from scratch on MSR filter changes,
   instead of maintaining shadow bitmaps.
 
 - Advertise support for LKGS (Load Kernel GS base), a new instruction
   that's loosely related to FRED, but is supported and enumerated
   independently.
 
 - Fix a user-triggerable WARN that syzkaller found by setting the vCPU
   in INIT_RECEIVED state (aka wait-for-SIPI), and then putting the vCPU
   into VMX Root Mode (post-VMXON).  Trying to detect every possible path
   leading to architecturally forbidden states is hard and even risks
   breaking userspace (if it goes from valid to valid state but passes
   through invalid states), so just wait until KVM_RUN to detect that
   the vCPU state isn't allowed.
 
 - Add KVM_X86_DISABLE_EXITS_APERFMPERF to allow disabling interception of
   APERF/MPERF reads, so that a "properly" configured VM can access
   APERF/MPERF.  This has many caveats (APERF/MPERF cannot be zeroed
   on vCPU creation or saved/restored on suspend and resume, or preserved
   over thread migration let alone VM migration) but can be useful whenever
   you're interested in letting Linux guests see the effective physical CPU
   frequency in /proc/cpuinfo.
 
 - Reject KVM_SET_TSC_KHZ for vm file descriptors if vCPUs have been
   created, as there's no known use case for changing the default
   frequency for other VM types and it goes counter to the very reason
   why the ioctl was added to the vm file descriptor.  And also, there
   would be no way to make it work for confidential VMs with a "secure"
   TSC, so kill two birds with one stone.
 
 - Dynamically allocation the shadow MMU's hashed page list, and defer
   allocating the hashed list until it's actually needed (the TDP MMU
   doesn't use the list).
 
 - Extract many of KVM's helpers for accessing architectural local APIC
   state to common x86 so that they can be shared by guest-side code for
   Secure AVIC.
 
 - Various cleanups and fixes.
 
 x86 (Intel):
 
 - Preserve the host's DEBUGCTL.FREEZE_IN_SMM when running the guest.
   Failure to honor FREEZE_IN_SMM can leak host state into guests.
 
 - Explicitly check vmcs12.GUEST_DEBUGCTL on nested VM-Enter to prevent
   L1 from running L2 with features that KVM doesn't support, e.g. BTF.
 
 x86 (AMD):
 
 - WARN and reject loading kvm-amd.ko instead of panicking the kernel if the
   nested SVM MSRPM offsets tracker can't handle an MSR (which is pretty
   much a static condition and therefore should never happen, but still).
 
 - Fix a variety of flaws and bugs in the AVIC device posted IRQ code.
 
 - Inhibit AVIC if a vCPU's ID is too big (relative to what hardware
   supports) instead of rejecting vCPU creation.
 
 - Extend enable_ipiv module param support to SVM, by simply leaving
   IsRunning clear in the vCPU's physical ID table entry.
 
 - Disable IPI virtualization, via enable_ipiv, if the CPU is affected by
   erratum #1235, to allow (safely) enabling AVIC on such CPUs.
 
 - Request GA Log interrupts if and only if the target vCPU is blocking,
   i.e. only if KVM needs a notification in order to wake the vCPU.
 
 - Intercept SPEC_CTRL on AMD if the MSR shouldn't exist according to the
   vCPU's CPUID model.
 
 - Accept any SNP policy that is accepted by the firmware with respect to
   SMT and single-socket restrictions.  An incompatible policy doesn't put
   the kernel at risk in any way, so there's no reason for KVM to care.
 
 - Drop a superfluous WBINVD (on all CPUs!) when destroying a VM and
   use WBNOINVD instead of WBINVD when possible for SEV cache maintenance.
 
 - When reclaiming memory from an SEV guest, only do cache flushes on CPUs
   that have ever run a vCPU for the guest, i.e. don't flush the caches for
   CPUs that can't possibly have cache lines with dirty, encrypted data.
 
 Generic:
 
 - Rework irqbypass to track/match producers and consumers via an xarray
   instead of a linked list.  Using a linked list leads to O(n^2) insertion
   times, which is hugely problematic for use cases that create large
   numbers of VMs.  Such use cases typically don't actually use irqbypass,
   but eliminating the pointless registration is a future problem to
   solve as it likely requires new uAPI.
 
 - Track irqbypass's "token" as "struct eventfd_ctx *" instead of a "void *",
   to avoid making a simple concept unnecessarily difficult to understand.
 
 - Decouple device posted IRQs from VFIO device assignment, as binding a VM
   to a VFIO group is not a requirement for enabling device posted IRQs.
 
 - Clean up and document/comment the irqfd assignment code.
 
 - Disallow binding multiple irqfds to an eventfd with a priority waiter,
   i.e.  ensure an eventfd is bound to at most one irqfd through the entire
   host, and add a selftest to verify eventfd:irqfd bindings are globally
   unique.
 
 - Add a tracepoint for KVM_SET_MEMORY_ATTRIBUTES to help debug issues
   related to private <=> shared memory conversions.
 
 - Drop guest_memfd's .getattr() implementation as the VFS layer will call
   generic_fillattr() if inode_operations.getattr is NULL.
 
 - Fix issues with dirty ring harvesting where KVM doesn't bound the
   processing of entries in any way, which allows userspace to keep KVM
   in a tight loop indefinitely.
 
 - Kill off kvm_arch_{start,end}_assignment() and x86's associated tracking,
   now that KVM no longer uses assigned_device_count as a heuristic for
   either irqbypass usage or MDS mitigation.
 
 Selftests:
 
 - Fix a comment typo.
 
 - Verify KVM is loaded when getting any KVM module param so that attempting
   to run a selftest without kvm.ko loaded results in a SKIP message about
   KVM not being loaded/enabled (versus some random parameter not existing).
 
 - Skip tests that hit EACCES when attempting to access a file, and rpint
   a "Root required?" help message.  In most cases, the test just needs to
   be run with elevated permissions.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmiKXMgUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMhMQf/QDhC/CP1aGXph2whuyeD2NMqPKiU
 9KdnDNST+ftPwjg9QxZ9mTaa8zeVz/wly6XlxD9OQHy+opM1wcys3k0GZAFFEEQm
 YrThgURdzEZ3nwJZgb+m0t4wjJQtpiFIBwAf7qq6z1VrqQBEmHXJ/8QxGuqO+BNC
 j5q/X+q6KZwehKI6lgFBrrOKWFaxqhnRAYfW6rGBxRXxzTJuna37fvDpodQnNceN
 zOiq+avfriUMArTXTqOteJNKU0229HjiPSnjILLnFQ+B3akBlwNG0jk7TMaAKR6q
 IZWG1EIS9q1BAkGXaw6DE1y6d/YwtXCR5qgAIkiGwaPt5yj9Oj6kRN2Ytw==
 =j2At
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:

   - Host driver for GICv5, the next generation interrupt controller for
     arm64, including support for interrupt routing, MSIs, interrupt
     translation and wired interrupts

   - Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on
     GICv5 hardware, leveraging the legacy VGIC interface

   - Userspace control of the 'nASSGIcap' GICv3 feature, allowing
     userspace to disable support for SGIs w/o an active state on
     hardware that previously advertised it unconditionally

   - Map supporting endpoints with cacheable memory attributes on
     systems with FEAT_S2FWB and DIC where KVM no longer needs to
     perform cache maintenance on the address range

   - Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the
     guest hypervisor to inject external aborts into an L2 VM and take
     traps of masked external aborts to the hypervisor

   - Convert more system register sanitization to the config-driven
     implementation

   - Fixes to the visibility of EL2 registers, namely making VGICv3
     system registers accessible through the VGIC device instead of the
     ONE_REG vCPU ioctls

   - Various cleanups and minor fixes

  LoongArch:

   - Add stat information for in-kernel irqchip

   - Add tracepoints for CPUCFG and CSR emulation exits

   - Enhance in-kernel irqchip emulation

   - Various cleanups

  RISC-V:

   - Enable ring-based dirty memory tracking

   - Improve perf kvm stat to report interrupt events

   - Delegate illegal instruction trap to VS-mode

   - MMU improvements related to upcoming nested virtualization

  s390x

   - Fixes

  x86:

   - Add CONFIG_KVM_IOAPIC for x86 to allow disabling support for I/O
     APIC, PIC, and PIT emulation at compile time

   - Share device posted IRQ code between SVM and VMX and harden it
     against bugs and runtime errors

   - Use vcpu_idx, not vcpu_id, for GA log tag/metadata, to make lookups
     O(1) instead of O(n)

   - For MMIO stale data mitigation, track whether or not a vCPU has
     access to (host) MMIO based on whether the page tables have MMIO
     pfns mapped; using VFIO is prone to false negatives

   - Rework the MSR interception code so that the SVM and VMX APIs are
     more or less identical

   - Recalculate all MSR intercepts from scratch on MSR filter changes,
     instead of maintaining shadow bitmaps

   - Advertise support for LKGS (Load Kernel GS base), a new instruction
     that's loosely related to FRED, but is supported and enumerated
     independently

   - Fix a user-triggerable WARN that syzkaller found by setting the
     vCPU in INIT_RECEIVED state (aka wait-for-SIPI), and then putting
     the vCPU into VMX Root Mode (post-VMXON). Trying to detect every
     possible path leading to architecturally forbidden states is hard
     and even risks breaking userspace (if it goes from valid to valid
     state but passes through invalid states), so just wait until
     KVM_RUN to detect that the vCPU state isn't allowed

   - Add KVM_X86_DISABLE_EXITS_APERFMPERF to allow disabling
     interception of APERF/MPERF reads, so that a "properly" configured
     VM can access APERF/MPERF. This has many caveats (APERF/MPERF
     cannot be zeroed on vCPU creation or saved/restored on suspend and
     resume, or preserved over thread migration let alone VM migration)
     but can be useful whenever you're interested in letting Linux
     guests see the effective physical CPU frequency in /proc/cpuinfo

   - Reject KVM_SET_TSC_KHZ for vm file descriptors if vCPUs have been
     created, as there's no known use case for changing the default
     frequency for other VM types and it goes counter to the very reason
     why the ioctl was added to the vm file descriptor. And also, there
     would be no way to make it work for confidential VMs with a
     "secure" TSC, so kill two birds with one stone

   - Dynamically allocation the shadow MMU's hashed page list, and defer
     allocating the hashed list until it's actually needed (the TDP MMU
     doesn't use the list)

   - Extract many of KVM's helpers for accessing architectural local
     APIC state to common x86 so that they can be shared by guest-side
     code for Secure AVIC

   - Various cleanups and fixes

  x86 (Intel):

   - Preserve the host's DEBUGCTL.FREEZE_IN_SMM when running the guest.
     Failure to honor FREEZE_IN_SMM can leak host state into guests

   - Explicitly check vmcs12.GUEST_DEBUGCTL on nested VM-Enter to
     prevent L1 from running L2 with features that KVM doesn't support,
     e.g. BTF

  x86 (AMD):

   - WARN and reject loading kvm-amd.ko instead of panicking the kernel
     if the nested SVM MSRPM offsets tracker can't handle an MSR (which
     is pretty much a static condition and therefore should never
     happen, but still)

   - Fix a variety of flaws and bugs in the AVIC device posted IRQ code

   - Inhibit AVIC if a vCPU's ID is too big (relative to what hardware
     supports) instead of rejecting vCPU creation

   - Extend enable_ipiv module param support to SVM, by simply leaving
     IsRunning clear in the vCPU's physical ID table entry

   - Disable IPI virtualization, via enable_ipiv, if the CPU is affected
     by erratum #1235, to allow (safely) enabling AVIC on such CPUs

   - Request GA Log interrupts if and only if the target vCPU is
     blocking, i.e. only if KVM needs a notification in order to wake
     the vCPU

   - Intercept SPEC_CTRL on AMD if the MSR shouldn't exist according to
     the vCPU's CPUID model

   - Accept any SNP policy that is accepted by the firmware with respect
     to SMT and single-socket restrictions. An incompatible policy
     doesn't put the kernel at risk in any way, so there's no reason for
     KVM to care

   - Drop a superfluous WBINVD (on all CPUs!) when destroying a VM and
     use WBNOINVD instead of WBINVD when possible for SEV cache
     maintenance

   - When reclaiming memory from an SEV guest, only do cache flushes on
     CPUs that have ever run a vCPU for the guest, i.e. don't flush the
     caches for CPUs that can't possibly have cache lines with dirty,
     encrypted data

  Generic:

   - Rework irqbypass to track/match producers and consumers via an
     xarray instead of a linked list. Using a linked list leads to
     O(n^2) insertion times, which is hugely problematic for use cases
     that create large numbers of VMs. Such use cases typically don't
     actually use irqbypass, but eliminating the pointless registration
     is a future problem to solve as it likely requires new uAPI

   - Track irqbypass's "token" as "struct eventfd_ctx *" instead of a
     "void *", to avoid making a simple concept unnecessarily difficult
     to understand

   - Decouple device posted IRQs from VFIO device assignment, as binding
     a VM to a VFIO group is not a requirement for enabling device
     posted IRQs

   - Clean up and document/comment the irqfd assignment code

   - Disallow binding multiple irqfds to an eventfd with a priority
     waiter, i.e. ensure an eventfd is bound to at most one irqfd
     through the entire host, and add a selftest to verify eventfd:irqfd
     bindings are globally unique

   - Add a tracepoint for KVM_SET_MEMORY_ATTRIBUTES to help debug issues
     related to private <=> shared memory conversions

   - Drop guest_memfd's .getattr() implementation as the VFS layer will
     call generic_fillattr() if inode_operations.getattr is NULL

   - Fix issues with dirty ring harvesting where KVM doesn't bound the
     processing of entries in any way, which allows userspace to keep
     KVM in a tight loop indefinitely

   - Kill off kvm_arch_{start,end}_assignment() and x86's associated
     tracking, now that KVM no longer uses assigned_device_count as a
     heuristic for either irqbypass usage or MDS mitigation

  Selftests:

   - Fix a comment typo

   - Verify KVM is loaded when getting any KVM module param so that
     attempting to run a selftest without kvm.ko loaded results in a
     SKIP message about KVM not being loaded/enabled (versus some random
     parameter not existing)

   - Skip tests that hit EACCES when attempting to access a file, and
     print a "Root required?" help message. In most cases, the test just
     needs to be run with elevated permissions"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (340 commits)
  Documentation: KVM: Use unordered list for pre-init VGIC registers
  RISC-V: KVM: Avoid re-acquiring memslot in kvm_riscv_gstage_map()
  RISC-V: KVM: Use find_vma_intersection() to search for intersecting VMAs
  RISC-V: perf/kvm: Add reporting of interrupt events
  RISC-V: KVM: Enable ring-based dirty memory tracking
  RISC-V: KVM: Fix inclusion of Smnpm in the guest ISA bitmap
  RISC-V: KVM: Delegate illegal instruction fault to VS mode
  RISC-V: KVM: Pass VMID as parameter to kvm_riscv_hfence_xyz() APIs
  RISC-V: KVM: Factor-out g-stage page table management
  RISC-V: KVM: Add vmid field to struct kvm_riscv_hfence
  RISC-V: KVM: Introduce struct kvm_gstage_mapping
  RISC-V: KVM: Factor-out MMU related declarations into separate headers
  RISC-V: KVM: Use ncsr_xyz() in kvm_riscv_vcpu_trap_redirect()
  RISC-V: KVM: Implement kvm_arch_flush_remote_tlbs_range()
  RISC-V: KVM: Don't flush TLB when PTE is unchanged
  RISC-V: KVM: Replace KVM_REQ_HFENCE_GVMA_VMID_ALL with KVM_REQ_TLB_FLUSH
  RISC-V: KVM: Rename and move kvm_riscv_local_tlb_sanitize()
  RISC-V: KVM: Drop the return value of kvm_riscv_vcpu_aia_init()
  RISC-V: KVM: Check kvm_riscv_vcpu_alloc_vector_context() return value
  KVM: arm64: selftests: Add FEAT_RAS EL2 registers to get-reg-list
  ...
2025-07-30 17:14:01 -07:00
Linus Torvalds
6fb44438a5 arm64 updates for 6.17:
Perf and PMU updates:
 
  - Add support for new (v3) Hisilicon SLLC and DDRC PMUs
 
  - Add support for Arm-NI PMU integrations that share interrupts between
    clock domains within a given instance
 
  - Allow SPE to be configured with a lower sample period than the
    minimum recommendation advertised by PMSIDR_EL1.Interval
 
  - Add suppport for Arm's "Branch Record Buffer Extension" (BRBE)
 
  - Adjust the perf watchdog period according to cpu frequency changes
 
  - Minor driver fixes and cleanups
 
 Hardware features:
 
  - Support for MTE store-only checking (FEAT_MTE_STORE_ONLY)
 
  - Support for reporting the non-address bits during a synchronous MTE
    tag check fault (FEAT_MTE_TAGGED_FAR)
 
  - Optimise the TLBI when folding/unfolding contiguous PTEs on hardware
    with FEAT_BBM (break-before-make) level 2 and no TLB conflict aborts
 
 Software features:
 
  - Enable HAVE_LIVEPATCH after implementing arch_stack_walk_reliable()
    and using the text-poke API for late module relocations
 
  - Force VMAP_STACK always on and change arm64_efi_rt_init() to use
    arch_alloc_vmap_stack() in order to avoid KASAN false positives
 
 ACPI:
 
  - Improve SPCR handling and messaging on systems lacking an SPCR table
 
 Debug:
 
  - Simplify the debug exception entry path
 
  - Drop redundant DBG_MDSCR_* macros
 
 Kselftests:
 
  - Cleanups and improvements for SME, SVE and FPSIMD tests
 
 Miscellaneous:
 
  - Optimise loop to reduce redundant operations in contpte_ptep_get()
 
  - Remove ISB when resetting POR_EL0 during signal handling
 
  - Mark the kernel as tainted on SEA and SError panic
 
  - Remove redundant gcs_free() call
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmiDkgoACgkQa9axLQDI
 XvFucQ//bYugRP5/Sdlrq5eDKWBGi1HufYzwfDEBLc4S75Eu8mGL/tuThfu9yFn+
 qCowtt4U84HdWsZDTSVo6lym6v2vJUpGOMgXzepvJaFBRnqGv9X9NxH6RQO1LTnu
 Pm7rO+7I9tNpfuc7Zu9pHDggsJEw+WzVfmEF6WPSFlT9mUNv6NbSx4rbLQKU86Dm
 ouTqXaePEQZ5oiRXVasxyT0otGtiACD20WpgOtNjYGzsfUVwCf/C83V/2DLwwbhr
 9cW9lCtFxA/yFdQcA9ThRzWZ9Eo5LAHqjGIq00+zOjuzgDbBtcTT79gpChkhovIR
 FBIsWHd9j9i3nYxzf4V4eRKQnyqS3NQWv7g7uKFwNgARif1Zk0VJ77QIlAYk5xLI
 ENTRjLKz5WNGGnhdkeCvDlVyxX+OktgcVTp3vqRxAKCRahMMUqBrwxiM8RzVF37e
 yzkEQayL8F7uZqy9H7Sjn48UpHZux6frJ1bBQw1oEvR9QmAoAdqavPMSAYIOT3Zr
 ze4WIljq/cFr3kBPIFP5pK1e0qYMHXZpSKIm8MAv6y/7KmQuVbMjZthpuPbLSIw0
 Q7C0KalB8lToPIbO7qMni/he0dCN4K2+E1YHFTR+pzfcoLuW4rjSg7i8tqMLKMJ8
 H+SeGLyPtM5A6bdAPTTpqefcgUUe7064ENUqrGUpDEynGXA7boE=
 =5h1C
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "A quick summary: perf support for Branch Record Buffer Extensions
  (BRBE), typical PMU hardware updates, small additions to MTE for
  store-only tag checking and exposing non-address bits to signal
  handlers, HAVE_LIVEPATCH enabled on arm64, VMAP_STACK forced on.

  There is also a TLBI optimisation on hardware that does not require
  break-before-make when changing the user PTEs between contiguous and
  non-contiguous.

  More details:

  Perf and PMU updates:

   - Add support for new (v3) Hisilicon SLLC and DDRC PMUs

   - Add support for Arm-NI PMU integrations that share interrupts
     between clock domains within a given instance

   - Allow SPE to be configured with a lower sample period than the
     minimum recommendation advertised by PMSIDR_EL1.Interval

   - Add suppport for Arm's "Branch Record Buffer Extension" (BRBE)

   - Adjust the perf watchdog period according to cpu frequency changes

   - Minor driver fixes and cleanups

  Hardware features:

   - Support for MTE store-only checking (FEAT_MTE_STORE_ONLY)

   - Support for reporting the non-address bits during a synchronous MTE
     tag check fault (FEAT_MTE_TAGGED_FAR)

   - Optimise the TLBI when folding/unfolding contiguous PTEs on
     hardware with FEAT_BBM (break-before-make) level 2 and no TLB
     conflict aborts

  Software features:

   - Enable HAVE_LIVEPATCH after implementing arch_stack_walk_reliable()
     and using the text-poke API for late module relocations

   - Force VMAP_STACK always on and change arm64_efi_rt_init() to use
     arch_alloc_vmap_stack() in order to avoid KASAN false positives

  ACPI:

   - Improve SPCR handling and messaging on systems lacking an SPCR
     table

  Debug:

   - Simplify the debug exception entry path

   - Drop redundant DBG_MDSCR_* macros

  Kselftests:

   - Cleanups and improvements for SME, SVE and FPSIMD tests

  Miscellaneous:

   - Optimise loop to reduce redundant operations in contpte_ptep_get()

   - Remove ISB when resetting POR_EL0 during signal handling

   - Mark the kernel as tainted on SEA and SError panic

   - Remove redundant gcs_free() call"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (93 commits)
  arm64/gcs: task_gcs_el0_enable() should use passed task
  arm64: Kconfig: Keep selects somewhat alphabetically ordered
  arm64: signal: Remove ISB when resetting POR_EL0
  kselftest/arm64: Handle attempts to disable SM on SME only systems
  kselftest/arm64: Fix SVE write data generation for SME only systems
  kselftest/arm64: Test SME on SME only systems in fp-ptrace
  kselftest/arm64: Test FPSIMD format data writes via NT_ARM_SVE in fp-ptrace
  kselftest/arm64: Allow sve-ptrace to run on SME only systems
  arm64/mm: Drop redundant addr increment in set_huge_pte_at()
  kselftest/arm4: Provide local defines for AT_HWCAP3
  arm64: Mark kernel as tainted on SAE and SError panic
  arm64/gcs: Don't call gcs_free() when releasing task_struct
  drivers/perf: hisi: Support PMUs with no interrupt
  drivers/perf: hisi: Relax the event number check of v2 PMUs
  drivers/perf: hisi: Add support for HiSilicon SLLC v3 PMU driver
  drivers/perf: hisi: Use ACPI driver_data to retrieve SLLC PMU information
  drivers/perf: hisi: Add support for HiSilicon DDRC v3 PMU driver
  drivers/perf: hisi: Simplify the probe process for each DDRC version
  perf/arm-ni: Support sharing IRQs within an NI instance
  perf/arm-ni: Consolidate CPU affinity handling
  ...
2025-07-29 20:21:54 -07:00
Paolo Bonzini
314b40b3b6 KVM/arm64 changes for 6.17, round #1
- Host driver for GICv5, the next generation interrupt controller for
    arm64, including support for interrupt routing, MSIs, interrupt
    translation and wired interrupts.
 
  - Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on
    GICv5 hardware, leveraging the legacy VGIC interface.
 
  - Userspace control of the 'nASSGIcap' GICv3 feature, allowing
    userspace to disable support for SGIs w/o an active state on hardware
    that previously advertised it unconditionally.
 
  - Map supporting endpoints with cacheable memory attributes on systems
    with FEAT_S2FWB and DIC where KVM no longer needs to perform cache
    maintenance on the address range.
 
  - Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the guest
    hypervisor to inject external aborts into an L2 VM and take traps of
    masked external aborts to the hypervisor.
 
  - Convert more system register sanitization to the config-driven
    implementation.
 
  - Fixes to the visibility of EL2 registers, namely making VGICv3 system
    registers accessible through the VGIC device instead of the ONE_REG
    vCPU ioctls.
 
  - Various cleanups and minor fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYIADUWIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCaIezbRccb2xpdmVyLnVw
 dG9uQGxpbnV4LmRldgAKCRCivnWIJHzdFr/eAQDY5NIG5cR6ZcAWnPQLmGWpz2ou
 pq4Jhn9E/mGR3n5L1AEAsJpfLLpOsmnLBdwfbjmW59gGsa8k3i5tjWEOJ6yzAwk=
 =r+sp
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.17' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 changes for 6.17, round #1

 - Host driver for GICv5, the next generation interrupt controller for
   arm64, including support for interrupt routing, MSIs, interrupt
   translation and wired interrupts.

 - Use FEAT_GCIE_LEGACY on GICv5 systems to virtualize GICv3 VMs on
   GICv5 hardware, leveraging the legacy VGIC interface.

 - Userspace control of the 'nASSGIcap' GICv3 feature, allowing
   userspace to disable support for SGIs w/o an active state on hardware
   that previously advertised it unconditionally.

 - Map supporting endpoints with cacheable memory attributes on systems
   with FEAT_S2FWB and DIC where KVM no longer needs to perform cache
   maintenance on the address range.

 - Nested support for FEAT_RAS and FEAT_DoubleFault2, allowing the guest
   hypervisor to inject external aborts into an L2 VM and take traps of
   masked external aborts to the hypervisor.

 - Convert more system register sanitization to the config-driven
   implementation.

 - Fixes to the visibility of EL2 registers, namely making VGICv3 system
   registers accessible through the VGIC device instead of the ONE_REG
   vCPU ioctls.

 - Various cleanups and minor fixes.
2025-07-29 12:27:40 -04:00
Paolo Bonzini
b4733cd5be Merge tag 'kvm-x86-selftests-6.17' of https://github.com/kvm-x86/linux into HEAD
KVM selftests changes for 6.17

 - Fix a comment typo.

 - Verify KVM is loaded when getting any KVM module param so that attempting to
   run a selftest without kvm.ko loaded results in a SKIP message about KVM not
   being loaded/enabled, versus some random parameter not existing.

 - SKIP tests that hit EACCES when attempting to access a file, with a "Root
   required?" help message.  In most cases, the test just needs to be run with
   elevated permissions.
2025-07-29 08:36:44 -04:00
Paolo Bonzini
1a14928e2e Merge tag 'kvm-x86-misc-6.17' of https://github.com/kvm-x86/linux into HEAD
KVM x86 misc changes for 6.17

 - Prevert the host's DEBUGCTL.FREEZE_IN_SMM (Intel only) when running the
   guest.  Failure to honor FREEZE_IN_SMM can bleed host state into the guest.

 - Explicitly check vmcs12.GUEST_DEBUGCTL on nested VM-Enter (Intel only) to
   prevent L1 from running L2 with features that KVM doesn't support, e.g. BTF.

 - Intercept SPEC_CTRL on AMD if the MSR shouldn't exist according to the
   vCPU's CPUID model.

 - Rework the MSR interception code so that the SVM and VMX APIs are more or
   less identical.

 - Recalculate all MSR intercepts from the "source" on MSR filter changes, and
   drop the dedicated "shadow" bitmaps (and their awful "max" size defines).

 - WARN and reject loading kvm-amd.ko instead of panicking the kernel if the
   nested SVM MSRPM offsets tracker can't handle an MSR.

 - Advertise support for LKGS (Load Kernel GS base), a new instruction that's
   loosely related to FRED, but is supported and enumerated independently.

 - Fix a user-triggerable WARN that syzkaller found by stuffing INIT_RECEIVED,
   a.k.a. WFS, and then putting the vCPU into VMX Root Mode (post-VMXON).  Use
   the same approach KVM uses for dealing with "impossible" emulation when
   running a !URG guest, and simply wait until KVM_RUN to detect that the vCPU
   has architecturally impossible state.

 - Add KVM_X86_DISABLE_EXITS_APERFMPERF to allow disabling interception of
   APERF/MPERF reads, so that a "properly" configured VM can "virtualize"
   APERF/MPERF (with many caveats).

 - Reject KVM_SET_TSC_KHZ if vCPUs have been created, as changing the "default"
   frequency is unsupported for VMs with a "secure" TSC, and there's no known
   use case for changing the default frequency for other VM types.
2025-07-29 08:36:43 -04:00
Paolo Bonzini
f02b1bcc73 Merge tag 'kvm-x86-irqs-6.17' of https://github.com/kvm-x86/linux into HEAD
KVM IRQ changes for 6.17

 - Rework irqbypass to track/match producers and consumers via an xarray
   instead of a linked list.  Using a linked list leads to O(n^2) insertion
   times, which is hugely problematic for use cases that create large numbers
   of VMs.  Such use cases typically don't actually use irqbypass, but
   eliminating the pointless registration is a future problem to solve as it
   likely requires new uAPI.

 - Track irqbypass's "token" as "struct eventfd_ctx *" instead of a "void *",
   to avoid making a simple concept unnecessarily difficult to understand.

 - Add CONFIG_KVM_IOAPIC for x86 to allow disabling support for I/O APIC, PIC,
   and PIT emulation at compile time.

 - Drop x86's irq_comm.c, and move a pile of IRQ related code into irq.c.

 - Fix a variety of flaws and bugs in the AVIC device posted IRQ code.

 - Inhibited AVIC if a vCPU's ID is too big (relative to what hardware
   supports) instead of rejecting vCPU creation.

 - Extend enable_ipiv module param support to SVM, by simply leaving IsRunning
   clear in the vCPU's physical ID table entry.

 - Disable IPI virtualization, via enable_ipiv, if the CPU is affected by
   erratum #1235, to allow (safely) enabling AVIC on such CPUs.

 - Dedup x86's device posted IRQ code, as the vast majority of functionality
   can be shared verbatime between SVM and VMX.

 - Harden the device posted IRQ code against bugs and runtime errors.

 - Use vcpu_idx, not vcpu_id, for GA log tag/metadata, to make lookups O(1)
   instead of O(n).

 - Generate GA Log interrupts if and only if the target vCPU is blocking, i.e.
   only if KVM needs a notification in order to wake the vCPU.

 - Decouple device posted IRQs from VFIO device assignment, as binding a VM to
   a VFIO group is not a requirement for enabling device posted IRQs.

 - Clean up and document/comment the irqfd assignment code.

 - Disallow binding multiple irqfds to an eventfd with a priority waiter, i.e.
   ensure an eventfd is bound to at most one irqfd through the entire host,
   and add a selftest to verify eventfd:irqfd bindings are globally unique.
2025-07-29 08:35:46 -04:00
Oliver Upton
18ec25dd0e KVM: arm64: selftests: Add FEAT_RAS EL2 registers to get-reg-list
VDISR_EL2 and VSESR_EL2 are now visible to userspace for nested VMs. Add
them to get-reg-list.

Link: https://lore.kernel.org/r/20250728152603.2823699-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-28 08:28:05 -07:00
Oliver Upton
0d46e324c0 Merge branch 'kvm-arm64/vgic-v4-ctl' into kvmarm/next
* kvm-arm64/vgic-v4-ctl:
  : Userspace control of nASSGIcap, courtesy of Raghavendra Rao Ananta
  :
  : Allow userspace to decide if support for SGIs without an active state is
  : advertised to the guest, allowing VMs from GICv3-only hardware to be
  : migrated to to GICv4.1 capable machines.
  Documentation: KVM: arm64: Describe VGICv3 registers writable pre-init
  KVM: arm64: selftests: Add test for nASSGIcap attribute
  KVM: arm64: vgic-v3: Allow userspace to write GICD_TYPER2.nASSGIcap
  KVM: arm64: vgic-v3: Allow access to GICD_IIDR prior to initialization
  KVM: arm64: vgic-v3: Consolidate MAINT_IRQ handling
  KVM: arm64: Disambiguate support for vSGIs v. vLPIs

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-28 08:11:38 -07:00
Oliver Upton
a7f49a9bf4 Merge branch 'kvm-arm64/el2-reg-visibility' into kvmarm/next
* kvm-arm64/el2-reg-visibility:
  : Fixes to EL2 register visibility, courtesy of Marc Zyngier
  :
  :  - Expose EL2 VGICv3 registers via the VGIC attributes accessor, not the
  :    KVM_{GET,SET}_ONE_REG ioctls
  :
  :  - Condition visibility of FGT registers on the presence of FEAT_FGT in
  :    the VM
  KVM: arm64: selftest: vgic-v3: Add basic GICv3 sysreg userspace access test
  KVM: arm64: Enforce the sorting of the GICv3 system register table
  KVM: arm64: Clarify the check for reset callback in check_sysreg_table()
  KVM: arm64: vgic-v3: Fix ordering of ICH_HCR_EL2
  KVM: arm64: Document registers exposed via KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS
  KVM: arm64: selftests: get-reg-list: Add base EL2 registers
  KVM: arm64: selftests: get-reg-list: Simplify feature dependency
  KVM: arm64: Advertise FGT2 registers to userspace
  KVM: arm64: Condition FGT registers on feature availability
  KVM: arm64: Expose GICv3 EL2 registers via KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS
  KVM: arm64: Let GICv3 save/restore honor visibility attribute
  KVM: arm64: Define helper for ICH_VTR_EL2
  KVM: arm64: Define constant value for ICC_SRE_EL2
  KVM: arm64: Don't advertise ICH_*_EL2 registers through GET_ONE_REG
  KVM: arm64: Make RVBAR_EL2 accesses UNDEF

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-28 08:06:38 -07:00
Raghavendra Rao Ananta
15b5964a41 KVM: arm64: selftests: Add test for nASSGIcap attribute
Extend vgic_init to test the nASSGIcap attribute, asserting that it is
configurable (within reason) prior to initializing the VGIC.
Additionally, check that userspace cannot set the attribute after the
VGIC has been initialized.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250724062805.2658919-6-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-26 08:45:52 -07:00
Marc Zyngier
3435bd79ec KVM: arm64: selftest: vgic-v3: Add basic GICv3 sysreg userspace access test
We have a lot of more or less useful vgic tests, but none of them
tracks the availability of GICv3 system registers, which is a bit
annoying.

Add one such test, which covers both EL1 and EL2 registers.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: Itaru Kitayama <itaru.kitayama@fujitsu.com>
Reviewed-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20250718111154.104029-5-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-26 08:36:58 -07:00
Marc Zyngier
3a90b6f279 KVM: arm64: selftests: get-reg-list: Add base EL2 registers
Add the EL2 registers and the eventual dependencies, effectively
doubling the number of test vectors. Oh well.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250714122634.3334816-11-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-15 20:24:29 -07:00
Marc Zyngier
9a40718079 KVM: arm64: selftests: get-reg-list: Simplify feature dependency
Describing the dependencies between registers and features is on
the masochistic side of things, with hard-coded values that would
be better taken from the existing description.

Add a couple of helpers to that effect, and repaint the dependency
array. More could be done to improve this test, but my interest is
wearing  thin...

Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: Itaru Kitayama <itaru.kitayama@fujitsu.com>
Link: https://lore.kernel.org/r/20250714122634.3334816-10-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-15 20:24:29 -07:00
Mark Brown
81bf24f1ac KVM: selftests: Add CONFIG_EVENTFD for irqfd selftest
In 7e9b231c40 ("KVM: selftests: Add a KVM_IRQFD test to verify
uniqueness requirements") we added a test for the newly added irqfd
support but since this feature works with eventfds it won't work unless
the kernel has been built wth eventfd support.  Add CONFIG_EVENTFD to
the list of required options for the KVM selftests.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250710-kvm-selftests-eventfd-config-v1-1-78c276e4b80f@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-07-10 06:20:20 -07:00
Oliver Upton
f9e4e0a663 KVM: arm64: selftests: Test ESR propagation for vSError injection
Ensure that vSErrors taken in the guest have an appropriate ESR_ELx
value for the expected exception. Additionally, switch the EASE test to
install the SEA handler at the SError offset, as the ESR is still
expected to match an SEA in that case.

Link: https://lore.kernel.org/r/20250708230632.1954240-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-09 09:55:05 -07:00
Sean Christopherson
95826e1ed3 KVM: selftests: Convert arch_timer tests to common helpers to pin task
Convert the arch timer tests to use __pin_task_to_cpu() and
pin_self_to_cpu().

No functional change intended.

Link: https://lore.kernel.org/r/20250626001225.744268-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-07-09 09:33:42 -07:00
Jim Mattson
df98ce784a KVM: selftests: Test behavior of KVM_X86_DISABLE_EXITS_APERFMPERF
For a VCPU thread pinned to a single LPU, verify that interleaved host
and guest reads of IA32_[AM]PERF return strictly increasing values when
APERFMPERF exiting is disabled.

Run the test in both L1 and L2 to verify that KVM passes through the
APERF and MPERF MSRs when L1 doesn't want to intercept them (or any MSRs).

Signed-off-by: Jim Mattson <jmattson@google.com>
Link: https://lore.kernel.org/r/20250530185239.2335185-4-jmattson@google.com
Co-developed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20250626001225.744268-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-07-09 09:33:41 -07:00
Sean Christopherson
e83ee6f76c KVM: selftests: Expand set of APIs for pinning tasks to a single CPU
Expand kvm_pin_this_task_to_pcpu() into a set of APIs to allow pinning a
task (or self) to a CPU (any or specific).  This will allow deduplicating
code throughout a variety of selftests.

Opportunistically use "self" instead of "this_task" as it is both more
concise and less ambiguous.

Link: https://lore.kernel.org/r/20250626001225.744268-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-07-09 09:33:40 -07:00
Oliver Upton
0b593ef12a KVM: arm64: selftests: Catch up set_id_regs with the kernel
Add test coverage for ID_AA64MMFR3_EL1 and the recently added
FEAT_DoubleFault2.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250708172532.1699409-28-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-08 11:36:36 -07:00
Oliver Upton
55ea75f5b2 KVM: arm64: selftests: Add SCTLR2_EL1 to get-reg-list
Handle SCTLR2_EL1 specially as it is only visible to userspace when
FEAT_SCTLR2 is implemented for the VM.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250708172532.1699409-27-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-08 11:36:36 -07:00
Oliver Upton
a90aac5532 KVM: arm64: selftests: Test SEAs are taken to SError vector when EASE=1
Ensure KVM routes SEAs to the correct vector depending on
SCTLR2_EL1.EASE.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250708172532.1699409-26-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-08 11:36:36 -07:00
Oliver Upton
2858ea3083 KVM: arm64: selftests: Add basic SError injection test
Add tests for SError injection considering KVM is more directly involved
in delivery:

 - Pending SErrors are taken at the first CSE after SErrors are unmasked

 - Pending SErrors aren't taken and remain pending if SErrors are masked

 - Unmasked SErrors are taken immediately when injected (implementation
   detail)

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250708172532.1699409-25-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-07-08 11:36:36 -07:00
Anshuman Khandual
30ff3c981e KVM: selftests: Change MDSCR_EL1 register holding variables as uint64_t
Change MDSCR_EL1 register holding local variables as uint64_t that reflects
its true register width as well.

Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: kvm@vger.kernel.org
Cc: kvmarm@lists.linux.dev
Cc: linux-kernel@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20250613023646.1215700-3-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-07-03 20:00:37 +01:00
Chenyi Qiang
0048ca5e99 KVM: selftests: Add back the missing check of MONITOR/MWAIT availability
The revamp of monitor/mwait test missed the original check of feature
availability [*]. If MONITOR/MWAIT is not supported or is disabled by
IA32_MISC_ENABLE on the host, executing MONITOR or MWAIT instruction
from guest doesn't cause monitor/mwait VM exits, but a #UD.

[*] https://lore.kernel.org/all/20240411210237.34646-1-zide.chen@intel.com/

Reported-by: Xuelian Guo <xuelian.guo@intel.com>
Fixes: 80fd663590 ("selftests: kvm: revamp MONITOR/MWAIT tests")
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Link: https://lore.kernel.org/r/20250620062219.342930-1-chenyi.qiang@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-24 12:58:19 -07:00
Sean Christopherson
7e9b231c40 KVM: selftests: Add a KVM_IRQFD test to verify uniqueness requirements
Add a selftest to verify that eventfd+irqfd bindings are globally unique,
i.e. that KVM doesn't allow multiple irqfds to bind to a single eventfd,
even across VMs.

Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250522235223.3178519-14-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23 09:51:01 -07:00
Sean Christopherson
74e5e3fb0d KVM: selftests: Add utilities to create eventfds and do KVM_IRQFD
Add helpers to create eventfds and to (de)assign eventfds via KVM_IRQFD.

Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250522235223.3178519-13-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23 09:51:01 -07:00
Sean Christopherson
033b76bc7f KVM: selftests: Assert that eventfd() succeeds in Xen shinfo test
Assert that eventfd() succeeds in the Xen shinfo test instead of skipping
the associated testcase.  While eventfd() is outside the scope of KVM, KVM
unconditionally selects EVENTFD, i.e. the syscall should always succeed.

Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250522235223.3178519-12-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-23 09:51:00 -07:00
Sean Christopherson
8fd2a6d43a KVM: selftests: Fall back to split IRQ chip if full in-kernel chip is unsupported
Now that KVM x86 allows compiling out support for in-kernel I/O APIC (and
PIC and PIT) emulation, i.e. allows disabling KVM_CREATE_IRQCHIP for all
intents and purposes, fall back to a split IRQ chip for x86 if creating
the full in-kernel version fails with ENOTTY.

Acked-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20250611213557.294358-17-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20 13:52:51 -07:00
Sean Christopherson
71443210e2 KVM: selftests: Print a more helpful message for EACCESS in access tracking test
Use open_path_or_exit() helper to probe /sys/kernel/mm/page_idle/bitmap in
the access tracking perf test so that a helpful/pertinent SKIP message is
printed if the file exists but is inaccessible, e.g. because the file has
the kernel's default 0600 permissions.

Cc: James Houghton <jthoughton@google.com>
Link: https://lore.kernel.org/r/20250516215909.2551628-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20 13:39:11 -07:00
Sean Christopherson
ba300a728f KVM: selftests: Play nice with EACCES errors in open_path_or_exit()
Expand the SKIP conditions of the open_path_or_exit() helper to skip on
EACCES as well as ENOENT.  Most often, lack of permissions to a file
needed by a KVM selftests is due to a file being root-only by default,
not because of any bug/misconfiguration that warrants failing a test.

Link: https://lore.kernel.org/r/20250516215909.2551628-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20 13:39:10 -07:00
Sean Christopherson
6e1cce7cda KVM: selftests: Add __open_path_or_exit() variant to provide extra help info
Add an inner __open_path_or_exit() API to let the caller provide additional
information on ENOENT to try and help the user figure out why the test is
being skipped, e.g. for files like the page_idle bitmap needed by the
access tracking perf, which is dependent on a Kconfig.

Immediately convert /dev/kvm to the new API, both as an example and because
users might not know that some architectures/setups require loading KVM.

Link: https://lore.kernel.org/r/20250516215909.2551628-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20 13:39:10 -07:00
Sean Christopherson
fcab107abe KVM: selftests: Verify KVM is loaded when getting a KVM module param
Probe /dev/kvm when getting a KVM module param so that attempting to load
a module param super early in a selftest generates a SKIP message about
KVM not being loaded/enabled, versus some random parameter not existing.

E.g. KVM x86's unconditional retrieval of force_emulation_prefix during
kvm_selftest_arch_init() generates a rather confusing error message that
takes far too much triage to understand.

Link: https://lore.kernel.org/r/20250516215909.2551628-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20 13:39:10 -07:00
Rahul Kumar
30142a93b1 KVM: selftests: Fix spelling of 'occurrences' in sparsebit.c comments
Correct two instances of the misspelled word 'occurences' to
'occurrences' in comments explaining node invariants in sparsebit.c.

Signed-off-by: Rahul Kumar <rk0006818@gmail.com>
Link: https://lore.kernel.org/r/20250523181606.568320-1-rk0006818@gmail.com
[sean: massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20 13:37:32 -07:00
Sean Christopherson
0792c71c1c KVM: selftests: Verify KVM disable interception (for userspace) on filter change
Re-read MSR_{FS,GS}_BASE after restoring the "allow everything" userspace
MSR filter to verify that KVM stops forwarding exits to userspace.  This
can also be used in conjunction with manual verification (e.g. printk) to
ensure KVM is correctly updating the MSR bitmaps consumed by hardware.

Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Manali Shukla <Manali.Shukla@amd.com>
Link: https://lore.kernel.org/r/20250610225737.156318-33-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-20 13:07:38 -07:00
Zenghui Yu
56a1498450 KVM: arm64: selftests: Close the GIC FD in arch_timer_edge_cases
Close the GIC FD to free the reference it holds to the VM so that we can
correctly clean up the VM. This also gets rid of the

	"KVM: debugfs: duplicate directory 395722-4"

warning when running arch_timer_edge_cases.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20250608095402.1131-1-yuzenghui@huawei.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-19 09:58:21 +01:00
Paolo Bonzini
ce360c2bfd KVM/arm64 fixes for 6.16, take #2
- Rework of system register accessors for system registers that are
   directly writen to memory, so that sanitisation of the in-memory
   value happens at the correct time (after the read, or before the
   write). For convenience, RMW-style accessors are also provided.
 
 - Multiple fixes for the so-called "arch-timer-edge-cases' selftest,
   which was always broken.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmhCs3IACgkQI9DQutE9
 ekMxlBAApd03crgHQy8V7I997D9TA/Ph4PkUOZOg091JAABkOZBCLd3H8hbe7Va6
 2XPD7IeTQUEP/8Xwc0+sWF3X4bIqU3PlxZ/TI4IgNDxazz2l+1LTHCrWrP47VXMr
 j5czEzWkSX/59LFc0jL3T0VxKhN9fI+aSE9UZCCXc0BGyLIlRNclO4ho87xkgbxM
 AuhM0VslXtAZBF9DBrtOQ1EodI5Cc7vH38id/8SCL9f74rKln4UViSuPhRQxgzgy
 7T523OERyAINJ8e6UNd0Tg5GFYdj2bMeivnTleaFFxmCH+tAKYtSTV8d6n0fzsOF
 1D+6uU93v4ky3DWwCvmEXLzijH6pRrLjMLsC4Sx1kFCPe05Zaui/g65n4REflZm6
 0xZ2bnTsZP1/MYrZya/XpXipF0EGITqsOuKpHgEO495TIgmAZKev+GIp3NDooSYk
 dZWN0U0ctePV2+WFoxNyN+r9nrg/xSujnyU0k3kMmRcfRHcATzZG6jYOj8CrLdNO
 jWZ56XhghiJj01B1IjVskuSyTwcoRMH4h//C7oAAFQoOuZtEgduGeZUQxz7EoBxX
 /I4Cg4+9P/m310gjdEVMGPdvrFQgweJc8K3+mT3WGRA8AT4Nhi6pxZxnzWeABuUD
 4HpVruNxygMwODilk3YruJ/yat7FqTBTdRZt4w+cwpBTi8VPPqs=
 =OMHL
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.16, take #2

- Rework of system register accessors for system registers that are
  directly writen to memory, so that sanitisation of the in-memory
  value happens at the correct time (after the read, or before the
  write). For convenience, RMW-style accessors are also provided.

- Multiple fixes for the so-called "arch-timer-edge-cases' selftest,
  which was always broken.
2025-06-11 14:25:22 -04:00
Sebastian Ott
fad4cf9448 KVM: arm64: selftests: Determine effective counter width in arch_timer_edge_cases
arch_timer_edge_cases uses ~0 as the maximum counter value, however there's
no architectural guarantee that this is valid.

Figure out the effective counter width based on the effective frequency
like it's done by the kernel.

This also serves as a workaround for AC03_CPU_14 that led to the
following assertion failure on ampere-one machines:

==== Test Assertion Failure ====
  arm64/arch_timer_edge_cases.c:169: timer_condition == istatus
  pid=11236 tid=11236 errno=4 - Interrupted system call
     1  0x0000000000404ce7: test_run at arch_timer_edge_cases.c:938
     2  0x0000000000401ebb: main at arch_timer_edge_cases.c:1053
     3  0x0000ffff9fa8625b: ?? ??:0
     4  0x0000ffff9fa8633b: ?? ??:0
     5  0x0000000000401fef: _start at ??:?
  0x1 != 0x0 (timer_condition != istatus)

Note that the following subtest only worked since the counter initialized
with CVAL_MAX would instantly overflow (which is no longer the case):

	test_set_cnt_after_cval_no_irq(timer, 0, DEF_CNT, CVAL_MAX, sm);

To fix this we could swap CVAL_MAX for 0 here but since that is already
done by test_move_counters_behind_timers() let's remove that subtest.

Link: https://lore.kernel.org/kvmarm/ac1de1d2-ef2b-d439-dc48-8615e121b07b@redhat.com
Link: https://amperecomputing.com/assets/AmpereOne_Developer_ER_v0_80_20240823_28945022f4.pdf
Signed-off-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20250605103613.14544-5-sebott@redhat.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-05 14:28:44 +01:00
Sebastian Ott
05ce38d489 KVM: arm64: selftests: Fix xVAL init in arch_timer_edge_cases
arch_timer_edge_cases hits the following assertion in < 10% of the test runs:

==== Test Assertion Failure ====
  arm64/arch_timer_edge_cases.c:490: timer_get_cntct(timer) >= DEF_CNT + (timer_get_cntfrq() * (uint64_t)(delta_2_ms) / 1000)
  pid=17110 tid=17110 errno=4 - Interrupted system call
     1  0x0000000000404ec7: test_run at arch_timer_edge_cases.c:945
     2  0x0000000000401fa3: main at arch_timer_edge_cases.c:1074
     3  0x0000ffffa774b587: ?? ??:0
     4  0x0000ffffa774b65f: ?? ??:0
     5  0x000000000040206f: _start at ??:?
  timer_get_cntct(timer) >= DEF_CNT + msec_to_cycles(delta_2_ms)

Enabling the timer without proper xval initialization in set_tval_irq()
resulted in an early interrupt during timer reprogramming. Make sure
to set the xval before setting the enable bit.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20250605103613.14544-4-sebott@redhat.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-05 14:28:01 +01:00
Sebastian Ott
050632ae65 KVM: arm64: selftests: Fix thread migration in arch_timer_edge_cases
arch_timer_edge_cases tries to migrate itself across host cpus. Before
the first test, it migrates to cpu 0 by setting up an affinity mask with
only bit 0 set. After that it looks for the next possible cpu in the
current affinity mask which still has only bit 0 set. So there is no
migration at all.

Fix this by reading the default mask at start and use this to find
the next cpu in each iteration.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20250605103613.14544-3-sebott@redhat.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-05 14:27:21 +01:00
Sebastian Ott
9a9864fd09 KVM: arm64: selftests: Fix help text for arch_timer_edge_cases
Fix the help text for arch_timer_edge_cases to show the correct
option for setting the wait time.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20250605103613.14544-2-sebott@redhat.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-05 14:27:00 +01:00
Paolo Bonzini
4e02d4f973 KVM SVM changes for 6.16:
- Wait for target vCPU to acknowledge KVM_REQ_UPDATE_PROTECTED_GUEST_STATE to
    fix a race between AP destroy and VMRUN.
 
  - Decrypt and dump the VMSA in dump_vmcb() if debugging enabled for the VM.
 
  - Add support for ALLOWED_SEV_FEATURES.
 
  - Add #VMGEXIT to the set of handlers special cased for CONFIG_RETPOLINE=y.
 
  - Treat DEBUGCTL[5:2] as reserved to pave the way for virtualizing features
    that utilize those bits.
 
  - Don't account temporary allocations in sev_send_update_data().
 
  - Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM, via Bus Lock Threshold.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmgwmwAACgkQOlYIJqCj
 N/1pHw//edW/x838POMeeCN8j39NBKErW9yZoQLhMbzogttRvfoba+xYY9zXyRFx
 8AXB8+2iLtb7pXUohc0eYN0mNqgD0SnoMLqGfn7nrkJafJSUAJHAoZn1Mdom1M1y
 jHvBPbHCMMsgdLV8wpDRqCNWTH+d5W0kcN5WjKwOswVLj1rybVfK7bSLMhvkk1e5
 RrOR4Ewf95/Ag2b36L4SvS1yG9fTClmKeGArMXhEXjy2INVSpBYyZMjVtjHiNzU9
 TjtB2RSM45O+Zl0T2fZdVW8LFhA6kVeL1v+Oo433CjOQE0LQff3Vl14GCANIlPJU
 PiWN/RIKdWkuxStIP3vw02eHzONCcg2GnNHzEyKQ1xW8lmrwzVRdXZzVsc2Dmowb
 7qGykBQ+wzoE0sMeZPA0k/QOSqg2vGxUQHjR7720loLV9m9Tu/mJnS9e179GJKgI
 e1ArSLwKmHpjwKZqU44IQVTZaxSC4Sg2kI670i21ChPgx8+oVkA6I0LFQXymx7uS
 2lbH+ovTlJSlP9fbaJhMwAU2wpSHAyXif/HPjdw2LTH3NdgXzfEnZfTlAWiP65LQ
 hnz5HvmUalW3x9kmzRmeDIAkDnAXhyt3ZQMvbNzqlO5AfS+Tqh4Ed5EFP3IrQAzK
 HQ+Gi0ip+B84t9Tbi6rfQwzTZEbSSOfYksC7TXqRGhNo/DvHumE=
 =k6rK
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-svm-6.16' of https://github.com/kvm-x86/linux into HEAD

KVM SVM changes for 6.16:

 - Wait for target vCPU to acknowledge KVM_REQ_UPDATE_PROTECTED_GUEST_STATE to
   fix a race between AP destroy and VMRUN.

 - Decrypt and dump the VMSA in dump_vmcb() if debugging enabled for the VM.

 - Add support for ALLOWED_SEV_FEATURES.

 - Add #VMGEXIT to the set of handlers special cased for CONFIG_RETPOLINE=y.

 - Treat DEBUGCTL[5:2] as reserved to pave the way for virtualizing features
   that utilize those bits.

 - Don't account temporary allocations in sev_send_update_data().

 - Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM, via Bus Lock Threshold.
2025-05-27 12:15:49 -04:00
Paolo Bonzini
3e0797f6dd KVM selftests changes for 6.16:
- Add support for SNP to the various SEV selftests.
 
  - Add a selftest to verify fastops instructions via forced emulation.
 
  - Add MGLRU support to the access tracking perf test.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmgwmd0ACgkQOlYIJqCj
 N/3WCBAAsQ5zS8e+B1+7xopSz41eCou8L7KBDZZSe4B9TAuT+hMslXBEculyOJqh
 tIlBFlvrQA/hC2tNYla58jIeA6/f08Jq4/sV1URMNvORFKMcIvgnKpxmMJfKujve
 L2iHvJigJs4hoBCXYHCZHTkd5VAtB6j++7y9rqZS+RznM6z6/NI9SalX7pHr7Sri
 DQeaMc71UYJfllvLyLmI+MbQccdLfQ1v4dmkt6pz29K5s0pX9PQYp54+Hu1Z73Te
 aFdrG+CuDchra1jxLFoell5P9bD6nq9SsNBfdf+6VjYk/1MMHP4yX/dAFEtEqMbm
 RJNX95bewY4mms3fj6e9j8jVXDLBiXR2an8yJI8k6CP6VPsIXQn+RG2pUQMcOUj0
 zcWikbfXvfn+ReIoaeReWPyZ7tPMW33mhnHDPy/saWHdZ9sycI4w2DstKgc2pe9E
 e6jI9H5JiH49CoMnue38kwnACNUIIvolJDpWeU6K0vQz4p5k6eUNTMSTEEVZbwiV
 Y8MVqMIf+Cu+y6UY1co5OhH387kFuLgYMC/LIFz/4nOrlopRCAzMvYcFEqo9gIOO
 0+Ls/lkPc/hU5D2f3/20UjAGKVY/GfTwKJDRFptzaYMfmiMWW0pl2zlHagYp1huM
 7k8p0vVh5rFOLUJxiftXC8+jBVyJKXLGgwPxBdLVapFMs9DU9gI=
 =v9bp
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-selftests-6.16' of https://github.com/kvm-x86/linux into HEAD

KVM selftests changes for 6.16:

 - Add support for SNP to the various SEV selftests.

 - Add a selftest to verify fastops instructions via forced emulation.

 - Add MGLRU support to the access tracking perf test.
2025-05-27 12:15:26 -04:00
Paolo Bonzini
1f7c9d52b1 KVM/riscv changes for 6.16
- Add vector registers to get-reg-list selftest
 - VCPU reset related improvements
 - Remove scounteren initialization from VCPU reset
 - Support VCPU reset from userspace using set_mpstate() ioctl
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmgx8nkACgkQrUjsVaLH
 LAdmYw//aoxlXA8OKFMWFzEC4yZHfq502X1p95Z9h1i8IYKoRcqNxX8O5MJt0y8Z
 dyhgqKLT5q5bpUf/yxJjuJdZQJtCTC0o1bx89sSiM83M2J+AZymcIS5yFVIcxetv
 JRsaWh8GDc/XYUFBnjYCj5zmA5IObbd/QCVnkATjVRcRS56LnRv98P9YPxN7zhHF
 OhDRWELtUowk6FMgiHo75R9vYhOO1ywzzjLFnK5ZHLSYeQ7bhRX/jY5n1BShsWjp
 k+zVRTpJobw9AU76EPQaEKuqiynQz35ZPkxAiuWYac6SDKwmztvGJ1fCnSNAkJLk
 0kN/eAxv61F5nCRJxOkxK0Uy3U94zA6zX53VdnLoRN4rYpA8CYrE4mNAddN31IBC
 NHlBS59w2EsUxIRL1FiCUrEKKgeSWqJY1NuqsmB4ogeo4MKm8n1OhiYSU0l7NnQ6
 3h77ccnHN95cah2C9XDX3GeZ+on6z1t6FjZ4Enki1w3CCfGdKMsfphwUckTNzSvw
 hTeXhYHcP4VKsYkCcitdLR/VFwO4a3HlnjAtHdJLh0qfJ5SergifZ5/eQXhHu7f1
 1uyfk/6nKculr/8yzUVkOR7kerMjBuBx8jir89ceKD8qeA+4MUI1rdwEvzbzaEY9
 9aDwEVbQ1qMjpeONfKUvbyLJ7uN5Lae1/X4Kafmo2TWNAvtLPLo=
 =+S8x
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-6.16-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv changes for 6.16

- Add vector registers to get-reg-list selftest
- VCPU reset related improvements
- Remove scounteren initialization from VCPU reset
- Support VCPU reset from userspace using set_mpstate() ioctl
2025-05-26 16:27:00 -04:00
Paolo Bonzini
4d526b02df KVM/arm64 updates for 6.16
* New features:
 
   - Add large stage-2 mapping support for non-protected pKVM guests,
     clawing back some performance.
 
   - Add UBSAN support to the standalone EL2 object used in nVHE/hVHE and
     protected modes.
 
   - Enable nested virtualisation support on systems that support it
     (yes, it has been a long time coming), though it is disabled by
     default.
 
 * Improvements, fixes and cleanups:
 
   - Large rework of the way KVM tracks architecture features and links
     them with the effects of control bits. This ensures correctness of
     emulation (the data is automatically extracted from the published
     JSON files), and helps dealing with the evolution of the
     architecture.
 
   - Significant changes to the way pKVM tracks ownership of pages,
     avoiding page table walks by storing the state in the hypervisor's
     vmemmap. This in turn enables the THP support described above.
 
   - New selftest checking the pKVM ownership transition rules
 
   - Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests
     even if the host didn't have it.
 
   - Fixes for the address translation emulation, which happened to be
     rather buggy in some specific contexts.
 
   - Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N
     from the number of counters exposed to a guest and addressing a
     number of issues in the process.
 
   - Add a new selftest for the SVE host state being corrupted by a
     guest.
 
   - Keep HCR_EL2.xMO set at all times for systems running with the
     kernel at EL2, ensuring that the window for interrupts is slightly
     bigger, and avoiding a pretty bad erratum on the AmpereOne HW.
 
   - Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers
     from a pretty bad case of TLB corruption unless accesses to HCR_EL2
     are heavily synchronised.
 
   - Add a per-VM, per-ITS debugfs entry to dump the state of the ITS
     tables in a human-friendly fashion.
 
   - and the usual random cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmgwU7UACgkQI9DQutE9
 ekN93g//fNnejxf01dBFIbuylzYEyHZSEH0iTGLeM+ES9zvntCzciTYVzb27oqNG
 RDLShlQYp3w4rAe6ORzyePyHptOmKXCxfj/VXUFp3A7H9QYOxt1nacD3WxI9fCOo
 LzaSLquvgwFBaeTdDE0KdeTUKQHluId+w1Azh0lnHGeUP+lOHNZ8FqoP1/la0q04
 GvVL+l3wz/IhPP8r1YA0Q1bzJ5SLfSpjIw/0F5H/xgI4lyYdHzgFL8sKuSyFeCyM
 2STQi+ZnTCsAs4bkXkw2Pp9CFYrfQgZi+sf7Om+noAKhbJo3vb7/RHpgjv+QCjJy
 Kx4g9CbxHfaM03cH6uSLBoFzsACR1iAuUz8BCSRvvVNH4RVT6H+34nzjLZXLncrP
 gm1uYs9aMTLr91caeAx0aYIMWGYa1uqV0rum3WxyIHezN9Q/NuQoZyfprUufr8oX
 wCYE+ot4VT3DwG0UFZKKwj0BiCbYcbph9nBLVyZJsg8OKxpvspkCtPriFp1kb6BP
 dTTGSXd9JJqwSgP9qJLxijcv6Nfgp2gT42TWwh/dJRZXhnTCvr9IyclFIhoIIq3G
 Q2BkFCXOoEoNQhBA1tiWzJ9nDHf52P72Z2K1gPyyMZwF49HGa2BZBCJGkqX06wSs
 Riolf1/cjFhDno1ThiHKsHT0sG1D4oc9k/1NLq5dyNAEGcgATIA=
 =Jju3
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 6.16

* New features:

  - Add large stage-2 mapping support for non-protected pKVM guests,
    clawing back some performance.

  - Add UBSAN support to the standalone EL2 object used in nVHE/hVHE and
    protected modes.

  - Enable nested virtualisation support on systems that support it
    (yes, it has been a long time coming), though it is disabled by
    default.

* Improvements, fixes and cleanups:

  - Large rework of the way KVM tracks architecture features and links
    them with the effects of control bits. This ensures correctness of
    emulation (the data is automatically extracted from the published
    JSON files), and helps dealing with the evolution of the
    architecture.

  - Significant changes to the way pKVM tracks ownership of pages,
    avoiding page table walks by storing the state in the hypervisor's
    vmemmap. This in turn enables the THP support described above.

  - New selftest checking the pKVM ownership transition rules

  - Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests
    even if the host didn't have it.

  - Fixes for the address translation emulation, which happened to be
    rather buggy in some specific contexts.

  - Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N
    from the number of counters exposed to a guest and addressing a
    number of issues in the process.

  - Add a new selftest for the SVE host state being corrupted by a
    guest.

  - Keep HCR_EL2.xMO set at all times for systems running with the
    kernel at EL2, ensuring that the window for interrupts is slightly
    bigger, and avoiding a pretty bad erratum on the AmpereOne HW.

  - Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers
    from a pretty bad case of TLB corruption unless accesses to HCR_EL2
    are heavily synchronised.

  - Add a per-VM, per-ITS debugfs entry to dump the state of the ITS
    tables in a human-friendly fashion.

  - and the usual random cleanups.
2025-05-26 16:19:46 -04:00
Marc Zyngier
1b85d923ba Merge branch kvm-arm64/misc-6.16 into kvmarm-master/next
* kvm-arm64/misc-6.16:
  : .
  : Misc changes and improvements for 6.16:
  :
  : - Add a new selftest for the SVE host state being corrupted by a guest
  :
  : - Keep HCR_EL2.xMO set at all times for systems running with the kernel at EL2,
  :   ensuring that the window for interrupts is slightly bigger, and avoiding
  :   a pretty bad erratum on the AmpereOne HW
  :
  : - Replace a couple of open-coded on/off strings with str_on_off()
  :
  : - Get rid of the pKVM memblock sorting, which now appears to be superflous
  :
  : - Drop superflous clearing of ICH_LR_EOI in the LR when nesting
  :
  : - Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers from
  :   a pretty bad case of TLB corruption unless accesses to HCR_EL2 are
  :   heavily synchronised
  :
  : - Add a per-VM, per-ITS debugfs entry to dump the state of the ITS tables
  :   in a human-friendly fashion
  : .
  KVM: arm64: Fix documentation for vgic_its_iter_next()
  KVM: arm64: vgic-its: Add debugfs interface to expose ITS tables
  arm64: errata: Work around AmpereOne's erratum AC04_CPU_23
  KVM: arm64: nv: Remove clearing of ICH_LR<n>.EOI if ICH_LR<n>.HW == 1
  KVM: arm64: Drop sort_memblock_regions()
  KVM: arm64: selftests: Add test for SVE host corruption
  KVM: arm64: Force HCR_EL2.xMO to 1 at all times in VHE mode
  KVM: arm64: Replace ternary flags with str_on_off() helper

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-23 10:59:43 +01:00
Atish Patra
f80e9cc5c6 KVM: riscv: selftests: Add vector extension tests
Add vector related tests with the ISA extension standard template.
However, the vector registers are bit tricky as the register length is
variable based on vlenb value of the system. That's why the macros are
defined with a default and overidden with actual value at runtime.

Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250430-kvm_selftest_improve-v3-3-eea270ff080b@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-05-21 09:34:40 +05:30
Atish Patra
3608b174ec KVM: riscv: selftests: Decode stval to identify exact exception type
Currently, the sbi_pmu_test continues if the exception type is illegal
instruction because access to hpmcounter will generate that. However
illegal instruction exception may occur due to the other reasons
which should result in test assertion.

Use the stval to decode the exact type of instructions and which csrs are
being accessed if it is csr access instructions. Assert in all cases
except if it is a csr access instructions that access valid PMU related
registers.

Take this opportunity to remove the CSR_CYCLEH reference as the test is
compiled for RV64 only.

Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250430-kvm_selftest_improve-v3-2-eea270ff080b@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-05-21 09:34:37 +05:30
Atish Patra
e23bb06b17 KVM: riscv: selftests: Align the trap information wiht pt_regs
The current exeception register structure in selftests are missing
few registers (e.g stval). Instead of adding it manually, change
the ex_regs to align with pt_regs to make it future proof.

Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250430-kvm_selftest_improve-v3-1-eea270ff080b@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-05-21 09:34:35 +05:30
Bibo Mao
a867688c8c KVM: selftests: Add supported test cases for LoongArch
Some common KVM test cases are supported on LoongArch now as following:
  coalesced_io_test
  demand_paging_test
  dirty_log_perf_test
  dirty_log_test
  guest_print_test
  hardware_disable_test
  kvm_binary_stats_test
  kvm_create_max_vcpus
  kvm_page_table_test
  memslot_modification_stress_test
  memslot_perf_test
  set_memory_region_test

And other test cases are not supported by LoongArch such as rseq_test,
since it is not supported on LoongArch physical machine either.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-20 20:20:26 +08:00
Bibo Mao
304b93b1a0 KVM: selftests: Add ucall test support for LoongArch
Add ucall test support for LoongArch, ucall method on LoongArch uses
undefined mmio area. It will cause vCPU exiting to hypervisor so that
hypervisor can communicate with vCPU.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-05-20 20:20:26 +08:00