* arm64/for-next/perf:
perf: hisi: Fix use-after-free when register pmu fails
drivers/perf: hisi_pcie: Initialize event->cpu only on success
drivers/perf: hisi_pcie: Check the type first in pmu::event_init()
perf/arm-cmn: Enable per-DTC counter allocation
perf/arm-cmn: Rework DTC counters (again)
perf/arm-cmn: Fix DTC domain detection
drivers: perf: arm_pmuv3: Drop some unused arguments from armv8_pmu_init()
drivers: perf: arm_pmuv3: Read PMMIR_EL1 unconditionally
drivers/perf: hisi: use cpuhp_state_remove_instance_nocalls() for hisi_hns3_pmu uninit process
drivers/perf: xgene: Use device_get_match_data()
perf/amlogic: add missing MODULE_DEVICE_TABLE
docs/perf: Add ampere_cspmu to toctree to fix a build warning
perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU
perf: arm_cspmu: Support implementation specific validation
perf: arm_cspmu: Support implementation specific filters
perf: arm_cspmu: Split 64-bit write to 32-bit writes
perf: arm_cspmu: Separate Arm and vendor module
* for-next/sve-remove-pseudo-regs:
: arm64/fpsimd: Remove the vector length pseudo registers
arm64/sve: Remove SMCR pseudo register from cpufeature code
arm64/sve: Remove ZCR pseudo register from cpufeature code
* for-next/backtrace-ipi:
: Add IPI for backtraces/kgdb, use NMI
arm64: smp: Don't directly call arch_smp_send_reschedule() for wakeup
arm64: smp: avoid NMI IPIs with broken MediaTek FW
arm64: smp: Mark IPI globals as __ro_after_init
arm64: kgdb: Implement kgdb_roundup_cpus() to enable pseudo-NMI roundup
arm64: smp: IPI_CPU_STOP and IPI_CPU_CRASH_STOP should try for NMI
arm64: smp: Add arch support for backtrace using pseudo-NMI
arm64: smp: Remove dedicated wakeup IPI
arm64: idle: Tag the arm64 idle functions as __cpuidle
irqchip/gic-v3: Enable support for SGIs to act as NMIs
* for-next/kselftest:
: Various arm64 kselftest updates
kselftest/arm64: Validate SVCR in streaming SVE stress test
* for-next/misc:
: Miscellaneous patches
arm64: Restrict CPU_BIG_ENDIAN to GNU as or LLVM IAS 15.x or newer
arm64: module: Fix PLT counting when CONFIG_RANDOMIZE_BASE=n
arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helper
clocksource/drivers/arm_arch_timer: limit XGene-1 workaround
arm64: Remove system_uses_lse_atomics()
arm64: Mark the 'addr' argument to set_ptes() and __set_pte_at() as unused
arm64/mm: Hoist synchronization out of set_ptes() loop
arm64: swiotlb: Reduce the default size if no ZONE_DMA bouncing needed
* for-next/cpufeat-display-cores:
: arm64 cpufeature display enabled cores
arm64: cpufeature: Change DBM to display enabled cores
arm64: cpufeature: Display the set of cores with a feature
Now that we have the ability to display the list of cores
with a feature when its selectivly enabled, lets convert
DBM to use that as well.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Link: https://lore.kernel.org/r/20231017052322.1211099-3-jeremy.linton@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The AMU feature can be enabled on a subset of the cores in a system.
Because of that, it prints a message for each core as it is detected.
This becomes tedious when there are hundreds of cores. Instead, for
CPU features which can be enabled on a subset of the present cores,
lets wait until update_cpu_capabilities() and print the subset of cores
the feature was enabled on.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@bytedance.com>
Tested-by: Punit Agrawal <punit.agrawal@bytedance.com>
Link: https://lore.kernel.org/r/20231017052322.1211099-2-jeremy.linton@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
There are no longer any users of cpus_have_const_cap(), and therefore it
can be removed.
Remove cpus_have_const_cap(). At the same time, remove
__cpus_have_const_cap(), as this is a trivial wrapper of
alternative_has_cap_unlikely(), which can be used directly instead.
The comment for __system_matches_cap() is updated to no longer refer to
cpus_have_const_cap(). As we have a number of ways to check the cpucaps,
the specific suggestions are removed.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In has_useable_cnp() we use cpus_have_const_cap() to check for
ARM64_WORKAROUND_NVIDIA_CARMEL_CNP, but this is not necessary and
cpus_have_cap() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
We use has_useable_cnp() to determine whether we have the system-wide
ARM64_HAS_CNP cpucap. Due to the structure of the cpufeature code, we
call has_useable_cnp() in two distinct cases:
1) When finalizing system capabilities, setup_system_capabilities() will
call has_useable_cnp() with SCOPE_SYSTEM to determine whether all
CPUs have the feature. This is called after we've detected any local
cpucaps including ARM64_WORKAROUND_NVIDIA_CARMEL_CNP, but prior to
patching alternatives.
If the ARM64_WORKAROUND_NVIDIA_CARMEL_CNP was detected, we will not
detect ARM64_HAS_CNP.
2) After finalizing system capabilties, verify_local_cpu_capabilities()
will call has_useable_cnp() with SCOPE_LOCAL_CPU to verify that CPUs
have CNP if we previously detected it.
Note that if ARM64_WORKAROUND_NVIDIA_CARMEL_CNP was detected, we will
not have detected ARM64_HAS_CNP.
For case 1 we must check the system_cpucaps bitmap as this occurs prior
to patching the alternatives. For case 2 we'll only call
has_useable_cnp() once per subsequent onlining of a CPU, and as this
isn't a fast path it's not necessary to optimize for this case.
This patch replaces the use of cpus_have_const_cap() with
cpus_have_cap(), which will only generate the bitmap test and avoid
generating an alternative sequence, resulting in slightly simpler annd
smaller code being generated. The ARM64_WORKAROUND_NVIDIA_CARMEL_CNP
cpucap is added to cpucap_is_possible() so that code can be elided
entirely when this is not possible.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In elf_hwcap_fixup() we use cpus_have_const_cap() to check for
ARM64_WORKAROUND_1742098, but this is not necessary and cpus_have_cap()
would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The ARM64_WORKAROUND_1742098 cpucap is detected and patched before
elf_hwcap_fixup() can run, and hence it is not necessary to use
cpus_have_const_cap(). We run cpus_have_const_cap() at most twice: once
after finalizing system cpucaps, and potentially once more after
detecting mismatched CPUs which support AArch32 at EL0. Due to this,
it's not necessary to optimize for many calls to elf_hwcap_fixup(), and
it's fine to use cpus_have_cap().
This patch replaces the use of cpus_have_const_cap() with
cpus_have_cap(), which will only generate the bitmap test and avoid
generating an alternative sequence, resulting in slightly simpler annd
smaller code being generated. For consistenct with other cpucaps, the
ARM64_WORKAROUND_1742098 cpucap is added to cpucap_is_possible() so that
code can be elided when this is not possible. However, as we only define
compat_elf_hwcap2 when CONFIG_COMPAT=y, some ifdeffery is still required
within user_feature_fixup() to avoid build errors when CONFIG_COMPAT=n.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In system_uses_hw_pan() we use cpus_have_const_cap() to check for
ARM64_HAS_PAN, but this is only necessary so that the
system_uses_ttbr0_pan() check in setup_cpu_features() can run prior to
alternatives being patched, and otherwise this is not necessary and
alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The ARM64_HAS_PAN cpucap is used by system_uses_hw_pan() and
system_uses_ttbr0_pan() depending on whether CONFIG_ARM64_SW_TTBR0_PAN
is selected, and:
* We only use system_uses_hw_pan() directly in __sdei_handler(), which
isn't reachable until after alternatives have been patched, and for
this it is safe to use alternative_has_cap_*().
* We use system_uses_ttbr0_pan() in a few places:
- In check_and_switch_context() and cpu_uninstall_idmap(), which will
defer installing a translation table into TTBR0 when the
ARM64_HAS_PAN cpucap is not detected.
Prior to patching alternatives, all CPUs will be using init_mm with
the reserved ttbr0 translation tables install in TTBR0, so these can
safely use alternative_has_cap_*().
- In update_saved_ttbr0(), which will only save the active TTBR0 into
a per-thread variable when the ARM64_HAS_PAN cpucap is not detected.
Prior to patching alternatives, all CPUs will be using init_mm with
the reserved ttbr0 translation tables install in TTBR0, so these can
safely use alternative_has_cap_*().
- In efi_set_pgd(), which will handle check_and_switch_context()
deferring the installation of TTBR0 when TTBR0 PAN is detected.
The EFI runtime services are not initialized until after
alternatives have been patched, and so this can safely use
alternative_has_cap_*() or cpus_have_final_cap().
- In uaccess_ttbr0_disable() and uaccess_ttbr0_enable(), where we'll
avoid installing/uninstalling a translation table in TTBR0 when
ARM64_HAS_PAN is detected.
Prior to patching alternatives we will not perform any uaccess and
will not call uaccess_ttbr0_disable() or uaccess_ttbr0_enable(), and
so these can safely use alternative_has_cap_*() or
cpus_have_final_cap().
- In is_el1_permission_fault() where we will consider a translation
fault on a TTBR0 address to be a permission fault when ARM64_HAS_PAN
is not detected *and* we have set the PAN bit in the SPSR (which
tells us that in the interrupted context, TTBR0 pointed at the
reserved zero ttbr).
In the window between detecting system cpucaps and patching
alternatives we should not perform any accesses to TTBR0 addresses,
and no userspace translation tables exist until after patching
alternatives. Thus it is safe for this to use alternative_has_cap*().
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime.
So that the check for TTBR0 PAN in setup_cpu_features() can run prior to
alternatives being patched, the call to system_uses_ttbr0_pan() is
replaced with an explicit check of the ARM64_HAS_PAN bit in the
system_cpucaps bitmap.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In system_supports_cnp() we use cpus_have_const_cap() to check for
ARM64_HAS_CNP, but this is only necessary so that the cpu_enable_cnp()
callback can run prior to alternatives being patched, and otherwise this
is not necessary and alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The cpu_enable_cnp() callback is run immediately after the ARM64_HAS_CNP
cpucap is detected system-wide under setup_system_capabilities(), prior
to alternatives being patched. During this window cpu_enable_cnp() uses
cpu_replace_ttbr1() to set the CNP bit for the swapper_pg_dir in TTBR1.
No other users of the ARM64_HAS_CNP cpucap need the up-to-date value
during this window:
* As KVM isn't initialized yet, kvm_get_vttbr() isn't reachable.
* As cpuidle isn't initialized yet, __cpu_suspend_exit() isn't
reachable.
* At this point all CPUs are using the swapper_pg_dir with a reserved
ASID in TTBR1, and the idmap_pg_dir in TTBR0, so neither
check_and_switch_context() nor cpu_do_switch_mm() need to do anything
special.
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime. To allow cpu_enable_cnp() to function prior to alternatives
being patched, cpu_replace_ttbr1() is split into cpu_replace_ttbr1() and
cpu_enable_swapper_cnp(), with the former only used for early TTBR1
replacement, and the latter used by both cpu_enable_cnp() and
__cpu_suspend_exit().
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently we have a negative cpucap which describes the *absence* of
FP/SIMD rather than *presence* of FP/SIMD. This largely works, but is
somewhat awkward relative to other cpucaps that describe the presence of
a feature, and it would be nicer to have a cpucap which describes the
presence of FP/SIMD:
* This will allow the cpucap to be treated as a standard
ARM64_CPUCAP_SYSTEM_FEATURE, which can be detected with the standard
has_cpuid_feature() function and ARM64_CPUID_FIELDS() description.
* This ensures that the cpucap will only transition from not-present to
present, reducing the risk of unintentional and/or unsafe usage of
FP/SIMD before cpucaps are finalized.
* This will allow using arm64_cpu_capabilities::cpu_enable() to enable
the use of FP/SIMD later, with FP/SIMD being disabled at boot time
otherwise. This will ensure that any unintentional and/or unsafe usage
of FP/SIMD prior to this is trapped, and will ensure that FP/SIMD is
never unintentionally enabled for userspace in mismatched big.LITTLE
systems.
This patch replaces the negative ARM64_HAS_NO_FPSIMD cpucap with a
positive ARM64_HAS_FPSIMD cpucap, making changes as described above.
Note that as FP/SIMD will now be trapped when not supported system-wide,
do_fpsimd_acc() must handle these traps in the same way as for SVE and
SME. The commentary in fpsimd_restore_current_state() is updated to
describe the new scheme.
No users of system_supports_fpsimd() need to know that FP/SIMD is
available prior to alternatives being patched, so this is updated to
use alternative_has_cap_likely() to check for the ARM64_HAS_FPSIMD
cpucap, without generating code to test the system_cpucaps bitmap.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The arm64_cpu_capabilities::cpu_enable() callbacks for SVE, SME, SME2,
and FA64 are named with an unusual "${feature}_kernel_enable" pattern
rather than the much more common "cpu_enable_${feature}". Now that we
only use these as cpu_enable() callbacks, it would be nice to have them
match the usual scheme.
This patch renames the cpu_enable() callbacks to match this scheme. At
the same time, the comment above cpu_enable_sve() is removed for
consistency with the other cpu_enable() callbacks.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When a CPUs onlined we first probe for supported features and
propetites, and then we subsequently enable features that have been
detected. This is a little problematic for SVE and SME, as some
properties (e.g. vector lengths) cannot be probed while they are
disabled. Due to this, the code probing for SVE properties has to enable
SVE for EL1 prior to proving, and the code probing for SME properties
has to enable SME for EL1 prior to probing. We never disable SVE or SME
for EL1 after probing.
It would be a little nicer to transiently enable SVE and SME during
probing, leaving them both disabled unless explicitly enabled, as this
would make it much easier to catch unintentional usage (e.g. when they
are not present system-wide).
This patch reworks the SVE and SME feature probing code to only
transiently enable support at EL1, disabling after probing is complete.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The arm64_cpu_capabilities::cpu_enable callbacks are intended for
cpu-local feature enablement (e.g. poking system registers). These get
called for each online CPU when boot/system cpucaps get finalized and
enabled, and get called whenever a CPU is subsequently onlined.
For KPTI with the ARM64_UNMAP_KERNEL_AT_EL0 cpucap, we use the
kpti_install_ng_mappings() function as the cpu_enable callback. This
does a mixture of cpu-local configuration (setting VBAR_EL1 to the
appropriate trampoline vectors) and some global configuration (rewriting
the swapper page tables to sue non-glboal mappings) that must happen at
most once.
This patch splits kpti_install_ng_mappings() into a cpu-local
cpu_enable_kpti() initialization function and a system-wide
kpti_install_ng_mappings() function. The cpu_enable_kpti() function is
responsible for selecting the necessary cpu-local vectors each time a
CPU is onlined, and the kpti_install_ng_mappings() function performs the
one-time rewrite of the translation tables too use non-global mappings.
Splitting the two makes the code a bit easier to follow and also allows
the page table rewriting code to be marked as __init such that it can be
freed after use.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
For ARM64_WORKAROUND_2658417, we use a cpu_enable() callback to hide the
ID_AA64ISAR1_EL1.BF16 ID register field. This is a little awkward as
CPUs may attempt to apply the workaround concurrently, requiring that we
protect the bulk of the callback with a raw_spinlock, and requiring some
pointless work every time a CPU is subsequently hotplugged in.
This patch makes this a little simpler by handling the masking once at
boot time. A new user_feature_fixup() function is called at the start of
setup_user_features() to mask the feature, matching the style of
elf_hwcap_fixup(). The ARM64_WORKAROUND_2658417 cpucap is added to
cpucap_is_possible() so that code can be elided entirely when this is
not possible.
Note that the ARM64_WORKAROUND_2658417 capability is matched with
ERRATA_MIDR_RANGE(), which implicitly gives the capability a
ARM64_CPUCAP_LOCAL_CPU_ERRATUM type, which forbids the late onlining of
a CPU with the erratum if the erratum was not present at boot time.
Therefore this patch doesn't change the behaviour for late onlining.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently setup_cpu_features() handles a mixture of one-time kernel
feature setup (e.g. cpucaps) and one-time user feature setup (e.g. ELF
hwcaps). Subsequent patches will rework other one-time setup and expand
the logic currently in setup_cpu_features(), and in preparation for this
it would be helpful to split the kernel and user setup into separate
functions.
This patch splits setup_user_features() out of setup_cpu_features(),
with a few additional cleanups of note:
* setup_cpu_features() is renamed to setup_system_features() to make it
clear that it handles system-wide feature setup rather than cpu-local
feature setup.
* setup_system_capabilities() is folded into setup_system_features().
* Presence of TTBR0 pan is logged immediately after
update_cpu_capabilities(), so that this is guaranteed to appear
alongside all the other detected system cpucaps.
* The 'cwg' variable is removed as its value is only consumed once and
it's simpler to use cache_type_cwg() directly without assigning its
return value to a variable.
* The call to setup_user_features() is moved after alternatives are
patched, which will allow user feature setup code to depend on
alternative branches and allow for simplifications in subsequent
patches.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
FEAT_LRCPC3 adds more instructions to support the Release Consistency model.
Add a HWCAP so that userspace can make decisions about instructions it can use.
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230919162757.2707023-2-joey.gouly@arm.com
[catalin.marinas@arm.com: change the HWCAP number]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
SVE 2.1 introduced a new feature FEAT_SVE_B16B16 which adds instructions
supporting the BFloat16 floating point format. Report this to userspace
through the ID registers and hwcap.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230915-arm64-zfr-b16b16-el0-v1-1-f9aba807bdb5@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
For reasons that are not currently apparent during cpufeature enumeration
we maintain a pseudo register for SMCR which records the maximum supported
vector length using the value that would be written to SMCR_EL1.LEN to
configure it. This is not exposed to userspace and is not sufficient for
detecting unsupportable configurations, we need the more detailed checks in
vec_update_vq_map() for that since we can't cope with missing vector
lengths on late CPUs and KVM requires an exactly matching set of supported
vector lengths as EL1 can enumerate VLs directly with the hardware.
Remove the code, replacing the usage in sme_setup() with a query of the
vq_map.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230913-arm64-vec-len-cpufeature-v1-2-cc69b0600a8a@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
For reasons that are not currently apparent during cpufeature enumeration
we maintain a pseudo register for ZCR which records the maximum supported
vector length using the value that would be written to ZCR_EL1.LEN to
configure it. This is not exposed to userspace and is not sufficient for
detecting unsupportable configurations, we need the more detailed checks in
vec_update_vq_map() for that since we can't cope with missing vector
lengths on late CPUs and KVM requires an exactly matching set of supported
vector lengths as EL1 can enumerate VLs directly with the hardware.
Remove the code, replacing the usage in sve_setup() with a query of the
vq_map.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230913-arm64-vec-len-cpufeature-v1-1-cc69b0600a8a@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
ClearBHB support is indicated by the CLRBHB field in ID_AA64ISAR2_EL1.
Following some refactoring the kernel incorrectly checks the BC field
instead. Fix the detection to use the right field.
(Note: The original ClearBHB support had it as FTR_HIGHER_SAFE, but this
patch uses FTR_LOWER_SAFE, which seems more correct.)
Also fix the detection of BC (hinted conditional branches) to use
FTR_LOWER_SAFE, so that it is not reported on mismatched systems.
Fixes: 356137e68a ("arm64/sysreg: Make BHB clear feature defines match the architecture")
Fixes: 8fcc8285c0 ("arm64/sysreg: Convert ID_AA64ISAR2_EL1 to automatic generation")
Cc: stable@vger.kernel.org
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230912133429.2606875-1-kristina.martsenko@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
* Clean up vCPU targets, always returning generic v8 as the preferred target
* Trap forwarding infrastructure for nested virtualization (used for traps
that are taken from an L2 guest and are needed by the L1 hypervisor)
* FEAT_TLBIRANGE support to only invalidate specific ranges of addresses
when collapsing a table PTE to a block PTE. This avoids that the guest
refills the TLBs again for addresses that aren't covered by the table PTE.
* Fix vPMU issues related to handling of PMUver.
* Don't unnecessary align non-stack allocations in the EL2 VA space
* Drop HCR_VIRT_EXCP_MASK, which was never used...
* Don't use smp_processor_id() in kvm_arch_vcpu_load(),
but the cpu parameter instead
* Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort()
* Remove prototypes without implementations
RISC-V:
* Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest
* Added ONE_REG interface for SATP mode
* Added ONE_REG interface to enable/disable multiple ISA extensions
* Improved error codes returned by ONE_REG interfaces
* Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V
* Added get-reg-list selftest for KVM RISC-V
s390:
* PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch)
Allows a PV guest to use crypto cards. Card access is governed by
the firmware and once a crypto queue is "bound" to a PV VM every
other entity (PV or not) looses access until it is not bound
anymore. Enablement is done via flags when creating the PV VM.
* Guest debug fixes (Ilya)
x86:
* Clean up KVM's handling of Intel architectural events
* Intel bugfixes
* Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use debug
registers and generate/handle #DBs
* Clean up LBR virtualization code
* Fix a bug where KVM fails to set the target pCPU during an IRTE update
* Fix fatal bugs in SEV-ES intrahost migration
* Fix a bug where the recent (architecturally correct) change to reinject
#BP and skip INT3 broke SEV guests (can't decode INT3 to skip it)
* Retry APIC map recalculation if a vCPU is added/enabled
* Overhaul emergency reboot code to bring SVM up to par with VMX, tie the
"emergency disabling" behavior to KVM actually being loaded, and move all of
the logic within KVM
* Fix user triggerable WARNs in SVM where KVM incorrectly assumes the TSC
ratio MSR cannot diverge from the default when TSC scaling is disabled
up related code
* Add a framework to allow "caching" feature flags so that KVM can check if
the guest can use a feature without needing to search guest CPUID
* Rip out the ancient MMU_DEBUG crud and replace the useful bits with
CONFIG_KVM_PROVE_MMU
* Fix KVM's handling of !visible guest roots to avoid premature triple fault
injection
* Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the API surface
that is needed by external users (currently only KVMGT), and fix a variety
of issues in the process
This last item had a silly one-character bug in the topic branch that
was sent to me. Because it caused pretty bad selftest failures in
some configurations, I decided to squash in the fix. So, while the
exact commit ids haven't been in linux-next, the code has (from the
kvm-x86 tree).
Generic:
* Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass
action specific data without needing to constantly update the main handlers.
* Drop unused function declarations
Selftests:
* Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs
* Add support for printf() in guest code and covert all guest asserts to use
printf-based reporting
* Clean up the PMU event filter test and add new testcases
* Include x86 selftests in the KVM x86 MAINTAINERS entry
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmT1m0kUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMNgggAiN7nz6UC423FznuI+yO3TLm8tkx1
CpKh5onqQogVtchH+vrngi97cfOzZb1/AtifY90OWQi31KEWhehkeofcx7G6ERhj
5a9NFADY1xGBsX4exca/VHDxhnzsbDWaWYPXw5vWFWI6erft9Mvy3tp1LwTvOzqM
v8X4aWz+g5bmo/DWJf4Wu32tEU6mnxzkrjKU14JmyqQTBawVmJ3RYvHVJ/Agpw+n
hRtPAy7FU6XTdkmq/uCT+KUHuJEIK0E/l1js47HFAqSzwdW70UDg14GGo1o4ETxu
RjZQmVNvL57yVgi6QU38/A0FWIsWQm5IlaX1Ug6x8pjZPnUKNbo9BY4T1g==
=W+4p
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM:
- Clean up vCPU targets, always returning generic v8 as the preferred
target
- Trap forwarding infrastructure for nested virtualization (used for
traps that are taken from an L2 guest and are needed by the L1
hypervisor)
- FEAT_TLBIRANGE support to only invalidate specific ranges of
addresses when collapsing a table PTE to a block PTE. This avoids
that the guest refills the TLBs again for addresses that aren't
covered by the table PTE.
- Fix vPMU issues related to handling of PMUver.
- Don't unnecessary align non-stack allocations in the EL2 VA space
- Drop HCR_VIRT_EXCP_MASK, which was never used...
- Don't use smp_processor_id() in kvm_arch_vcpu_load(), but the cpu
parameter instead
- Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort()
- Remove prototypes without implementations
RISC-V:
- Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest
- Added ONE_REG interface for SATP mode
- Added ONE_REG interface to enable/disable multiple ISA extensions
- Improved error codes returned by ONE_REG interfaces
- Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V
- Added get-reg-list selftest for KVM RISC-V
s390:
- PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch)
Allows a PV guest to use crypto cards. Card access is governed by
the firmware and once a crypto queue is "bound" to a PV VM every
other entity (PV or not) looses access until it is not bound
anymore. Enablement is done via flags when creating the PV VM.
- Guest debug fixes (Ilya)
x86:
- Clean up KVM's handling of Intel architectural events
- Intel bugfixes
- Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use
debug registers and generate/handle #DBs
- Clean up LBR virtualization code
- Fix a bug where KVM fails to set the target pCPU during an IRTE
update
- Fix fatal bugs in SEV-ES intrahost migration
- Fix a bug where the recent (architecturally correct) change to
reinject #BP and skip INT3 broke SEV guests (can't decode INT3 to
skip it)
- Retry APIC map recalculation if a vCPU is added/enabled
- Overhaul emergency reboot code to bring SVM up to par with VMX, tie
the "emergency disabling" behavior to KVM actually being loaded,
and move all of the logic within KVM
- Fix user triggerable WARNs in SVM where KVM incorrectly assumes the
TSC ratio MSR cannot diverge from the default when TSC scaling is
disabled up related code
- Add a framework to allow "caching" feature flags so that KVM can
check if the guest can use a feature without needing to search
guest CPUID
- Rip out the ancient MMU_DEBUG crud and replace the useful bits with
CONFIG_KVM_PROVE_MMU
- Fix KVM's handling of !visible guest roots to avoid premature
triple fault injection
- Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the
API surface that is needed by external users (currently only
KVMGT), and fix a variety of issues in the process
Generic:
- Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier
events to pass action specific data without needing to constantly
update the main handlers.
- Drop unused function declarations
Selftests:
- Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs
- Add support for printf() in guest code and covert all guest asserts
to use printf-based reporting
- Clean up the PMU event filter test and add new testcases
- Include x86 selftests in the KVM x86 MAINTAINERS entry"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (279 commits)
KVM: x86/mmu: Include mmu.h in spte.h
KVM: x86/mmu: Use dummy root, backed by zero page, for !visible guest roots
KVM: x86/mmu: Disallow guest from using !visible slots for page tables
KVM: x86/mmu: Harden TDP MMU iteration against root w/o shadow page
KVM: x86/mmu: Harden new PGD against roots without shadow pages
KVM: x86/mmu: Add helper to convert root hpa to shadow page
drm/i915/gvt: Drop final dependencies on KVM internal details
KVM: x86/mmu: Handle KVM bookkeeping in page-track APIs, not callers
KVM: x86/mmu: Drop @slot param from exported/external page-track APIs
KVM: x86/mmu: Bug the VM if write-tracking is used but not enabled
KVM: x86/mmu: Assert that correct locks are held for page write-tracking
KVM: x86/mmu: Rename page-track APIs to reflect the new reality
KVM: x86/mmu: Drop infrastructure for multiple page-track modes
KVM: x86/mmu: Use page-track notifiers iff there are external users
KVM: x86/mmu: Move KVM-only page-track declarations to internal header
KVM: x86: Remove the unused page-track hook track_flush_slot()
drm/i915/gvt: switch from ->track_flush_slot() to ->track_remove_region()
KVM: x86: Add a new page-track hook to handle memslot deletion
drm/i915/gvt: Don't bother removing write-protection on to-be-deleted slot
KVM: x86: Reject memslot MOVE operations if KVMGT is attached
...
In order to allow us to have shared code for managing fine grained traps
for KVM guests add it as a detected feature rather than relying on it
being a dependency of other features.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
[maz: converted to ARM64_CPUID_FIELDS()]
Link: https://lore.kernel.org/r/20230301-kvm-arm64-fgt-v4-1-1bf8d235ac1f@kernel.org
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-10-maz@kernel.org
Add a HWCAP for FEAT_HBC, so that userspace can make a decision on using
this feature.
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230804143746.3900803-2-joey.gouly@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
The recently added Enhanced Virtualization Traps cpufeature does not use
the ARM64_CPUID_FIELDS() helper, convert it to do so. No functional
change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Zenghui Yu <zenghui.yu@linux.dev>
Link: https://lore.kernel.org/r/20230718-arm64-evt-cpuid-helper-v1-1-68375d1e6b92@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
* Eager page splitting optimization for dirty logging, optionally
allowing for a VM to avoid the cost of hugepage splitting in the stage-2
fault path.
* Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with
services that live in the Secure world. pKVM intervenes on FF-A calls
to guarantee the host doesn't misuse memory donated to the hyp or a
pKVM guest.
* Support for running the split hypervisor with VHE enabled, known as
'hVHE' mode. This is extremely useful for testing the split
hypervisor on VHE-only systems, and paves the way for new use cases
that depend on having two TTBRs available at EL2.
* Generalized framework for configurable ID registers from userspace.
KVM/arm64 currently prevents arbitrary CPU feature set configuration
from userspace, but the intent is to relax this limitation and allow
userspace to select a feature set consistent with the CPU.
* Enable the use of Branch Target Identification (FEAT_BTI) in the
hypervisor.
* Use a separate set of pointer authentication keys for the hypervisor
when running in protected mode, as the host is untrusted at runtime.
* Ensure timer IRQs are consistently released in the init failure
paths.
* Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps
(FEAT_EVT), as it is a register commonly read from userspace.
* Erratum workaround for the upcoming AmpereOne part, which has broken
hardware A/D state management.
RISC-V:
* Redirect AMO load/store misaligned traps to KVM guest
* Trap-n-emulate AIA in-kernel irqchip for KVM guest
* Svnapot support for KVM Guest
s390:
* New uvdevice secret API
* CMM selftest and fixes
* fix racy access to target CPU for diag 9c
x86:
* Fix missing/incorrect #GP checks on ENCLS
* Use standard mmu_notifier hooks for handling APIC access page
* Drop now unnecessary TR/TSS load after VM-Exit on AMD
* Print more descriptive information about the status of SEV and SEV-ES during
module load
* Add a test for splitting and reconstituting hugepages during and after
dirty logging
* Add support for CPU pinning in demand paging test
* Add support for AMD PerfMonV2, with a variety of cleanups and minor fixes
included along the way
* Add a "nx_huge_pages=never" option to effectively avoid creating NX hugepage
recovery threads (because nx_huge_pages=off can be toggled at runtime)
* Move handling of PAT out of MTRR code and dedup SVM+VMX code
* Fix output of PIC poll command emulation when there's an interrupt
* Add a maintainer's handbook to document KVM x86 processes, preferred coding
style, testing expectations, etc.
* Misc cleanups, fixes and comments
Generic:
* Miscellaneous bugfixes and cleanups
Selftests:
* Generate dependency files so that partial rebuilds work as expected
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmSgHrIUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroORcAf+KkBlXwQMf+Q0Hy6Mfe0OtkKmh0Ae
6HJ6dsuMfOHhWv5kgukh+qvuGUGzHq+gpVKmZg2yP3h3cLHOLUAYMCDm+rjXyjsk
F4DbnJLfxq43Pe9PHRKFxxSecRcRYCNox0GD5UYL4PLKcH0FyfQrV+HVBK+GI8L3
FDzUcyJkR12Lcj1qf++7fsbzfOshL0AJPmidQCoc6wkLJpUEr/nYUqlI1Kx3YNuQ
LKmxFHS4l4/O/px3GKNDrLWDbrVlwciGIa3GZLS52PZdW3mAqT+cqcPcYK6SW71P
m1vE80VbNELX5q3YSRoOXtedoZ3Pk97LEmz/xQAsJ/jri0Z5Syk0Ok0m/Q==
=AMXp
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM64:
- Eager page splitting optimization for dirty logging, optionally
allowing for a VM to avoid the cost of hugepage splitting in the
stage-2 fault path.
- Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact
with services that live in the Secure world. pKVM intervenes on
FF-A calls to guarantee the host doesn't misuse memory donated to
the hyp or a pKVM guest.
- Support for running the split hypervisor with VHE enabled, known as
'hVHE' mode. This is extremely useful for testing the split
hypervisor on VHE-only systems, and paves the way for new use cases
that depend on having two TTBRs available at EL2.
- Generalized framework for configurable ID registers from userspace.
KVM/arm64 currently prevents arbitrary CPU feature set
configuration from userspace, but the intent is to relax this
limitation and allow userspace to select a feature set consistent
with the CPU.
- Enable the use of Branch Target Identification (FEAT_BTI) in the
hypervisor.
- Use a separate set of pointer authentication keys for the
hypervisor when running in protected mode, as the host is untrusted
at runtime.
- Ensure timer IRQs are consistently released in the init failure
paths.
- Avoid trapping CTR_EL0 on systems with Enhanced Virtualization
Traps (FEAT_EVT), as it is a register commonly read from userspace.
- Erratum workaround for the upcoming AmpereOne part, which has
broken hardware A/D state management.
RISC-V:
- Redirect AMO load/store misaligned traps to KVM guest
- Trap-n-emulate AIA in-kernel irqchip for KVM guest
- Svnapot support for KVM Guest
s390:
- New uvdevice secret API
- CMM selftest and fixes
- fix racy access to target CPU for diag 9c
x86:
- Fix missing/incorrect #GP checks on ENCLS
- Use standard mmu_notifier hooks for handling APIC access page
- Drop now unnecessary TR/TSS load after VM-Exit on AMD
- Print more descriptive information about the status of SEV and
SEV-ES during module load
- Add a test for splitting and reconstituting hugepages during and
after dirty logging
- Add support for CPU pinning in demand paging test
- Add support for AMD PerfMonV2, with a variety of cleanups and minor
fixes included along the way
- Add a "nx_huge_pages=never" option to effectively avoid creating NX
hugepage recovery threads (because nx_huge_pages=off can be toggled
at runtime)
- Move handling of PAT out of MTRR code and dedup SVM+VMX code
- Fix output of PIC poll command emulation when there's an interrupt
- Add a maintainer's handbook to document KVM x86 processes,
preferred coding style, testing expectations, etc.
- Misc cleanups, fixes and comments
Generic:
- Miscellaneous bugfixes and cleanups
Selftests:
- Generate dependency files so that partial rebuilds work as
expected"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (153 commits)
Documentation/process: Add a maintainer handbook for KVM x86
Documentation/process: Add a label for the tip tree handbook's coding style
KVM: arm64: Fix misuse of KVM_ARM_VCPU_POWER_OFF bit index
RISC-V: KVM: Remove unneeded semicolon
RISC-V: KVM: Allow Svnapot extension for Guest/VM
riscv: kvm: define vcpu_sbi_ext_pmu in header
RISC-V: KVM: Expose IMSIC registers as attributes of AIA irqchip
RISC-V: KVM: Add in-kernel virtualization of AIA IMSIC
RISC-V: KVM: Expose APLIC registers as attributes of AIA irqchip
RISC-V: KVM: Add in-kernel emulation of AIA APLIC
RISC-V: KVM: Implement device interface for AIA irqchip
RISC-V: KVM: Skeletal in-kernel AIA irqchip support
RISC-V: KVM: Set kvm_riscv_aia_nr_hgei to zero
RISC-V: KVM: Add APLIC related defines
RISC-V: KVM: Add IMSIC related defines
RISC-V: KVM: Implement guest external interrupt line management
KVM: x86: Remove PRIx* definitions as they are solely for user space
s390/uv: Update query for secret-UVCs
s390/uv: replace scnprintf with sysfs_emit
s390/uvdevice: Add 'Lock Secret Store' UVC
...
* arm64/for-next/perf:
docs: perf: Fix warning from 'make htmldocs' in hisi-pmu.rst
docs: perf: Add new description for HiSilicon UC PMU
drivers/perf: hisi: Add support for HiSilicon UC PMU driver
drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driver
perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE
perf/arm-cmn: Add sysfs identifier
perf/arm-cmn: Revamp model detection
perf/arm_dmc620: Add cpumask
dt-bindings: perf: fsl-imx-ddr: Add i.MX93 compatible
drivers/perf: imx_ddr: Add support for NXP i.MX9 SoC DDRC PMU driver
perf/arm_cspmu: Decouple APMT dependency
perf/arm_cspmu: Clean up ACPI dependency
ACPI/APMT: Don't register invalid resource
perf/arm_cspmu: Fix event attribute type
perf: arm_cspmu: Set irq affinitiy only if overflow interrupt is used
drivers/perf: hisi: Don't migrate perf to the CPU going to teardown
drivers/perf: apple_m1: Force 63bit counters for M2 CPUs
perf/arm-cmn: Fix DTC reset
perf: qcom_l2_pmu: Make l2_cache_pmu_probe_cluster() more robust
perf/arm-cci: Slightly optimize cci_pmu_sync_counters()
* for-next/kpti:
: Simplify KPTI trampoline exit code
arm64: entry: Simplify tramp_alias macro and tramp_exit routine
arm64: entry: Preserve/restore X29 even for compat tasks
* for-next/missing-proto-warn:
: Address -Wmissing-prototype warnings
arm64: add alt_cb_patch_nops prototype
arm64: move early_brk64 prototype to header
arm64: signal: include asm/exception.h
arm64: kaslr: add kaslr_early_init() declaration
arm64: flush: include linux/libnvdimm.h
arm64: module-plts: inline linux/moduleloader.h
arm64: hide unused is_valid_bugaddr()
arm64: efi: add efi_handle_corrupted_x18 prototype
arm64: cpuidle: fix #ifdef for acpi functions
arm64: kvm: add prototypes for functions called in asm
arm64: spectre: provide prototypes for internal functions
arm64: move cpu_suspend_set_dbg_restorer() prototype to header
arm64: avoid prototype warnings for syscalls
arm64: add scs_patch_vmlinux prototype
arm64: xor-neon: mark xor_arm64_neon_*() static
* for-next/iss2-decode:
: Add decode of ISS2 to data abort reports
arm64/esr: Add decode of ISS2 to data abort reporting
arm64/esr: Use GENMASK() for the ISS mask
* for-next/kselftest:
: Various arm64 kselftest improvements
kselftest/arm64: Log signal code and address for unexpected signals
kselftest/arm64: Add a smoke test for ptracing hardware break/watch points
* for-next/misc:
: Miscellaneous patches
arm64: alternatives: make clean_dcache_range_nopatch() noinstr-safe
arm64: hibernate: remove WARN_ON in save_processor_state
arm64/fpsimd: Exit streaming mode when flushing tasks
arm64: mm: fix VA-range sanity check
arm64/mm: remove now-superfluous ISBs from TTBR writes
arm64: consolidate rox page protection logic
arm64: set __exception_irq_entry with __irq_entry as a default
arm64: syscall: unmask DAIF for tracing status
arm64: lockdep: enable checks for held locks when returning to userspace
arm64/cpucaps: increase string width to properly format cpucaps.h
arm64/cpufeature: Use helper for ECV CNTPOFF cpufeature
* for-next/feat_mops:
: Support for ARMv8.8 memcpy instructions in userspace
kselftest/arm64: add MOPS to hwcap test
arm64: mops: allow disabling MOPS from the kernel command line
arm64: mops: detect and enable FEAT_MOPS
arm64: mops: handle single stepping after MOPS exception
arm64: mops: handle MOPS exceptions
KVM: arm64: hide MOPS from guests
arm64: mops: don't disable host MOPS instructions from EL2
arm64: mops: document boot requirements for MOPS
KVM: arm64: switch HCRX_EL2 between host and guest
arm64: cpufeature: detect FEAT_HCX
KVM: arm64: initialize HCRX_EL2
* for-next/module-alloc:
: Make the arm64 module allocation code more robust (clean-up, VA range expansion)
arm64: module: rework module VA range selection
arm64: module: mandate MODULE_PLTS
arm64: module: move module randomization to module.c
arm64: kaslr: split kaslr/module initialization
arm64: kasan: remove !KASAN_VMALLOC remnants
arm64: module: remove old !KASAN_VMALLOC logic
* for-next/sysreg: (21 commits)
: More sysreg conversions to automatic generation
arm64/sysreg: Convert TRBIDR_EL1 register to automatic generation
arm64/sysreg: Convert TRBTRG_EL1 register to automatic generation
arm64/sysreg: Convert TRBMAR_EL1 register to automatic generation
arm64/sysreg: Convert TRBSR_EL1 register to automatic generation
arm64/sysreg: Convert TRBBASER_EL1 register to automatic generation
arm64/sysreg: Convert TRBPTR_EL1 register to automatic generation
arm64/sysreg: Convert TRBLIMITR_EL1 register to automatic generation
arm64/sysreg: Rename TRBIDR_EL1 fields per auto-gen tools format
arm64/sysreg: Rename TRBTRG_EL1 fields per auto-gen tools format
arm64/sysreg: Rename TRBMAR_EL1 fields per auto-gen tools format
arm64/sysreg: Rename TRBSR_EL1 fields per auto-gen tools format
arm64/sysreg: Rename TRBBASER_EL1 fields per auto-gen tools format
arm64/sysreg: Rename TRBPTR_EL1 fields per auto-gen tools format
arm64/sysreg: Rename TRBLIMITR_EL1 fields per auto-gen tools format
arm64/sysreg: Convert OSECCR_EL1 to automatic generation
arm64/sysreg: Convert OSDTRTX_EL1 to automatic generation
arm64/sysreg: Convert OSDTRRX_EL1 to automatic generation
arm64/sysreg: Convert OSLAR_EL1 to automatic generation
arm64/sysreg: Standardise naming of bitfield constants in OSL[AS]R_EL1
arm64/sysreg: Convert MDSCR_EL1 to automatic register generation
...
* for-next/cpucap:
: arm64 cpucap clean-up
arm64: cpufeature: fold cpus_set_cap() into update_cpu_capabilities()
arm64: cpufeature: use cpucap naming
arm64: alternatives: use cpucap naming
arm64: standardise cpucap bitmap names
* for-next/acpi:
: Various arm64-related ACPI patches
ACPI: bus: Consolidate all arm specific initialisation into acpi_arm_init()
* for-next/kdump:
: Simplify the crashkernel reservation behaviour of crashkernel=X,high on arm64
arm64: add kdump.rst into index.rst
Documentation: add kdump.rst to present crashkernel reservation on arm64
arm64: kdump: simplify the reservation behaviour of crashkernel=,high
* for-next/acpi-doc:
: Update ACPI documentation for Arm systems
Documentation/arm64: Update ACPI tables from BBR
Documentation/arm64: Update references in arm-acpi
Documentation/arm64: Update ARM and arch reference
* for-next/doc:
: arm64 documentation updates
Documentation/arm64: Add ptdump documentation
* for-next/tpidr2-fix:
: Fix the TPIDR2_EL0 register restoring on sigreturn
kselftest/arm64: Add a test case for TPIDR2 restore
arm64/signal: Restore TPIDR2 register rather than memory state
* kvm-arm64/misc:
: Miscellaneous updates
:
: - Avoid trapping CTR_EL0 on systems with FEAT_EVT, as the register is
: commonly read by userspace
:
: - Make use of FEAT_BTI at hyp stage-1, setting the Guard Page bit to 1
: for executable mappings
:
: - Use a separate set of pointer authentication keys for the hypervisor
: when running in protected mode (i.e. pKVM)
:
: - Plug a few holes in timer initialization where KVM fails to free the
: timer IRQ(s)
KVM: arm64: Use different pointer authentication keys for pKVM
KVM: arm64: timers: Fix resource leaks in kvm_timer_hyp_init()
KVM: arm64: Use BTI for nvhe
KVM: arm64: Relax trapping of CTR_EL0 when FEAT_EVT is available
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* kvm-arm64/configurable-id-regs:
: Configurable ID register infrastructure, courtesy of Jing Zhang
:
: Create generalized infrastructure for allowing userspace to select the
: supported feature set for a VM, so long as the feature set is a subset
: of what hardware + KVM allows. This does not add any new features that
: are user-configurable, and instead focuses on the necessary refactoring
: to enable future work.
:
: As a consequence of the series, feature asymmetry is now deliberately
: disallowed for KVM. It is unlikely that VMMs ever configured VMs with
: asymmetry, nor does it align with the kernel's overall stance that
: features must be uniform across all cores in the system.
:
: Furthermore, KVM incorrectly advertised an IMP_DEF PMU to guests for
: some time. Migrations from affected kernels was supported by explicitly
: allowing such an ID register value from userspace, and forwarding that
: along to the guest. KVM now allows an IMP_DEF PMU version to be restored
: through the ID register interface, but reinterprets the user value as
: not implemented (0).
KVM: arm64: Rip out the vestiges of the 'old' ID register scheme
KVM: arm64: Handle ID register reads using the VM-wide values
KVM: arm64: Use generic sanitisation for ID_AA64PFR0_EL1
KVM: arm64: Use generic sanitisation for ID_(AA64)DFR0_EL1
KVM: arm64: Use arm64_ftr_bits to sanitise ID register writes
KVM: arm64: Save ID registers' sanitized value per guest
KVM: arm64: Reuse fields of sys_reg_desc for idreg
KVM: arm64: Rewrite IMPDEF PMU version as NI
KVM: arm64: Make vCPU feature flags consistent VM-wide
KVM: arm64: Relax invariance of KVM_ARM_VCPU_POWER_OFF
KVM: arm64: Separate out feature sanitisation and initialisation
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Rather than reinventing the wheel in KVM to do ID register sanitisation
we can rely on the work already done in the core kernel. Implement a
generalized sanitisation of ID registers based on the combination of the
arm64_ftr_bits definitions from the core kernel and (optionally) a set
of KVM-specific overrides.
This all amounts to absolutely nothing for now, but will be used in
subsequent changes to realize user-configurable ID registers.
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230609190054.1542113-8-oliver.upton@linux.dev
[Oliver: split off from monster patch, rewrote commit description,
reworked RAZ handling, return EINVAL to userspace]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Expose a capability keying the hVHE feature as well as a new
predicate testing it. Nothing is so far using it, and nothing
is enabling it yet.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20230609162200.2024064-5-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Disabling KASLR from the command line is implemented as a feature
override. Repaint it slightly so that it can further be used as
more generic infrastructure for SW override purposes.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20230609162200.2024064-4-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
We only use cpus_set_cap() in update_cpu_capabilities(), where we
open-code an analgous update to boot_cpucaps.
Due to the way the cpucap_ptrs[] array is initialized, we know that the
capability number cannot be greater than or equal to ARM64_NCAPS, so the
warning is superfluous.
Fold cpus_set_cap() into update_cpu_capabilities(), matching what we do
for the boot_cpucaps, and making the relationship between the two a bit
clearer.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230607164846.3967305-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
To more clearly align the various users of the cpucap enumeration, this patch
changes the cpufeature code to use the term `cpucap` in favour of `cpu_hwcap`.
This more clearly aligns with other users of the cpucaps, and avoids confusion
with the ELF hwcaps.
There should be no functional change as a result of this patch; this is
purely a renaming exercise.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230607164846.3967305-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The 'cpu_hwcaps' and 'boot_capabilities' bitmaps are bitmaps have the
same enumerated bits, but are named wildly differently for no good
reason. The terms 'hwcaps' and 'capabilities' have become ambiguous over
time (e.g. due to clashes with ELF hwcaps and the structures used to
manage feature detection), and it would be nicer to use 'cpucaps',
matching the <asm/cpucaps.h> header the enumerated bit indices are
defined in.
While this isn't a functional problem, it makes the code harder than
necessary to understand, and hard to extend with related functionality
(e.g. per-cpu cpucap bitmaps).
To that end, this patch renames `boot_capabilities` to `boot_cpucaps`
and `cpu_hwcaps` to `system_cpucaps`. This more clearly indicates the
relationship between the two and aligns with terminology used elsewhere
in our feature management code.
This change was scripted with:
| find . -type f -name '*.[chS]' -print0 | \
| xargs -0 sed -i 's/\<boot_capabilities\>/boot_cpucaps/'
| find . -type f -name '*.[chS]' -print0 | \
| xargs -0 sed -i 's/\<cpu_hwcaps\>/system_cpucaps/'
... and the instance of "cpu_hwcap" (without a trailing "s") in
<asm/mmu_context.h> corrected manually to "system_cpucaps".
Subsequent patches will adjust the naming of related functions to better
align with the `cpucap` naming.
There should be no functional change as a result of this patch; this is
purely a renaming exercise.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230607164846.3967305-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This indicates if the system supports PIE. This is a CPUCAP_BOOT_CPU_FEATURE
as the boot CPU will enable PIE if it has it, so secondary CPUs must also
have this feature.
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230606145859.697944-8-joey.gouly@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This capability indicates if the system supports the TCR2_ELx system register.
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230606145859.697944-7-joey.gouly@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The Arm v8.8/9.3 FEAT_MOPS feature provides new instructions that
perform a memory copy or set. Wire up the cpufeature code to detect the
presence of FEAT_MOPS and enable it.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20230509142235.3284028-10-kristina.martsenko@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Detect if the system has the new HCRX_EL2 register added in ARMv8.7/9.2,
so that subsequent patches can check for its presence.
KVM currently relies on the register being present on all CPUs (or
none), so the kernel will panic if that is not the case. Fortunately no
such systems currently exist, but this can be revisited if they appear.
Note that the kernel will not panic if CONFIG_KVM is disabled.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20230509142235.3284028-3-kristina.martsenko@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The newly added support for ECV CNTPOFF open codes the recently added
helper ARM64_CPUID_FIELDS(), make use of the helper. No functional
change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20230523-arm64-ecv-helper-v1-1-506dfb5fb199@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
CTR_EL0 can often be used in userspace, and it would be nice if
KVM didn't have to emulate it unnecessarily.
While it isn't possible to trap the cache configuration registers
independently from CTR_EL0 in the base ARMv8.0 architecture, FEAT_EVT
allows these cache configuration registers (CCSIDR_EL1, CCSIDR2_EL1,
CLIDR_EL1 and CSSELR_EL1) to be trapped independently by setting
HCR_EL2.TID4.
Switch to using TID4 instead of TID2 in the cases where FEAT_EVT
is available *and* that KVM doesn't need to sanitise CTR_EL0 to
paper over mismatched cache configurations.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230515170016.965378-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
- Fix regression in CPU erratum workaround when disabling the MMU
- Fix detection of pointer authentication hwcaps
- Avoid writeable, executable ELF sections in vmlinux
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmRTe+UQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNJXqB/9C9DrrbHUg9ZPqAIUpXkyaxem4gpIS+kyU
+ard53uweuQHchuR/x2s2K9Sp/ano5jGnQXEjikNy29Opu2UYI/wmsqdJEn3km8q
kohTRsiFgQ40Y85/3iJ8ug6+llxCxK6AXdZCskdWTP56Jur0WpNiQd0a/ShYQLdX
wBHdInT3QpDVzd5bEWDtUEj4H//tTCy4rESQyGsLhrHgb/x8uZKgZMtPJp6+Q3Eq
ofs+PQc0qHr/Ri3ahQOCMbxTbNaLIgUzkyXbZN+y2JtxgE+l8E3Gsir2+Pv7mcSx
1gCSLCmpwE7rVJpTykN+jA6OSsoSUSJHXs6565nF4n8+ugdL7aqR
=Tba+
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"A few arm64 fixes that came in during the merge window for -rc1.
The main thing is restoring the pointer authentication hwcaps, which
disappeared during some recent refactoring
- Fix regression in CPU erratum workaround when disabling the MMU
- Fix detection of pointer authentication hwcaps
- Avoid writeable, executable ELF sections in vmlinux"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: lds: move .got section out of .text
arm64: kernel: remove SHF_WRITE|SHF_EXECINSTR from .idmap.text
arm64: cpufeature: Fix pointer auth hwcaps
arm64: Fix label placement in record_mmu_state()
The pointer auth hwcaps are not getting reported to userspace, as they
are missing the .matches field. Add the field back.
Fixes: 876e3c8efe ("arm64/cpufeature: Pull out helper for CPUID register definitions")
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230428132546.2513834-1-kristina.martsenko@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
* More phys_to_virt conversions
* Improvement of AP management for VSIE (nested virtualization)
ARM64:
* Numerous fixes for the pathological lock inversion issue that
plagued KVM/arm64 since... forever.
* New framework allowing SMCCC-compliant hypercalls to be forwarded
to userspace, hopefully paving the way for some more features
being moved to VMMs rather than be implemented in the kernel.
* Large rework of the timer code to allow a VM-wide offset to be
applied to both virtual and physical counters as well as a
per-timer, per-vcpu offset that complements the global one.
This last part allows the NV timer code to be implemented on
top.
* A small set of fixes to make sure that we don't change anything
affecting the EL1&0 translation regime just after having having
taken an exception to EL2 until we have executed a DSB. This
ensures that speculative walks started in EL1&0 have completed.
* The usual selftest fixes and improvements.
KVM x86 changes for 6.4:
* Optimize CR0.WP toggling by avoiding an MMU reload when TDP is enabled,
and by giving the guest control of CR0.WP when EPT is enabled on VMX
(VMX-only because SVM doesn't support per-bit controls)
* Add CR0/CR4 helpers to query single bits, and clean up related code
where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long" return
as a bool
* Move AMD_PSFD to cpufeatures.h and purge KVM's definition
* Avoid unnecessary writes+flushes when the guest is only adding new PTEs
* Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s optimizations
when emulating invalidations
* Clean up the range-based flushing APIs
* Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a single
A/D bit using a LOCK AND instead of XCHG, and skip all of the "handle
changed SPTE" overhead associated with writing the entire entry
* Track the number of "tail" entries in a pte_list_desc to avoid having
to walk (potentially) all descriptors during insertion and deletion,
which gets quite expensive if the guest is spamming fork()
* Disallow virtualizing legacy LBRs if architectural LBRs are available,
the two are mutually exclusive in hardware
* Disallow writes to immutable feature MSRs (notably PERF_CAPABILITIES)
after KVM_RUN, similar to CPUID features
* Overhaul the vmx_pmu_caps selftest to better validate PERF_CAPABILITIES
* Apply PMU filters to emulated events and add test coverage to the
pmu_event_filter selftest
x86 AMD:
* Add support for virtual NMIs
* Fixes for edge cases related to virtual interrupts
x86 Intel:
* Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if XTILE_DATA is
not being reported due to userspace not opting in via prctl()
* Fix a bug in emulation of ENCLS in compatibility mode
* Allow emulation of NOP and PAUSE for L2
* AMX selftests improvements
* Misc cleanups
MIPS:
* Constify MIPS's internal callbacks (a leftover from the hardware enabling
rework that landed in 6.3)
Generic:
* Drop unnecessary casts from "void *" throughout kvm_main.c
* Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the struct
size by 8 bytes on 64-bit kernels by utilizing a padding hole
Documentation:
* Fix goof introduced by the conversion to rST
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmRNExkUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroNyjwf+MkzDael9y9AsOZoqhEZ5OsfQYJ32
Im5ZVYsPRU2K5TuoWql6meIihgclCj1iIU32qYHa2F1WYt2rZ72rJp+HoY8b+TaI
WvF0pvNtqQyg3iEKUBKPA4xQ6mj7RpQBw86qqiCHmlfNt0zxluEGEPxH8xrWcfhC
huDQ+NUOdU7fmJ3rqGitCvkUbCuZNkw3aNPR8dhU8RAWrwRzP2hBOmdxIeo81WWY
XMEpJSijbGpXL9CvM0Jz9nOuMJwZwCCBGxg1vSQq0xTfLySNMxzvWZC2GFaBjucb
j0UOQ7yE0drIZDVhd3sdNslubXXU6FcSEzacGQb9aigMUon3Tem9SHi7Kw==
=S2Hq
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"s390:
- More phys_to_virt conversions
- Improvement of AP management for VSIE (nested virtualization)
ARM64:
- Numerous fixes for the pathological lock inversion issue that
plagued KVM/arm64 since... forever.
- New framework allowing SMCCC-compliant hypercalls to be forwarded
to userspace, hopefully paving the way for some more features being
moved to VMMs rather than be implemented in the kernel.
- Large rework of the timer code to allow a VM-wide offset to be
applied to both virtual and physical counters as well as a
per-timer, per-vcpu offset that complements the global one. This
last part allows the NV timer code to be implemented on top.
- A small set of fixes to make sure that we don't change anything
affecting the EL1&0 translation regime just after having having
taken an exception to EL2 until we have executed a DSB. This
ensures that speculative walks started in EL1&0 have completed.
- The usual selftest fixes and improvements.
x86:
- Optimize CR0.WP toggling by avoiding an MMU reload when TDP is
enabled, and by giving the guest control of CR0.WP when EPT is
enabled on VMX (VMX-only because SVM doesn't support per-bit
controls)
- Add CR0/CR4 helpers to query single bits, and clean up related code
where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long"
return as a bool
- Move AMD_PSFD to cpufeatures.h and purge KVM's definition
- Avoid unnecessary writes+flushes when the guest is only adding new
PTEs
- Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s
optimizations when emulating invalidations
- Clean up the range-based flushing APIs
- Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a
single A/D bit using a LOCK AND instead of XCHG, and skip all of
the "handle changed SPTE" overhead associated with writing the
entire entry
- Track the number of "tail" entries in a pte_list_desc to avoid
having to walk (potentially) all descriptors during insertion and
deletion, which gets quite expensive if the guest is spamming
fork()
- Disallow virtualizing legacy LBRs if architectural LBRs are
available, the two are mutually exclusive in hardware
- Disallow writes to immutable feature MSRs (notably
PERF_CAPABILITIES) after KVM_RUN, similar to CPUID features
- Overhaul the vmx_pmu_caps selftest to better validate
PERF_CAPABILITIES
- Apply PMU filters to emulated events and add test coverage to the
pmu_event_filter selftest
- AMD SVM:
- Add support for virtual NMIs
- Fixes for edge cases related to virtual interrupts
- Intel AMX:
- Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if
XTILE_DATA is not being reported due to userspace not opting in
via prctl()
- Fix a bug in emulation of ENCLS in compatibility mode
- Allow emulation of NOP and PAUSE for L2
- AMX selftests improvements
- Misc cleanups
MIPS:
- Constify MIPS's internal callbacks (a leftover from the hardware
enabling rework that landed in 6.3)
Generic:
- Drop unnecessary casts from "void *" throughout kvm_main.c
- Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the
struct size by 8 bytes on 64-bit kernels by utilizing a padding
hole
Documentation:
- Fix goof introduced by the conversion to rST"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (211 commits)
KVM: s390: pci: fix virtual-physical confusion on module unload/load
KVM: s390: vsie: clarifications on setting the APCB
KVM: s390: interrupt: fix virtual-physical confusion for next alert GISA
KVM: arm64: Have kvm_psci_vcpu_on() use WRITE_ONCE() to update mp_state
KVM: arm64: Acquire mp_state_lock in kvm_arch_vcpu_ioctl_vcpu_init()
KVM: selftests: Test the PMU event "Instructions retired"
KVM: selftests: Copy full counter values from guest in PMU event filter test
KVM: selftests: Use error codes to signal errors in PMU event filter test
KVM: selftests: Print detailed info in PMU event filter asserts
KVM: selftests: Add helpers for PMC asserts in PMU event filter test
KVM: selftests: Add a common helper for the PMU event filter guest code
KVM: selftests: Fix spelling mistake "perrmited" -> "permitted"
KVM: arm64: vhe: Drop extra isb() on guest exit
KVM: arm64: vhe: Synchronise with page table walker on MMU update
KVM: arm64: pkvm: Document the side effects of kvm_flush_dcache_to_poc()
KVM: arm64: nvhe: Synchronise with page table walker on TLBI
KVM: arm64: Handle 32bit CNTPCTSS traps
KVM: arm64: nvhe: Synchronise with page table walker on vcpu run
KVM: arm64: vgic: Don't acquire its_lock before config_lock
KVM: selftests: Add test to verify KVM's supported XCR0
...
Here is the large set of driver core changes for 6.4-rc1.
Once again, a busy development cycle, with lots of changes happening in
the driver core in the quest to be able to move "struct bus" and "struct
class" into read-only memory, a task now complete with these changes.
This will make the future rust interactions with the driver core more
"provably correct" as well as providing more obvious lifetime rules for
all busses and classes in the kernel.
The changes required for this did touch many individual classes and
busses as many callbacks were changed to take const * parameters
instead. All of these changes have been submitted to the various
subsystem maintainers, giving them plenty of time to review, and most of
them actually did so.
Other than those changes, included in here are a small set of other
things:
- kobject logging improvements
- cacheinfo improvements and updates
- obligatory fw_devlink updates and fixes
- documentation updates
- device property cleanups and const * changes
- firwmare loader dependency fixes.
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZEp7Sw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykitQCfamUHpxGcKOAGuLXMotXNakTEsxgAoIquENm5
LEGadNS38k5fs+73UaxV
=7K4B
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the large set of driver core changes for 6.4-rc1.
Once again, a busy development cycle, with lots of changes happening
in the driver core in the quest to be able to move "struct bus" and
"struct class" into read-only memory, a task now complete with these
changes.
This will make the future rust interactions with the driver core more
"provably correct" as well as providing more obvious lifetime rules
for all busses and classes in the kernel.
The changes required for this did touch many individual classes and
busses as many callbacks were changed to take const * parameters
instead. All of these changes have been submitted to the various
subsystem maintainers, giving them plenty of time to review, and most
of them actually did so.
Other than those changes, included in here are a small set of other
things:
- kobject logging improvements
- cacheinfo improvements and updates
- obligatory fw_devlink updates and fixes
- documentation updates
- device property cleanups and const * changes
- firwmare loader dependency fixes.
All of these have been in linux-next for a while with no reported
problems"
* tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits)
device property: make device_property functions take const device *
driver core: update comments in device_rename()
driver core: Don't require dynamic_debug for initcall_debug probe timing
firmware_loader: rework crypto dependencies
firmware_loader: Strip off \n from customized path
zram: fix up permission for the hot_add sysfs file
cacheinfo: Add use_arch[|_cache]_info field/function
arch_topology: Remove early cacheinfo error message if -ENOENT
cacheinfo: Check cache properties are present in DT
cacheinfo: Check sib_leaf in cache_leaves_are_shared()
cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
cacheinfo: Add arm64 early level initializer implementation
cacheinfo: Add arch specific early level initializer
tty: make tty_class a static const structure
driver core: class: remove struct class_interface * from callbacks
driver core: class: mark the struct class in struct class_interface constant
driver core: class: make class_register() take a const *
driver core: class: mark class_release() as taking a const *
driver core: remove incorrect comment for device_create*
MIPS: vpe-cmp: remove module owner pointer from struct class usage.
...
When defining which value to look for in a system register field we
currently manually specify the register, field shift, width and sign and
the value to look for. This opens the potential for error with for example
the wrong field width or sign being specified, an enumeration value for
a different similarly named field or letting something be initialised to 0.
Since we now generate defines for all the ID registers we now have named
constants for all of these things generated from the system register
description, meaning that we can generate initialisation for all the fields
used in matching from a minimal specification of register, field and match
value. This is both shorter and eliminates or makes build failures several
potential errors.
No change in the generated binary.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230303-arm64-cpufeature-helpers-v2-3-4c8f28a6f203@kernel.org
[will: Drop explicit '.sign' assignment for BTI feature]
Signed-off-by: Will Deacon <will@kernel.org>
A number of the cpufeatures use raw numbers for the minimum field values
specified rather than symbolic constants. In preparation for the use of
helper macros replace all these with the appropriate constants.
No change in the generated binary.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230303-arm64-cpufeature-helpers-v2-2-4c8f28a6f203@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
We use the same structure to match hwcaps and CPU features so we can use
the same helper to generate the fields required. Pull the portion of the
current hwcaps helper that initialises the fields out into a separate
define placed earlier in the file so we can use it for cpufeatures.
No functional change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230303-arm64-cpufeature-helpers-v2-1-4c8f28a6f203@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Add the probing code for the FEAT_ECV variant that implements CNTPOFF_EL2.
Why it is optional is a mystery, but let's try and detect it.
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Colton Lewis <coltonlewis@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230330174800.2677007-4-maz@kernel.org
Direct access to the struct bus_type dev_root pointer is going away soon
so replace that with a call to bus_get_dev_root() instead, which is what
it is there for.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230313182918.1312597-11-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- In copy_highpage(), only reset the tag of the destination pointer if
KASAN_HW_TAGS is enabled so that user-space MTE does not interfere
with KASAN_SW_TAGS (which relies on top-byte-ignore).
- Remove warning if SME is detected without SVE, the kernel can cope
with such configuration (though none in the field currently).
- In cfi_handler(), pass the ESR_EL1 value to die() for consistency with
other die() callers.
- Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP on arm64 since the pte
manipulation from the generic vmemmap_remap_pte() does not follow the
required ARM break-before-make sequence (clear the pte, flush the
TLBs, set the new pte). It may be re-enabled once this sequence is
sorted.
- Fix possible memory leak in the arm64 ACPI code if the SMCCC version
and conduit checks fail.
- Forbid CALL_OPS with CC_OPTIMIZE_FOR_SIZE since gcc ignores
-falign-functions=N with -Os.
- Don't pretend KASLR is enabled if offset < MIN_KIMG_ALIGN as no
randomisation would actually take place.
-----BEGIN PGP SIGNATURE-----
iQIyBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmQBHFEACgkQa9axLQDI
XvHf9Q/3Zg8o/8HchnWSvzgV//9ljGrrDfAjbfZHrE2W4PCniSd0op0uXYsVK3IH
Nk6ZDiRe5uIXKgHuSq5caOoL4aRk0hk1TpQ3RKCuh8E3ybhQe9gwYm8xEWXDSSWh
QzcfENsKlZLpuMoSMILJ2NlMPMbMLprXNCUlgENBbRT7KUToHZKTwE6BL2AUI3tg
RdMntccorybxk1hiXV1YKT8482i+x2gAnylYXFsq3eI+G54rdfiks+tft0CQV3ng
1/i1PfbnGC45sBoxXPqYXzBSUDNHpAqb5dwvtlVinGo3J6STxIvbM6Zi5Ma5hl3u
QrhwyduwCTZ6wVOqzd4KAH9gmhJSzRG75OzCek2dTwU9KXVMOPEvp1ZfTwUXDx7J
5j8UkjGgrbtj6IioGqBAO/HiFfoty8EBtmlSZIj0thwxkM73ZBG6efQOJaVWh85m
ioUzMC2Y5yfKLfHEcy9yKIQVizMYoz6fl+QHOEbVSoFhJKNRc4wt5CCJCvsbMHsu
K8rvD/CI9jFMP9GEK7ObTaC7ICjUz/+8wbIrRrm5ObRQ65Tm2zv3OLqGnK8O5O4W
gcDEraTnSPHDUtgG6dAEPFN5Wi9hT3zYC0xAcNhc3aZC5ofS5RD6YXIWJvqjWrvL
5k8G1gfa57C/hfxO6pPw7bg/nY8vvYpxUkZ9erRWD430g7y0Sg==
=jfPa
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- In copy_highpage(), only reset the tag of the destination pointer if
KASAN_HW_TAGS is enabled so that user-space MTE does not interfere
with KASAN_SW_TAGS (which relies on top-byte-ignore).
- Remove warning if SME is detected without SVE, the kernel can cope
with such configuration (though none in the field currently).
- In cfi_handler(), pass the ESR_EL1 value to die() for consistency
with other die() callers.
- Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP on arm64 since the pte
manipulation from the generic vmemmap_remap_pte() does not follow the
required ARM break-before-make sequence (clear the pte, flush the
TLBs, set the new pte). It may be re-enabled once this sequence is
sorted.
- Fix possible memory leak in the arm64 ACPI code if the SMCCC version
and conduit checks fail.
- Forbid CALL_OPS with CC_OPTIMIZE_FOR_SIZE since gcc ignores
-falign-functions=N with -Os.
- Don't pretend KASLR is enabled if offset < MIN_KIMG_ALIGN as no
randomisation would actually take place.
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: kaslr: don't pretend KASLR is enabled if offset < MIN_KIMG_ALIGN
arm64: ftrace: forbid CALL_OPS with CC_OPTIMIZE_FOR_SIZE
arm64: acpi: Fix possible memory leak of ffh_ctxt
arm64: mm: hugetlb: Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP
arm64: pass ESR_ELx to die() of cfi_handler
arm64/fpsimd: Remove warning for SME without SVE
arm64: Reset KASAN tag in copy_highpage with HW tags only
Our virtual KASLR displacement is a randomly chosen multiple of
2 MiB plus an offset that is equal to the physical placement modulo 2
MiB. This arrangement ensures that we can always use 2 MiB block
mappings (or contiguous PTE mappings for 16k or 64k pages) to map the
kernel.
This means that a KASLR offset of less than 2 MiB is simply the product
of this physical displacement, and no randomization has actually taken
place. Currently, we use 'kaslr_offset() > 0' to decide whether or not
randomization has occurred, and so we misidentify this case.
If the kernel image placement is not randomized, modules are allocated
from a dedicated region below the kernel mapping, which is only used for
modules and not for other vmalloc() or vmap() calls.
When randomization is enabled, the kernel image is vmap()'ed randomly
inside the vmalloc region, and modules are allocated in the vicinity of
this mapping to ensure that relative references are always in range.
However, unlike the dedicated module region below the vmalloc region,
this region is not reserved exclusively for modules, and so ordinary
vmalloc() calls may end up overlapping with it. This should rarely
happen, given that vmalloc allocates bottom up, although it cannot be
ruled out entirely.
The misidentified case results in a placement of the kernel image within
2 MiB of its default address. However, the logic that randomizes the
module region is still invoked, and this could result in the module
region overlapping with the start of the vmalloc region, instead of
using the dedicated region below it. If this happens, a single large
vmalloc() or vmap() call will use up the entire region, and leave no
space for loading modules after that.
Since commit 82046702e2 ("efi/libstub/arm64: Replace 'preferred'
offset with alignment check"), this is much more likely to occur on
systems that boot via EFI but lack an implementation of the EFI RNG
protocol, as in that case, the EFI stub will decide to leave the image
where it found it, and the EFI firmware uses 64k alignment only.
Fix this, by correctly identifying the case where the virtual
displacement is a result of the physical displacement only.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20230223204101.1500373-1-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
- Provide a virtual cache topology to the guest to avoid
inconsistencies with migration on heterogenous systems. Non secure
software has no practical need to traverse the caches by set/way in
the first place.
- Add support for taking stage-2 access faults in parallel. This was an
accidental omission in the original parallel faults implementation,
but should provide a marginal improvement to machines w/o FEAT_HAFDBS
(such as hardware from the fruit company).
- A preamble to adding support for nested virtualization to KVM,
including vEL2 register state, rudimentary nested exception handling
and masking unsupported features for nested guests.
- Fixes to the PSCI relay that avoid an unexpected host SVE trap when
resuming a CPU when running pKVM.
- VGIC maintenance interrupt support for the AIC
- Improvements to the arch timer emulation, primarily aimed at reducing
the trap overhead of running nested.
- Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the
interest of CI systems.
- Avoid VM-wide stop-the-world operations when a vCPU accesses its own
redistributor.
- Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions
in the host.
- Aesthetic and comment/kerneldoc fixes
- Drop the vestiges of the old Columbia mailing list and add [Oliver]
as co-maintainer
This also drags in arm64's 'for-next/sme2' branch, because both it and
the PSCI relay changes touch the EL2 initialization code.
RISC-V:
- Fix wrong usage of PGDIR_SIZE instead of PUD_SIZE
- Correctly place the guest in S-mode after redirecting a trap to the guest
- Redirect illegal instruction traps to guest
- SBI PMU support for guest
s390:
- Two patches sorting out confusion between virtual and physical
addresses, which currently are the same on s390.
- A new ioctl that performs cmpxchg on guest memory
- A few fixes
x86:
- Change tdp_mmu to a read-only parameter
- Separate TDP and shadow MMU page fault paths
- Enable Hyper-V invariant TSC control
- Fix a variety of APICv and AVIC bugs, some of them real-world,
some of them affecting architecurally legal but unlikely to
happen in practice
- Mark APIC timer as expired if its in one-shot mode and the count
underflows while the vCPU task was being migrated
- Advertise support for Intel's new fast REP string features
- Fix a double-shootdown issue in the emergency reboot code
- Ensure GIF=1 and disable SVM during an emergency reboot, i.e. give SVM
similar treatment to VMX
- Update Xen's TSC info CPUID sub-leaves as appropriate
- Add support for Hyper-V's extended hypercalls, where "support" at this
point is just forwarding the hypercalls to userspace
- Clean up the kvm->lock vs. kvm->srcu sequences when updating the PMU and
MSR filters
- One-off fixes and cleanups
- Fix and cleanup the range-based TLB flushing code, used when KVM is
running on Hyper-V
- Add support for filtering PMU events using a mask. If userspace
wants to restrict heavily what events the guest can use, it can now
do so without needing an absurd number of filter entries
- Clean up KVM's handling of "PMU MSRs to save", especially when vPMU
support is disabled
- Add PEBS support for Intel Sapphire Rapids
- Fix a mostly benign overflow bug in SEV's send|receive_update_data()
- Move several SVM-specific flags into vcpu_svm
x86 Intel:
- Handle NMI VM-Exits before leaving the noinstr region
- A few trivial cleanups in the VM-Enter flows
- Stop enabling VMFUNC for L1 purely to document that KVM doesn't support
EPTP switching (or any other VM function) for L1
- Fix a crash when using eVMCS's enlighted MSR bitmaps
Generic:
- Clean up the hardware enable and initialization flow, which was
scattered around multiple arch-specific hooks. Instead, just
let the arch code call into generic code. Both x86 and ARM should
benefit from not having to fight common KVM code's notion of how
to do initialization.
- Account allocations in generic kvm_arch_alloc_vm()
- Fix a memory leak if coalesced MMIO unregistration fails
selftests:
- On x86, cache the CPU vendor (AMD vs. Intel) and use the info to emit
the correct hypercall instruction instead of relying on KVM to patch
in VMMCALL
- Use TAP interface for kvm_binary_stats_test and tsc_msrs_test
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmP2YA0UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPg/Qf+J6nT+TkIa+8Ei+fN1oMTDp4YuIOx
mXvJ9mRK9sQ+tAUVwvDz3qN/fK5mjsYbRHIDlVc5p2Q3bCrVGDDqXPFfCcLx1u+O
9U9xjkO4JxD2LS9pc70FYOyzVNeJ8VMGOBbC2b0lkdYZ4KnUc6e/WWFKJs96bK+H
duo+RIVyaMthnvbTwSv1K3qQb61n6lSJXplywS8KWFK6NZAmBiEFDAWGRYQE9lLs
VcVcG0iDJNL/BQJ5InKCcvXVGskcCm9erDszPo7w4Bypa4S9AMS42DHUaRZrBJwV
/WqdH7ckIz7+OSV0W1j+bKTHAFVTCjXYOM7wQykgjawjICzMSnnG9Gpskw==
=goe1
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM:
- Provide a virtual cache topology to the guest to avoid
inconsistencies with migration on heterogenous systems. Non secure
software has no practical need to traverse the caches by set/way in
the first place
- Add support for taking stage-2 access faults in parallel. This was
an accidental omission in the original parallel faults
implementation, but should provide a marginal improvement to
machines w/o FEAT_HAFDBS (such as hardware from the fruit company)
- A preamble to adding support for nested virtualization to KVM,
including vEL2 register state, rudimentary nested exception
handling and masking unsupported features for nested guests
- Fixes to the PSCI relay that avoid an unexpected host SVE trap when
resuming a CPU when running pKVM
- VGIC maintenance interrupt support for the AIC
- Improvements to the arch timer emulation, primarily aimed at
reducing the trap overhead of running nested
- Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the
interest of CI systems
- Avoid VM-wide stop-the-world operations when a vCPU accesses its
own redistributor
- Serialize when toggling CPACR_EL1.SMEN to avoid unexpected
exceptions in the host
- Aesthetic and comment/kerneldoc fixes
- Drop the vestiges of the old Columbia mailing list and add [Oliver]
as co-maintainer
RISC-V:
- Fix wrong usage of PGDIR_SIZE instead of PUD_SIZE
- Correctly place the guest in S-mode after redirecting a trap to the
guest
- Redirect illegal instruction traps to guest
- SBI PMU support for guest
s390:
- Sort out confusion between virtual and physical addresses, which
currently are the same on s390
- A new ioctl that performs cmpxchg on guest memory
- A few fixes
x86:
- Change tdp_mmu to a read-only parameter
- Separate TDP and shadow MMU page fault paths
- Enable Hyper-V invariant TSC control
- Fix a variety of APICv and AVIC bugs, some of them real-world, some
of them affecting architecurally legal but unlikely to happen in
practice
- Mark APIC timer as expired if its in one-shot mode and the count
underflows while the vCPU task was being migrated
- Advertise support for Intel's new fast REP string features
- Fix a double-shootdown issue in the emergency reboot code
- Ensure GIF=1 and disable SVM during an emergency reboot, i.e. give
SVM similar treatment to VMX
- Update Xen's TSC info CPUID sub-leaves as appropriate
- Add support for Hyper-V's extended hypercalls, where "support" at
this point is just forwarding the hypercalls to userspace
- Clean up the kvm->lock vs. kvm->srcu sequences when updating the
PMU and MSR filters
- One-off fixes and cleanups
- Fix and cleanup the range-based TLB flushing code, used when KVM is
running on Hyper-V
- Add support for filtering PMU events using a mask. If userspace
wants to restrict heavily what events the guest can use, it can now
do so without needing an absurd number of filter entries
- Clean up KVM's handling of "PMU MSRs to save", especially when vPMU
support is disabled
- Add PEBS support for Intel Sapphire Rapids
- Fix a mostly benign overflow bug in SEV's
send|receive_update_data()
- Move several SVM-specific flags into vcpu_svm
x86 Intel:
- Handle NMI VM-Exits before leaving the noinstr region
- A few trivial cleanups in the VM-Enter flows
- Stop enabling VMFUNC for L1 purely to document that KVM doesn't
support EPTP switching (or any other VM function) for L1
- Fix a crash when using eVMCS's enlighted MSR bitmaps
Generic:
- Clean up the hardware enable and initialization flow, which was
scattered around multiple arch-specific hooks. Instead, just let
the arch code call into generic code. Both x86 and ARM should
benefit from not having to fight common KVM code's notion of how to
do initialization
- Account allocations in generic kvm_arch_alloc_vm()
- Fix a memory leak if coalesced MMIO unregistration fails
selftests:
- On x86, cache the CPU vendor (AMD vs. Intel) and use the info to
emit the correct hypercall instruction instead of relying on KVM to
patch in VMMCALL
- Use TAP interface for kvm_binary_stats_test and tsc_msrs_test"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (325 commits)
KVM: SVM: hyper-v: placate modpost section mismatch error
KVM: x86/mmu: Make tdp_mmu_allowed static
KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID
KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes
KVM: arm64: nv: Filter out unsupported features from ID regs
KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2
KVM: arm64: nv: Allow a sysreg to be hidden from userspace only
KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor
KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2
KVM: arm64: nv: Handle SMCs taken from virtual EL2
KVM: arm64: nv: Handle trapped ERET from virtual EL2
KVM: arm64: nv: Inject HVC exceptions to the virtual EL2
KVM: arm64: nv: Support virtual EL2 exceptions
KVM: arm64: nv: Handle HCR_EL2.NV system register traps
KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state
KVM: arm64: nv: Add EL2 system registers to vcpu context
KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x
KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set
KVM: arm64: nv: Introduce nested virtualization VCPU feature
KVM: arm64: Use the S2 MMU context to iterate over S2 table
...
* kvm-arm64/nv-prefix:
: Preamble to NV support, courtesy of Marc Zyngier.
:
: This brings in a set of prerequisite patches for supporting nested
: virtualization in KVM/arm64. Of course, there is a long way to go until
: NV is actually enabled in KVM.
:
: - Introduce cpucap / vCPU feature flag to pivot the NV code on
:
: - Add support for EL2 vCPU register state
:
: - Basic nested exception handling
:
: - Hide unsupported features from the ID registers for NV-capable VMs
KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID
KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes
KVM: arm64: nv: Filter out unsupported features from ID regs
KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2
KVM: arm64: nv: Allow a sysreg to be hidden from userspace only
KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor
KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2
KVM: arm64: nv: Handle SMCs taken from virtual EL2
KVM: arm64: nv: Handle trapped ERET from virtual EL2
KVM: arm64: nv: Inject HVC exceptions to the virtual EL2
KVM: arm64: nv: Support virtual EL2 exceptions
KVM: arm64: nv: Handle HCR_EL2.NV system register traps
KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state
KVM: arm64: nv: Add EL2 system registers to vcpu context
KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x
KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set
KVM: arm64: nv: Introduce nested virtualization VCPU feature
KVM: arm64: Use the S2 MMU context to iterate over S2 table
arm64: Add ARM64_HAS_NESTED_VIRT cpufeature
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Add a new ARM64_HAS_NESTED_VIRT feature to indicate that the
CPU has the ARMv8.3 nested virtualization capability, together
with the 'kvm-arm.mode=nested' command line option.
This will be used to support nested virtualization in KVM.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
[maz: moved the command-line option to kvm-arm.mode]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230209175820.1939006-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* for-next/sysreg-hwcaps:
: Make use of sysreg helpers for hwcaps
arm64/cpufeature: Use helper macros to specify hwcaps
arm64/cpufeature: Always use symbolic name for feature value in hwcaps
arm64/sysreg: Initial unsigned annotations for ID registers
arm64/sysreg: Initial annotation of signed ID registers
arm64/sysreg: Allow enumerations to be declared as signed or unsigned
* arm64/for-next/perf:
perf: arm_spe: Print the version of SPE detected
perf: arm_spe: Add support for SPEv1.2 inverted event filtering
perf: Add perf_event_attr::config3
drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable
perf: arm_spe: Support new SPEv1.2/v8.7 'not taken' event
perf: arm_spe: Use new PMSIDR_EL1 register enums
perf: arm_spe: Drop BIT() and use FIELD_GET/PREP accessors
arm64/sysreg: Convert SPE registers to automatic generation
arm64: Drop SYS_ from SPE register defines
perf: arm_spe: Use feature numbering for PMSEVFR_EL1 defines
perf/marvell: Add ACPI support to TAD uncore driver
perf/marvell: Add ACPI support to DDR uncore driver
perf/arm-cmn: Reset DTM_PMU_CONFIG at probe
drivers/perf: hisi: Extract initialization of "cpa_pmu->pmu"
drivers/perf: hisi: Simplify the parameters of hisi_pmu_init()
drivers/perf: hisi: Advertise the PERF_PMU_CAP_NO_EXCLUDE capability
* for-next/sysreg:
: arm64 sysreg and cpufeature fixes/updates
KVM: arm64: Use symbolic definition for ISR_EL1.A
arm64/sysreg: Add definition of ISR_EL1
arm64/sysreg: Add definition for ICC_NMIAR1_EL1
arm64/cpufeature: Remove 4 bit assumption in ARM64_FEATURE_MASK()
arm64/sysreg: Fix errors in 32 bit enumeration values
arm64/cpufeature: Fix field sign for DIT hwcap detection
* for-next/sme:
: SME-related updates
arm64/sme: Optimise SME exit on syscall entry
arm64/sme: Don't use streaming mode to probe the maximum SME VL
arm64/ptrace: Use system_supports_tpidr2() to check for TPIDR2 support
* for-next/kselftest: (23 commits)
: arm64 kselftest fixes and improvements
kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests
kselftest/arm64: Copy whole EXTRA context
kselftest/arm64: Fix enumeration of systems without 128 bit SME for SSVE+ZA
kselftest/arm64: Fix enumeration of systems without 128 bit SME
kselftest/arm64: Don't require FA64 for streaming SVE tests
kselftest/arm64: Limit the maximum VL we try to set via ptrace
kselftest/arm64: Correct buffer size for SME ZA storage
kselftest/arm64: Remove the local NUM_VL definition
kselftest/arm64: Verify simultaneous SSVE and ZA context generation
kselftest/arm64: Verify that SSVE signal context has SVE_SIG_FLAG_SM set
kselftest/arm64: Remove spurious comment from MTE test Makefile
kselftest/arm64: Support build of MTE tests with clang
kselftest/arm64: Initialise current at build time in signal tests
kselftest/arm64: Don't pass headers to the compiler as source
kselftest/arm64: Remove redundant _start labels from FP tests
kselftest/arm64: Fix .pushsection for strings in FP tests
kselftest/arm64: Run BTI selftests on systems without BTI
kselftest/arm64: Fix test numbering when skipping tests
kselftest/arm64: Skip non-power of 2 SVE vector lengths in fp-stress
kselftest/arm64: Only enumerate power of two VLs in syscall-abi
...
* for-next/misc:
: Miscellaneous arm64 updates
arm64/mm: Intercept pfn changes in set_pte_at()
Documentation: arm64: correct spelling
arm64: traps: attempt to dump all instructions
arm64: Apply dynamic shadow call stack patching in two passes
arm64: el2_setup.h: fix spelling typo in comments
arm64: Kconfig: fix spelling
arm64: cpufeature: Use kstrtobool() instead of strtobool()
arm64: Avoid repeated AA64MMFR1_EL1 register read on pagefault path
arm64: make ARCH_FORCE_MAX_ORDER selectable
* for-next/sme2: (23 commits)
: Support for arm64 SME 2 and 2.1
arm64/sme: Fix __finalise_el2 SMEver check
kselftest/arm64: Remove redundant _start labels from zt-test
kselftest/arm64: Add coverage of SME 2 and 2.1 hwcaps
kselftest/arm64: Add coverage of the ZT ptrace regset
kselftest/arm64: Add SME2 coverage to syscall-abi
kselftest/arm64: Add test coverage for ZT register signal frames
kselftest/arm64: Teach the generic signal context validation about ZT
kselftest/arm64: Enumerate SME2 in the signal test utility code
kselftest/arm64: Cover ZT in the FP stress test
kselftest/arm64: Add a stress test program for ZT0
arm64/sme: Add hwcaps for SME 2 and 2.1 features
arm64/sme: Implement ZT0 ptrace support
arm64/sme: Implement signal handling for ZT
arm64/sme: Implement context switching for ZT0
arm64/sme: Provide storage for ZT0
arm64/sme: Add basic enumeration for SME2
arm64/sme: Enable host kernel to access ZT0
arm64/sme: Manually encode ZT0 load and store instructions
arm64/esr: Document ISS for ZT0 being disabled
arm64/sme: Document SME 2 and SME 2.1 ABI
...
* for-next/tpidr2:
: Include TPIDR2 in the signal context
kselftest/arm64: Add test case for TPIDR2 signal frame records
kselftest/arm64: Add TPIDR2 to the set of known signal context records
arm64/signal: Include TPIDR2 in the signal context
arm64/sme: Document ABI for TPIDR2 signal information
* for-next/scs:
: arm64: harden shadow call stack pointer handling
arm64: Stash shadow stack pointer in the task struct on interrupt
arm64: Always load shadow stack pointer directly from the task struct
* for-next/compat-hwcap:
: arm64: Expose compat ARMv8 AArch32 features (HWCAPs)
arm64: Add compat hwcap SSBS
arm64: Add compat hwcap SB
arm64: Add compat hwcap I8MM
arm64: Add compat hwcap ASIMDBF16
arm64: Add compat hwcap ASIMDFHM
arm64: Add compat hwcap ASIMDDP
arm64: Add compat hwcap FPHP and ASIMDHP
* for-next/ftrace:
: Add arm64 support for DYNAMICE_FTRACE_WITH_CALL_OPS
arm64: avoid executing padding bytes during kexec / hibernation
arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
arm64: ftrace: Update stale comment
arm64: patching: Add aarch64_insn_write_literal_u64()
arm64: insn: Add helpers for BTI
arm64: Extend support for CONFIG_FUNCTION_ALIGNMENT
ACPI: Don't build ACPICA with '-Os'
Compiler attributes: GCC cold function alignment workarounds
ftrace: Add DYNAMIC_FTRACE_WITH_CALL_OPS
* for-next/efi-boot-mmu-on:
: Permit arm64 EFI boot with MMU and caches on
arm64: kprobes: Drop ID map text from kprobes blacklist
arm64: head: Switch endianness before populating the ID map
efi: arm64: enter with MMU and caches enabled
arm64: head: Clean the ID map and the HYP text to the PoC if needed
arm64: head: avoid cache invalidation when entering with the MMU on
arm64: head: record the MMU state at primary entry
arm64: kernel: move identity map out of .text mapping
arm64: head: Move all finalise_el2 calls to after __enable_mmu
* for-next/ptrauth:
: arm64 pointer authentication cleanup
arm64: pauth: don't sign leaf functions
arm64: unify asm-arch manipulation
* for-next/pseudo-nmi:
: Pseudo-NMI code generation optimisations
arm64: irqflags: use alternative branches for pseudo-NMI logic
arm64: add ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap
arm64: make ARM64_HAS_GIC_PRIO_MASKING depend on ARM64_HAS_GIC_CPUIF_SYSREGS
arm64: rename ARM64_HAS_IRQ_PRIO_MASKING to ARM64_HAS_GIC_PRIO_MASKING
arm64: rename ARM64_HAS_SYSREG_GIC_CPUIF to ARM64_HAS_GIC_CPUIF_SYSREGS
At present the hwcaps are hard to read and a bit error prone since the
macros used to specify matches require us to write out the register name
multiple times and explicitly specify the width of the field, hopefully
using the correct constant. Now that all the ID registers are generated we
can improve this somewhat by redoing the macros so that we specify the
register, field and minimum value symbolically and use token pasting to
initialise the capability struct with the appropriate values.
We move from specifying like this:
HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_EL1_BT_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR1_EL1_BT_IMP, CAP_HWCAP, KERNEL_HWCAP_BTI),
to this:
HWCAP_CAP(ID_AA64PFR1_EL1, BT, IMP, CAP_HWCAP, KERNEL_HWCAP_BTI),
which is shorter due to having less duplicate information and makes it
much harder to make an error like specifying the wrong field width or
an invalid enumeration value since everything must be a constant defined
for the sysreg and names are only typed once.
There should be no functional effect from this change, a check of the
generated .rodata showed no differences.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221207-arm64-sysreg-helpers-v4-5-25b6b3fb9d18@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Our table of hwcaps sometimes uses the defined constant to specify the
enumeration value they are attempting to match but in some cases an
unadorned number is used. In preparation for using helper macros to to
specify the hwcaps less verbosely replace the magic numbers with their
constants, this will hopefully make the conversion to helper macros
easier to review.
There should be no functional effect from this change, a check of the
generate .rodata showed no differences.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221207-arm64-sysreg-helpers-v4-4-25b6b3fb9d18@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When Priority Mask Hint Enable (PMHE) == 0b1, the GIC may use the PMR
value to determine whether to signal an IRQ to a PE, and consequently
after a change to the PMR value, a DSB SY may be required to ensure that
interrupts are signalled to a CPU in finite time. When PMHE == 0b0,
interrupts are always signalled to the relevant PE, and all masking
occurs locally, without requiring a DSB SY.
Since commit:
f226650494 ("arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear")
... we handle this dynamically: in most cases a static key is used to
determine whether to issue a DSB SY, but the entry code must read from
ICC_CTLR_EL1 as static keys aren't accessible from plain assembly.
It would be much nicer to use an alternative instruction sequence for
the DSB, as this would avoid the need to read from ICC_CTLR_EL1 in the
entry code, and for most other code this will result in simpler code
generation with fewer instructions and fewer branches.
This patch adds a new ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap which is
only set when ICC_CTLR_EL1.PMHE == 0b0 (and GIC priority masking is in
use). This allows us to replace the existing users of the
`gic_pmr_sync` static key with alternative sequences which default to a
DSB SY and are relaxed to a NOP when PMHE is not in use.
The entry assembly management of the PMR is slightly restructured to use
a branch (rather than multiple NOPs) when priority masking is not in
use. This is more in keeping with other alternatives in the entry
assembly, and permits the use of a separate alternatives for the
PMHE-dependent DSB SY (and removal of the conditional branch this
currently requires). For consistency I've adjusted both the save and
restore paths.
According to bloat-o-meter, when building defconfig +
CONFIG_ARM64_PSEUDO_NMI=y this shrinks the kernel text by ~4KiB:
| add/remove: 4/2 grow/shrink: 42/310 up/down: 332/-5032 (-4700)
The resulting vmlinux is ~66KiB smaller, though the resulting Image size
is unchanged due to padding and alignment:
| [mark@lakrids:~/src/linux]% ls -al vmlinux-*
| -rwxr-xr-x 1 mark mark 137508344 Jan 17 14:11 vmlinux-after
| -rwxr-xr-x 1 mark mark 137575440 Jan 17 13:49 vmlinux-before
| [mark@lakrids:~/src/linux]% ls -al Image-*
| -rw-r--r-- 1 mark mark 38777344 Jan 17 14:11 Image-after
| -rw-r--r-- 1 mark mark 38777344 Jan 17 13:49 Image-before
Prior to this patch we did not verify the state of ICC_CTLR_EL1.PMHE on
secondary CPUs. As of this patch this is verified by the cpufeature code
when using GIC priority masking (i.e. when using pseudo-NMIs).
Note that since commit:
7e3a57fa6c ("arm64: Document ICC_CTLR_EL3.PMHE setting requirements")
... Documentation/arm64/booting.rst specifies:
| - ICC_CTLR_EL3.PMHE (bit 6) must be set to the same value across
| all CPUs the kernel is executing on, and must stay constant
| for the lifetime of the kernel.
... so that should not adversely affect any compliant systems, and as
we'll only check for the absense of PMHE when using pseudo-NMIs, this
will only fire when such mismatch will adversely affect the system.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently the arm64_cpu_capabilities structure for
ARM64_HAS_GIC_PRIO_MASKING open-codes the same CPU field definitions as
the arm64_cpu_capabilities structure for ARM64_HAS_GIC_CPUIF_SYSREGS, so
that can_use_gic_priorities() can use has_useable_gicv3_cpuif().
This duplication isn't ideal for the legibility of the code, and sets a
bad example for any ARM64_HAS_GIC_* definitions added by subsequent
patches.
Instead, have ARM64_HAS_GIC_PRIO_MASKING check for the
ARM64_HAS_GIC_CPUIF_SYSREGS cpucap, and add a comment explaining why
this is safe. Subsequent patches will use the same pattern where one
cpucap depends upon another.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Subsequent patches will add more GIC-related cpucaps. When we do so, it
would be nice to give them a consistent HAS_GIC_* prefix.
In preparation for doing so, this patch renames the existing
ARM64_HAS_IRQ_PRIO_MASKING cap to ARM64_HAS_GIC_PRIO_MASKING.
The cpucaps file was hand-modified; all other changes were scripted
with:
find . -type f -name '*.[chS]' -print0 | \
xargs -0 sed -i 's/ARM64_HAS_IRQ_PRIO_MASKING/ARM64_HAS_GIC_PRIO_MASKING/'
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Subsequent patches will add more GIC-related cpucaps. When we do so, it
would be nice to give them a consistent HAS_GIC_* prefix.
In preparation for doing so, this patch renames the existing
ARM64_HAS_SYSREG_GIC_CPUIF cap to ARM64_HAS_GIC_CPUIF_SYSREGS.
The 'CPUIF_SYSREGS' suffix is chosen so that this will be ordered ahead
of other ARM64_HAS_GIC_* definitions in subsequent patches.
The cpucaps file was hand-modified; all other changes were scripted
with:
find . -type f -name '*.[chS]' -print0 | \
xargs -0 sed -i
's/ARM64_HAS_SYSREG_GIC_CPUIF/ARM64_HAS_GIC_CPUIF_SYSREGS/'
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
strtobool() is the same as kstrtobool().
However, the latter is more used within the kernel.
In order to remove strtobool() and slightly simplify kstrtox.h, switch to
the other function name.
While at it, include the corresponding header file (<linux/kstrtox.h>)
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/5a1b329cda34aec67615c0d2fd326eb0d6634bf7.1667336095.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This hwcap was added for 32-bit native arm kernel by commit fea53546be
("ARM: 9274/1: Add hwcap for Speculative Store Bypassing Safe") and hence
the corresponding changes added in 32-bit compat arm64 for similar user
interfaces.
Speculative Store Bypass Safe is a feature(FEAT_SSBS) present in
AArch32/AArch64 state for Armv8 and can be identified by PFR2.SSBS
identification register. This hwcap is already advertised in native arm64
kernel.
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-8-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This hwcap was added for 32-bit native arm kernel by commit
3bda6d8848 ("ARM: 9273/1: Add hwcap for Speculation Barrier(SB)")
and hence the corresponding changes added in 32-bit compat arm64 kernel.
Speculation Barrier is a feature(FEAT_SB) present in both AArch32 and
AArch64 state. This hwcap is already advertised in native arm64 kernel.
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-7-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This hwcap was added earlier for 32-bit native arm kernel by commit
956ca3a4eb ("ARM: 9272/1: vfp: Add hwcap for FEAT_AA32I8MM") and hence
the corresponding changes added in 32-bit compat arm64 kernel for similar
user interfaces.
Int8 matrix multiplication is a feature (FEAT_AA32I8MM) present in AArch32
state of Armv8 and is identified by ISAR6.I8MM register. Similar
feature(FEAT_I8MM) exist for AArch64 state and is already advertised in
arm64 kernel.
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-6-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This hwcap was added earlier for 32-bit native arm kernel by commit
23b6d4ad6e ("ARM: 9271/1: vfp: Add hwcap for FEAT_AA32BF16") and hence
the corresponding changes added in 32-bit compat arm64 kernel.
Brain 16-bit floating-point storage format is a feature (FEAT_AA32BF16)
present in AArch32 state for Armv8 and is represented by ISAR6.BF16
identification register. Similar feature (FEAT_BF16) exist for AArch64
state and is already advertised in native arm64 kernel.
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-5-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This hwcap was added earlier for 32-bit native arm kernel by commit
ce4835497c ("ARM: 9270/1: vfp: Add hwcap for FEAT_FHM") and hence the
corresponding changes added in 32-bit compat arm64 kernel for similar user
interfaces.
Floating-point half-precision multiplication (FHM) is a feature present
in AArch32/AArch64 state for Armv8. This hwcap is already advertised in
native arm64 kernel.
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-4-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This hwcap was added earlier for 32-bit native arm kernel by commit
62ea0d873a ("ARM: 9269/1: vfp: Add hwcap for FEAT_DotProd") and hence the
corresponding changes added in 32-bit compat arm64 kernel for similar user
interfaces.
Advanced Dot product is a feature (FEAT_DotProd) present in both
AArch32/AArch64 state for Armv8 and is already advertised in native arm64
kernel.
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-3-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
These hwcaps were added earlier for 32-bit native arm kernel by commit
c00a19c8b1 ("ARM: 9268/1: vfp: Add hwcap FPHP and ASIMDHP for FEAT_FP16")
and hence the corresponding changes added in 32-bit compat arm64 kernel for
similar userspace interfaces.
Floating point half-precision (FPHP) and Advanced SIMD half-precision
(ASIMDHP) represents the Armv8 FP16 feature extension and is already
advertised in native arm64 kernel.
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-2-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Since it was added our hwcap for DIT has specified that DIT is a signed
field but this appears to be incorrect, the two values for the enumeration
are:
0b0000 NI
0b0001 IMP
which look like a normal unsigned enumeration and the in-kernel DIT usage
added by 01ab991fc0 ("arm64: Enable data independent timing (DIT) in the
kernel") detects the feature with an unsigned enum. Fix the hwcap to specify
the field as unsigned.
Fixes: 7206dc93a5 ("arm64: Expose Arm v8.4 features")
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221207-arm64-sysreg-helpers-v3-1-0d71a7b174a8@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* Enable the per-vcpu dirty-ring tracking mechanism, together with an
option to keep the good old dirty log around for pages that are
dirtied by something other than a vcpu.
* Switch to the relaxed parallel fault handling, using RCU to delay
page table reclaim and giving better performance under load.
* Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option,
which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a9:
"Fix a number of issues with MTE, such as races on the tags being
initialised vs the PG_mte_tagged flag as well as the lack of support
for VM_SHARED when KVM is involved. Patches from Catalin Marinas and
Peter Collingbourne").
* Merge the pKVM shadow vcpu state tracking that allows the hypervisor
to have its own view of a vcpu, keeping that state private.
* Add support for the PMUv3p5 architecture revision, bringing support
for 64bit counters on systems that support it, and fix the
no-quite-compliant CHAIN-ed counter support for the machines that
actually exist out there.
* Fix a handful of minor issues around 52bit VA/PA support (64kB pages
only) as a prefix of the oncoming support for 4kB and 16kB pages.
* Pick a small set of documentation and spelling fixes, because no
good merge window would be complete without those.
s390:
* Second batch of the lazy destroy patches
* First batch of KVM changes for kernel virtual != physical address support
* Removal of a unused function
x86:
* Allow compiling out SMM support
* Cleanup and documentation of SMM state save area format
* Preserve interrupt shadow in SMM state save area
* Respond to generic signals during slow page faults
* Fixes and optimizations for the non-executable huge page errata fix.
* Reprogram all performance counters on PMU filter change
* Cleanups to Hyper-V emulation and tests
* Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest
running on top of a L1 Hyper-V hypervisor)
* Advertise several new Intel features
* x86 Xen-for-KVM:
** Allow the Xen runstate information to cross a page boundary
** Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
** Add support for 32-bit guests in SCHEDOP_poll
* Notable x86 fixes and cleanups:
** One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
** Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few
years back when eliminating unnecessary barriers when switching between
vmcs01 and vmcs02.
** Clean up vmread_error_trampoline() to make it more obvious that params
must be passed on the stack, even for x86-64.
** Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective
of the current guest CPUID.
** Fudge around a race with TSC refinement that results in KVM incorrectly
thinking a guest needs TSC scaling when running on a CPU with a
constant TSC, but no hardware-enumerated TSC frequency.
** Advertise (on AMD) that the SMM_CTL MSR is not supported
** Remove unnecessary exports
Generic:
* Support for responding to signals during page faults; introduces
new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks
Selftests:
* Fix an inverted check in the access tracking perf test, and restore
support for asserting that there aren't too many idle pages when
running on bare metal.
* Fix build errors that occur in certain setups (unsure exactly what is
unique about the problematic setup) due to glibc overriding
static_assert() to a variant that requires a custom message.
* Introduce actual atomics for clear/set_bit() in selftests
* Add support for pinning vCPUs in dirty_log_perf_test.
* Rename the so called "perf_util" framework to "memstress".
* Add a lightweight psuedo RNG for guest use, and use it to randomize
the access pattern and write vs. read percentage in the memstress tests.
* Add a common ucall implementation; code dedup and pre-work for running
SEV (and beyond) guests in selftests.
* Provide a common constructor and arch hook, which will eventually be
used by x86 to automatically select the right hypercall (AMD vs. Intel).
* A bunch of added/enabled/fixed selftests for ARM64, covering memslots,
breakpoints, stage-2 faults and access tracking.
* x86-specific selftest changes:
** Clean up x86's page table management.
** Clean up and enhance the "smaller maxphyaddr" test, and add a related
test to cover generic emulation failure.
** Clean up the nEPT support checks.
** Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.
** Fix an ordering issue in the AMX test introduced by recent conversions
to use kvm_cpu_has(), and harden the code to guard against similar bugs
in the future. Anything that tiggers caching of KVM's supported CPUID,
kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if
the caching occurs before the test opts in via prctl().
Documentation:
* Remove deleted ioctls from documentation
* Clean up the docs for the x86 MSR filter.
* Various fixes
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmOaFrcUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPemQgAq49excg2Cc+EsHnZw3vu/QWdA0Rt
KhL3OgKxuHNjCbD2O9n2t5di7eJOTQ7F7T0eDm3xPTr4FS8LQ2327/mQePU/H2CF
mWOpq9RBWLzFsSTeVA2Mz9TUTkYSnDHYuRsBvHyw/n9cL76BWVzjImldFtjYjjex
yAwl8c5itKH6bc7KO+5ydswbvBzODkeYKUSBNdbn6m0JGQST7XppNwIAJvpiHsii
Qgpk0e4Xx9q4PXG/r5DedI6BlufBsLhv0aE9SHPzyKH3JbbUFhJYI8ZD5OhBQuYW
MwxK2KlM5Jm5ud2NZDDlsMmmvd1lnYCFDyqNozaKEWC1Y5rq1AbMa51fXA==
=QAYX
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM64:
- Enable the per-vcpu dirty-ring tracking mechanism, together with an
option to keep the good old dirty log around for pages that are
dirtied by something other than a vcpu.
- Switch to the relaxed parallel fault handling, using RCU to delay
page table reclaim and giving better performance under load.
- Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
option, which multi-process VMMs such as crosvm rely on (see merge
commit 382b5b87a9: "Fix a number of issues with MTE, such as
races on the tags being initialised vs the PG_mte_tagged flag as
well as the lack of support for VM_SHARED when KVM is involved.
Patches from Catalin Marinas and Peter Collingbourne").
- Merge the pKVM shadow vcpu state tracking that allows the
hypervisor to have its own view of a vcpu, keeping that state
private.
- Add support for the PMUv3p5 architecture revision, bringing support
for 64bit counters on systems that support it, and fix the
no-quite-compliant CHAIN-ed counter support for the machines that
actually exist out there.
- Fix a handful of minor issues around 52bit VA/PA support (64kB
pages only) as a prefix of the oncoming support for 4kB and 16kB
pages.
- Pick a small set of documentation and spelling fixes, because no
good merge window would be complete without those.
s390:
- Second batch of the lazy destroy patches
- First batch of KVM changes for kernel virtual != physical address
support
- Removal of a unused function
x86:
- Allow compiling out SMM support
- Cleanup and documentation of SMM state save area format
- Preserve interrupt shadow in SMM state save area
- Respond to generic signals during slow page faults
- Fixes and optimizations for the non-executable huge page errata
fix.
- Reprogram all performance counters on PMU filter change
- Cleanups to Hyper-V emulation and tests
- Process Hyper-V TLB flushes from a nested guest (i.e. from a L2
guest running on top of a L1 Hyper-V hypervisor)
- Advertise several new Intel features
- x86 Xen-for-KVM:
- Allow the Xen runstate information to cross a page boundary
- Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
- Add support for 32-bit guests in SCHEDOP_poll
- Notable x86 fixes and cleanups:
- One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
- Reinstate IBPB on emulated VM-Exit that was incorrectly dropped
a few years back when eliminating unnecessary barriers when
switching between vmcs01 and vmcs02.
- Clean up vmread_error_trampoline() to make it more obvious that
params must be passed on the stack, even for x86-64.
- Let userspace set all supported bits in MSR_IA32_FEAT_CTL
irrespective of the current guest CPUID.
- Fudge around a race with TSC refinement that results in KVM
incorrectly thinking a guest needs TSC scaling when running on a
CPU with a constant TSC, but no hardware-enumerated TSC
frequency.
- Advertise (on AMD) that the SMM_CTL MSR is not supported
- Remove unnecessary exports
Generic:
- Support for responding to signals during page faults; introduces
new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks
Selftests:
- Fix an inverted check in the access tracking perf test, and restore
support for asserting that there aren't too many idle pages when
running on bare metal.
- Fix build errors that occur in certain setups (unsure exactly what
is unique about the problematic setup) due to glibc overriding
static_assert() to a variant that requires a custom message.
- Introduce actual atomics for clear/set_bit() in selftests
- Add support for pinning vCPUs in dirty_log_perf_test.
- Rename the so called "perf_util" framework to "memstress".
- Add a lightweight psuedo RNG for guest use, and use it to randomize
the access pattern and write vs. read percentage in the memstress
tests.
- Add a common ucall implementation; code dedup and pre-work for
running SEV (and beyond) guests in selftests.
- Provide a common constructor and arch hook, which will eventually
be used by x86 to automatically select the right hypercall (AMD vs.
Intel).
- A bunch of added/enabled/fixed selftests for ARM64, covering
memslots, breakpoints, stage-2 faults and access tracking.
- x86-specific selftest changes:
- Clean up x86's page table management.
- Clean up and enhance the "smaller maxphyaddr" test, and add a
related test to cover generic emulation failure.
- Clean up the nEPT support checks.
- Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.
- Fix an ordering issue in the AMX test introduced by recent
conversions to use kvm_cpu_has(), and harden the code to guard
against similar bugs in the future. Anything that tiggers
caching of KVM's supported CPUID, kvm_cpu_has() in this case,
effectively hides opt-in XSAVE features if the caching occurs
before the test opts in via prctl().
Documentation:
- Remove deleted ioctls from documentation
- Clean up the docs for the x86 MSR filter.
- Various fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits)
KVM: x86: Add proper ReST tables for userspace MSR exits/flags
KVM: selftests: Allocate ucall pool from MEM_REGION_DATA
KVM: arm64: selftests: Align VA space allocator with TTBR0
KVM: arm64: Fix benign bug with incorrect use of VA_BITS
KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow
KVM: x86: Advertise that the SMM_CTL MSR is not supported
KVM: x86: remove unnecessary exports
KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic"
tools: KVM: selftests: Convert clear/set_bit() to actual atomics
tools: Drop "atomic_" prefix from atomic test_and_set_bit()
tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers
KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests
perf tools: Use dedicated non-atomic clear/set bit helpers
tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers
KVM: arm64: selftests: Enable single-step without a "full" ucall()
KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself
KVM: Remove stale comment about KVM_REQ_UNHALT
KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR
KVM: Reference to kvm_userspace_memory_region in doc and comments
KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl
...
* kvm-arm64/mte-map-shared:
: .
: Update the MTE support to allow the VMM to use shared mappings
: to back the memslots exposed to MTE-enabled guests.
:
: Patches courtesy of Catalin Marinas and Peter Collingbourne.
: .
: Fix a number of issues with MTE, such as races on the tags
: being initialised vs the PG_mte_tagged flag as well as the
: lack of support for VM_SHARED when KVM is involved.
:
: Patches from Catalin Marinas and Peter Collingbourne.
: .
Documentation: document the ABI changes for KVM_CAP_ARM_MTE
KVM: arm64: permit all VM_MTE_ALLOWED mappings with MTE enabled
KVM: arm64: unify the tests for VMAs in memslots when MTE is enabled
arm64: mte: Lock a page for MTE tag initialisation
mm: Add PG_arch_3 page flag
KVM: arm64: Simplify the sanitise_mte_tags() logic
arm64: mte: Fix/clarify the PG_mte_tagged semantics
mm: Do not enable PG_arch_2 for all 64-bit architectures
Signed-off-by: Marc Zyngier <maz@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the MVFR2_EL1 register use lower-case for feature
names where the arm-arm does the same.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-16-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the MVFR1_EL1 register use lower-case for feature
names where the arm-arm does the same.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-15-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the MVFR0_EL1 register use lower-case for feature
names where the arm-arm does the same.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-14-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the ID_DFR1_EL1 register have an _EL1 suffix.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-13-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the ID_DFR0_EL1 register have an _EL1 suffix,
and use lower-case for feature names where the arm-arm does the same.
The arm-arm has feature names for some of the ID_DFR0_EL1.PerMon encodings.
Use these feature names in preference to the '8_4' indication of the
architecture version they were introduced in.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-12-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the ID_PFR2_EL1 register have an _EL1 suffix.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-11-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the ID_PFR1_EL1 register have an _EL1 suffix,
and use lower case in feature names where the arm-arm does the same.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-10-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the ID_PFR0_EL1 register have an _EL1 suffix,
and use lower case in feature names where the arm-arm does the same.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-9-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the ID_ISAR6_EL1 register have an _EL1 suffix.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-8-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the ID_ISAR5_EL1 register have an _EL1 suffix.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-7-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the ID_ISAR4_EL1 register have an _EL1 suffix,
and use lower-case for feature names where the arm-arm does the same.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-6-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the ID_ISAR0_EL1 register have an _EL1 suffix,
and use lower-case for feature names where the arm-arm does the same.
To functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-5-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the ID_MMFR5_EL1 register have an _EL1 suffix.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-4-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.
Ensure symbols for the ID_MMFR4_EL1 register have an _EL1 suffix,
and use lower case in feature names where the arm-arm does the same.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-3-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates. The scripts would like to follow exactly what is in the
arm-arm, which uses lower case for some of these feature names.
Ensure symbols for the ID_MMFR0_EL1 register have an _EL1 suffix,
and use lower case in feature names where the arm-arm does the same.
No functional change.
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-2-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Initialising the tags and setting PG_mte_tagged flag for a page can race
between multiple set_pte_at() on shared pages or setting the stage 2 pte
via user_mem_abort(). Introduce a new PG_mte_lock flag as PG_arch_3 and
set it before attempting page initialisation. Given that PG_mte_tagged
is never cleared for a page, consider setting this flag to mean page
unlocked and wait on this bit with acquire semantics if the page is
locked:
- try_page_mte_tagging() - lock the page for tagging, return true if it
can be tagged, false if already tagged. No acquire semantics if it
returns true (PG_mte_tagged not set) as there is no serialisation with
a previous set_page_mte_tagged().
- set_page_mte_tagged() - set PG_mte_tagged with release semantics.
The two-bit locking is based on Peter Collingbourne's idea.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Peter Collingbourne <pcc@google.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221104011041.290951-6-pcc@google.com
Currently the PG_mte_tagged page flag mostly means the page contains
valid tags and it should be set after the tags have been cleared or
restored. However, in mte_sync_tags() it is set before setting the tags
to avoid, in theory, a race with concurrent mprotect(PROT_MTE) for
shared pages. However, a concurrent mprotect(PROT_MTE) with a copy on
write in another thread can cause the new page to have stale tags.
Similarly, tag reading via ptrace() can read stale tags if the
PG_mte_tagged flag is set before actually clearing/restoring the tags.
Fix the PG_mte_tagged semantics so that it is only set after the tags
have been cleared or restored. This is safe for swap restoring into a
MAP_SHARED or CoW page since the core code takes the page lock. Add two
functions to test and set the PG_mte_tagged flag with acquire and
release semantics. The downside is that concurrent mprotect(PROT_MTE) on
a MAP_SHARED page may cause tag loss. This is already the case for KVM
guests if a VMM changes the page protection while the guest triggers a
user_mem_abort().
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[pcc@google.com: fix build with CONFIG_ARM64_MTE disabled]
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221104011041.290951-3-pcc@google.com
On CPUs without FEAT_IDST, ID register emulation is slower than it needs
to be, as all threads contend for the same lock to perform the
emulation. This patch reworks the emulation to avoid this unnecessary
contention.
On CPUs with FEAT_IDST (which is mandatory from ARMv8.4 onwards), EL0
accesses to ID registers result in a SYS trap, and emulation of these is
handled with a sys64_hook. These hooks are statically allocated, and no
locking is required to iterate through the hooks and perform the
emulation, allowing emulation to occur in parallel with no contention.
On CPUs without FEAT_IDST, EL0 accesses to ID registers result in an
UNDEFINED exception, and emulation of these accesses is handled with an
undef_hook. When an EL0 MRS instruction is trapped to EL1, the kernel
finds the relevant handler by iterating through all of the undef_hooks,
requiring undef_lock to be held during this lookup.
This locking is only required to safely traverse the list of undef_hooks
(as it can be concurrently modified), and the actual emulation of the
MRS does not require any mutual exclusion. This locking is an
unfortunate bottleneck, especially given that MRS emulation is enabled
unconditionally and is never disabled.
This patch reworks the non-FEAT_IDST MRS emulation logic so that it can
be invoked directly from do_el0_undef(). This removes the bottleneck,
allowing MRS traps to be handled entirely in parallel, and is a stepping
stone to making all of the undef_hooks lock-free.
I've tested this in a 64-vCPU VM on a 64-CPU ThunderX2 host, with a
benchmark which spawns a number of threads which each try to read
ID_AA64ISAR0_EL1 1000000 times. This is vastly more contention than will
ever be seen in realistic usage, but clearly demonstrates the removal of
the bottleneck:
| Threads || Time (seconds) |
| || Before || After |
| || Real | System || Real | System |
|---------++--------+---------++--------+---------|
| 1 || 0.29 | 0.20 || 0.24 | 0.12 |
| 2 || 0.35 | 0.51 || 0.23 | 0.27 |
| 4 || 1.08 | 3.87 || 0.24 | 0.56 |
| 8 || 4.31 | 33.60 || 0.24 | 1.11 |
| 16 || 9.47 | 149.39 || 0.23 | 2.15 |
| 32 || 19.07 | 605.27 || 0.24 | 4.38 |
| 64 || 65.40 | 3609.09 || 0.33 | 11.27 |
Aside from the speedup, there should be no functional change as a result
of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-6-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
FEAT_SVE2p1 introduces a number of new SVE instructions. Since there is no
new architectural state added kernel support is simply a new hwcap which
lets userspace know that the feature is supported.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20221017152520.1039165-6-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
FEAT_RPRFM adds a new range prefetch hint within the existing PRFM space
for range prefetch hinting. Add a new hwcap to allow userspace to discover
support for the new instruction.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20221017152520.1039165-4-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
FEAT_CSSC adds a number of new instructions usable to optimise common short
sequences of instructions, add a hwcap indicating that the feature is
available and can be used by userspace.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20221017152520.1039165-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
The ARM architecture revision v8.4 introduces a data independent timing
control (DIT) which can be set at any exception level, and instructs the
CPU to avoid optimizations that may result in a correlation between the
execution time of certain instructions and the value of the data they
operate on.
The DIT bit is part of PSTATE, and is therefore context switched as
usual, given that it becomes part of the saved program state (SPSR) when
taking an exception. We have also defined a hwcap for DIT, and so user
space can discover already whether or nor DIT is available. This means
that, as far as user space is concerned, DIT is wired up and fully
functional.
In the kernel, however, we never bothered with DIT: we disable at it
boot (i.e., INIT_PSTATE_EL1 has DIT cleared) and ignore the fact that we
might run with DIT enabled if user space happened to set it.
Currently, we have no idea whether or not running privileged code with
DIT disabled on a CPU that implements support for it may result in a
side channel that exposes privileged data to unprivileged user space
processes, so let's be cautious and just enable DIT while running in the
kernel if supported by all CPUs.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Adam Langley <agl@google.com>
Link: https://lore.kernel.org/all/YwgCrqutxmX0W72r@gmail.com/
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20221107172400.1851434-1-ardb@kernel.org
[will: Removed cpu_has_dit() as per Mark's suggestion on the list]
Signed-off-by: Will Deacon <will@kernel.org>
Commit 237405ebef ("arm64: cpufeature: Force HWCAP to be based on the
sysreg visible to user-space") forced the hwcaps to use sanitised
user-space view of the id registers. However, the ID register structures
used to select few compat cpufeatures (vfp, crc32, ...) are masked and
hence such hwcaps do not appear in /proc/cpuinfo anymore for PER_LINUX32
personality.
Add the ID register structures explicitly and set the relevant entry as
visible. As these ID registers are now of type visible so make them
available in 64-bit userspace by making necessary changes in register
emulation logic and documentation.
While at it, update the comment for structure ftr_generic_32bits[] which
lists the ID register that use it.
Fixes: 237405ebef ("arm64: cpufeature: Force HWCAP to be based on the sysreg visible to user-space")
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Link: https://lore.kernel.org/r/20221103082232.19189-1-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
- arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE
vector granule register added to the user regs together with SVE perf
extensions documentation.
- SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation
to match the actual kernel behaviour (zeroing the registers on syscall
rather than "zeroed or preserved" previously).
- More conversions to automatic system registers generation.
- vDSO: use self-synchronising virtual counter access in gettimeofday()
if the architecture supports it.
- arm64 stacktrace cleanups and improvements.
- arm64 atomics improvements: always inline assembly, remove LL/SC
trampolines.
- Improve the reporting of EL1 exceptions: rework BTI and FPAC exception
handling, better EL1 undefs reporting.
- Cortex-A510 erratum 2658417: remove BF16 support due to incorrect
result.
- arm64 defconfig updates: build CoreSight as a module, enable options
necessary for docker, memory hotplug/hotremove, enable all PMUs
provided by Arm.
- arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME
extensions).
- arm64 ftraces updates/fixes: fix module PLTs with mcount, remove
unused function.
- kselftest updates for arm64: simple HWCAP validation, FP stress test
improvements, validation of ZA regs in signal handlers, include larger
SVE and SME vector lengths in signal tests, various cleanups.
- arm64 alternatives (code patching) improvements to robustness and
consistency: replace cpucap static branches with equivalent
alternatives, associate callback alternatives with a cpucap.
- Miscellaneous updates: optimise kprobe performance of patching
single-step slots, simplify uaccess_mask_ptr(), move MTE registers
initialisation to C, support huge vmalloc() mappings, run softirqs on
the per-CPU IRQ stack, compat (arm32) misalignment fixups for
multiword accesses.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmM9W4cACgkQa9axLQDI
XvEy3w/+LJ3KCFowWiz5gTAWikjv+UVssHjLMJixn47V7hsEFQ26Xnam/438rTMI
kE95u6DHUpw2SMIxKzFRO7oI5cQtP+cWGwTtOUnjVO+U1oN+HqDOIbO9DbylWDcU
eeeqMMmawMfTPuZrYklpOhXscsorbrKIvYBg7wHYOcwBYV3EPhWr89lwMvTVRuyJ
qpX628KlkGMaBcONNhv3nS3qZcAOs0oHQCAVS4C8czLDL+vtJlumXUS3xr1Mqm72
xtFe7sje8Djr2kZ8mzh0GbFiZEBoBD3F/l7ayq8gVRaVpToUt8sk36Stjs4LojF1
6imuAfji/5TItkScq5KhGqj6MIugwp/eUVbRN74OLNTYx7msF1ZADNFQ+Q0UuY0H
SYK13KvmOji0xjS8qAfhqrwNB79sk3fb+zF9LjETbdz4ZJCgg9gcFbSUTY0DvMfS
MXZk/jVeB07olA8xYbjh0BRt4UV9xU628FPQzK5k7e4Nzl4jSvgtJZCZanfuVtjy
/ZS1vbN8o7tQLBAlVnw+Exi/VedkKxkkMgm8tPKsMgERTFDx0Pc4Gs72hRpDnPWT
MRbeCCGleAf3JQ5vF0coBDNOCEVvweQgShHOyHTz0GyhWXLCFx3RJICo5I4EIpps
LLUk4JK0fO3LVrf1AEpu5ZP4+Sact0zfsH3gB7qyLPYFDmjDXD8=
=jl3Z
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE
vector granule register added to the user regs together with SVE perf
extensions documentation.
- SVE updates: add HWCAP for SVE EBF16, update the SVE ABI
documentation to match the actual kernel behaviour (zeroing the
registers on syscall rather than "zeroed or preserved" previously).
- More conversions to automatic system registers generation.
- vDSO: use self-synchronising virtual counter access in gettimeofday()
if the architecture supports it.
- arm64 stacktrace cleanups and improvements.
- arm64 atomics improvements: always inline assembly, remove LL/SC
trampolines.
- Improve the reporting of EL1 exceptions: rework BTI and FPAC
exception handling, better EL1 undefs reporting.
- Cortex-A510 erratum 2658417: remove BF16 support due to incorrect
result.
- arm64 defconfig updates: build CoreSight as a module, enable options
necessary for docker, memory hotplug/hotremove, enable all PMUs
provided by Arm.
- arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME
extensions).
- arm64 ftraces updates/fixes: fix module PLTs with mcount, remove
unused function.
- kselftest updates for arm64: simple HWCAP validation, FP stress test
improvements, validation of ZA regs in signal handlers, include
larger SVE and SME vector lengths in signal tests, various cleanups.
- arm64 alternatives (code patching) improvements to robustness and
consistency: replace cpucap static branches with equivalent
alternatives, associate callback alternatives with a cpucap.
- Miscellaneous updates: optimise kprobe performance of patching
single-step slots, simplify uaccess_mask_ptr(), move MTE registers
initialisation to C, support huge vmalloc() mappings, run softirqs on
the per-CPU IRQ stack, compat (arm32) misalignment fixups for
multiword accesses.
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits)
arm64: alternatives: Use vdso/bits.h instead of linux/bits.h
arm64/kprobe: Optimize the performance of patching single-step slot
arm64: defconfig: Add Coresight as module
kselftest/arm64: Handle EINTR while reading data from children
kselftest/arm64: Flag fp-stress as exiting when we begin finishing up
kselftest/arm64: Don't repeat termination handler for fp-stress
ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs
arm64/mm: fold check for KFENCE into can_set_direct_map()
arm64: ftrace: fix module PLTs with mcount
arm64: module: Remove unused plt_entry_is_initialized()
arm64: module: Make plt_equals_entry() static
arm64: fix the build with binutils 2.27
kselftest/arm64: Don't enable v8.5 for MTE selftest builds
arm64: uaccess: simplify uaccess_mask_ptr()
arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header
kselftest/arm64: Fix typo in hwcap check
arm64: mte: move register initialization to C
arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
arm64/sve: Add Perf extensions documentation
...
* for-next/misc:
: Miscellaneous patches
arm64/kprobe: Optimize the performance of patching single-step slot
ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs
arm64/mm: fold check for KFENCE into can_set_direct_map()
arm64: uaccess: simplify uaccess_mask_ptr()
arm64: mte: move register initialization to C
arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
arm64: support huge vmalloc mappings
arm64: spectre: increase parameters that can be used to turn off bhb mitigation individually
arm64: run softirqs on the per-CPU IRQ stack
arm64: compat: Implement misalignment fixups for multiword loads
* for-next/alternatives:
: Alternatives (code patching) improvements
arm64: fix the build with binutils 2.27
arm64: avoid BUILD_BUG_ON() in alternative-macros
arm64: alternatives: add shared NOP callback
arm64: alternatives: add alternative_has_feature_*()
arm64: alternatives: have callbacks take a cap
arm64: alternatives: make alt_region const
arm64: alternatives: hoist print out of __apply_alternatives()
arm64: alternatives: proton-pack: prepare for cap changes
arm64: alternatives: kvm: prepare for cap changes
arm64: cpufeature: make cpus_have_cap() noinstr-safe
With -fsanitize=kcfi, we no longer need function_nocfi() as
the compiler won't change function references to point to a
jump table. Remove all implementations and uses of the macro.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-14-samitolvanen@google.com
With -fsanitize=kcfi, CONFIG_CFI_CLANG no longer has issues
with address space confusion in functions that switch to linear
mapping. Now that the indirectly called assembly functions have
type annotations, drop the __nocfi attributes.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-12-samitolvanen@google.com
If FEAT_MTE2 is disabled via the arm64.nomte command line argument on a
CPU that claims to support FEAT_MTE2, the kernel will use Tagged Normal
in the MAIR. If we interpret arm64.nomte to mean that the CPU does not
in fact implement FEAT_MTE2, setting the system register like this may
lead to UNSPECIFIED behavior. Fix it by arranging for MAIR to be set
in the C function cpu_enable_mte which is called based on the sanitized
version of the system register.
There is no need for the rest of the MTE-related system register
initialization to happen from assembly, with the exception of TCR_EL1,
which must be set to include at least TBI1 because the secondary CPUs
access KASan-allocated data structures early. Therefore, make the TCR_EL1
initialization unconditional and move the rest of the initialization to
cpu_enable_mte so that we no longer have a dependency on the unsanitized
ID register value.
Co-developed-by: Evgenii Stepanov <eugenis@google.com>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Signed-off-by: Evgenii Stepanov <eugenis@google.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 3b714d24ef ("arm64: mte: CPU feature detection and initial sysreg configuration")
Cc: <stable@vger.kernel.org> # 5.10.x
Link: https://lore.kernel.org/r/20220915222053.3484231-1-eugenis@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currrently we use a mixture of alternative sequences and static branches
to handle features detected at boot time. For ease of maintenance we
generally prefer to use static branches in C code, but this has a few
downsides:
* Each static branch has metadata in the __jump_table section, which is
not discarded after features are finalized. This wastes some space,
and slows down the patching of other static branches.
* The static branches are patched at a different point in time from the
alternatives, so changes are not atomic. This leaves a transient
period where there could be a mismatch between the behaviour of
alternatives and static branches, which could be problematic for some
features (e.g. pseudo-NMI).
* More (instrumentable) kernel code is executed to patch each static
branch, which can be risky when patching certain features (e.g.
irqflags management for pseudo-NMI).
* When CONFIG_JUMP_LABEL=n, static branches are turned into a load of a
flag and a conditional branch. This means it isn't safe to use such
static branches in an alternative address space (e.g. the NVHE/PKVM
hyp code), where the generated address isn't safe to acccess.
To deal with these issues, this patch introduces new
alternative_has_feature_*() helpers, which work like static branches but
are patched using alternatives. This ensures the patching is performed
at the same time as other alternative patching, allows the metadata to
be freed after patching, and is safe for use in alternative address
spaces.
Note that all supported toolchains have asm goto support, and since
commit:
a0a12c3ed0 ("asm goto: eradicate CC_HAS_ASM_GOTO)"
... the CC_HAS_ASM_GOTO Kconfig symbol has been removed, so no feature
check is necessary, and we can always make use of asm goto.
Additionally, note that:
* This has no impact on cpus_have_cap(), which is a dynamic check.
* This has no functional impact on cpus_have_const_cap(). The branches
are patched slightly later than before this patch, but these branches
are not reachable until caps have been finalised.
* It is now invalid to use cpus_have_final_cap() in the window between
feature detection and patching. All existing uses are only expected
after patching anyway, so this should not be a problem.
* The LSE atomics will now be enabled during alternatives patching
rather than immediately before. As the LL/SC an LSE atomics are
functionally equivalent this should not be problematic.
When building defconfig with GCC 12.1.0, the resulting Image is 64KiB
smaller:
| % ls -al Image-*
| -rw-r--r-- 1 mark mark 37108224 Aug 23 09:56 Image-after
| -rw-r--r-- 1 mark mark 37173760 Aug 23 09:54 Image-before
According to bloat-o-meter.pl:
| add/remove: 44/34 grow/shrink: 602/1294 up/down: 39692/-61108 (-21416)
| Function old new delta
| [...]
| Total: Before=16618336, After=16596920, chg -0.13%
| add/remove: 0/2 grow/shrink: 0/0 up/down: 0/-1296 (-1296)
| Data old new delta
| arm64_const_caps_ready 16 - -16
| cpu_hwcap_keys 1280 - -1280
| Total: Before=8987120, After=8985824, chg -0.01%
| add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
| RO Data old new delta
| Total: Before=18408, After=18408, chg +0.00%
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220912162210.3626215-8-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Today, callback alternatives are special-cased within
__apply_alternatives(), and are applied alongside patching for system
capabilities as ARM64_NCAPS is not part of the boot_capabilities feature
mask.
This special-casing is less than ideal. Giving special meaning to
ARM64_NCAPS for this requires some structures and loops to use
ARM64_NCAPS + 1 (AKA ARM64_NPATCHABLE), while others use ARM64_NCAPS.
It's also not immediately clear callback alternatives are only applied
when applying alternatives for system-wide features.
To make this a bit clearer, changes the way that callback alternatives
are identified to remove the special-casing of ARM64_NCAPS, and to allow
callback alternatives to be associated with a cpucap as with all other
alternatives.
New cpucaps, ARM64_ALWAYS_BOOT and ARM64_ALWAYS_SYSTEM are added which
are always detected alongside boot cpu capabilities and system
capabilities respectively. All existing callback alternatives are made
to use ARM64_ALWAYS_SYSTEM, and so will be patched at the same point
during the boot flow as before.
Subsequent patches will make more use of these new cpucaps.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220912162210.3626215-7-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
get_arm64_ftr_reg() returns the properties of a system register based
on its instruction encoding.
This is needed by erratum workaround in cpu_errata.c to modify the
user-space visible view of id registers.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220909165938.3931307-3-james.morse@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
arm64 advertises hardware features to user-space via HWCAPs, and by
emulating access to the CPUs id registers. The cpufeature code has a
sanitised system-wide view of an id register, and a sanitised user-space
view of an id register, where some features use their 'safe' value
instead of the hardware value.
It is currently possible for a HWCAP to be advertised where the user-space
view of the id register does not show the feature as supported.
Erratum workaround need to remove both the HWCAP, and the feature from
the user-space view of the id register. This involves duplicating the
code, and spreading it over cpufeature.c and cpu_errata.c.
Make the HWCAP code use the user-space view of id registers. This ensures
the values never diverge, and allows erratum workaround to remove HWCAP
by modifying the user-space view of the id register.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220909165938.3931307-2-james.morse@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64DFR0_EL1 to follow the convention. No functional changes.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220910163354.860255-3-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The naming scheme the architecture uses for the fields in ID_AA64DFR0_EL1
does not align well with kernel conventions, using as it does a lot of
MixedCase in various arrangements. In preparation for automatically
generating the defines for this register rename the defines used to match
what is in the architecture.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220910163354.860255-2-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In preparation for automatic generation of constants update the define for
SME being implemented to the convention we are using, no functional change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-20-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In preparation for automatic generation of constants update the define for
BTI being implemented to the convention we are using, no functional change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-19-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The naming for fractional versions fields in ID_AA64PFR1_EL1 does not align
with that in the architecture, lacking underscores and using upper case
where the architecture uses lower case. In preparation for automatic
generation of defines bring the code in sync with the architecture, no
functional change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-18-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In preparation for conversion to automatic generation refresh the names
given to the items in the MTE feture enumeration to reflect our standard
pattern for naming, corresponding to the architecture feature names they
reflect. No functional change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-17-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In preparation for conversion to automatic generation refresh the names
given to the items in the SSBS feature enumeration to reflect our standard
pattern for naming, corresponding to the architecture feature names they
reflect. No functional change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-16-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The architecture refers to the register field identifying advanced SIMD as
AdvSIMD but the kernel refers to it as ASIMD. Use the architecture's
naming. No functional changes.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-15-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We generally refer to the baseline feature implemented as _IMP so in
preparation for automatic generation of register defines update those for
ID_AA64PFR0_EL1 to reflect this.
In the case of ASIMD we don't actually use the define so just remove it.
No functional changes.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-14-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The kernel refers to ID_AA64MMFR2_EL1.CnP as CNP. In preparation for
automatic generation of defines for the system registers bring the naming
used by the kernel in sync with that of DDI0487H.a. No functional change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-13-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The kernel refers to ID_AA64MMFR2_EL1.VARange as LVA. In preparation for
automatic generation of defines for the system registers bring the naming
used by the kernel in sync with that of DDI0487H.a. No functional change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-12-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In preparation for converting the ID_AA64MMFR1_EL1 system register
defines to automatic generation, rename them to follow the conventions
used by other automatically generated registers:
* Add _EL1 in the register name.
* Rename fields to match the names in the ARM ARM:
* LOR -> LO
* HPD -> HPDS
* VHE -> VH
* HADBS -> HAFDBS
* SPECSEI -> SpecSEI
* VMIDBITS -> VMIDBits
There should be no functional change as a result of this patch.
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-11-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
For some reason we refer to ID_AA64MMFR0_EL1.ASIDBits as ASID. Add BITS
into the name, bringing the naming into sync with DDI0487H.a. Due to the
large amount of MixedCase in this register which isn't really consistent
with either the kernel style or the majority of the architecture the use of
upper case is preserved. No functional changes.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-10-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
For some reason we refer to ID_AA64MMFR0_EL1.BigEnd as BIGENDEL. Remove the
EL from the name, bringing the naming into sync with DDI0487H.a. Due to the
large amount of MixedCase in this register which isn't really consistent
with either the kernel style or the majority of the architecture the use of
upper case is preserved. No functional changes.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-9-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Our standard is to include the _EL1 in the constant names for registers but
we did not do that for ID_AA64PFR1_EL1, update to do so in preparation for
conversion to automatic generation. No functional change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-8-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64PFR0_EL1 to follow the convention. No functional changes.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-7-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64MMFR2_EL1 to follow the convention. No functional changes.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-6-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64MMFR0_EL1 to follow the convention. No functional changes.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com>
Link: https://lore.kernel.org/r/20220905225425.1871461-5-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
SVE has a separate identification register indicating support for BFloat16
operations. Add a hwcap identifying support for EBF16 in this register,
mirroring what we did for the non-SVE case.
While there is currently an architectural requirement for BF16 support to
be the same in SVE and non-SVE contexts there are separate identification
registers this separate hwcap helps avoid issues if that requirement were
to be relaxed in the future, we have already chosen to have a separate
capability for base BF16 support.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220829154815.832347-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The AMU counter AMEVCNTR01 (constant counter) should increment at the same
rate as the system counter. On affected Cortex-A510 cores, AMEVCNTR01
increments incorrectly giving a significantly higher output value. This
results in inaccurate task scheduler utilization tracking and incorrect
feedback on CPU frequency.
Work around this problem by returning 0 when reading the affected counter
in key locations that results in disabling all users of this counter from
using it either for frequency invariance or as FFH reference counter. This
effect is the same to firmware disabling affected counters.
Details on how the two features are affected by this erratum:
- AMU counters will not be used for frequency invariance for affected
CPUs and CPUs in the same cpufreq policy. AMUs can still be used for
frequency invariance for unaffected CPUs in the system. Although
unlikely, if no alternative method can be found to support frequency
invariance for affected CPUs (cpufreq based or solution based on
platform counters) frequency invariance will be disabled. Please check
the chapter on frequency invariance at
Documentation/scheduler/sched-capacity.rst for details of its effect.
- Given that FFH can be used to fetch either the core or constant counter
values, restrictions are lifted regarding any of these counters
returning a valid (!0) value. Therefore FFH is considered supported
if there is a least one CPU that support AMUs, independent of any
counters being disabled or affected by this erratum. Clarifying
comments are now added to the cpc_ffh_supported(), cpu_read_constcnt()
and cpu_read_corecnt() functions.
The above is achieved through adding a new erratum: ARM64_ERRATUM_2457168.
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20220819103050.24211-1-ionela.voinescu@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmLnDOwACgkQSfxwEqXe
A65Fiw//Z0YaPejSslQIGitQ1b0XzdWBhyJArYDieaaiQRXMqlaSKlIUqHz38xb7
+FykUY51/SJLjHV2riPxq1OK3/MPmk6VlTd0HHihcHVmg77oZcFcv2tPnDpZoqND
TsBOujLbXKwxP8tNFedRY/4+K7w+ue9BTfDjuH7aCtz7uWd+4cNJmPg3x9FCfkMA
+hbcRluwE9W3Pg4OCKwv+qxL0JF3qQtNKEOp1wpnjGAZZW/I9gFNgFBEkykvcAsj
TkIRDc3agPFj6QgDeRIgLdnf9KCsLubKAg5oJneeCvQztJJUCSkn8nQXxpx+4sLo
GsRgvCdfL/GyJqfSAzQJVYDHKtKMkJiCiWCC/oOALR8dzHJfSlULDAjbY1m/DAr9
at+vi4678Or7TNx2ZSaUlCXXKZ+UT7yWMlQWax9JuxGk1hGYP5/eT1AH5SGjqUwF
w1q8oyzxt1vUcnOzEddFXPFirnqqhAk4dQFtu83+xKM4ZssMVyeB4NZdEhAdW0ng
MX+RjrVj4l5gWWuoS0Cx3LUxDCgV6WT0dN+Vl9axAZkoJJbcXLEmXwQ6NbzTLPWg
1/MT7qFTxNcTCeAArMdZvvFbeh7pOBXO42pafrK/7vDRnTMUIw9tqXNLQUfvdFQp
F5flPgiVRHDU2vSzKIFtnPTyXU0RBBGvNb4n0ss2ehH2DSsCxYE=
=Zy3d
-----END PGP SIGNATURE-----
Merge tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
"Though there's been a decent amount of RNG-related development during
this last cycle, not all of it is coming through this tree, as this
cycle saw a shift toward tackling early boot time seeding issues,
which took place in other trees as well.
Here's a summary of the various patches:
- The CONFIG_ARCH_RANDOM .config option and the "nordrand" boot
option have been removed, as they overlapped with the more widely
supported and more sensible options, CONFIG_RANDOM_TRUST_CPU and
"random.trust_cpu". This change allowed simplifying a bit of arch
code.
- x86's RDRAND boot time test has been made a bit more robust, with
RDRAND disabled if it's clearly producing bogus results. This would
be a tip.git commit, technically, but I took it through random.git
to avoid a large merge conflict.
- The RNG has long since mixed in a timestamp very early in boot, on
the premise that a computer that does the same things, but does so
starting at different points in wall time, could be made to still
produce a different RNG state. Unfortunately, the clock isn't set
early in boot on all systems, so now we mix in that timestamp when
the time is actually set.
- User Mode Linux now uses the host OS's getrandom() syscall to
generate a bootloader RNG seed and later on treats getrandom() as
the platform's RDRAND-like faculty.
- The arch_get_random_{seed_,}_long() family of functions is now
arch_get_random_{seed_,}_longs(), which enables certain platforms,
such as s390, to exploit considerable performance advantages from
requesting multiple CPU random numbers at once, while at the same
time compiling down to the same code as before on platforms like
x86.
- A small cleanup changing a cmpxchg() into a try_cmpxchg(), from
Uros.
- A comment spelling fix"
More info about other random number changes that come in through various
architecture trees in the full commentary in the pull request:
https://lore.kernel.org/all/20220731232428.2219258-1-Jason@zx2c4.com/
* tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
random: correct spelling of "overwrites"
random: handle archrandom with multiple longs
um: seed rng using host OS rng
random: use try_cmpxchg in _credit_init_bits
timekeeping: contribute wall clock to rng on time change
x86/rdrand: Remove "nordrand" flag in favor of "random.trust_cpu"
random: remove CONFIG_ARCH_RANDOM
Even if we are now able to tell the kernel to avoid exposing SVE/SME
from the command line, we still have a couple of places where we
unconditionally access the ZCR_EL1 (resp. SMCR_EL1) registers.
On systems with broken firmwares, this results in a crash even if
arm64.nosve (resp. arm64.nosme) was passed on the command-line.
To avoid this, only update cpuinfo_arm64::reg_{zcr,smcr} once
we have computed the sanitised version for the corresponding
feature registers (ID_AA64PFR0 for SVE, and ID_AA64PFR1 for
SME). This results in some minor refactoring.
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Peter Collingbourne <pcc@google.com>
Tested-by: Peter Collingbourne <pcc@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220720105219.1755096-1-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
* for-next/boot: (34 commits)
arm64: fix KASAN_INLINE
arm64: Add an override for ID_AA64SMFR0_EL1.FA64
arm64: Add the arm64.nosve command line option
arm64: Add the arm64.nosme command line option
arm64: Expose a __check_override primitive for oddball features
arm64: Allow the idreg override to deal with variable field width
arm64: Factor out checking of a feature against the override into a macro
arm64: Allow sticky E2H when entering EL1
arm64: Save state of HCR_EL2.E2H before switch to EL1
arm64: Rename the VHE switch to "finalise_el2"
arm64: mm: fix booting with 52-bit address space
arm64: head: remove __PHYS_OFFSET
arm64: lds: use PROVIDE instead of conditional definitions
arm64: setup: drop early FDT pointer helpers
arm64: head: avoid relocating the kernel twice for KASLR
arm64: kaslr: defer initialization to initcall where permitted
arm64: head: record CPU boot mode after enabling the MMU
arm64: head: populate kernel page tables with MMU and caches on
arm64: head: factor out TTBR1 assignment into a macro
arm64: idreg-override: use early FDT mapping in ID map
...
* for-next/cpufeature:
arm64/hwcap: Support FEAT_EBF16
arm64/cpufeature: Store elf_hwcaps as a bitmap rather than unsigned long
arm64/hwcap: Document allocation of upper bits of AT_HWCAP
arm64: trap implementation defined functionality in userspace
* for-next/perf:
drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX
perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node()
docs: perf: Include hns3-pmu.rst in toctree to fix 'htmldocs' WARNING
drivers/perf: hisi: add driver for HNS3 PMU
drivers/perf: hisi: Add description for HNS3 PMU driver
drivers/perf: riscv_pmu_sbi: perf format
perf/arm-cci: Use the bitmap API to allocate bitmaps
drivers/perf: riscv_pmu: Add riscv pmu pm notifier
perf: hisi: Extract hisi_pmu_init
perf/marvell_cn10k: Fix TAD PMU register offset
perf/marvell_cn10k: Remove useless license text when SPDX-License-Identifier is already used
arm64: cpufeature: Allow different PMU versions in ID_DFR0_EL1
perf/arm-cci: fix typo in comment
drivers/perf:Directly use ida_alloc()/free()
drivers/perf: Directly use ida_alloc()/free()
* for-next/kpti:
arm64: correct the effect of mitigations off on kpti
arm64: entry: simplify trampoline data page
arm64: mm: install KPTI nG mappings with MMU enabled
arm64: kpti-ng: simplify page table traversal logic
The v9.2 feature FEAT_EBF16 provides support for an extended BFloat16 mode.
Allow userspace to discover system support for this feature by adding a
hwcap for it.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220707103632.12745-4-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
When we added support for AT_HWCAP2 we took advantage of the fact that we
have limited hwcaps to the low 32 bits and stored it along with AT_HWCAP
in a single unsigned integer. Thanks to the ever expanding capabilities of
the architecture we have now allocated all 64 of the bits in an unsigned
long so in preparation for adding more hwcaps convert elf_hwcap to be a
bitmap instead, with 64 bits allocated to each AT_HWCAP.
There should be no functional change from this patch.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220707103632.12745-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Cortex-A57 and Cortex-A72 have an erratum where an interrupt that
occurs between a pair of AES instructions in aarch32 mode may corrupt
the ELR. The task will subsequently produce the wrong AES result.
The AES instructions are part of the cryptographic extensions, which are
optional. User-space software will detect the support for these
instructions from the hwcaps. If the platform doesn't support these
instructions a software implementation should be used.
Remove the hwcap bits on affected parts to indicate user-space should
not use the AES instructions.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20220714161523.279570-3-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
When RDRAND was introduced, there was much discussion on whether it
should be trusted and how the kernel should handle that. Initially, two
mechanisms cropped up, CONFIG_ARCH_RANDOM, a compile time switch, and
"nordrand", a boot-time switch.
Later the thinking evolved. With a properly designed RNG, using RDRAND
values alone won't harm anything, even if the outputs are malicious.
Rather, the issue is whether those values are being *trusted* to be good
or not. And so a new set of options were introduced as the real
ones that people use -- CONFIG_RANDOM_TRUST_CPU and "random.trust_cpu".
With these options, RDRAND is used, but it's not always credited. So in
the worst case, it does nothing, and in the best case, maybe it helps.
Along the way, CONFIG_ARCH_RANDOM's meaning got sort of pulled into the
center and became something certain platforms force-select.
The old options don't really help with much, and it's a bit odd to have
special handling for these instructions when the kernel can deal fine
with the existence or untrusted existence or broken existence or
non-existence of that CPU capability.
Simplify the situation by removing CONFIG_ARCH_RANDOM and using the
ordinary asm-generic fallback pattern instead, keeping the two options
that are actually used. For now it leaves "nordrand" for now, as the
removal of that will take a different route.
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64ISAR2_EL1 to follow the convention. No functional changes.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220704170302.2609529-17-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64ISAR1_EL1 to follow the convention. No functional changes.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220704170302.2609529-16-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
The various defines for bitfields in ID_AA64ZFR0_EL1 do not follow our
conventions for register field names, they omit the _EL1, they don't use
specific defines for enumeration values and they don't follow the naming
in the architecture in some cases. In preparation for automatic generation
bring them into line with convention. No functional changes.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220704170302.2609529-14-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>