Commit Graph

834 Commits

Author SHA1 Message Date
Rob Herring (Arm)
04bd15c4cb perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters
Counting events related to setup of the PMU is not desired, but
kvm_vcpu_pmu_resync_el0() is called just after the PMU counters have
been enabled. Move the call to before enabling the counters.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-1-4e9922fc2e8e@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01 05:08:10 +00:00
Vincenzo Frascino
0424b1a81a perf: arm_pmuv3: Add support for ARM Rainier PMU
Add support for the ARM Rainier CPU core PMU.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20250221180349.1413089-7-vincenzo.frascino@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-02-28 01:31:21 +00:00
Luo Gengkun
9ec84f79c5 perf: Remove unnecessary parameter of security check
It seems that the attr parameter was never been used in security
checks since it was first introduced by:

commit da97e18458 ("perf_event: Add support for LSM and SELinux checks")

so remove it.

Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-02-26 14:13:58 -05:00
Nam Cao
5f8401cf7b drivers: perf: Switch to use hrtimer_setup()
hrtimer_setup() takes the callback function pointer as argument and
initializes the timer completely.

Replace hrtimer_init() and the open coded initialization of
hrtimer::function with the new setup mechanism.

Patch was created by using Coccinelle.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://lore.kernel.org/all/471ea3b829d14a4b4c3c7814dbe1ed13b15d47b8.1738746904.git.namcao@linutronix.de
2025-02-18 11:19:04 +01:00
Joel Granados
1751f872cc treewide: const qualify ctl_tables where applicable
Add the const qualifier to all the ctl_tables in the tree except for
watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls,
loadpin_sysctl_table and the ones calling register_net_sysctl (./net,
drivers/inifiniband dirs). These are special cases as they use a
registration function with a non-const qualified ctl_table argument or
modify the arrays before passing them on to the registration function.

Constifying ctl_table structs will prevent the modification of
proc_handler function pointers as the arrays would reside in .rodata.
This is made possible after commit 78eb4ea25c ("sysctl: treewide:
constify the ctl_table argument of proc_handlers") constified all the
proc_handlers.

Created this by running an spatch followed by a sed command:
Spatch:
    virtual patch

    @
    depends on !(file in "net")
    disable optional_qualifier
    @

    identifier table_name != {
      watchdog_hardlockup_sysctl,
      iwcm_ctl_table,
      ucma_ctl_table,
      memory_allocation_profiling_sysctls,
      loadpin_sysctl_table
    };
    @@

    + const
    struct ctl_table table_name [] = { ... };

sed:
    sed --in-place \
      -e "s/struct ctl_table .table = &uts_kern/const struct ctl_table *table = \&uts_kern/" \
      kernel/utsname_sysctl.c

Reviewed-by: Song Liu <song@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> # for kernel/trace/
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> # SCSI
Reviewed-by: Darrick J. Wong <djwong@kernel.org> # xfs
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Bill O'Donnell <bodonnel@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Acked-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-01-28 13:48:37 +01:00
Linus Torvalds
9ad09c4f28 arm64 updates for 6.14
Confidential Computing:
 * Register a platform device when running in CCA realm mode to enable
   automatic loading of dependent modules.
 
 CPU Features:
 * Update a bunch of system register definitions to pick up new field
   encodings from the architectural documentation.
 
 * Add hwcaps and selftests for the new (2024) dpISA extensions.
 
 Documentation:
 * Update EL3 (firmware) requirements for booting Linux on modern arm64
   designs.
 
 * Remove stale information about the kernel virtual memory map.
 
 Miscellaneous:
 * Minor cleanups and typo fixes.
 
 Memory management:
 * Fix vmemmap_check_pmd() to look at the PMD type bits
 
 * LPA2 (52-bit physical addressing) cleanups and minor fixes.
 
 * Adjust physical address space depending upon whether or not LPA2 is
   enabled.
 
 Perf and PMUs:
 * Add port filtering support for NVIDIA's NVLINK-C2C Coresight PMU
 
 * Extend AXI filtering support for the DDR PMU on NXP IMX SoCs
 
 * Fix Designware PCIe PMU event numbering.
 
 * Add generic branch events for the Apple M1 CPU PMU.
 
 * Add support for Marvell Odyssey DDR and LLC-TAD PMUs.
 
 * Cleanups to the Hisilicon DDRC and Uncore PMU code.
 
 * Advertise discard mode for the SPE PMU.
 
 * Add the perf users mailing list to our MAINTAINERS entry.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmeKZLcQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNEQzB/0X2U89ZiqxIkTPQvfFrjN/uUGybkq59rEL
 DfeoGukTgJIwc3GHWXXtQ//wuuYKdTeCXaIz5NFK3+7/wmKSLvjkexmue8pta6EY
 5rx9bAPr/D8lAUvhKIN2l3pF/ygoRwDz+nT2yVQ1xlZxYJWX7ZIsMj7W7ceb5kdx
 HRrTSQuhEEPREAWWO4oCMWl5SQZSrIflSE3Be/PsP0OhW6k//ZmWbcJTgUcHbKam
 o2WtNjITyGzxMpRCcrGEZKoe9YcwSxiut/PoD7JuoB4C/rbsf1cdJ6uLmtvGJcZj
 qsdRHhVfBzP1+ahONrDbiT3C2+s1UZySKdCDIxiYy6lB39wpP0dd
 =E7Mf
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "We've got a little less than normal thanks to the holidays in
  December, but there's the usual summary below. The highlight is
  probably the 52-bit physical addressing (LPA2) clean-up from Ard.

  Confidential Computing:

   - Register a platform device when running in CCA realm mode to enable
     automatic loading of dependent modules

  CPU Features:

   - Update a bunch of system register definitions to pick up new field
     encodings from the architectural documentation

   - Add hwcaps and selftests for the new (2024) dpISA extensions

  Documentation:

   - Update EL3 (firmware) requirements for booting Linux on modern
     arm64 designs

   - Remove stale information about the kernel virtual memory map

  Miscellaneous:

   - Minor cleanups and typo fixes

  Memory management:

   - Fix vmemmap_check_pmd() to look at the PMD type bits

   - LPA2 (52-bit physical addressing) cleanups and minor fixes

   - Adjust physical address space depending upon whether or not LPA2 is
     enabled

  Perf and PMUs:

   - Add port filtering support for NVIDIA's NVLINK-C2C Coresight PMU

   - Extend AXI filtering support for the DDR PMU on NXP IMX SoCs

   - Fix Designware PCIe PMU event numbering

   - Add generic branch events for the Apple M1 CPU PMU

   - Add support for Marvell Odyssey DDR and LLC-TAD PMUs

   - Cleanups to the Hisilicon DDRC and Uncore PMU code

   - Advertise discard mode for the SPE PMU

   - Add the perf users mailing list to our MAINTAINERS entry"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (64 commits)
  Documentation: arm64: Remove stale and redundant virtual memory diagrams
  perf docs: arm_spe: Document new discard mode
  perf: arm_spe: Add format option for discard mode
  MAINTAINERS: Add perf list for drivers/perf/
  arm64: Remove duplicate included header
  drivers/perf: apple_m1: Map generic branch events
  arm64: rsi: Add automatic arm-cca-guest module loading
  kselftest/arm64: Add 2024 dpISA extensions to hwcap test
  KVM: arm64: Allow control of dpISA extensions in ID_AA64ISAR3_EL1
  arm64/hwcap: Describe 2024 dpISA extensions to userspace
  arm64/sysreg: Update ID_AA64SMFR0_EL1 to DDI0601 2024-12
  arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented
  drivers/perf: hisi: Set correct IRQ affinity for PMUs with no association
  arm64/sme: Move storage of reg_smidr to __cpuinfo_store_cpu()
  arm64: mm: Test for pmd_sect() in vmemmap_check_pmd()
  arm64/mm: Replace open encodings with PXD_TABLE_BIT
  arm64/mm: Rename pte_mkpresent() as pte_mkvalid()
  arm64/sysreg: Update ID_AA64ISAR2_EL1 to DDI0601 2024-09
  arm64/sysreg: Update ID_AA64ZFR0_EL1 to DDI0601 2024-09
  arm64/sysreg: Update ID_AA64FPFR0_EL1 to DDI0601 2024-09
  ...
2025-01-20 21:21:49 -08:00
James Clark
d28d95bc63 perf: arm_spe: Add format option for discard mode
FEAT_SPEv1p2 (optional from Armv8.6) adds a discard mode that allows all
SPE data to be discarded rather than written to memory. Add a format
bit for this mode.

If the mode isn't supported, the format bit isn't published and attempts
to use it will result in -EOPNOTSUPP. Allocating an aux buffer is still
allowed even though it won't be written to so that old tools continue to
work, but updated tools can choose to skip this step.

Signed-off-by: James Clark <james.clark@linaro.org>
Reviewd-by: Yeoreum Yun <yeoreum.yun@arm.com>
Link: https://lore.kernel.org/r/20250108142904.401139-2-james.clark@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2025-01-10 14:50:55 +00:00
Oliver Upton
4575353d82 drivers/perf: apple_m1: Map generic branch events
Map the generic perf events for branch prediction stats to the
corresponding hardware events.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Tested-by: Janne Grunau <j@jannau.net>
Link: https://lore.kernel.org/r/20241217212048.3709204-4-oliver.upton@linux.dev
Signed-off-by: Will Deacon <will@kernel.org>
2025-01-10 13:32:56 +00:00
Palmer Dabbelt
6f6ecce59d
Merge patch series "SBI PMU event related fixes"
Atish Patra <atishp@rivosinc.com> says:

Here are two minor improvement/fixes in the PMU event path. The first patch
was part of the series[1]. The 2nd patch was suggested during the series
review.

While the series can only be merged once SBI v3.0 is frozen, these two
patches can be independent of SBI v3.0 and can be merged sooner. Hence, these
two patches are sent as a separate series.

* b4-shazam-merge:
  drivers/perf: riscv: Do not allow invalid raw event config
  drivers/perf: riscv: Return error for default case
  drivers/perf: riscv: Fix Platform firmware event data

Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-0-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09 09:37:12 -08:00
Atish Patra
3aff4cdbe5
drivers/perf: riscv: Do not allow invalid raw event config
The SBI specification allows only lower 48bits of hpmeventX to be
configured via SBI PMU. Currently, the driver masks of the higher
bits but doesn't return an error. This will lead to an additional
SBI call for config matching which should return for an invalid
event error in most of the cases.

However, if a platform(i.e Rocket and sifive cores) implements a
bitmap of all bits in the event encoding this will lead to an
incorrect event being programmed leading to user confusion.

Report the error to the user if higher bits are set during the
event mapping itself to avoid the confusion and save an additional
SBI call.

Suggested-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-3-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09 09:37:09 -08:00
Atish Patra
2c206cdede
drivers/perf: riscv: Return error for default case
If the upper two bits has an invalid valid (0x1), the event mapping
is not reliable as it returns an uninitialized variable.

Return appropriate value for the default case.

Fixes: f0c9363db2 ("perf/riscv-sbi: Add platform specific firmware event handling")

Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-2-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09 09:37:09 -08:00
Atish Patra
fc58db9aeb
drivers/perf: riscv: Fix Platform firmware event data
Platform firmware event data field is allowed to be 62 bits for
Linux as uppper most two bits are reserved to indicate SBI fw or
platform specific firmware events.
However, the event data field is masked as per the hardware raw
event mask which is not correct.

Fix the platform firmware event data field with proper mask.

Fixes: f0c9363db2 ("perf/riscv-sbi: Add platform specific firmware event handling")

Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-1-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09 09:37:08 -08:00
Yicong Yang
555c6e9b03 drivers/perf: hisi: Set correct IRQ affinity for PMUs with no association
For PMUs with no association, the hisi_pmu->on_cpu is initialized
according to the NUMA locality but use a wrong CPU for the interrupt
affinity. The CPU selected from cpumask_local_spread() can be different
from the CPU by the cpuhp callback. Fix this by setting the IRQ affinity
to hisi_pmu->on_cpu.

Fixes: 6cd137088f ("drivers/perf: hisi: Refactor the detection of associated CPUs")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20241223125134.57885-1-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-01-07 17:17:32 +00:00
Xu Yang
f3edf03a4c perf: imx9_perf: Introduce AXI filter version to refactor the driver and better extension
The imx93 is the first supported DDR PMU that supports read transaction,
write transaction and read beats events which corresponding respecitively
to counter 2, 3 and 4.

However, transaction-based AXI match has low accuracy when get total bits
compared to beats-based. And imx93 doesn't assign AXI_ID to each master.
So axi filter is not used widely on imx93. This could be regards as AXI
filter version 1.

To improve the AXI filter capability, imx95 supports 1 read beats and 3
write beats event which corresponding respecitively to counter 2-5. imx95
also detailed AXI_ID allocation so that most of the master could be count
individually. This could be regards as AXI filter version 2.

This will introduce AXI filter version to refactor the driver and support
better extension, such as coming imx943. This is also a potential fix on
imx91 when configure axi filter.

Fixes: 44798fe136 ("perf: imx_perf: add support for i.MX91 platform")
Cc: stable@vger.kernel.org
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20241212065708.1353513-1-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-19 15:50:32 +00:00
Robin Murphy
e49ecdf79a perf/arm-cmn: Permit more exhaustive groups
The group validation logic still somewhat assumes the original CMN-600
case of events counting globally, such that if one tries to group 9
events where the first 8 target a single DTC domain, the 9th will be
rejected because *a* DTC domain is full, even though it might only
target other non-overlapping domains and thus still be schedulable.
Improve matters by only counting the DTCs that the new event actually
needs (as arm_cmn_val_add_event() was already clever enough to do).

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/bdfd1e58dac449e407c5cacfd6bf8577dc0a5899.1733943898.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-19 15:33:42 +00:00
Bjorn Helgaas
b34d605d12 perf/dwc_pcie: Qualify RAS DES VSEC Capability by Vendor, Revision
PCI Vendor-Specific (VSEC) Capabilities are defined by each vendor.
Devices from different vendors may advertise a VSEC Capability with the DWC
RAS DES functionality, but the vendors may assign different VSEC IDs.

Search for the DWC RAS DES Capability using the VSEC ID and VSEC Rev
chosen by the vendor.

This does not fix a current problem because Alibaba, Ampere, and Qualcomm
all assigned the same VSEC ID and VSEC Rev for the DWC RAS DES Capability.

The potential issue is that we may add support for a device from another
vendor, where the vendor has already assigned DWC_PCIE_VSEC_RAS_DES_ID
(0x02) for an unrelated VSEC.  In that event, dwc_pcie_des_cap() would find
the unrelated VSEC and mistakenly assume it was a DWC RAS DES Capability.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-and-tested-by: Shuai Xue <xueshuai@linux.alibaba.com>
Link: https://lore.kernel.org/r/20241209222938.3219364-1-helgaas@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-11 21:46:36 +00:00
Junhao He
f03241fbeb drivers/perf: hisi: Delete redundant blank line of DDRC PMU
Do not add blank line at the end of a code block defined by braces.

Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241210141525.37788-11-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10 15:57:25 +00:00
Junhao He
4e15bcffa1 drivers/perf: hisi: Fix incorrect variable name "hha_pmu" in DDRC PMU driver
In the callback function write_evtype(), the variable name of struct
hisi_pmu should be "ddrc_pmu" instead of "hha_pmu".

Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241210141525.37788-10-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10 15:57:25 +00:00
Yicong Yang
3b051bb7cb drivers/perf: hisi: Export associated CPUs of each PMU through sysfs
Although the event of the uncore PMU can only be opened on a single
CPU, some PMU does have the affinity on a range of CPUs. For example
the L3C PMU is associated to the CPUs sharing the L3T it monitors.
Users may infer this affinity by the PMU name which may have SCCL ID
and CCL ID encoded (for L3C etc), but it's not that straightforward.
So export this information by adding an "associated_cpus" sysfs
attribute then user can get this directly.

Reviewed-by: Jonathan Cameron <Joanthan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241210141525.37788-9-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10 15:57:24 +00:00
Yicong Yang
8688c01e31 drivers/perf: hisi: Provide a generic implementation of cpumask/identifier
Each type of HiSilicon Uncore PMU has the following sysfs attributes:

- format: bitmask in perf_event_attr::config[012] of corresponding
  attribute
- event: events name and corresponding event code
- cpumask: range of CPUs the events can be opened on
- identifier: the version of this PMU

Different types of PMU have different implementations of the "format"
and "event" but all share the same implementation of the "cpumask"
and "identifier". Thus we can move cpumask and identifier to the
hisi_uncore_pmu framework and drivers can use the generic
implementation.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241210141525.37788-8-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10 15:57:24 +00:00
Yicong Yang
32528b165e drivers/perf: hisi: Add a common function to retrieve topology from firmware
Currently each type of uncore PMU driver uses almost the same routine
and the same firmware interface (or properties) to retrieve the topology
information, then reset the unused IDs to -1. Extract the common parts
to the framework in hisi_uncore_pmu_init_topology().

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241210141525.37788-7-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10 15:57:24 +00:00
Yicong Yang
c192026cee drivers/perf: hisi: Extract topology information to a separate structure
HiSilicon Uncore PMUs are identified by the IDs of the topology element
on which the PMUs are located. Add a new separate struct hisi_pmu_toplogy
to encapsulate this information. Add additional documentation on the
meaning of each ID.

- make sccl_id and sicl_id into a union since they're exclusive. It can
  also be accessed by scl_id if the SICL/SCCL distinction is not
  relevant
- make index_id and sub_id signed so -1 may be used to indicate the PMU
  doesn't have this topology element or it could not be retrieved.

This patch should have no functional changes.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241210141525.37788-6-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10 15:57:24 +00:00
Yicong Yang
6cd137088f drivers/perf: hisi: Refactor the detection of associated CPUs
There are two type of PMUs supported currently:
1) PMUs locate on SCCL (Super CPU Cluster [1]), associated with certain
   CCL (CPU cluster [1])(e.g. L3C PMU) or not (e.g. DDRC PMU)
2) PMUs locate on the SICL (Super IO Cluster [1]), which has no
   association with certain CPU topology (e.g. CPA PMU)

Currently the associated CPUs of the PMU is detected in the cpuhp online
callback as below:
- for type 1) the CPUs match PMU's sccl_id and ccl_id
- for type 2) PMU's sccl_id is -1 and all online CPUs will be associated

Since uncore PMUs are not bound to certain CPU context and event could be
counting started by any online CPU, the associated CPUs are just a
preference. Below disadvantages are observed in current implementation:
- the PMU cannot be used if its associated CPUs are offline
- SICL PMUs are associated to all the online CPUs implicitly without
  the consideration of locality

So refactor the way we detect the associated CPUs in below aspects:
- add a clear definition of hisi_pmu::associated_cpus
- initialize hisi_pmu::on_cpu based on locality if no associated CPU
  found, otherwise update it from associated CPUs
- drop the detection with a sccl_id of -1 for SICL PMUs

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/perf/hisi-pmu.rst?h=v6.12-rc1

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241210141525.37788-5-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10 15:57:24 +00:00
Yicong Yang
83037a47d3 drivers/perf: hisi: Migrate to one online CPU if no associated one online
If the selected CPU hisi_pmu::on_cpu goes offline, driver will select
a new online CPU from hisi_pmu::associated_cpus, or if no online CPU
found the PMU context won't be migrated. However for uncore PMUs the
associated CPUs are just a peference and it also works to schedule
the events on any online CPUs. So add a fallback to choose an online
CPU if no associated CPUs found.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241210141525.37788-4-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10 15:57:24 +00:00
Yicong Yang
f2368a209a drivers/perf: hisi: Don't update the associated_cpus on CPU offline
Event will be scheduled on CPU of hisi_pmu::on_cpu which is selected
from the intersection of hisi_pmu::associated_cpus and online CPUs.
So the associated_cpus don't need to be maintained with online CPUs.
This will save one update operation if one associated CPU is offlined.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241210141525.37788-3-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10 15:57:24 +00:00
Yicong Yang
41729809ac drivers/perf: hisi: Define a symbol namespace for HiSilicon Uncore PMUs
The HiSilicon Uncore PMU framework implements some common functions
and exported them to the drivers. Use a specific HISI_PMU namespace
for the exported symbols to avoid pollute the generic ones.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20241210141525.37788-2-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10 15:57:24 +00:00
Gowthami Thiagarajan
5fcccba118 perf/marvell: Odyssey LLC-TAD performance monitor support
Each TAD provides eight 64-bit counters for monitoring
cache behavior.The driver always configures the same counter for
all the TADs. The user would end up effectively reserving one of
eight counters in every TAD to look across all TADs.
The occurrences of events are aggregated and presented to the user
at the end of running the workload. The driver does not provide a
way for the user to partition TADs so that different TADs are used for
different applications.

The performance events reflect various internal or interface activities.
By combining the values from multiple performance counters, cache
performance can be measured in terms such as: cache miss rate, cache
allocations, interface retry rate, internal resource occupancy, etc.

Each supported counter's event and formatting information is exposed
to sysfs at /sys/devices/tad/. Use perf tool stat command to measure
the pmu events. For instance:

perf stat -e tad_hit_ltg,tad_hit_dtg <workload>

Signed-off-by: Gowthami Thiagarajan <gthiagarajan@marvell.com>
Link: https://lore.kernel.org/r/20241108040619.753343-6-gthiagarajan@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-09 15:57:49 +00:00
Gowthami Thiagarajan
59731e231c perf/marvell: Refactor to extract platform data
Refactor the Marvell TAD PMU driver to add versioning to the
existing driver.

Make no functional changes, the behavior and performance
of the driver remain unchanged.

Signed-off-by: Gowthami Thiagarajan <gthiagarajan@marvell.com>
Link: https://lore.kernel.org/r/20241108040619.753343-5-gthiagarajan@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-09 15:57:44 +00:00
Gowthami Thiagarajan
d950c381dc perf/marvell: Odyssey DDR Performance monitor support
Odyssey DRAM Subsystem supports eight counters for monitoring performance
and software can program those counters to monitor any of the defined
performance events. Supported performance events include those counted
at the interface between the DDR controller and the PHY, interface between
the DDR Controller and the CHI interconnect, or within the DDR Controller.

Additionally DSS also supports two fixed performance event counters, one
for ddr reads and the other for ddr writes.

Signed-off-by: Gowthami Thiagarajan <gthiagarajan@marvell.com>
Link: https://lore.kernel.org/r/20241108040619.753343-4-gthiagarajan@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-09 15:57:39 +00:00
Gowthami Thiagarajan
0045de7e87 perf/marvell: Refactor to extract PMU operations
Introduce a refactor to the Marvell DDR PMU driver to extract
PMU operations ("pmu ops") from the existing driver.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Signed-off-by: Gowthami Thiagarajan <gthiagarajan@marvell.com>
Link: https://lore.kernel.org/r/20241108040619.753343-3-gthiagarajan@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-09 15:57:34 +00:00
Gowthami Thiagarajan
349f77e109 perf/marvell: Refactor to extract platform data
Introduce a refactor to the Marvell DDR pmu driver to extract platform
data ("pdata") from the existing driver. Prepare for the upcoming
support of the next version of the Performance Monitoring Unit (PMU) in
this driver.

Make no functional changes, this refactor solely improves code
organization and prepares for future enhancements.

While at it, fix a typo.

Signed-off-by: Gowthami Thiagarajan <gthiagarajan@marvell.com>
Link: https://lore.kernel.org/r/20241108040619.753343-2-gthiagarajan@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-09 15:57:17 +00:00
Ilkka Koskinen
e64c22cc2e perf/dwc_pcie: Fix the event numbers
According to Databook, L1 aux is event number 0x08 and
TX L0s and RX L0S is 0x09. Fix the event numbers for the
two events.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Link: https://lore.kernel.org/r/20241205061914.5568-2-ilkka@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-09 15:45:21 +00:00
Besar Wicaksono
bce61d5c57 perf: arm_cspmu: nvidia: monitor all ports by default
Some NVIDIA PMUs like the NVLINK-C2C, CNVLINK, and PCIE PMU provide
port filtering. If the port filter is set to zero, the counter of
these PMUs will not capture any event. To avoid meaningless
experiment, the driver sets the port filter value to a default
non-zero value.

Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com>
Link: https://lore.kernel.org/r/20241031142118.1865965-5-bwicaksono@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-09 15:07:49 +00:00
Besar Wicaksono
ca26df4b10 perf: arm_cspmu: nvidia: enable NVLINK-C2C port filtering
Enable NVLINK-C2C port filtering to distinguish traffic from
different GPUs connected to NVLINK-C2C.

Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com>
Link: https://lore.kernel.org/r/20241031142118.1865965-4-bwicaksono@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-09 15:07:49 +00:00
Besar Wicaksono
ac4c52956f perf: arm_cspmu: nvidia: remove unsupported SCF events
Remove unsupported events under SCF PMU.

Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com>
Link: https://lore.kernel.org/r/20241031142118.1865965-2-bwicaksono@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-12-09 15:07:49 +00:00
Peter Zijlstra
cdd30ebb1b module: Convert symbol namespace to string literal
Clean up the existing export namespace code along the same lines of
commit 33def8498f ("treewide: Convert macro and uses of __section(foo)
to __section("foo")") and for the same reason, it is not desired for the
namespace argument to be a macro expansion itself.

Scripted using

  git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file;
  do
    awk -i inplace '
      /^#define EXPORT_SYMBOL_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /^#define MODULE_IMPORT_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /MODULE_IMPORT_NS/ {
        $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g");
      }
      /EXPORT_SYMBOL_NS/ {
        if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) {
  	if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ &&
  	    $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ &&
  	    $0 !~ /^my/) {
  	  getline line;
  	  gsub(/[[:space:]]*\\$/, "");
  	  gsub(/[[:space:]]/, "", line);
  	  $0 = $0 " " line;
  	}

  	$0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/,
  		    "\\1(\\2, \"\\3\")", "g");
        }
      }
      { print }' $file;
  done

Requested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc
Acked-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-02 11:34:44 -08:00
Linus Torvalds
50ee4a6fe3 arm64 fixes for 6.13-rc1:
- Deselect ARCH_CORRECT_STACKTRACE_ON_KRETPROBE so that tests depending
   on it don't run (and fail) on arm64
 
 - Fix lockdep assert in the Arm SMMUv3 PMU driver
 
 - Fix the port and device ID bits setting in the Arm CMN perf driver
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmdKK6QACgkQa9axLQDI
 XvFvxg//SIcw9gQL+2POTH4taZ5Xk4lsUyCKygVINhO54RE1iKMAhsr5qar7YQtE
 0/sxOdHACqO0BATNGQMbwhHYvn7RkfcqnoXihvLxOwPp8Nqdf4W55eOIwQCPfVaZ
 p2N/Iy2SYEO044gIH6N8KzoFLjH7hwvXDtaWD6o+Fq9NEAwLMRn55QyIGfyM//V3
 6igBD1vjPKwxX7qIIlNqsn1l80jk8weF5uWeU1g06lSFSSRNGQ9PGT7oTLmgiKrl
 8daE8E+1++rKmelANNe4CZ2Ig9tFNonaE8jRFuEJMEu/mZ5AAMksGh1M7eaz/ODU
 ov6AukUVFRr+EsfSYlARhEgBKEssuRSDRiYuwTYPGHhDmyf+wNSu0u9K3p99DdJv
 Fu55u4weVShg/ooOY3HtPWm3WnSoaxO7L4OOVd6yxeF2McUyRI9qFrADkpr4D3Sv
 9aCGX9UdIuuTbEuXf3jacMhPq97QP9KAshWNEmzo9OZze7vLcbDz3IVeTanHIUOg
 PieUgcnLW0dOxPz+tXYzOs3Ebf0pU8nQY3Eyj3YP7GsJkJMQh3usNC2t/oYhsnyj
 OsohGT97CyBAgGtdlGsiNknIKisD3LIwa4gkJ+4sFIvFr5jfdrfCTvl23DieI8bf
 9YETifEaSrJ+R/wzP0cMUUAauBC0gxS4HMPXGETe02U+ziGt3LU=
 =ZNzJ
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Deselect ARCH_CORRECT_STACKTRACE_ON_KRETPROBE so that tests depending
   on it don't run (and fail) on arm64

 - Fix lockdep assert in the Arm SMMUv3 PMU driver

 - Fix the port and device ID bits setting in the Arm CMN perf driver

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  perf/arm-cmn: Ensure port and device id bits are set properly
  perf/arm-smmuv3: Fix lockdep assert in ->event_init()
  arm64: disable ARCH_CORRECT_STACKTRACE_ON_KRETPROBE tests
2024-11-30 14:33:44 -08:00
Linus Torvalds
55cb93fd24 Driver core changes for 6.13-rc1
Here is a small set of driver core changes for 6.13-rc1.
 
 Nothing major for this merge cycle, except for the 2 simple merge
 conflicts are here just to make life interesting.
 
 Included in here are:
   - sysfs core changes and preparations for more sysfs api cleanups that
     can come through all driver trees after -rc1 is out
   - fw_devlink fixes based on many reports and debugging sessions
   - list_for_each_reverse() removal, no one was using it!
   - last-minute seq_printf() format string bug found and fixed in many
     drivers all at once.
   - minor bugfixes and changes full details in the shortlog
 
 As mentioned above, there is 2 merge conflicts with your tree, one is
 where the file is removed (easy enough to resolve), the second is a
 build time error, that has been found in linux-next and the fix can be
 seen here:
 	https://lore.kernel.org/r/20241107212645.41252436@canb.auug.org.au
 
 Other than that, the changes here have been in linux-next with no other
 reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZ0lEog8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ym+0ACgw6wN+LkLVIHWhxTq5DYHQ0QCxY8AoJrRIcKe
 78h0+OU3OXhOy8JGz62W
 =oI5S
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is a small set of driver core changes for 6.13-rc1.

  Nothing major for this merge cycle, except for the two simple merge
  conflicts are here just to make life interesting.

  Included in here are:

   - sysfs core changes and preparations for more sysfs api cleanups
     that can come through all driver trees after -rc1 is out

   - fw_devlink fixes based on many reports and debugging sessions

   - list_for_each_reverse() removal, no one was using it!

   - last-minute seq_printf() format string bug found and fixed in many
     drivers all at once.

   - minor bugfixes and changes full details in the shortlog"

* tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (35 commits)
  Fix a potential abuse of seq_printf() format string in drivers
  cpu: Remove spurious NULL in attribute_group definition
  s390/con3215: Remove spurious NULL in attribute_group definition
  perf: arm-ni: Remove spurious NULL in attribute_group definition
  driver core: Constify bin_attribute definitions
  sysfs: attribute_group: allow registration of const bin_attribute
  firmware_loader: Fix possible resource leak in fw_log_firmware_info()
  drivers: core: fw_devlink: Fix excess parameter description in docstring
  driver core: class: Correct WARN() message in APIs class_(for_each|find)_device()
  cacheinfo: Use of_property_present() for non-boolean properties
  cdx: Fix cdx_mmap_resource() after constifying attr in ->mmap()
  drivers: core: fw_devlink: Make the error message a bit more useful
  phy: tegra: xusb: Set fwnode for xusb port devices
  drm: display: Set fwnode for aux bus devices
  driver core: fw_devlink: Stop trying to optimize cycle detection logic
  driver core: Constify attribute arguments of binary attributes
  sysfs: bin_attribute: add const read/write callback variants
  sysfs: implement all BIN_ATTR_* macros in terms of __BIN_ATTR()
  sysfs: treewide: constify attribute callback of bin_attribute::llseek()
  sysfs: treewide: constify attribute callback of bin_attribute::mmap()
  ...
2024-11-29 11:43:29 -08:00
Namhyung Kim
dfdf714fed perf/arm-cmn: Ensure port and device id bits are set properly
The portid_bits and deviceid_bits were set only for XP type nodes in
the arm_cmn_discover() and it confused other nodes to find XP nodes.
Copy the both bits from the XP nodes directly when it sets up a new
node.

Fixes: e79634b53e ("perf/arm-cmn: Refactor node ID handling. Again.")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20241121001334.331334-1-namhyung@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-25 18:53:05 +00:00
Chun-Tse Shao
02a55f2743 perf/arm-smmuv3: Fix lockdep assert in ->event_init()
Same as
https://lore.kernel.org/all/20240514180050.182454-1-namhyung@kernel.org/,
we should skip `for_each_sibling_event()` for group leader since it
doesn't have the ctx yet.

Fixes: f3c0eba287 ("perf: Add a few assertions")
Reported-by: Greg Thelen <gthelen@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Tuan Phan <tuanphan@os.amperecomputing.com>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20241108050806.3730811-1-ctshao@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-25 18:52:46 +00:00
Linus Torvalds
ba1f9c8fe3 arm64 updates for 6.13:
* Support for running Linux in a protected VM under the Arm Confidential
   Compute Architecture (CCA)
 
 * Guarded Control Stack user-space support. Current patches follow the
   x86 ABI of implicitly creating a shadow stack on clone(). Subsequent
   patches (already on the list) will add support for clone3() allowing
   finer-grained control of the shadow stack size and placement from libc
 
 * AT_HWCAP3 support (not running out of HWCAP2 bits yet but we are
   getting close with the upcoming dpISA support)
 
 * Other arch features:
 
   - In-kernel use of the memcpy instructions, FEAT_MOPS (previously only
     exposed to user; uaccess support not merged yet)
 
   - MTE: hugetlbfs support and the corresponding kselftests
 
   - Optimise CRC32 using the PMULL instructions
 
   - Support for FEAT_HAFT enabling ARCH_HAS_NONLEAF_PMD_YOUNG
 
   - Optimise the kernel TLB flushing to use the range operations
 
   - POE/pkey (permission overlays): further cleanups after bringing the
     signal handler in line with the x86 behaviour for 6.12
 
 * arm64 perf updates:
 
   - Support for the NXP i.MX91 PMU in the existing IMX driver
 
   - Support for Ampere SoCs in the Designware PCIe PMU driver
 
   - Support for Marvell's 'PEM' PCIe PMU present in the 'Odyssey' SoC
 
   - Support for Samsung's 'Mongoose' CPU PMU
 
   - Support for PMUv3.9 finer-grained userspace counter access control
 
   - Switch back to platform_driver::remove() now that it returns 'void'
 
   - Add some missing events for the CXL PMU driver
 
 * Miscellaneous arm64 fixes/cleanups:
 
   - Page table accessors cleanup: type updates, drop unused macros,
     reorganise arch_make_huge_pte() and clean up pte_mkcont(), sanity
     check addresses before runtime P4D/PUD folding
 
   - Command line override for ID_AA64MMFR0_EL1.ECV (advertising the
     FEAT_ECV for the generic timers) allowing Linux to boot with
     firmware deployments that don't set SCTLR_EL3.ECVEn
 
   - ACPI/arm64: tighten the check for the array of platform timer
     structures and adjust the error handling procedure in
     gtdt_parse_timer_block()
 
   - Optimise the cache flush for the uprobes xol slot (skip if no
     change) and other uprobes/kprobes cleanups
 
   - Fix the context switching of tpidrro_el0 when kpti is enabled
 
   - Dynamic shadow call stack fixes
 
   - Sysreg updates
 
   - Various arm64 kselftest improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmc5POIACgkQa9axLQDI
 XvEDYA//a3eeNkgMuGdnSCVcLz+zy+oNwAwboG/4X1DqL8jiCbI4npwugPx95RIA
 YZOUvo9T2aL3OyefpUHll4gFHqx9OwoZIig2F70TEUmlPsGUbh0KBkdfQF3xZPdl
 EwV0kHSGEqMWMBwsGJGwgCYrUaf1MUQzh1GBl7VJ2ts5XsJBaBeOyKkysij26wtZ
 V+aHq2IUx7qQS7+HC/4P6IoHxKziFcsCMovaKaynP4cw9xXBQbDMcNlHEwndOMyk
 pu2zrv7GG0j3KQuVP/2Alf5FKhmI0GVGP/6Nc/zsOmw96w8Kf7HfzEtkHawr2aRq
 rqg/c9ivzDn1p+fUBo4ZYtrRk4IAY+yKu6hdzdLTP5+bQrBTWTO9rjQVBm9FAGYT
 sCdEj1NqzvExvNHD7X6ut/GJ05lmce3K+qeSXSEysN9gqiT3eomYWMXrD2V2lxzb
 rIDDcb/icfaqjt14Mksh19r/rzNeq7noj9CGSmcqw0BHZfHzl38Lai6pdfYzCNyn
 vCM/c4c1D/WWX8/lifO1JZVbhDk1jy82Iphg2KEhL8iKPxDsKBBZLmYuU1oa7tMo
 WryGAz9+GQwd+W9chFuaOEtMnzvW2scEJ5Eb2fEf0Qj0aEurkL+C9dZR6o1GN77V
 DBUxtU628Ef4PJJGfbNCwZzdd8UPYG3a/mKfQQ3dz0oz2LySlW4=
 =wDot
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - Support for running Linux in a protected VM under the Arm
   Confidential Compute Architecture (CCA)

 - Guarded Control Stack user-space support. Current patches follow the
   x86 ABI of implicitly creating a shadow stack on clone(). Subsequent
   patches (already on the list) will add support for clone3() allowing
   finer-grained control of the shadow stack size and placement from
   libc

 - AT_HWCAP3 support (not running out of HWCAP2 bits yet but we are
   getting close with the upcoming dpISA support)

 - Other arch features:

     - In-kernel use of the memcpy instructions, FEAT_MOPS (previously
       only exposed to user; uaccess support not merged yet)

     - MTE: hugetlbfs support and the corresponding kselftests

     - Optimise CRC32 using the PMULL instructions

     - Support for FEAT_HAFT enabling ARCH_HAS_NONLEAF_PMD_YOUNG

     - Optimise the kernel TLB flushing to use the range operations

     - POE/pkey (permission overlays): further cleanups after bringing
       the signal handler in line with the x86 behaviour for 6.12

 - arm64 perf updates:

     - Support for the NXP i.MX91 PMU in the existing IMX driver

     - Support for Ampere SoCs in the Designware PCIe PMU driver

     - Support for Marvell's 'PEM' PCIe PMU present in the 'Odyssey' SoC

     - Support for Samsung's 'Mongoose' CPU PMU

     - Support for PMUv3.9 finer-grained userspace counter access
       control

     - Switch back to platform_driver::remove() now that it returns
       'void'

     - Add some missing events for the CXL PMU driver

 - Miscellaneous arm64 fixes/cleanups:

     - Page table accessors cleanup: type updates, drop unused macros,
       reorganise arch_make_huge_pte() and clean up pte_mkcont(), sanity
       check addresses before runtime P4D/PUD folding

     - Command line override for ID_AA64MMFR0_EL1.ECV (advertising the
       FEAT_ECV for the generic timers) allowing Linux to boot with
       firmware deployments that don't set SCTLR_EL3.ECVEn

     - ACPI/arm64: tighten the check for the array of platform timer
       structures and adjust the error handling procedure in
       gtdt_parse_timer_block()

     - Optimise the cache flush for the uprobes xol slot (skip if no
       change) and other uprobes/kprobes cleanups

     - Fix the context switching of tpidrro_el0 when kpti is enabled

     - Dynamic shadow call stack fixes

     - Sysreg updates

     - Various arm64 kselftest improvements

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (168 commits)
  arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled
  kselftest/arm64: Try harder to generate different keys during PAC tests
  kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all()
  arm64/ptrace: Clarify documentation of VL configuration via ptrace
  kselftest/arm64: Corrupt P0 in the irritator when testing SSVE
  acpi/arm64: remove unnecessary cast
  arm64/mm: Change protval as 'pteval_t' in map_range()
  kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.c
  kselftest/arm64: Add FPMR coverage to fp-ptrace
  kselftest/arm64: Expand the set of ZA writes fp-ptrace does
  kselftets/arm64: Use flag bits for features in fp-ptrace assembler code
  kselftest/arm64: Enable build of PAC tests with LLVM=1
  kselftest/arm64: Check that SVCR is 0 in signal handlers
  selftests/mm: Fix unused function warning for aarch64_write_signal_pkey()
  kselftest/arm64: Fix printf() compiler warnings in the arm64 syscall-abi.c tests
  kselftest/arm64: Fix printf() warning in the arm64 MTE prctl() test
  kselftest/arm64: Fix printf() compiler warnings in the arm64 fp tests
  kselftest/arm64: Fix build with stricter assemblers
  arm64/scs: Drop unused prototype __pi_scs_patch_vmlinux()
  arm64/scs: Deal with 64-bit relative offsets in FDE frames
  ...
2024-11-18 18:10:37 -08:00
Thomas Weißschuh
573bcbe17e perf: arm-ni: Remove spurious NULL in attribute_group definition
This NULL value is most-likely a copy-paste error from an array
definition. So far the NULL didn't have any effect.
As there will be a union in struct attribute_group at this location,
it will trigger a compiler warning.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20241118-sysfs-const-attribute_group-fixes-v1-1-48e0b0ad8cba@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-18 16:20:46 +01:00
Alexandre Ghiti
57f7c7dc78
drivers: perf: Fix wrong put_cpu() placement
Unfortunately, the wrong patch version was merged which places the
put_cpu() after enabling a static key, which is not safe as pointed by
Will [1], so move put_cpu() before to avoid this.

Fixes: 2840dadf0d ("drivers: perf: Fix smp_processor_id() use in preemptible code")
Reported-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/all/20240827125335.GD4772@willie-the-truck/ [1]
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241112113422.617954-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-11-12 07:34:27 -08:00
Uwe Kleine-König
845fd2cbed perf: Switch back to struct platform_driver::remove()
After commit 0edb555a65 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.

Convert all platform drivers below drivers/perf to use .remove(), with
the eventual goal to drop struct platform_driver::remove_new(). As
.remove() and .remove_new() have the same prototypes, conversion is done
by just changing the structure member name in the driver initializer.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/20241027180313.410964-2-u.kleine-koenig@baylibre.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-11-06 14:16:20 +00:00
Markuss Broks
9643aaa194 perf: arm_pmuv3: Add support for Samsung Mongoose PMU
Add support for the Samsung Mongoose CPU core PMU.

This just adds the names and links to DT compatible strings.

Co-developed-by: Maksym Holovach <nergzd@nergzd723.xyz>
Signed-off-by: Maksym Holovach <nergzd@nergzd723.xyz>
Signed-off-by: Markuss Broks <markuss.broks@gmail.com>
Link: https://lore.kernel.org/r/20241026-mongoose-pmu-v1-2-f1a7448054be@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 13:23:25 +00:00
Ilkka Koskinen
94b3ad10c2 perf/dwc_pcie: Fix typos in event names
Fix a few typos in event names

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Link: https://lore.kernel.org/r/20241008231824.5102-4-ilkka@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 13:01:14 +00:00
Ilkka Koskinen
83d511c3ca perf/dwc_pcie: Add support for Ampere SoCs
Add support for Ampere SoCs by adding Ampere's vendor ID to the
vendor list.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20241008231824.5102-2-ilkka@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 13:01:14 +00:00
Gowthami Thiagarajan
e1dce56443 perf/marvell: Marvell PEM performance monitor support
PCI Express Interface PMU includes various performance counters
to monitor the data that is transmitted over the PCIe link. The
counters track various inbound and outbound transactions which
includes separate counters for posted/non-posted/completion TLPs.
Also, inbound and outbound memory read requests along with their
latencies can also be monitored. Address Translation Services(ATS)events
such as ATS Translation, ATS Page Request, ATS Invalidation along with
their corresponding latencies are also supported.

The performance counters are 64 bits wide.

For instance,
perf stat -e ib_tlp_pr <workload>
tracks the inbound posted TLPs for the workload.

Co-developed-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Gowthami Thiagarajan <gthiagarajan@marvell.com>
Link: https://lore.kernel.org/r/20241028055309.17893-1-gthiagarajan@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-28 17:35:35 +00:00
Rob Herring (Arm)
0bbff9ed81 perf/arm_pmuv3: Add PMUv3.9 per counter EL0 access control
Armv8.9/9.4 PMUv3.9 adds per counter EL0 access controls. Per counter
access is enabled with the UEN bit in PMUSERENR_EL1 register. Individual
counters are enabled/disabled in the PMUACR_EL1 register. When UEN is
set, the CR/ER bits control EL0 write access and must be set to disable
write access.

With the access controls, the clearing of unused counters can be
skipped.

KVM also configures PMUSERENR_EL1 in order to trap to EL2. UEN does not
need to be set for it since only PMUv3.5 is exposed to guests.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20241002184326.1105499-1-robh@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-28 17:27:15 +00:00
Ilkka Koskinen
759b5fc6cc perf/dwc_pcie: Convert the events with mixed case to lowercase
Group #1 events had both upper and lower case characters in their names.
Trying to count such events with perf tool results in an error:

$ perf stat -e dwc_rootport_10008/Tx_PCIe_TLP_Data_Payload/ sleep 1
event syntax error: 'dwc_rootport_10008/Tx_PCIe_TLP_Data_Payload/'
                     \___ Bad event or PMU

Unable to find PMU or event on a PMU of 'dwc_rootport_10008'

event syntax error: '..port_10008/Tx_PCIe_TLP_Data_Payload/'
                                  \___ unknown term 'Tx_PCIe_TLP_Data_Payload' for pmu 'dwc_rootport_10008'

valid terms: eventid,type,lane,config,config1,config2,config3,name,period,percore,metric-id

Run 'perf list' for a list of valid events

 Usage: perf stat [<options>] [<command>]

    -e, --event <event>   event selector. use 'perf list' to list available events

Perf tool assumes the event names are either in lower or upper case. This
is also mentioned in
Documentation/ABI/testing/sysfs-bus-event_source-devices-events

  "As performance monitoring event names are case
   insensitive in the perf tool, the perf tool only looks
   for lower or upper case event names in sysfs to avoid
   scanning the directory. It is therefore required the
   name of the event here is either lower or upper case."

Change the Group #1 events names to lower case.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20241016210136.65452-1-ilkka@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-24 12:37:31 +01:00
Davidlohr Bueso
48545b3eff perf/cxlpmu: Support missing events in 3.1 spec
Update the CXL PMU driver to support the new events introduced
in the latest revision. These are:

- read/write accesses with TEE constraints.
- S2M indicating Modified state.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/r/20241010025208.180458-1-dave@stgolabs.net
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-24 12:35:55 +01:00
Xu Yang
44798fe136 perf: imx_perf: add support for i.MX91 platform
This will add compatible and identifier for i.MX91 platform.

Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20240924061251.3387850-2-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-24 12:33:46 +01:00
Yunhui Cui
cc84767899 drivers perf: remove unused field pmu_node
The driver does not use the pmu_node field, so remove it.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240919034601.2453-1-cuiyunhui@bytedance.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-14 14:40:13 +01:00
Pu Lehui
c625154993
drivers/perf: riscv: Align errno for unsupported perf event
RISC-V perf driver does not yet support PERF_TYPE_BREAKPOINT. It would
be more appropriate to return -EOPNOTSUPP or -ENOENT for this type in
pmu_sbi_event_map. Considering that other implementations return -ENOENT
for unsupported perf types, let's synchronize this behavior. Due to this
reason, a riscv bpf testcases perf_skip fail. Meanwhile, align that
behavior to the rest of proper place.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Fixes: 9b3e150e31 ("RISC-V: Add a simple platform driver for RISC-V legacy perf")
Fixes: 16d3b1af09 ("perf: RISC-V: Check standard event availability")
Fixes: e999143459 ("RISC-V: Add perf platform driver based on SBI PMU extension")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240831071520.1630360-1-pulehui@huaweicloud.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-01 02:47:39 -07:00
Linus Torvalds
97d8894b6f RISC-V Patches for the 6.12 Merge Window, Part 1
* Support for using Zkr to seed KASLR.
 * Support for IPI-triggered CPU backtracing.
 * Support for generic CPU vulnerabilities reporting to userspace.
 * A few cleanups for missing licenses.
 * The size limit on the XIP kernel has been removed.
 * Support for tracing userspace stacks.
 * Support for the Svvptc extension.
 * Various cleanups and fixes throughout the tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmbykZITHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiTDLEACABTPlMX+Ke1lMYWj6MLBTXMSlQJ6w
 TKLVk3slp4POPsd+ViWy4J6EJDpCkNp6iK30Bv4AGainA3RJnbS7neKYy+MTw0ZV
 pJWTn73sxHeF+APnZ+geFYcmyteL/SZplgHgwLipFaojs7i/+tOvLFRRD3LueCz6
 Q6Muvke9R5uN6LL3tUrmIhJeyZjOwJtKEYoRz6Fo5Tq3RRK0VHYZkdpqRQ86rr9U
 yNbRNlBLlANstq8xjtiwm22ImPGIsvvhs0AvFft0H72zhf3Tfy0VcTLlDJYwkdKN
 Bz59v4N4jg1ajXuAsj4/BQdj5uRkzJNZ3Yq1yvMmGI+2UV1tInHEMeZi63+4gs8F
 0FYCziA+j6/08u8v8CrwdDOpJ9Iq/NMV6359bt5ySu3rM6QnPRGnHGkv4Ptk9Oyh
 sSPiuHS8YxpRBOB8xXNp5xFipTZ4+65VqCz253pAsbbp5aZ/o9Jw/bbBFFWcSsRZ
 tV2xhELVzPiC7dowfYkQhcNZOLlCaJ/Mvo2nMOtjbUwzaXkcjIRcYEy7+dTlfXyr
 3h5sStjWAKEmtLEvCYSI+lAT544tbV1x+lCMJGuvztapsTMtAP4GpQKblXl5qqnV
 L+VWIPJV1elI26H/p/Max9Z1s48NtfwoJSRL7Rnaya6uJ3iWG/BtajHlFbvgOkJf
 ObPZ//Yr9e91gg==
 =zDwL
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support using Zkr to seed KASLR

 - Support IPI-triggered CPU backtracing

 - Support for generic CPU vulnerabilities reporting to userspace

 - A few cleanups for missing licenses

 - The size limit on the XIP kernel has been removed

 - Support for tracing userspace stacks

 - Support for the Svvptc extension

 - Various cleanups and fixes throughout the tree

* tag 'riscv-for-linus-6.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (47 commits)
  crash: Fix riscv64 crash memory reserve dead loop
  perf/riscv-sbi: Add platform specific firmware event handling
  tools: Optimize ring buffer for riscv
  tools: Add riscv barrier implementation
  RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t
  ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
  riscv: Enable bitops instrumentation
  riscv: Omit optimized string routines when using KASAN
  ACPI: RISCV: Make acpi_numa_get_nid() to be static
  riscv: Randomize lower bits of stack address
  selftests: riscv: Allow mmap test to compile on 32-bit
  riscv: Make riscv_isa_vendor_ext_andes array static
  riscv: Use LIST_HEAD() to simplify code
  riscv: defconfig: Disable RZ/Five peripheral support
  RISC-V: Implement kgdb_roundup_cpus() to enable future NMI Roundup
  riscv: avoid Imbalance in RAS
  riscv: cacheinfo: Add back init_cache_level() function
  riscv: Remove unused _TIF_WORK_MASK
  drivers/perf: riscv: Remove redundant macro check
  riscv: define ILLEGAL_POINTER_VALUE for 64bit
  ...
2024-09-24 10:59:17 -07:00
Mayuresh Chitale
f0c9363db2
perf/riscv-sbi: Add platform specific firmware event handling
The SBI v2.0 specification pointed to by the link below reserves the
event code 0xffff for platform specific firmware events. Update the driver
to be able to parse and program such events. The platform specific
firmware events must now be specified in the perf command as below:
perf stat -e rCxxx ...
where bits[63:62] = 0x3 of the event config indicate a platform specific
firmware event and xxx indicate the actual event code which is passed
as the event data.

Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Link: https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/v2.0/riscv-sbi.pdf
Link: https://lore.kernel.org/r/20240812051109.6496-1-mchitale@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-20 05:58:11 -07:00
Linus Torvalds
114143a595 arm64 updates for 6.12
ACPI:
 * Enable PMCG erratum workaround for HiSilicon HIP10 and 11 platforms.
 * Ensure arm64-specific IORT header is covered by MAINTAINERS.
 
 CPU Errata:
 * Enable workaround for hardware access/dirty issue on Ampere-1A cores.
 
 Memory management:
 * Define PHYSMEM_END to fix a crash in the amdgpu driver.
 * Avoid tripping over invalid kernel mappings on the kexec() path.
 * Userspace support for the Permission Overlay Extension (POE) using
   protection keys.
 
 Perf and PMUs:
 * Add support for the "fixed instruction counter" extension in the CPU
   PMU architecture.
 * Extend and fix the event encodings for Apple's M1 CPU PMU.
 * Allow LSM hooks to decide on SPE permissions for physical profiling.
 * Add support for the CMN S3 and NI-700 PMUs.
 
 Confidential Computing:
 * Add support for booting an arm64 kernel as a protected guest under
   Android's "Protected KVM" (pKVM) hypervisor.
 
 Selftests:
 * Fix vector length issues in the SVE/SME sigreturn tests
 * Fix build warning in the ptrace tests.
 
 Timers:
 * Add support for PR_{G,S}ET_TSC so that 'rr' can deal with
   non-determinism arising from the architected counter.
 
 Miscellaneous:
 * Rework our IPI-based CPU stopping code to try NMIs if regular IPIs
   don't succeed.
 * Minor fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmbkVNEQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNKeIB/9YtbN7JMgsXktM94GP03r3tlFF36Y1S51S
 +zdDZclAVZCTCZN+PaFeAZ/+ah2EQYrY6rtDoHUSEMQdF9kH+ycuIPDTwaJ4Qkam
 QKXMpAgtY/4yf2rX4lhDF8rEvkhLDsu7oGDhqUZQsA33GrMBHfgA3oqpYwlVjvGq
 gkm7olTo9LdWAxkPpnjGrjB6Mv5Dq8dJRhW+0Q5AntI5zx3RdYGJZA9GUSzyYCCt
 FIYOtMmWPkQ0kKxIVxOxAOm/ubhfyCs2sjSfkaa3vtvtt+Yjye1Xd81rFciIbPgP
 QlK/Mes2kBZmjhkeus8guLI5Vi7tx3DQMkNqLXkHAAzOoC4oConE
 =6osL
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "The highlights are support for Arm's "Permission Overlay Extension"
  using memory protection keys, support for running as a protected guest
  on Android as well as perf support for a bunch of new interconnect
  PMUs.

  Summary:

  ACPI:
   - Enable PMCG erratum workaround for HiSilicon HIP10 and 11
     platforms.
   - Ensure arm64-specific IORT header is covered by MAINTAINERS.

  CPU Errata:
   - Enable workaround for hardware access/dirty issue on Ampere-1A
     cores.

  Memory management:
   - Define PHYSMEM_END to fix a crash in the amdgpu driver.
   - Avoid tripping over invalid kernel mappings on the kexec() path.
   - Userspace support for the Permission Overlay Extension (POE) using
     protection keys.

  Perf and PMUs:
   - Add support for the "fixed instruction counter" extension in the
     CPU PMU architecture.
   - Extend and fix the event encodings for Apple's M1 CPU PMU.
   - Allow LSM hooks to decide on SPE permissions for physical
     profiling.
   - Add support for the CMN S3 and NI-700 PMUs.

  Confidential Computing:
   - Add support for booting an arm64 kernel as a protected guest under
     Android's "Protected KVM" (pKVM) hypervisor.

  Selftests:
   - Fix vector length issues in the SVE/SME sigreturn tests
   - Fix build warning in the ptrace tests.

  Timers:
   - Add support for PR_{G,S}ET_TSC so that 'rr' can deal with
     non-determinism arising from the architected counter.

  Miscellaneous:
   - Rework our IPI-based CPU stopping code to try NMIs if regular IPIs
     don't succeed.
   - Minor fixes and cleanups"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits)
  perf: arm-ni: Fix an NULL vs IS_ERR() bug
  arm64: hibernate: Fix warning for cast from restricted gfp_t
  arm64: esr: Define ESR_ELx_EC_* constants as UL
  arm64: pkeys: remove redundant WARN
  perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled
  MAINTAINERS: List Arm interconnect PMUs as supported
  perf: Add driver for Arm NI-700 interconnect PMU
  dt-bindings/perf: Add Arm NI-700 PMU
  perf/arm-cmn: Improve format attr printing
  perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check
  arm64/mm: use lm_alias() with addresses passed to memblock_free()
  mm: arm64: document why pte is not advanced in contpte_ptep_set_access_flags()
  arm64: Expose the end of the linear map in PHYSMEM_END
  arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec()
  arm64/mm: Delete __init region from memblock.reserved
  perf/arm-cmn: Support CMN S3
  dt-bindings: perf: arm-cmn: Add CMN S3
  perf/arm-cmn: Refactor DTC PMU register access
  perf/arm-cmn: Make cycle counts less surprising
  perf/arm-cmn: Improve build-time assertion
  ...
2024-09-16 06:55:07 +02:00
Xiao Wang
1e206fad76
drivers/perf: riscv: Remove redundant macro check
The macro CONFIG_RISCV_PMU must have been defined when riscv_pmu.c gets
compiled, so this patch removes the redundant check.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240708121224.1148154-1-xiao.w.wang@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:48 -07:00
Dan Carpenter
2e091a805f perf: arm-ni: Fix an NULL vs IS_ERR() bug
The devm_ioremap() function never returns error pointers, it returns a
NULL pointer if there is an error.

Fixes: 4d5a7680f2 ("perf: Add driver for Arm NI-700 interconnect PMU")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/04d6ccc3-6d31-4f0f-ab0f-7a88342cc09a@stanley.mountain
Signed-off-by: Will Deacon <will@kernel.org>
2024-09-12 12:49:33 +01:00
Alexandre Ghiti
2840dadf0d
drivers: perf: Fix smp_processor_id() use in preemptible code
As reported in [1], the use of smp_processor_id() in
pmu_sbi_device_probe() must be protected by disabling the preemption, so
simple use get_cpu()/put_cpu() instead.

Reported-by: Nam Cao <namcao@linutronix.de>
Closes: https://lore.kernel.org/linux-riscv/20240820074925.ReMKUPP3@linutronix.de/ [1]
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Tested-by: Nam Cao <namcao@linutronix.de>
Fixes: a8625217a0 ("drivers/perf: riscv: Implement SBI PMU snapshot function")
Reported-by: Andrea Parri <parri.andrea@gmail.com>
Tested-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20240826165210.124696-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-10 20:28:02 -07:00
Ilkka Koskinen
5967a19f1c perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled
The PMU driver attempts to use PC_WRITE_RETIRED for the HW branch event,
if enabled. However, PC_WRITE_RETIRED counts only taken branches,
whereas BR_RETIRED counts also non-taken ones.

Furthermore, perf uses HW branch event to calculate branch misses ratio,
implying BR_RETIRED is the correct event to count.

We keep PC_WRITE_RETIRED still as an option in case BR_RETIRED isn't
implemented.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20240906191539.4847-1-ilkka@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-09-09 15:28:02 +01:00
Robin Murphy
4d5a7680f2 perf: Add driver for Arm NI-700 interconnect PMU
The Arm NI-700 Network-on-Chip Interconnect has a relatively
straightforward design with a hierarchy of voltage, power, and clock
domains, where each clock domain then contains a number of interface
units and a PMU which can monitor events thereon. As such, it begets a
relatively straightforward driver to interface those PMUs with perf.

Even more so than with arm-cmn, users will require detailed knowledge of
the wider system topology in order to meaningfully analyse anything,
since the interconnect itself cannot know what lies beyond the boundary
of each inscrutably-numbered interface. Given that, for now they are
also expected to refer to the NI-700 documentation for the relevant
event IDs to provide as well. An identifier is implemented so we can
come back and add jevents if anyone really wants to.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/9933058d0ab8138c78a61cd6852ea5d5ff48e393.1725470837.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-09-06 12:58:28 +01:00
Robin Murphy
f32efa3e4b perf/arm-cmn: Improve format attr printing
Take full advantage of our formats being stored in bitfield form, and
make the printing even more robust and simple by letting printk do all
the hard work of formatting bitlists.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/50459f2d48fc62310a566863dbf8a7c14361d363.1725474584.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-09-06 12:58:06 +01:00
Robin Murphy
f04b611e66 perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check
Checking for NUMA_NO_NODE is a misleading and, on reflection, entirely
unnecessary micro-optimisation. If it ever did happen that an incoming
CPU has no NUMA affinity while the current CPU does, a questionably-
useful PMU migration isn't the biggest thing wrong with that picture...

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/00634da33c21269a00844140afc7cc3a2ac1eb4d.1725474584.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-09-06 12:58:06 +01:00
Robin Murphy
0dc2f4963f perf/arm-cmn: Support CMN S3
CMN S3 is the latest and greatest evolution for 2024, although most of
the new features don't impact the PMU, so from our point of view it ends
up looking a lot like CMN-700 r3 still. We have some new device types to
ignore, a mildly irritating rearrangement of the register layouts, and a
scary new configuration option that makes it potentially unsafe to even
walk the full discovery tree, let alone attempt to use the PMU.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/2ec9eec5b6bf215a9886f3b69e3b00e4cd85095c.1725296395.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04 16:04:08 +01:00
Robin Murphy
67acca3504 perf/arm-cmn: Refactor DTC PMU register access
Annoyingly, we're soon going to have to cope with PMU registers moving
about. This will mostly be straightforward, except for the hard-coding
of CMN_PMU_OFFSET for the DTC PMU registers. As a first step, refactor
those accessors to allow for encapsulating a variable offset without
making a big mess all over. As a bonus, we can repack the arm_cmn_dtc
structure to accommodate the new pointer without growing any larger,
since irq_friend only encodes a range of +/-3.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/fc677576fae7b5b55780e5b245a4ef6ea1b30daf.1725296395.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04 16:04:08 +01:00
Robin Murphy
c5b15ddf11 perf/arm-cmn: Make cycle counts less surprising
By default, CMN has automatic clock-gating with the implication that
a DTC's cycle counter may not increment while the DTC is sufficiently
idle. Given that we may have up to 4 DTCs to choose from when scheduling
a cycles event, this may potentially lead to surprising results if
trying to measure metrics based on activity in a different DTC domain
from where cycles end up being counted. Furthermore, since the details
of internal clock gating are not documented, we can't even reason about
what "active" cycles for a DTC actually mean relative to the activity of
other nodes within the same nominal DTC domain.

Make the reasonable assumption that if the user wants to count cycles,
they almost certainly want to count all of the cycles, and disable clock
gating while a DTC's cycle counter is in use.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/c47cfdc09e907b1d7753d142a7e659982cceb246.1725296395.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04 16:04:08 +01:00
Robin Murphy
ff436cee69 perf/arm-cmn: Improve build-time assertion
These days we can use static_assert() in the logical place rather than
jamming a BUILD_BUG_ON() into the nearest function scope.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/224ee8286f299100f1c768edb254edc898539f50.1725296395.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04 16:04:08 +01:00
Robin Murphy
359414b33e perf/arm-cmn: Ensure dtm_idx is big enough
While CMN_MAX_DIMENSION was bumped to 12 for CMN-650, that only supports
up to a 10x10 mesh, so bumping dtm_idx to 256 bits at the time worked
out OK in practice. However CMN-700 did finally support up to 144 XPs,
and thus needs a worst-case 288 bits of dtm_idx for an aggregated XP
event on a maxed-out config. Oops.

Fixes: 23760a0144 ("perf/arm-cmn: Add CMN-700 support")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/e771b358526a0d7fc06efee2c3a2fdc0c9f51d44.1725296395.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04 16:04:08 +01:00
Robin Murphy
88b63a82c8 perf/arm-cmn: Fix CCLA register offset
Apparently pmu_event_sel is offset by 8 for all CCLA nodes, not just
the CCLA_RNI combination type.

Fixes: 23760a0144 ("perf/arm-cmn: Add CMN-700 support")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/6e7bb06fef6046f83e7647aad0e5be544139763f.1725296395.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04 16:04:07 +01:00
Robin Murphy
e79634b53e perf/arm-cmn: Refactor node ID handling. Again.
The scope of the "extra device ports" configuration is not made clear by
the CMN documentation - so far we've assumed it applies globally, based
on the sole example which suggests as much. However it transpires that
this is incorrect, and the format does in fact vary based on each
individual XP's port configuration. As a consequence, we're currenly
liable to decode the port/device indices from a node ID incorrectly,
thus program the wrong event source in the DTM leading to bogus event
counts, and also show device topology on the wrong ports in debugfs.

To put this right, rework node IDs yet again to carry around the
additional data necessary to decode them properly per-XP. At this point
the notion of fully decomposing an ID becomes more impractical than it's
worth, so unabstracting the XY mesh coordinates (where 2/3 users were
just debug anyway) ends up leaving things a bit simpler overall.

Fixes: 60d1504070 ("perf/arm-cmn: Support new IP features")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/5195f990152fc37adba5fbf5929a6b11063d9f09.1725296395.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04 16:04:07 +01:00
Yicong Yang
d1c93d5c67 drivers/perf: hisi_pcie: Export supported Root Ports [bdf_min, bdf_max]
Currently users can get the Root Ports supported by the PCIe PMU by
"bus" sysfs attributes which indicates the PCIe bus number where
Root Ports are located. This maybe insufficient since Root Ports
supported by different PCIe PMUs may be located on the same PCIe bus.
So export the BDF range the Root Ports additionally.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240829090332.28756-4-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30 11:43:10 +01:00
Yicong Yang
17bf68aeb3 drivers/perf: hisi_pcie: Fix TLP headers bandwidth counting
We make the initial value of event ctrl register as HISI_PCIE_INIT_SET
and modify according to the user options. This will make TLP headers
bandwidth only counting never take effect since HISI_PCIE_INIT_SET
configures to count the TLP payloads bandwidth. Fix this by making
the initial value of event ctrl register as 0.

Fixes: 17d573984d ("drivers/perf: hisi: Add TLP filter support")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240829090332.28756-3-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30 11:43:10 +01:00
Yicong Yang
daecd3373a drivers/perf: hisi_pcie: Record hardware counts correctly
Currently we set the period and record it as the initial value of the
counter without checking it's set to the hardware successfully or not.
However the counter maybe unwritable if the target event is unsupported
by the device. In such case we will pass user a wrong count:

[start counts when setting the period]
hwc->prev_count = 0x8000000000000000
device.counter_value = 0 // the counter is not set as the period
[when user reads the counter]
event->count = device.counter_value - hwc->prev_count
             = 0x8000000000000000 // wrong. should be 0.

Fix this by record the hardware counter counts correctly when setting
the period.

Fixes: 8404b0fbc7 ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240829090332.28756-2-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30 11:43:09 +01:00
James Clark
5e9629d0ae drivers/perf: arm_spe: Use perf_allow_kernel() for permissions
Use perf_allow_kernel() for 'pa_enable' (physical addresses),
'pct_enable' (physical timestamps) and context IDs. This means that
perf_event_paranoid is now taken into account and LSM hooks can be used,
which is more consistent with other perf_event_open calls. For example
PERF_SAMPLE_PHYS_ADDR uses perf_allow_kernel() rather than just
perfmon_capable().

This also indirectly fixes the following error message which is
misleading because perf_event_paranoid is not taken into account by
perfmon_capable():

  $ perf record -e arm_spe/pa_enable/

  Error:
  Access to performance monitoring and observability operations is
  limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid
  setting ...

Suggested-by: Al Grant <al.grant@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20240827145113.1224604-1-james.clark@linaro.org
Link: https://lore.kernel.org/all/20240807120039.GD37996@noisy.programming.kicks-ass.net/
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30 11:42:24 +01:00
Krishna chaitanya chundru
db9e7a83d3 perf/dwc_pcie: Add support for QCOM vendor devices
Update the vendor table with QCOM PCIe vendorid.

Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240816-dwc_pmu_fix-v2-4-198b8ab1077c@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-23 16:07:26 +01:00
Krishna chaitanya chundru
b94b05478f perf/dwc_pcie: Always register for PCIe bus notifier
When the PCIe devices are discovered late, the driver can't find
the PCIe devices and returns in the init without registering with
the bus notifier. Due to that the devices which are discovered late
the driver can't register for this.

Register for bus notifier & driver even if the device is not found
as part of init.

Fixes: af9597adc2 ("drivers/perf: add DesignWare PCIe PMU driver")
Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240816-dwc_pmu_fix-v2-3-198b8ab1077c@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-23 16:07:26 +01:00
Krishna chaitanya chundru
e669388537 perf/dwc_pcie: Fix registration issue in multi PCIe controller instances
When there are multiple of instances of PCIe controllers, registration
to perf driver fails with this error.
sysfs: cannot create duplicate filename '/devices/platform/dwc_pcie_pmu.0'
CPU: 0 PID: 166 Comm: modprobe Not tainted 6.10.0-rc2-next-20240607-dirty
Hardware name: Qualcomm SA8775P Ride (DT)
Call trace:
 dump_backtrace.part.8+0x98/0xf0
 show_stack+0x14/0x1c
 dump_stack_lvl+0x74/0x88
 dump_stack+0x14/0x1c
 sysfs_warn_dup+0x60/0x78
 sysfs_create_dir_ns+0xe8/0x100
 kobject_add_internal+0x94/0x224
 kobject_add+0xa8/0x118
 device_add+0x298/0x7b4
 platform_device_add+0x1a0/0x228
 platform_device_register_full+0x11c/0x148
 dwc_pcie_register_dev+0x74/0xf0 [dwc_pcie_pmu]
 dwc_pcie_pmu_init+0x7c/0x1000 [dwc_pcie_pmu]
 do_one_initcall+0x58/0x1c0
 do_init_module+0x58/0x208
 load_module+0x1804/0x188c
 __do_sys_init_module+0x18c/0x1f0
 __arm64_sys_init_module+0x14/0x1c
 invoke_syscall+0x40/0xf8
 el0_svc_common.constprop.1+0x70/0xf4
 do_el0_svc+0x18/0x20
 el0_svc+0x28/0xb0
 el0t_64_sync_handler+0x9c/0xc0
 el0t_64_sync+0x160/0x164
kobject: kobject_add_internal failed for dwc_pcie_pmu.0 with -EEXIST,
don't try to register things with the same name in the same directory.

This is because of having same bdf value for devices under two different
controllers.

Update the logic to use sbdf which is a unique number in case of
multi instance also.

Fixes: af9597adc2 ("drivers/perf: add DesignWare PCIe PMU driver")
Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240816-dwc_pmu_fix-v2-1-198b8ab1077c@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-23 16:07:25 +01:00
Jing Zhang
a3dd920977 drivers/perf: Fix ali_drw_pmu driver interrupt status clearing
The alibaba_uncore_pmu driver forgot to clear all interrupt status
in the interrupt processing function. After the PMU counter overflow
interrupt occurred, an interrupt storm occurred, causing the system
to hang.

Therefore, clear the correct interrupt status in the interrupt handling
function to fix it.

Fixes: cf7b61073e ("drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC")
Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1724297611-20686-1-git-send-email-renyu.zj@linux.alibaba.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-23 16:03:56 +01:00
Yangyu Chen
3cce331ee2 drivers/perf: apple_m1: add known PMU events
This patch adds known PMU events that can be found on /usr/share/kpep in
macOS. The m1_pmu_events and m1_pmu_event_affinity are generated from
the script [1], which consumes the plist file from Apple. And then added
these events to m1_pmu_perf_map and m1_pmu_event_attrs with Apple's
documentation [2].

Link: https://github.com/cyyself/m1-pmu-gen [1]
Link: https://developer.apple.com/download/apple-silicon-cpu-optimization-guide/ [2]
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
Acked-by: Hector Martin <marcan@marcan.st>
Link: https://lore.kernel.org/r/tencent_C5DA658E64B8D13125210C8D707CD8823F08@qq.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-23 15:52:48 +01:00
Rob Herring (Arm)
d8226d8cfb perf: arm_pmuv3: Add support for Armv9.4 PMU instruction counter
Armv9.4/8.9 PMU adds optional support for a fixed instruction counter
similar to the fixed cycle counter. Support for the feature is indicated
in the ID_AA64DFR1_EL1 register PMICNTR field. The counter is not
accessible in AArch32.

Existing userspace using direct counter access won't know how to handle
the fixed instruction counter, so we have to avoid using the counter
when user access is requested.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20240731-arm-pmu-3-9-icntr-v3-7-280a8d7ff465@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-16 13:09:12 +01:00
Rob Herring (Arm)
126d7d7cce arm64: perf/kvm: Use a common PMU cycle counter define
The PMUv3 and KVM code each have a define for the PMU cycle counter
index. Move KVM's define to a shared location and use it for PMUv3
driver.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20240731-arm-pmu-3-9-icntr-v3-5-280a8d7ff465@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-16 13:09:12 +01:00
Rob Herring (Arm)
a4a6e2078d perf: arm_pmuv3: Prepare for more than 32 counters
Various PMUv3 registers which are a mask of counters are 64-bit
registers, but the accessor functions take a u32. This has been fine as
the upper 32-bits have been RES0 as there has been a maximum of 32
counters prior to Armv9.4/8.9. With Armv9.4/8.9, a 33rd counter is
added. Update the accessor functions to use a u64 instead.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20240731-arm-pmu-3-9-icntr-v3-2-280a8d7ff465@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-16 13:09:12 +01:00
Rob Herring (Arm)
bf5ffc8c80 perf: arm_pmu: Remove event index to counter remapping
Xscale and Armv6 PMUs defined the cycle counter at 0 and event counters
starting at 1 and had 1:1 event index to counter numbering. On Armv7 and
later, this changed the cycle counter to 31 and event counters start at
0. The drivers for Armv7 and PMUv3 kept the old event index numbering
and introduced an event index to counter conversion. The conversion uses
masking to convert from event index to a counter number. This operation
relies on having at most 32 counters so that the cycle counter index 0
can be transformed to counter number 31.

Armv9.4 adds support for an additional fixed function counter
(instructions) which increases possible counters to more than 32, and
the conversion won't work anymore as a simple subtract and mask. The
primary reason for the translation (other than history) seems to be to
have a contiguous mask of counters 0-N. Keeping that would result in
more complicated index to counter conversions. Instead, store a mask of
available counters rather than just number of events. That provides more
information in addition to the number of events.

No (intended) functional changes.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20240731-arm-pmu-3-9-icntr-v3-1-280a8d7ff465@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-16 13:09:11 +01:00
Rob Herring (Arm)
48b035121a perf: arm_pmu: Use of_property_present()
Use of_property_present() to test for property presence rather than
of_find_property(). This is part of a larger effort to remove callers
of of_find_property() and similar functions. of_find_property() leaks
the DT struct property and data pointers which is a problem for
dynamically allocated nodes which may be freed.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20240731191312.1710417-15-robh@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-08-16 13:08:21 +01:00
Shifrin Dmitry
941a8e9b7a
perf: riscv: Fix selecting counters in legacy mode
It is required to check event type before checking event config.
Events with the different types can have the same config.
This check is missed for legacy mode code

For such perf usage:
    sysctl -w kernel.perf_user_access=2
    perf stat -e cycles,L1-dcache-loads --
driver will try to force both events to CYCLE counter.

This commit implements event type check before forcing
events on the special counters.

Signed-off-by: Shifrin Dmitry <dmitry.shifrin@syntacore.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Fixes: cc4c07c89a ("drivers: perf: Implement perf event mmap support in the SBI backend")
Link: https://lore.kernel.org/r/20240729125858.630653-1-dmitry.shifrin@syntacore.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-01 07:15:13 -07:00
Linus Torvalds
c9f33436d8 RISC-V Patches for the 6.11 Merge Window, Part 2
* Support for NUMA (via SRAT and SLIT), console output (via SPCR), and
   cache info (via PPTT) on ACPI-based systems.
 * The trap entry/exit code no longer breaks the return address stack
   predictor on many systems, which results in an improvement to trap
   latency.
 * Support for HAVE_ARCH_STACKLEAK.
 * The sv39 linear map has been extended to support 128GiB mappings.
 * The frequency of the mtime CSR is now visible via hwprobe.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmaj2EYTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiVG3D/9kNHTI09iPDJd6fTChE3cpMxy7xXXE
 URX3Avu+gYsJmIbYyg4RnQ8FGFN7icKBCrQqs7JmLliU0NU+YMcCcjsJA2QaivbD
 VAlaex1qNcvNGteHrpbqhr3Zs4zw8GlBkB3KFTLyPAp61bybGo0a/A5ONJ7ScQIW
 RWHewAPgb86cQ0Q34JpO87TqvMM0KMvhQP5dip+olaFjLRBzhXmGFZfHqA80kTWl
 0ytYclVCHZMtO/5mnQpuIOVs1IKw9L4wa0sivOQF0iLTqfKDFALa6yZsThHA/w3e
 JVuBAdQhcPZ3fgO2fUfJPlW16GmRC2/tdiFg5NFw8k4vo7DYBwX55ztPKXqDrJDM
 8ah85IeLiPar/A/uHdn6bPjK+aGMuzklKF50r62XXAc2fL8mza1sdvKCVOy2EOLn
 JyGI9c/10KpvN/DW8g7hPefhvbx4+tCKkFcPqf++VQha6W8cQdCKi+Li0Pm8TTnp
 XPQjIvSlDDG1Pl4ofgBSFoyB8pkBXNzvv8NZp+YYtnqSOLAKaZuP+KwA8TwHdvGM
 pdCXcL3KHiLy4/pJWEoNTutD0mbJ7PUIb2P/KkjqYDgp4F1n0Hg+/aeSIp+7a4Pv
 yTBctIGxrlriQMIdtWCR8tyhcPP4pDpGYkW0K15EE16G0NK0fjD89LEXYqT6ae2R
 C0QgiwnVe/eopg==
 =zeUn
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.11-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - Support for NUMA (via SRAT and SLIT), console output (via SPCR), and
   cache info (via PPTT) on ACPI-based systems.

 - The trap entry/exit code no longer breaks the return address stack
   predictor on many systems, which results in an improvement to trap
   latency.

 - Support for HAVE_ARCH_STACKLEAK.

 - The sv39 linear map has been extended to support 128GiB mappings.

 - The frequency of the mtime CSR is now visible via hwprobe.

* tag 'riscv-for-linus-6.11-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits)
  RISC-V: Provide the frequency of time CSR via hwprobe
  riscv: Extend sv39 linear mapping max size to 128G
  riscv: enable HAVE_ARCH_STACKLEAK
  riscv: signal: Remove unlikely() from WARN_ON() condition
  riscv: Improve exception and system call latency
  RISC-V: Select ACPI PPTT drivers
  riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT
  riscv: cacheinfo: remove the useless input parameter (node) of ci_leaf_init()
  RISC-V: ACPI: Enable SPCR table for console output on RISC-V
  riscv: boot: remove duplicated targets line
  trace: riscv: Remove deprecated kprobe on ftrace support
  riscv: cpufeature: Extract common elements from extension checking
  riscv: Introduce vendor variants of extension helpers
  riscv: Add vendor extensions to /proc/cpuinfo
  riscv: Extend cpufeature.c to detect vendor extensions
  RISC-V: run savedefconfig for defconfig
  RISC-V: hwprobe: sort EXT_KEY()s in hwprobe_isa_ext0() alphabetically
  ACPI: NUMA: replace pr_info with pr_debug in arch_acpi_numa_init
  ACPI: NUMA: change the ACPI_NUMA to a hidden option
  ACPI: NUMA: Add handler for SRAT RINTC affinity structure
  ...
2024-07-27 10:14:34 -07:00
Joel Granados
78eb4ea25c sysctl: treewide: constify the ctl_table argument of proc_handlers
const qualify the struct ctl_table argument in the proc_handler function
signatures. This is a prerequisite to moving the static ctl_table
structs into .rodata data which will ensure that proc_handler function
pointers cannot be modified.

This patch has been generated by the following coccinelle script:

```
  virtual patch

  @r1@
  identifier ctl, write, buffer, lenp, ppos;
  identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)";
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int write, void *buffer, size_t *lenp, loff_t *ppos);

  @r2@
  identifier func, ctl, write, buffer, lenp, ppos;
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int write, void *buffer, size_t *lenp, loff_t *ppos)
  { ... }

  @r3@
  identifier func;
  @@

  int func(
  - struct ctl_table *
  + const struct ctl_table *
    ,int , void *, size_t *, loff_t *);

  @r4@
  identifier func, ctl;
  @@

  int func(
  - struct ctl_table *ctl
  + const struct ctl_table *ctl
    ,int , void *, size_t *, loff_t *);

  @r5@
  identifier func, write, buffer, lenp, ppos;
  @@

  int func(
  - struct ctl_table *
  + const struct ctl_table *
    ,int write, void *buffer, size_t *lenp, loff_t *ppos);

```

* Code formatting was adjusted in xfs_sysctl.c to comply with code
  conventions. The xfs_stats_clear_proc_handler,
  xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where
  adjusted.

* The ctl_table argument in proc_watchdog_common was const qualified.
  This is called from a proc_handler itself and is calling back into
  another proc_handler, making it necessary to change it as part of the
  proc_handler migration.

Co-developed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Co-developed-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-07-24 20:59:29 +02:00
Palmer Dabbelt
b9a603da42
Merge patch series "riscv: Separate vendor extensions from standard extensions"
Charlie Jenkins <charlie@rivosinc.com> says:

All extensions, both standard and vendor, live in one struct
"riscv_isa_ext". There is currently one vendor extension, xandespmu, but
it is likely that more vendor extensions will be added to the kernel in
the future. As more vendor extensions (and standard extensions) are
added, riscv_isa_ext will become more bloated with a mix of vendor and
standard extensions.

This also allows each vendor to be conditionally enabled through
Kconfig.

* b4-shazam-merge:
  riscv: cpufeature: Extract common elements from extension checking
  riscv: Introduce vendor variants of extension helpers
  riscv: Add vendor extensions to /proc/cpuinfo
  riscv: Extend cpufeature.c to detect vendor extensions

Link: https://lore.kernel.org/r/20240719-support_vendor_extensions-v3-0-0af7587bbec0@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-22 15:37:01 -07:00
Charlie Jenkins
0f24254111
riscv: Introduce vendor variants of extension helpers
Vendor extensions are maintained in per-vendor structs (separate from
standard extensions which live in riscv_isa). Create vendor variants for
the existing extension helpers to interface with the riscv_isa_vendor
bitmaps.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Link: https://lore.kernel.org/r/20240719-support_vendor_extensions-v3-3-0af7587bbec0@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-22 15:36:56 -07:00
Charlie Jenkins
23c996fc2b
riscv: Extend cpufeature.c to detect vendor extensions
Instead of grouping all vendor extensions into the same riscv_isa_ext
that standard instructions use, create a struct
"riscv_isa_vendor_ext_data_list" that allows each vendor to maintain
their vendor extensions independently of the standard extensions.
xandespmu is currently the only vendor extension so that is the only
extension that is affected by this change.

An additional benefit of this is that the extensions of each vendor can
be conditionally enabled. A config RISCV_ISA_VENDOR_EXT_ANDES has been
added to allow for that.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Tested-by: Yu Chien Peter Lin <peterlin@andestech.com>
Reviewed-by: Yu Chien Peter Lin <peterlin@andestech.com>
Link: https://lore.kernel.org/r/20240719-support_vendor_extensions-v3-1-0af7587bbec0@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-22 15:36:54 -07:00
Linus Torvalds
c89d780cc1 arm64 updates for 6.11:
* Virtual CPU hotplug support for arm64 ACPI systems
 
 * cpufeature infrastructure cleanups and making the FEAT_ECBHB ID bits
   visible to guests
 
 * CPU errata: expand the speculative SSBS workaround to more CPUs
 
 * arm64 ACPI:
 
   - acpi=nospcr option to disable SPCR as default console for arm64
 
   - Move some ACPI code (cpuidle, FFH) to drivers/acpi/arm64/
 
 * GICv3, use compile-time PMR values: optimise the way regular IRQs are
   masked/unmasked when GICv3 pseudo-NMIs are used, removing the need for
   a static key in fast paths by using a priority value chosen
   dynamically at boot time
 
 * arm64 perf updates:
 
   - Rework of the IMX PMU driver to enable support for I.MX95
 
   - Enable support for tertiary match groups in the CMN PMU driver
 
   - Initial refactoring of the CPU PMU code to prepare for the fixed
     instruction counter introduced by Arm v9.4
 
   - Add missing PMU driver MODULE_DESCRIPTION() strings
 
   - Hook up DT compatibles for recent CPU PMUs
 
 * arm64 kselftest updates:
 
   - Kernel mode NEON fp-stress
 
   - Cleanups, spelling mistakes
 
 * arm64 Documentation update with a minor clarification on TBI
 
 * Miscellaneous:
 
   - Fix missing IPI statistics
 
   - Implement raw_smp_processor_id() using thread_info rather than a
     per-CPU variable (better code generation)
 
   - Make MTE checking of in-kernel asynchronous tag faults conditional
     on KASAN being enabled
 
   - Minor cleanups, typos
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmaQKN4ACgkQa9axLQDI
 XvE0Nw/+JZ6OEQ+DMUHXZfbWanvn1p0nVOoEV3MYVpOeQK1ILYCoDapatLNIlet0
 wcja7tohKbL1ifc7GOqlkitu824LMlotncrdOBycRqb/4C5KuJ+XhygFv5hGfX0T
 Uh2zbo4w52FPPEUMICfEAHrKT3QB9tv7f66xeUNbWWFqUn3rY02/ZVQVVdw6Zc0e
 fVYWGUUoQDR7+9hRkk6tnYw3+9YFVAUAbLWk+DGrW7WsANi6HuJ/rBMibwFI6RkG
 SZDZHum6vnwx0Dj9H7WrYaQCvUMm7AlckhQGfPbIFhUk6pWysfJtP5Qk49yiMl7p
 oRk/GrSXpiKumuetgTeOHbokiE1Nb8beXx0OcsjCu4RrIaNipAEpH1AkYy5oiKoT
 9vKZErMDtQgd96JHFVaXc+A3D2kxVfkc1u7K3TEfVRnZFV7CN+YL+61iyZ+uLxVi
 d9xrAmwRsWYFVQzlZG3NWvSeQBKisUA1L8JROlzWc/NFDwTqDGIt/zS4pZNL3+OM
 EXW0LyKt7Ijl6vPXKCXqrODRrPlcLc66VMZxofZOl0/dEqyJ+qLL4GUkWZu8lTqO
 BqydYnbTSjiDg/ntWjTrD0uJ8c40Qy7KTPEdaPqEIQvkDEsUGlOnhAQjHrnGNb9M
 psZtpDW2xm7GykEOcd6rgSz4Xeky2iLsaR4Wc7FTyDS0YRmeG44=
 =ob2k
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "The biggest part is the virtual CPU hotplug that touches ACPI,
  irqchip. We also have some GICv3 optimisation for pseudo-NMIs that has
  been queued via the arm64 tree. Otherwise the usual perf updates,
  kselftest, various small cleanups.

  Core:

   - Virtual CPU hotplug support for arm64 ACPI systems

   - cpufeature infrastructure cleanups and making the FEAT_ECBHB ID
     bits visible to guests

   - CPU errata: expand the speculative SSBS workaround to more CPUs

   - GICv3, use compile-time PMR values: optimise the way regular IRQs
     are masked/unmasked when GICv3 pseudo-NMIs are used, removing the
     need for a static key in fast paths by using a priority value
     chosen dynamically at boot time

  ACPI:

   - 'acpi=nospcr' option to disable SPCR as default console for arm64

   - Move some ACPI code (cpuidle, FFH) to drivers/acpi/arm64/

  Perf updates:

   - Rework of the IMX PMU driver to enable support for I.MX95

   - Enable support for tertiary match groups in the CMN PMU driver

   - Initial refactoring of the CPU PMU code to prepare for the fixed
     instruction counter introduced by Arm v9.4

   - Add missing PMU driver MODULE_DESCRIPTION() strings

   - Hook up DT compatibles for recent CPU PMUs

  Kselftest updates:

   - Kernel mode NEON fp-stress

   - Cleanups, spelling mistakes

  Miscellaneous:

   - arm64 Documentation update with a minor clarification on TBI

   - Fix missing IPI statistics

   - Implement raw_smp_processor_id() using thread_info rather than a
     per-CPU variable (better code generation)

   - Make MTE checking of in-kernel asynchronous tag faults conditional
     on KASAN being enabled

   - Minor cleanups, typos"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (69 commits)
  selftests: arm64: tags: remove the result script
  selftests: arm64: tags_test: conform test to TAP output
  perf: add missing MODULE_DESCRIPTION() macros
  arm64: smp: Fix missing IPI statistics
  irqchip/gic-v3: Fix 'broken_rdists' unused warning when !SMP and !ACPI
  ACPI: Add acpi=nospcr to disable ACPI SPCR as default console on ARM64
  Documentation: arm64: Update memory.rst for TBI
  arm64/cpufeature: Replace custom macros with fields from ID_AA64PFR0_EL1
  KVM: arm64: Replace custom macros with fields from ID_AA64PFR0_EL1
  perf: arm_pmuv3: Include asm/arm_pmuv3.h from linux/perf/arm_pmuv3.h
  perf: arm_v6/7_pmu: Drop non-DT probe support
  perf/arm: Move 32-bit PMU drivers to drivers/perf/
  perf: arm_pmuv3: Drop unnecessary IS_ENABLED(CONFIG_ARM64) check
  perf: arm_pmuv3: Avoid assigning fixed cycle counter with threshold
  arm64: Kconfig: Fix dependencies to enable ACPI_HOTPLUG_CPU
  perf: imx_perf: add support for i.MX95 platform
  perf: imx_perf: fix counter start and config sequence
  perf: imx_perf: refactor driver for imx93
  perf: imx_perf: let the driver manage the counter usage rather the user
  perf: imx_perf: add macro definitions for parsing config attr
  ...
2024-07-15 17:06:19 -07:00
Jeff Johnson
42bebc7cca perf: add missing MODULE_DESCRIPTION() macros
With ARCH=x86, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm-ccn.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/fsl_imx8_ddr_perf.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/marvell_cn10k_ddr_pmu.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/arm_cspmu_module.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/nvidia_cspmu.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/ampere_cspmu.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/cxl_pmu.o

Add the missing invocation of the MODULE_DESCRIPTION() macro to all
files which have a MODULE_LICENSE().

This includes drivers/perf/hisilicon/hisi_uncore_pmu.c which, although
it did not produce a warning with the x86 allmodconfig configuration,
may cause this warning with arm64 configurations.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240709-md-drivers-perf-v3-1-513275b75ed0@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-07-10 13:04:38 +01:00
Samuel Holland
16d3b1af09
perf: RISC-V: Check standard event availability
The RISC-V SBI PMU specification defines several standard hardware and
cache events. Currently, all of these events are exposed to userspace,
even when not actually implemented. They appear in the `perf list`
output, and commands like `perf stat` try to use them.

This is more than just a cosmetic issue, because the PMU driver's .add
function fails for these events, which causes pmu_groups_sched_in() to
prematurely stop scheduling in other (possibly valid) hardware events.

Add logic to check which events are supported by the hardware (i.e. can
be mapped to some counter), so only usable events are reported to
userspace. Since the kernel does not know the mapping between events and
possible counters, this check must happen during boot, when no counters
are in use. Make the check asynchronous to minimize impact on boot time.

Fixes: e999143459 ("RISC-V: Add perf platform driver based on SBI PMU extension")

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240628-misc_perf_fixes-v4-3-e01cfddcf035@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03 12:56:22 -07:00
Samuel Holland
7dd646cf74
drivers/perf: riscv: Reset the counter to hpmevent mapping while starting cpus
Currently, we stop all the counters while a new cpu is brought online.
However, the hpmevent to counter mappings are not reset. The firmware may
have some stale encoding in their mapping structure which may lead to
undesirable results. We have not encountered such scenario though.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240628-misc_perf_fixes-v4-2-e01cfddcf035@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03 12:56:21 -07:00
Atish Patra
a3f24e83d1
drivers/perf: riscv: Do not update the event data if uptodate
In case of an counter overflow, the event data may get corrupted
if called from an external overflow handler. This happens because
we can't update the counter without starting it when SBI PMU
extension is in use. However, the prev_count has been already
updated at the first pass while the counter value is still the
old one.

The solution is simple where we don't need to update it again
if it is already updated which can be detected using hwc state.
The event state in the overflow handler is updated in the following
patch. Thus, this fix can't be backported to kernel version where
overflow support was added.

Fixes: a8625217a0 ("drivers/perf: riscv: Implement SBI PMU snapshot function")

Closes:https://lore.kernel.org/all/CC51D53B-846C-4D81-86FC-FBF969D0A0D6@pku.edu.cn/

Reported-by: garthlei@pku.edu.cn
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240628-misc_perf_fixes-v4-1-e01cfddcf035@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03 12:56:15 -07:00
Rob Herring (Arm)
d688ffa269 perf: arm_pmuv3: Include asm/arm_pmuv3.h from linux/perf/arm_pmuv3.h
The arm64 asm/arm_pmuv3.h depends on defines from
linux/perf/arm_pmuv3.h. Rather than depend on include order, follow the
usual pattern of "linux" headers including "asm" headers of the same
name.

With this change, the include of linux/kvm_host.h is problematic due to
circular includes:

In file included from ../arch/arm64/include/asm/arm_pmuv3.h:9,
                 from ../include/linux/perf/arm_pmuv3.h:312,
                 from ../include/kvm/arm_pmu.h:11,
                 from ../arch/arm64/include/asm/kvm_host.h:38,
                 from ../arch/arm64/mm/init.c:41:
../include/linux/kvm_host.h:383:30: error: field 'arch' has incomplete type

Switching to asm/kvm_host.h solves the issue.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-5-c9784b4f4065@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-07-03 14:07:14 +01:00
Rob Herring (Arm)
12f051c987 perf: arm_v6/7_pmu: Drop non-DT probe support
There are no non-DT based PMU users for v6 or v7, so drop the custom
non-DT probe table. Unfortunately XScale still needs non-DT probing.

Note that this drops support for arm1156 PMU, but there are no arm1156
based systems supported in the kernel.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-4-c9784b4f4065@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-07-03 14:07:14 +01:00
Rob Herring (Arm)
8d75537beb perf/arm: Move 32-bit PMU drivers to drivers/perf/
It is preferred to put drivers under drivers/ rather than under arch/.
The PMU drivers also depend on arm_pmu.c, so it's better to place them
all together.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-3-c9784b4f4065@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-07-03 14:07:14 +01:00
Rob Herring (Arm)
598c1a2d9f perf: arm_pmuv3: Drop unnecessary IS_ENABLED(CONFIG_ARM64) check
The IS_ENABLED(CONFIG_ARM64) check for threshold support is unnecessary.
The purpose is to not enable thresholds on arm32, but if threshold is
non-zero, the check against threshold_max() just above here will have
errored out because threshold_max() is always 0 on arm32.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Acked-by: Mark rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-2-c9784b4f4065@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2024-07-03 14:07:14 +01:00