Commit Graph

532 Commits

Author SHA1 Message Date
Linus Torvalds
5c0b0c676a powerpc updates for 5.16
- Enable STRICT_KERNEL_RWX for Freescale 85xx platforms.
 
  - Activate CONFIG_STRICT_KERNEL_RWX by default, while still allowing it to be disabled.
 
  - Add support for out-of-line static calls on 32-bit.
 
  - Fix oopses doing bpf-to-bpf calls when STRICT_KERNEL_RWX is enabled.
 
  - Fix boot hangs on e5500 due to stale value in ESR passed to do_page_fault().
 
  - Fix several bugs on pseries in handling of device tree cache information for hotplugged
    CPUs, and/or during partition migration.
 
  - Various other small features and fixes.
 
 Thanks to: Alexey Kardashevskiy, Alistair Popple, Anatolij Gustschin, Andrew Donnellan,
 Athira Rajeev, Bixuan Cui, Bjorn Helgaas, Cédric Le Goater, Christophe Leroy, Daniel
 Axtens, Daniel Henrique Barboza, Denis Kirjanov, Fabiano Rosas, Frederic Barrat, Gustavo
 A. R. Silva, Hari Bathini, Jacques de Laval, Joel Stanley, Kai Song, Kajol Jain, Laurent
 Vivier, Leonardo Bras, Madhavan Srinivasan, Nathan Chancellor, Nathan Lynch, Naveen N.
 Rao, Nicholas Piggin, Nick Desaulniers, Niklas Schnelle, Oliver O'Halloran, Rob Herring,
 Russell Currey, Srikar Dronamraju, Stan Johnson, Tyrel Datwyler, Uwe Kleine-König, Vasant
 Hegde, Wan Jiabing, Xiaoming Ni,
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmGFDPoTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgNbsEACVczMVwMBEny5a7W1tqq1bnY9RFVw3
 K+/rE7/FpSLX+RrwgoMkmqfPvfyc9WISVLlIQDKz4XhkBjaXv0+Y4OMsymuDCbxL
 Qk7F1ff22UfLmPjjJk39gHJ8QZQqZk3wmFu2QzTO4yBZbz2SqqXFLxwyoLpZ0LJ8
 pdGl51+bIsTkDJzrdkhX9X4AKS/fYyjbQxq5u7FS89ZCCs+KvzjLcDRo0GZYaOK/
 hgDBa60DCCszL/9yjbh0ANZxmM2Z3+6AXkvAAXrtXzIGk4JzenZfiV+VEzmq8Tt0
 UpWSsUEe7VgykMR3MTrL9G8op70PpKX6OMUPegJq4iRQ24h4mpFCK4oV9OMKJqpF
 ifN9NO2ZZKOz1ke4l7Xe8SEHLX7rq5U/P7INh3AsKYNYwo6HkJhSPxiCBWUTlnZ3
 OYoZ7czyO4gMPHWP92z4CoSiTYVBFuyhYexRcnQskg60TIwbr+lMXziRyPRGI8b6
 taf2rD8eAiyUJnvkFUsyAHtYHpkSkuMeiVqY2CDQdh2SdtIFgwKzB2RjFL0gzaBZ
 XP9RWD+HernGQAJSlIk7cVthont3JHklcKk+ohhDbsWzPeUJcz6t4ChtgRq0x4q4
 Hpes1lsVfXpyxj5ouBK/E/t+diwPvBLM0dCcarQJE6ExgMzBC/C7Br7jCSgyVJA2
 VhtcZaCYh+vRlQ==
 =f7HE
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Enable STRICT_KERNEL_RWX for Freescale 85xx platforms.

 - Activate CONFIG_STRICT_KERNEL_RWX by default, while still allowing it
   to be disabled.

 - Add support for out-of-line static calls on 32-bit.

 - Fix oopses doing bpf-to-bpf calls when STRICT_KERNEL_RWX is enabled.

 - Fix boot hangs on e5500 due to stale value in ESR passed to
   do_page_fault().

 - Fix several bugs on pseries in handling of device tree cache
   information for hotplugged CPUs, and/or during partition migration.

 - Various other small features and fixes.

Thanks to Alexey Kardashevskiy, Alistair Popple, Anatolij Gustschin,
Andrew Donnellan, Athira Rajeev, Bixuan Cui, Bjorn Helgaas, Cédric Le
Goater, Christophe Leroy, Daniel Axtens, Daniel Henrique Barboza, Denis
Kirjanov, Fabiano Rosas, Frederic Barrat, Gustavo A.  R.  Silva, Hari
Bathini, Jacques de Laval, Joel Stanley, Kai Song, Kajol Jain, Laurent
Vivier, Leonardo Bras, Madhavan Srinivasan, Nathan Chancellor, Nathan
Lynch, Naveen N.  Rao, Nicholas Piggin, Nick Desaulniers, Niklas
Schnelle, Oliver O'Halloran, Rob Herring, Russell Currey, Srikar
Dronamraju, Stan Johnson, Tyrel Datwyler, Uwe Kleine-König, Vasant
Hegde, Wan Jiabing, and Xiaoming Ni,

* tag 'powerpc-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (73 commits)
  powerpc/8xx: Fix Oops with STRICT_KERNEL_RWX without DEBUG_RODATA_TEST
  powerpc/32e: Ignore ESR in instruction storage interrupt handler
  powerpc/powernv/prd: Unregister OPAL_MSG_PRD2 notifier during module unload
  powerpc: Don't provide __kernel_map_pages() without ARCH_SUPPORTS_DEBUG_PAGEALLOC
  MAINTAINERS: Update powerpc KVM entry
  powerpc/xmon: fix task state output
  powerpc/44x/fsp2: add missing of_node_put
  powerpc/dcr: Use cmplwi instead of 3-argument cmpli
  KVM: PPC: Tick accounting should defer vtime accounting 'til after IRQ handling
  powerpc/security: Use a mutex for interrupt exit code patching
  powerpc/83xx/mpc8349emitx: Make mcu_gpiochip_remove() return void
  powerpc/fsl_booke: Fix setting of exec flag when setting TLBCAMs
  powerpc/book3e: Fix set_memory_x() and set_memory_nx()
  powerpc/nohash: Fix __ptep_set_access_flags() and ptep_set_wrprotect()
  powerpc/bpf: Fix write protecting JIT code
  selftests/powerpc: Use date instead of EPOCHSECONDS in mitigation-patching.sh
  powerpc/64s/interrupt: Fix check_return_regs_valid() false positive
  powerpc/boot: Set LC_ALL=C in wrapper script
  powerpc/64s: Default to 64K pages for 64 bit book3s
  Revert "powerpc/audit: Convert powerpc to AUDIT_ARCH_COMPAT_GENERIC"
  ...
2021-11-05 08:15:46 -07:00
Linus Torvalds
91e1c99e17 perf updates:
core:
 
   - Allow ftrace to instrument parts of the perf core code
 
   - Add a new mem_hops field to perf_mem_data_src which allows to represent
     intra-node/package or inter-node/off-package details to prepare for
     next generation systems which have more hieararchy within the
     node/pacakge level.
 
  tools:
 
   - Update for the new mem_hops field in perf_mem_data_src
 
  arch:
 
   - A set of constraints fixes for the Intel uncore PMU
 
   - The usual set of small fixes and improvements for x86 and PPC
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/GkQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoaD8D/wLhXR8RxtF4W9HJmHA+5XFsPtg+isp
 ZNU2kOs4gZskFx75NQaRv5ikA8y68TKdIx+NuQvRLYItaMveTToLSsJ55bfGMxIQ
 JHqDvANUNxBmAACnbYQlqf9WgB0i/3fCUHY5lpmN0waKjaswz7WNpycv4ccShVZr
 PKbgEjkeFBhplCqqOF0X5H3V+4q85+nZONm1iSNd4S7/3B6OCxOf1u78usL1bbtW
 yJAMSuTeOVUZCJm7oVywKW/ZlCscT135aKr6xe5QTrjlPuRWzuLaXNezdMnMyoVN
 HVv8a0ClACb8U5KiGfhvaipaIlIAliWJp2qoiNjrspDruhH6Yc+eNh1gUhLbtNpR
 4YZR5jxv4/mS13kzMMQg00cCWQl7N4whPT+ZE9pkpshGt+EwT+Iy3U+v13wDfnnp
 MnDggpWYGEkAck13t/T6DwC3qBIsVujtpiG+tt/ERbTxiuxi1ccQTGY3PDjtHV3k
 tIMH5n7l4jEpfl8VmoSUgz/2h1MLZnQUWp41GXkjkaOt7uunQZen+nAwqpTm28KV
 7U6U0h1q6r7HxOZRxkPPe4HSV+aBNH3H1LeNBfEd3hDCFGf6MY6vLow+2BE9ybk7
 Y6LPbRqq0SN3sd5MND0ZvQEt5Zgol8CMlX+UKoLEEv7RognGbIxkgpK7exv5pC9w
 nWj7TaMfpRzPgw==
 =Oj0G
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Thomas Gleixner:
 "Core:

   - Allow ftrace to instrument parts of the perf core code

   - Add a new mem_hops field to perf_mem_data_src which allows to
     represent intra-node/package or inter-node/off-package details to
     prepare for next generation systems which have more hieararchy
     within the node/pacakge level.

  Tools:

   - Update for the new mem_hops field in perf_mem_data_src

  Arch:

   - A set of constraints fixes for the Intel uncore PMU

   - The usual set of small fixes and improvements for x86 and PPC"

* tag 'perf-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings
  powerpc/perf: Fix data source encodings for L2.1 and L3.1 accesses
  tools/perf: Add mem_hops field in perf_mem_data_src structure
  perf: Add mem_hops field in perf_mem_data_src structure
  perf: Add comment about current state of PERF_MEM_LVL_* namespace and remove an extra line
  perf/core: Allow ftrace for functions in kernel/event/core.c
  perf/x86: Add new event for AUX output counter index
  perf/x86: Add compiler barrier after updating BTS
  perf/x86/intel/uncore: Fix Intel SPR M3UPI event constraints
  perf/x86/intel/uncore: Fix Intel SPR M2PCIE event constraints
  perf/x86/intel/uncore: Fix Intel SPR IIO event constraints
  perf/x86/intel/uncore: Fix Intel SPR CHA event constraints
  perf/x86/intel/uncore: Fix Intel ICX IIO event constraints
  perf/x86/intel/uncore: Fix invalid unit check
  perf/x86/intel/uncore: Support extra IMC channel on Ice Lake server
2021-11-01 13:12:15 -07:00
Kajol Jain
26da4abfb3 powerpc/perf: Fix data source encodings for L2.1 and L3.1 accesses
Fix the data source encodings to represent L2.1/L3.1(another core's
L2/L3 on the same node) accesses properly for power10 and older
plaforms.

Add new macros(LEVEL/REM) which can be used to add mem_lvl_num and remote
field data inside perf_mem_data_src structure.

Result in power9 system with patch changes:

localhost:~/linux/tools/perf # ./perf mem report | grep Remote
     0.01%             1  252           Remote core, same node L3 or L3 hit  [.] 0x0000000000002dd0                producer_consumer   [.] 0x00007fff7f25eb90
anon               HitM          N/A                     No       N/A        0              0
     0.01%             1  220           Remote core, same node L3 or L3 hit  [.] 0x0000000000002dd0                producer_consumer   [.] 0x00007fff77776d90
anon               HitM          N/A                     No       N/A        0              0
     0.01%             1  220           Remote core, same node L3 or L3 hit  [.] 0x0000000000002dd0                producer_consumer   [.] 0x00007fff817d9410
anon               HitM          N/A                     No       N/A        0              0

Fixes: 79e96f8f93 ("powerpc/perf: Export memory hierarchy info to user space")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211006140654.298352-5-kjain@linux.ibm.com
2021-10-19 17:27:01 +02:00
Athira Rajeev
8f6aca0e0f powerpc/perf: Fix cycles/instructions as PM_CYC/PM_INST_CMPL in power10
On power9 and earlier platforms, the default event used for cyles and
instructions is PM_CYC (0x0001e) and PM_INST_CMPL (0x00002)
respectively. These events use two programmable PMCs and by default will
count irrespective of the run latch state (idle state). But since they
use programmable PMCs, these events can lead to multiplexing with other
events, because there are only 4 programmable PMCs. Hence in power10,
performance monitoring unit (PMU) driver uses performance monitor
counter 5 (PMC5) and performance monitor counter6 (PMC6) for counting
instructions and cycles.

Currently on power10, the event used for cycles is PM_RUN_CYC (0x600F4)
and instructions uses PM_RUN_INST_CMPL (0x500fa). But counting of these
events in idle state is controlled by the CC56RUN bit setting in Monitor
Mode Control Register0 (MMCR0). If the CC56RUN bit is zero, PMC5/6 will
not count when CTRL[RUN] (run latch) is zero. This could lead to missing
some counts if a thread is in idle state during system wide profiling.

To fix it, set the CC56RUN bit in MMCR0 for power10, which makes PMC5
and PMC6 count instructions and cycles regardless of the run latch
state. Since this change make PMC5/6 count as PM_INST_CMPL/PM_CYC,
rename the event code 0x600f4 as PM_CYC instead of PM_RUN_CYC and event
code 0x500fa as PM_INST_CMPL instead of PM_RUN_INST_CMPL. The changes
are only for PMC5/6 event codes and will not affect the behaviour of
PM_RUN_CYC/PM_RUN_INST_CMPL if progammed in other PMC's.

Fixes: a64e697cef ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.cm>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
[mpe: Tweak change log wording for style and consistency]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211007075121.28497-1-atrajeev@linux.vnet.ibm.com
2021-10-14 21:46:45 +11:00
Athira Rajeev
29908bbf7b powerpc/perf: Expose instruction and data address registers as part of extended regs
Patch adds support to include Sampled Instruction Address Register
(SIAR) and Sampled Data Address Register (SDAR) SPRs as part of extended
registers. Update the definition of PERF_REG_PMU_MASK_300/31 and
PERF_REG_EXTENDED_MAX to include these SPR's.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Kajol Jain<kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211007065505.27809-4-atrajeev@linux.vnet.ibm.com
2021-10-12 18:39:02 +11:00
Kajol Jain
3c69a5f222 powerpc/perf: Fix the check for SIAR value
Incase of random sampling, there can be scenarios where
Sample Instruction Address Register(SIAR) may not latch
to the sampled instruction and could result in
the value of 0. In these scenarios it is preferred to
return regs->nip. These corner cases are seen in the
previous generation (p9) also.

Patch adds the check for SIAR value along with regs_use_siar
and siar_valid checks so that the function will return
regs->nip incase SIAR is zero.

Patch drops the code under PPMU_P10_DD1 flag check
which handles SIAR 0 case only for Power10 DD1.

Fixes: 2ca13a4cc5 ("powerpc/perf: Use regs->nip when SIAR is zero")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210818171556.36912-3-kjain@linux.ibm.com
2021-08-25 22:38:19 +10:00
Kajol Jain
cc90c6742e powerpc/perf: Drop the case of returning 0 as instruction pointer
Drop the case of returning 0 as instruction pointer since kernel
never executes at 0 and userspace almost never does either.

Fixes: e6878835ac ("powerpc/perf: Sample only if SIAR-Valid bit is set in P7+")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210818171556.36912-2-kjain@linux.ibm.com
2021-08-25 22:38:19 +10:00
Kajol Jain
b1643084d1 powerpc/perf: Use stack siar instead of mfspr
Minor optimization in the 'perf_instruction_pointer' function code by
making use of stack siar instead of mfspr.

Fixes: 75382aa72f ("powerpc/perf: Move code to select SIAR or pt_regs into perf_read_regs")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210818171556.36912-1-kjain@linux.ibm.com
2021-08-25 22:38:18 +10:00
Kajol Jain
f9addd85fb powerpc/perf/hv-gpci: Fix counter value parsing
H_GetPerformanceCounterInfo (0xF080) hcall returns the counter data in
the result buffer. Result buffer has specific format defined in the PAPR
specification. One of the fields is counter offset and width of the
counter data returned.

Counter data are returned in a unsigned char array in big endian byte
order. To get the final counter data, the values must be left shifted
byte at a time. But commit 220a0c609a ("powerpc/perf: Add support for
the hv gpci (get performance counter info) interface") made the shifting
bitwise and also assumed little endian order. Because of that, hcall
counters values are reported incorrectly.

In particular this can lead to counters go backwards which messes up the
counter prev vs now calculation and leads to huge counter value
reporting:

  #: perf stat -e hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
           -C 0 -I 1000
        time             counts unit events
     1.000078854 18,446,744,073,709,535,232      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
     2.000213293                  0      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
     3.000320107                  0      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
     4.000428392                  0      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
     5.000537864                  0      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
     6.000649087                  0      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
     7.000760312                  0      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
     8.000865218             16,448      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
     9.000978985 18,446,744,073,709,535,232      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
    10.001088891             16,384      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
    11.001201435                  0      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
    12.001307937 18,446,744,073,709,535,232      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/

Fix the shifting logic to correct match the format, ie. read bytes in
big endian order.

Fixes: e4f226b158 ("powerpc/perf/hv-gpci: Increase request buffer size")
Cc: stable@vger.kernel.org # v4.6+
Reported-by: Nageswara R Sastry<rnsastry@linux.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Tested-by: Nageswara R Sastry<rnsastry@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210813082158.429023-1-kjain@linux.ibm.com
2021-08-20 17:00:54 +10:00
Nicholas Piggin
cf9c615cde powerpc/64s/perf: Always use SIAR for kernel interrupts
If an interrupt is taken in kernel mode, always use SIAR for it rather than
looking at regs_sipr. This prevents samples piling up around interrupt
enable (hard enable or interrupt replay via soft enable) in PMUs / modes
where the PR sample indication is not in synch with SIAR.

This results in better sampling of interrupt entry and exit in particular.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210720141504.420110-1-npiggin@gmail.com
2021-08-04 10:53:39 +10:00
Linus Torvalds
019b3fd94b powerpc updates for 5.14
- A big series refactoring parts of our KVM code, and converting some to C.
 
  - Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on some CPUs.
 
  - Support for the Microwatt soft-core.
 
  - Optimisations to our interrupt return path on 64-bit.
 
  - Support for userspace access to the NX GZIP accelerator on PowerVM on Power10.
 
  - Enable KUAP and KUEP by default on 32-bit Book3S CPUs.
 
  - Other smaller features, fixes & cleanups.
 
 Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Baokun Li,
 Benjamin Herrenschmidt, Bharata B Rao, Christophe Leroy, Daniel Axtens, Daniel Henrique
 Barboza, Finn Thain, Geoff Levand, Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley,
 Jordan Niethe, Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas
 Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika Vasireddy, Shaokun
 Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar Singh, Tom Rix, Vaibhav Jain,
 YueHaibing, Zhang Jianhua, Zhen Lei.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmDfFS4THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgFxHEAC88NJ+Gz87LiTQFt6QjhziBaJUd+sY
 uqADPRROr4P50O8PjYZbMi2qbXzOlLkZO4wJWX7jpZ1F9KmbPNqY2shD8h4ahyge
 F/uqzBW1FXBJfnDEKdU2MzalkeTP+dwxLZyouUamjDCGNLFjOV4x/Fft5otOdXjO
 k9uO6yoGyOkWYzjC+Y/irNPlIDDByB/+bD92Cb52Y2mXMDDEnx4JzbtkeJW+8udT
 Sjn3bWzeL+dz5GehjMKwK4+SptNiyQGOgM8FwtnKUMvgzxv04DqCGjr9YC12L2Z7
 VoFZc4GzVgtf8DZg4fJ3KG5aG2nH3Tui7jc9lUckdrxixDAZw5wSG7CQ39gFb/5+
 7A4fEJk4Z3h5llibwxAZrC7wV8ZDDXn8oRFzRcOJjfxYaD+ohOOyWHIebwkdiXYx
 nfYI7sBcScDLXeBvHtDra2GJpbFSpVL3S/QNhhi1vKVNrFSyAgbAybcVL2xPLZ6+
 8Mh7A8xt+hf2bo9AXuYJDo9mwXWfg1093d0kT+AslcRhZioBk18c2AiZLIz0FzuL
 Ua/e5FPb99x9LSdcZHvaAXBoHT2iTgDyCyDa3gkIesyuRX6ggHoFcVQuvdDcbJ9d
 H8LK+Tahy1Y+E5b6KdtU8mDEGE+QG+CWLnwQ6YSCaL/MYgaFzNa32Jdj1fmztSBC
 cttP43kHZ7ljTw==
 =zo4d
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - A big series refactoring parts of our KVM code, and converting some
   to C.

 - Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on
   some CPUs.

 - Support for the Microwatt soft-core.

 - Optimisations to our interrupt return path on 64-bit.

 - Support for userspace access to the NX GZIP accelerator on PowerVM on
   Power10.

 - Enable KUAP and KUEP by default on 32-bit Book3S CPUs.

 - Other smaller features, fixes & cleanups.

Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Baokun Li, Benjamin Herrenschmidt, Bharata B Rao, Christophe
Leroy, Daniel Axtens, Daniel Henrique Barboza, Finn Thain, Geoff Levand,
Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley, Jordan Niethe,
Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas
Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika
Vasireddy, Shaokun Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar
Singh, Tom Rix, Vaibhav Jain, YueHaibing, Zhang Jianhua, and Zhen Lei.

* tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (218 commits)
  powerpc: Only build restart_table.c for 64s
  powerpc/64s: move ret_from_fork etc above __end_soft_masked
  powerpc/64s/interrupt: clean up interrupt return labels
  powerpc/64/interrupt: add missing kprobe annotations on interrupt exit symbols
  powerpc/64: enable MSR[EE] in irq replay pt_regs
  powerpc/64s/interrupt: preserve regs->softe for NMI interrupts
  powerpc/64s: add a table of implicit soft-masked addresses
  powerpc/64e: remove implicit soft-masking and interrupt exit restart logic
  powerpc/64e: fix CONFIG_RELOCATABLE build warnings
  powerpc/64s: fix hash page fault interrupt handler
  powerpc/4xx: Fix setup_kuep() on SMP
  powerpc/32s: Fix setup_{kuap/kuep}() on SMP
  powerpc/interrupt: Use names in check_return_regs_valid()
  powerpc/interrupt: Also use exit_must_hard_disable() on PPC32
  powerpc/sysfs: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE
  powerpc/ptrace: Refactor regs_set_return_{msr/ip}
  powerpc/ptrace: Move set_return_regs_changed() before regs_set_return_{msr/ip}
  powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()
  powerpc/pseries/vas: Include irqdomain.h
  powerpc: mark local variables around longjmp as volatile
  ...
2021-07-02 12:54:34 -07:00
Paul Mackerras
d40a82be2f powerpc/pmu: Make the generic compat PMU use the architected events
This changes generic-compat-pmu.c so that it only uses architected
events defined in Power ISA v3.0B, rather than event encodings which,
while common to all the IBM Power Systems implementations, are
nevertheless implementation-specific rather than architected.  The
intention is that any CPU implementation designed to conform to Power
ISA v3.0B or later can use generic-compat-pmu.c.

In addition to the existing events for cycles and instructions, this
adds several other architected events, including alternative encodings
for some events.  In order to make it possible to measure cycles and
instructions at the same time as each other, we set the CC5-6RUN bit
in MMCR0, which makes PMC5 and PMC6 count instructions and cycles
regardless of the run bit, so their events are now PM_CYC and
PM_INST_CMPL rather than PM_RUN_CYC and PM_RUN_INST_CMPL (the latter
are still available via other event codes).

Note that POWER9 has an erratum where one architected event
(PM_FLOP_CMPL, floating-point operations completed, code 0x100f4) does
not work correctly.  Given that there is a specific PMU driver for P9
which will be used in preference to generic-compat-pmu.c, that is not
a real problem.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YJD7L9yeoxvxqeYi@thinks.paulus.ozlabs.org
2021-06-25 14:47:20 +10:00
Athira Rajeev
60b7ed54a4 powerpc/perf: Fix crash in perf_instruction_pointer() when ppmu is not set
On systems without any specific PMU driver support registered, running
perf record causes Oops.

The relevant portion from call trace:

  BUG: Kernel NULL pointer dereference on read at 0x00000040
  Faulting instruction address: 0xc0021f0c
  Oops: Kernel access of bad area, sig: 11 [#1]
  BE PAGE_SIZE=4K PREEMPT CMPCPRO
  SAF3000 DIE NOTIFICATION
  CPU: 0 PID: 442 Comm: null_syscall Not tainted 5.13.0-rc6-s3k-dev-01645-g7649ee3d2957 #5164
  NIP:  c0021f0c LR: c00e8ad8 CTR: c00d8a5c
  NIP perf_instruction_pointer+0x10/0x60
  LR  perf_prepare_sample+0x344/0x674
  Call Trace:
    perf_prepare_sample+0x7c/0x674 (unreliable)
    perf_event_output_forward+0x3c/0x94
    __perf_event_overflow+0x74/0x14c
    perf_swevent_hrtimer+0xf8/0x170
    __hrtimer_run_queues.constprop.0+0x160/0x318
    hrtimer_interrupt+0x148/0x3b0
    timer_interrupt+0xc4/0x22c
    Decrementer_virt+0xb8/0xbc

During perf record session, perf_instruction_pointer() is called to
capture the sample IP. This function in core-book3s accesses
ppmu->flags. If a platform specific PMU driver is not registered, ppmu
is set to NULL and accessing its members results in a crash. Fix this
crash by checking if ppmu is set.

Fixes: 2ca13a4cc5 ("powerpc/perf: Use regs->nip when SIAR is zero")
Cc: stable@vger.kernel.org # v5.11+
Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1623952506-1431-1-git-send-email-atrajeev@linux.vnet.ibm.com
2021-06-18 16:30:36 +10:00
Daniel Axtens
b112fb913b powerpc: make stack walking KASAN-safe
Make our stack-walking code KASAN-safe by using __no_sanitize_address.
Generic code, arm64, s390 and x86 all make accesses unchecked for similar
sorts of reasons: when unwinding a stack, we might touch memory that KASAN
has marked as being out-of-bounds. In ppc64 KASAN development, I hit this
sometimes when checking for an exception frame - because we're checking
an arbitrary offset into the stack frame.

See commit 2095574632 ("s390/kasan: avoid false positives during stack
unwind"), commit bcaf669b4b ("arm64: disable kasan when accessing
frame->fp in unwind_frame"), commit 91e08ab0c8 ("x86/dumpstack:
Prevent KASAN false positive warnings") and commit 6e22c83664
("tracing, kasan: Silence Kasan warning in check_stack of stack_tracer").

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210614120907.1952321-1-dja@axtens.net
2021-06-17 00:09:11 +10:00
Christophe Leroy
69d4d6e5fd powerpc: Don't use 'struct ppc_inst' to reference instruction location
'struct ppc_inst' is an internal representation of an instruction, but
in-memory instructions are and will remain a table of 'u32' forever.

Replace all 'struct ppc_inst *' used for locating an instruction in
memory by 'u32 *'. This removes a lot of undue casts to 'struct
ppc_inst *'.

It also helps locating ab-use of 'struct ppc_inst' dereference.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Fix ppc_inst_next(), use u32 instead of unsigned int]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7062722b087228e42cbd896e39bfdf526d6a340a.1621516826.git.christophe.leroy@csgroup.eu
2021-06-17 00:09:00 +10:00
Christophe Leroy
87f19ea101 powerpc/perf: Simplify Makefile
arch/powerpc/Kbuild decend into arch/powerpc/perf/ only when
CONFIG_PERF_EVENTS is selected, so there is not need to take
CONFIG_PERF_EVENTS into account in arch/powerpc/perf/Makefile.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Michal Suchánek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d37f61afca55b5b33787b643890e061ae1c18f5f.1620396045.git.christophe.leroy@csgroup.eu
2021-06-15 17:12:27 +10:00
Athira Rajeev
66d9b74928 powerpc/perf: Fix the threshold event selection for memory events in power10
Memory events (mem-loads and mem-stores) currently use the threshold
event selection as issue to finish. Power10 supports issue to complete
as part of thresholding which is more appropriate for mem-loads and
mem-stores. Hence fix the event code for memory events to use issue
to complete.

Fixes: a64e697cef ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1614840015-1535-1-git-send-email-atrajeev@linux.vnet.ibm.com
2021-04-23 01:38:02 +10:00
Athira Rajeev
b4ded42268 powerpc/perf: Fix sampled instruction type for larx/stcx
Sampled Instruction Event Register (SIER) field [46:48] identifies the
sampled instruction type. ISA v3.1 says value of 0b111 for this field as
reserved, but in POWER10 it denotes LARX/STCX type which will hopefully
be fixed in ISA v3.1 update.

Patch fixes the functions to handle type value 7 for CPU_FTR_ARCH_31.

Fixes: a64e697cef ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
[mpe: Avoid reading mmcra until necessary, use early return to deindent if block]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1614858937-1485-1-git-send-email-atrajeev@linux.vnet.ibm.com
2021-04-23 01:38:02 +10:00
Athira Rajeev
af31fd0c91 powerpc/perf: Expose processor pipeline stage cycles using PERF_SAMPLE_WEIGHT_STRUCT
Performance Monitoring Unit (PMU) registers in powerpc provides
information on cycles elapsed between different stages in the
pipeline. This can be used for application tuning. On ISA v3.1
platform, this information is exposed by sampling registers.
Patch adds kernel support to capture two of the cycle counters
as part of perf sample using the sample type:
PERF_SAMPLE_WEIGHT_STRUCT.

The power PMU function 'get_mem_weight' currently uses 64 bit weight
field of perf_sample_data to capture memory latency. But following the
introduction of PERF_SAMPLE_WEIGHT_TYPE, weight field could contain
64-bit or 32-bit value depending on the architexture support for
PERF_SAMPLE_WEIGHT_STRUCT. Patches uses WEIGHT_STRUCT to expose the
pipeline stage cycles info. Hence update the ppmu functions to work for
64-bit and 32-bit weight values.

If the sample type is PERF_SAMPLE_WEIGHT, use the 64-bit weight field.
if the sample type is PERF_SAMPLE_WEIGHT_STRUCT, memory subsystem
latency is stored in the low 32bits of perf_sample_weight structure.
Also for CPU_FTR_ARCH_31, capture the two cycle counter information in
two 16 bit fields of perf_sample_weight structure.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1616425047-1666-2-git-send-email-atrajeev@linux.vnet.ibm.com
2021-04-20 14:22:23 +10:00
Madhavan Srinivasan
d8a1d6c589 powerpc/perf: Add platform specific check_attr_config
Add platform specific attr.config value checks. Patch
includes checks for both power9 and power10.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210408074504.248211-2-maddy@linux.ibm.com
2021-04-19 12:22:09 +10:00
Xiongwei Song
7153d4bf0b powerpc/traps: Enhance readability for trap types
Define macros to list ppc interrupt types in interttupt.h, replace the
reference of the trap hex values with these macros.

Referred the hex numbers in arch/powerpc/kernel/exceptions-64e.S,
arch/powerpc/kernel/exceptions-64s.S, arch/powerpc/kernel/head_*.S,
arch/powerpc/kernel/head_booke.h and arch/powerpc/include/asm/kvm_asm.h.

Signed-off-by: Xiongwei Song <sxwjean@gmail.com>
[mpe: Resolve conflicts in nmi_disables_ftrace(), fix 40x build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1618398033-13025-1-git-send-email-sxwjean@me.com
2021-04-17 22:20:19 +10:00
Madhavan Srinivasan
2e2a441d2c powerpc/perf: Infrastructure to support checking of attr.config*
Introduce code to support the checking of attr.config* for
values which are reserved for a given platform.
Performance Monitoring Unit (PMU) configuration registers
have fields that are reserved and some specific values for
bit fields are reserved. For ex., MMCRA[61:62] is
Random Sampling Mode (SM) and value of 0b11 for this field
is reserved.

Writing non-zero or invalid values in these fields will
have unknown behaviours.

Patch adds a generic call-back function "check_attr_config"
in "struct power_pmu", to be called in event_init to
check for attr.config* values for a given platform.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210408074504.248211-1-maddy@linux.ibm.com
2021-04-14 23:04:19 +10:00
Bixuan Cui
cc331eee03 powerpc/perf/hv-24x7: Make some symbols static
The sparse tool complains as follows:

arch/powerpc/perf/hv-24x7.c:229:1: warning:
 symbol '__pcpu_scope_hv_24x7_txn_flags' was not declared. Should it be static?
arch/powerpc/perf/hv-24x7.c:230:1: warning:
 symbol '__pcpu_scope_hv_24x7_txn_err' was not declared. Should it be static?
arch/powerpc/perf/hv-24x7.c:236:1: warning:
 symbol '__pcpu_scope_hv_24x7_hw' was not declared. Should it be static?
arch/powerpc/perf/hv-24x7.c:244:1: warning:
 symbol '__pcpu_scope_hv_24x7_reqb' was not declared. Should it be static?
arch/powerpc/perf/hv-24x7.c:245:1: warning:
 symbol '__pcpu_scope_hv_24x7_resb' was not declared. Should it be static?

This symbol is not used outside of hv-24x7.c, so this
commit marks it static.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210409090124.59492-1-cuibixuan@huawei.com
2021-04-14 23:04:17 +10:00
Bixuan Cui
107dadb046 powerpc/perf: Make symbol 'isa207_pmu_format_attr' static
The sparse tool complains as follows:

arch/powerpc/perf/isa207-common.c:24:18: warning:
 symbol 'isa207_pmu_format_attr' was not declared. Should it be static?

This symbol is not used outside of isa207-common.c, so this
commit marks it static.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210409090119.59444-1-cuibixuan@huawei.com
2021-04-14 23:04:17 +10:00
Athira Rajeev
10f8f96179 powerpc/perf: Fix PMU constraint check for EBB events
The power PMU group constraints includes check for EBB events to make
sure all events in a group must agree on EBB. This will prevent
scheduling EBB and non-EBB events together. But in the existing check,
settings for constraint mask and value is interchanged. Patch fixes the
same.

Before the patch, PMU selftest "cpu_event_pinned_vs_ebb_test" fails with
below in dmesg logs. This happens because EBB event gets enabled along
with a non-EBB cpu event.

  [35600.453346] cpu_event_pinne[41326]: illegal instruction (4)
  at 10004a18 nip 10004a18 lr 100049f8 code 1 in
  cpu_event_pinned_vs_ebb_test[10000000+10000]

Test results after the patch:

  $ ./pmu/ebb/cpu_event_pinned_vs_ebb_test
  test: cpu_event_pinned_vs_ebb
  tags: git_version:v5.12-rc5-93-gf28c3125acd3-dirty
  Binding to cpu 8
  EBB Handler is at 0x100050c8
  read error on event 0x7fffe6bd4040!
  PM_RUN_INST_CMPL: result 9872 running/enabled 37930432
  success: cpu_event_pinned_vs_ebb

This bug was hidden by other logic until commit 1908dc9117 (perf:
Tweak perf_event_attr::exclusive semantics).

Fixes: 4df4899911 ("powerpc/perf: Add power8 EBB support")
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
[mpe: Mention commit 1908dc9117]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1617725761-1464-1-git-send-email-atrajeev@linux.vnet.ibm.com
2021-04-08 21:17:44 +10:00
Athira Rajeev
5ae5fbd210 powerpc/perf: Fix handling of privilege level checks in perf interrupt context
Running "perf mem record" in powerpc platforms with selinux enabled
resulted in soft lockup's. Below call-trace was seen in the logs:

  CPU: 58 PID: 3751 Comm: sssd_nss Not tainted 5.11.0-rc7+ #2
  NIP:  c000000000dff3d4 LR: c000000000dff3d0 CTR: 0000000000000000
  REGS: c000007fffab7d60 TRAP: 0100   Not tainted  (5.11.0-rc7+)
  ...
  NIP _raw_spin_lock_irqsave+0x94/0x120
  LR  _raw_spin_lock_irqsave+0x90/0x120
  Call Trace:
    0xc00000000fd47260 (unreliable)
    skb_queue_tail+0x3c/0x90
    audit_log_end+0x6c/0x180
    common_lsm_audit+0xb0/0xe0
    slow_avc_audit+0xa4/0x110
    avc_has_perm+0x1c4/0x260
    selinux_perf_event_open+0x74/0xd0
    security_perf_event_open+0x68/0xc0
    record_and_restart+0x6e8/0x7f0
    perf_event_interrupt+0x22c/0x560
    performance_monitor_exception0x4c/0x60
    performance_monitor_common_virt+0x1c8/0x1d0
  interrupt: f00 at _raw_spin_lock_irqsave+0x38/0x120
  NIP:  c000000000dff378 LR: c000000000b5fbbc CTR: c0000000007d47f0
  REGS: c00000000fd47860 TRAP: 0f00   Not tainted  (5.11.0-rc7+)
  ...
  NIP _raw_spin_lock_irqsave+0x38/0x120
  LR  skb_queue_tail+0x3c/0x90
  interrupt: f00
    0x38 (unreliable)
    0xc00000000aae6200
    audit_log_end+0x6c/0x180
    audit_log_exit+0x344/0xf80
    __audit_syscall_exit+0x2c0/0x320
    do_syscall_trace_leave+0x148/0x200
    syscall_exit_prepare+0x324/0x390
    system_call_common+0xfc/0x27c

The above trace shows that while the CPU was handling a performance
monitor exception, there was a call to security_perf_event_open()
function. In powerpc core-book3s, this function is called from
perf_allow_kernel() check during recording of data address in the
sample via perf_get_data_addr().

Commit da97e18458 ("perf_event: Add support for LSM and SELinux
checks") introduced security enhancements to perf. As part of this
commit, the new security hook for perf_event_open() was added in all
places where perf paranoid check was previously used. In powerpc
core-book3s code, originally had paranoid checks in
perf_get_data_addr() and power_pmu_bhrb_read(). So
perf_paranoid_kernel() checks were replaced with perf_allow_kernel()
in these PMU helper functions as well.

The intention of paranoid checks in core-book3s was to verify
privilege access before capturing some of the sample data. Along with
paranoid checks, perf_allow_kernel() also does a
security_perf_event_open(). Since these functions are accessed while
recording a sample, we end up calling selinux_perf_event_open() in PMI
context. Some of the security functions use spinlock like
sidtab_sid2str_put(). If a perf interrupt hits under a spin lock and
if we end up in calling selinux hook functions in PMI handler, this
could cause a dead lock.

Since the purpose of this security hook is to control access to
perf_event_open(), it is not right to call this in interrupt context.

The paranoid checks in powerpc core-book3s were done at interrupt time
which is also not correct.

Reference commits:
  Commit cd1231d703 ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()")
  Commit bb19af8160 ("powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer")

We only allow creation of events that have already passed the
privilege checks in perf_event_open(). So these paranoid checks are
not needed at event time. As a fix, patch uses
'event->attr.exclude_kernel' check to prevent exposing kernel address
for userspace only sampling.

Fixes: cd1231d703 ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()")
Cc: stable@vger.kernel.org # v4.17+
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1614247839-1428-1-git-send-email-atrajeev@linux.vnet.ibm.com
2021-03-02 22:41:51 +11:00
Linus Torvalds
b12b472496 powerpc updates for 5.12
A large series adding wrappers for our interrupt handlers, so that irq/nmi/user
 tracking can be isolated in the wrappers rather than spread in each handler.
 
 Conversion of the 32-bit syscall handling into C.
 
 A series from Nick to streamline our TLB flushing when using the Radix MMU.
 
 Switch to using queued spinlocks by default for 64-bit server CPUs.
 
 A rework of our PCI probing so that it happens later in boot, when more generic
 infrastructure is available.
 
 Two small fixes to allow 32-bit little-endian processes to run on 64-bit
 kernels.
 
 Other smaller features, fixes & cleanups.
 
 Thanks to:
   Alexey Kardashevskiy, Ananth N Mavinakayanahalli, Aneesh Kumar K.V, Athira
   Rajeev, Bhaskar Chowdhury, Cédric Le Goater, Chengyang Fan, Christophe Leroy,
   Christopher M. Riedl, Fabiano Rosas, Florian Fainelli, Frederic Barrat, Ganesh
   Goudar, Hari Bathini, Jiapeng Chong, Joseph J Allen, Kajol Jain, Markus
   Elfring, Michal Suchanek, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver
   O'Halloran, Pingfan Liu, Po-Hsu Lin, Qian Cai, Ram Pai, Randy Dunlap, Sandipan
   Das, Stephen Rothwell, Tyrel Datwyler, Will Springer, Yury Norov, Zheng
   Yongjun.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmAzMagTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgAbBD/wMS2g1Q9oAGZPsx2NGd2RoeAauGxUs
 Yj6cZVmR+oa6sJyFYgEG7dT7tcwJITQxLBD3HpsHSnJ/rLrMloE33+cZNA9c4STz
 0mlzm3R7M5pOgcEqZglsgLP0RQeUuHSSF01g0kf1N3r+HYtmbmPjuUIl8CnAjlbT
 iMD2ZN2p8/r3kDDht0iBO534HUpsqhc00duSZgQhsV/PR7ZWVxoPk7PEJeo4vXlJ
 77986F7J5NLUTjMiLv5lTx49FcPbRd7a1jubsBtahJrwXj2GVvuy2i86G7HY+a+B
 eSxN7zJQgaFeLo0YPo7fZLBI0MAsIQt3nnZhKX0TMglbv/K8Aq64xiJqsVQdJ883
 CeEt0HvSJhsSC0C4O595NEINfDhDd+5IeSF9MvsujYXiUKRXtRkm1EPuAzTcZIzW
 NwkCLRo33NMXa+khMKaiqF/g7INayPUXoWESx75NXFsuNfcORvstkeUuEoi5GwJo
 TSlmosFqwRjghQ8eTLZuWBzmh3EpPGdtC4gm6D+lbzhzjah5c/1whyuLqra275kK
 E3Qt0/V0ixKyvlG7MI5yYh3L7+R/hrsflH7xIJJxZp2DW6mwBJzQYmkxDbSS8PzK
 nWien2XgpIQhSFat3QqreEFSfNkzdN2MClVi2Y1hpAgi+2Zm9rPdPNGcQI+DSOsB
 kpJkjOjWNJU/PQ==
 =dB2S
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - A large series adding wrappers for our interrupt handlers, so that
   irq/nmi/user tracking can be isolated in the wrappers rather than
   spread in each handler.

 - Conversion of the 32-bit syscall handling into C.

 - A series from Nick to streamline our TLB flushing when using the
   Radix MMU.

 - Switch to using queued spinlocks by default for 64-bit server CPUs.

 - A rework of our PCI probing so that it happens later in boot, when
   more generic infrastructure is available.

 - Two small fixes to allow 32-bit little-endian processes to run on
   64-bit kernels.

 - Other smaller features, fixes & cleanups.

Thanks to: Alexey Kardashevskiy, Ananth N Mavinakayanahalli, Aneesh
Kumar K.V, Athira Rajeev, Bhaskar Chowdhury, Cédric Le Goater, Chengyang
Fan, Christophe Leroy, Christopher M. Riedl, Fabiano Rosas, Florian
Fainelli, Frederic Barrat, Ganesh Goudar, Hari Bathini, Jiapeng Chong,
Joseph J Allen, Kajol Jain, Markus Elfring, Michal Suchanek, Nathan
Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Pingfan Liu,
Po-Hsu Lin, Qian Cai, Ram Pai, Randy Dunlap, Sandipan Das, Stephen
Rothwell, Tyrel Datwyler, Will Springer, Yury Norov, and Zheng Yongjun.

* tag 'powerpc-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (188 commits)
  powerpc/perf: Adds support for programming of Thresholding in P10
  powerpc/pci: Remove unimplemented prototypes
  powerpc/uaccess: Merge raw_copy_to_user_allowed() into raw_copy_to_user()
  powerpc/uaccess: Merge __put_user_size_allowed() into __put_user_size()
  powerpc/uaccess: get rid of small constant size cases in raw_copy_{to,from}_user()
  powerpc/64: Fix stack trace not displaying final frame
  powerpc/time: Remove get_tbl()
  powerpc/time: Avoid using get_tbl()
  spi: mpc52xx: Avoid using get_tbl()
  powerpc/syscall: Avoid storing 'current' in another pointer
  powerpc/32: Handle bookE debugging in C in syscall entry/exit
  powerpc/syscall: Do not check unsupported scv vector on PPC32
  powerpc/32: Remove the counter in global_dbcr0
  powerpc/32: Remove verification of MSR_PR on syscall in the ASM entry
  powerpc/syscall: implement system call entry/exit logic in C for PPC32
  powerpc/32: Always save non volatile GPRs at syscall entry
  powerpc/syscall: Change condition to check MSR_RI
  powerpc/syscall: Save r3 in regs->orig_r3
  powerpc/syscall: Use is_compat_task()
  powerpc/syscall: Make interrupt.c buildable on PPC32
  ...
2021-02-22 14:34:00 -08:00
Kajol Jain
82d2c16b35 powerpc/perf: Adds support for programming of Thresholding in P10
Thresholding, a performance monitoring unit feature, can be
used to identify marked instructions which take more than
expected cycles between start event and end event.
Threshold compare (thresh_cmp) bits are programmed in MMCRA
register. In Power9, thresh_cmp bits were part of the
event code. But in case of P10, thresh_cmp are not part of
event code due to inclusion of MMCR3 bits.

Patch here adds an option to use attr.config1 variable
to be used to pass thresh_cmp value to be programmed in
MMCRA register. A new ppmu flag called PPMU_HAS_ATTR_CONFIG1
has been added and this flag is used to notify the use of
attr.config1 variable.

Patch has extended the parameter list of 'compute_mmcr',
to include power_pmu's 'flags' element and parameter list of
get_constraint to include attr.config1 value. It also extend
parameter list of power_check_constraints inorder to pass
perf_event list.

As stated by commit ef0e3b650f ("powerpc/perf: Fix Threshold
Event Counter Multiplier width for P10"), constraint bits for
thresh_cmp is also needed to be increased to 11 bits, which is
handled as part of this patch. We added bit number 53 as part
of constraint bits of thresh_cmp for power10 to make it an
11 bit field.

Updated layout for p10:

/*
 * Layout of constraint bits:
 *
 *        60        56        52        48        44        40        36        32
 * | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - |
 *   [   fab_match   ]         [       thresh_cmp      ] [   thresh_ctl    ] [   ]
 *                                          |                                  |
 *                           [  thresh_cmp bits for p10]           thresh_sel -*
 *
 *        28        24        20        16        12         8         4         0
 * | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - |
 *               [ ] |   [ ] |  [  sample ]   [     ]   [6] [5]   [4] [3]   [2] [1]
 *                |  |    |  |                  |
 *      BHRB IFM -*  |    |  |*radix_scope      |      Count of events for each PMC.
 *              EBB -*    |                     |        p1, p2, p3, p4, p5, p6.
 *      L1 I/D qualifier -*                     |
 *                     nc - number of counters -*
 *
 * The PMC fields P1..P6, and NC, are adder fields. As we accumulate constraints
 * we want the low bit of each field to be added to any existing value.
 *
 * Everything else is a value field.
 */

Result:
command#: cat /sys/devices/cpu/format/thresh_cmp
config1:0-17

ex. usage:

command#: perf record -I --weight -d  -e
	 cpu/event=0x67340101EC,thresh_cmp=500/ ./ebizzy -S 2 -t 1 -s 4096
1826636 records/s
real  2.00 s
user  2.00 s
sys   0.00 s
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.038 MB perf.data (61 samples) ]

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210209095234.837356-1-kjain@linux.ibm.com
2021-02-11 23:35:36 +11:00
Athira Rajeev
d137845c97 powerpc/perf: Record counter overflow always if SAMPLE_IP is unset
While sampling for marked events, currently we record the sample only
if the SIAR valid bit of Sampled Instruction Event Register (SIER) is
set. SIAR_VALID bit is used for fetching the instruction address from
Sampled Instruction Address Register(SIAR). But there are some
usecases, where the user is interested only in the PMU stats at each
counter overflow and the exact IP of the overflow event is not
required. Dropping SIAR invalid samples will fail to record some of
the counter overflows in such cases.

Example of such usecase is dumping the PMU stats (event counts) after
some regular amount of instructions/events from the userspace (ex: via
ptrace). Here counter overflow is indicated to userspace via signal
handler, and captured by monitoring and enabling I/O signaling on the
event file descriptor. In these cases, we expect to get
sample/overflow indication after each specified sample_period.

Perf event attribute will not have PERF_SAMPLE_IP set in the
sample_type if exact IP of the overflow event is not requested. So
while profiling if SAMPLE_IP is not set, just record the counter
overflow irrespective of SIAR_VALID check.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
[mpe: Reflow comment and if formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1612516492-1428-1-git-send-email-atrajeev@linux.vnet.ibm.com
2021-02-09 01:09:46 +11:00
Athira Rajeev
e79b76e03b powerpc/perf: Expose Performance Monitor Counter SPR's as part of extended regs
Currently Monitor Mode Control Registers and Sampling registers are
part of extended regs. Patch adds support to include Performance Monitor
Counter Registers (PMC1 to PMC6 ) as part of extended registers.

PMCs are saved in the perf interrupt handler as part of
per-cpu array 'pmcs' in struct cpu_hw_events. While capturing
the register values for extended regs, fetch these saved PMC values.

Simplified the PERF_REG_PMU_MASK_300/31 definition to include PMU
SPRs MMCR0 to PMC6. Exclude the unsupported SPRs (MMCR3, SIER2, SIER3)
from extended mask value for CPU_FTR_ARCH_300 in the new definition.

PERF_REG_EXTENDED_MAX is used to check if any index beyond the extended
registers is requested in the sample. Have one PERF_REG_EXTENDED_MAX
for CPU_FTR_ARCH_300/CPU_FTR_ARCH_31 since perf_reg_validate function
already checks the extended mask for the presence of any unsupported
register.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1612335337-1888-3-git-send-email-atrajeev@linux.vnet.ibm.com
2021-02-09 01:09:44 +11:00
Athira Rajeev
91f3469a43 powerpc/perf: Include PMCs as part of per-cpu cpuhw_events struct
To support capturing of PMC's as part of extended registers, the
value of SPR's PMC1 to PMC6 has to be saved in the starting of PMI
interrupt handler. This is needed since we are resetting the
overflown PMC before creating sample and hence directly reading
SPRN_PMCx in 'perf_reg_value' will be capturing the modified value.

To solve this, add a per-cpu array as part of structure cpu_hw_events
and use this array to capture PMC values in the perf interrupt handler.
Patch also re-factor's the interrupt handler code to use this per-cpu
array instead of current local array.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1612335337-1888-2-git-send-email-atrajeev@linux.vnet.ibm.com
2021-02-09 01:09:44 +11:00
Nicholas Piggin
156b5371a9 powerpc/perf: move perf irq/nmi handling details into traps.c
This is required in order to allow more significant differences between
NMI type interrupt handlers and regular asynchronous handlers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210130130852.2952424-20-npiggin@gmail.com
2021-02-09 00:02:10 +11:00
Kan Liang
2a6c6b7d7a perf/core: Add PERF_SAMPLE_WEIGHT_STRUCT
Current PERF_SAMPLE_WEIGHT sample type is very useful to expresses the
cost of an action represented by the sample. This allows the profiler
to scale the samples to be more informative to the programmer. It could
also help to locate a hotspot, e.g., when profiling by memory latencies,
the expensive load appear higher up in the histograms. But current
PERF_SAMPLE_WEIGHT sample type is solely determined by one factor. This
could be a problem, if users want two or more factors to contribute to
the weight. For example, Golden Cove core PMU can provide both the
instruction latency and the cache Latency information as factors for the
memory profiling.

For current X86 platforms, although meminfo::latency is defined as a
u64, only the lower 32 bits include the valid data in practice (No
memory access could last than 4G cycles). The higher 32 bits can be used
to store new factors.

Add a new sample type, PERF_SAMPLE_WEIGHT_STRUCT, to indicate the new
sample weight structure. It shares the same space as the
PERF_SAMPLE_WEIGHT sample type.

Users can apply either the PERF_SAMPLE_WEIGHT sample type or the
PERF_SAMPLE_WEIGHT_STRUCT sample type to retrieve the sample weight, but
they cannot apply both sample types simultaneously.

Currently, only X86 and PowerPC use the PERF_SAMPLE_WEIGHT sample type.
- For PowerPC, there is nothing changed for the PERF_SAMPLE_WEIGHT
  sample type. There is no effect for the new PERF_SAMPLE_WEIGHT_STRUCT
  sample type. PowerPC can re-struct the weight field similarly later.
- For X86, the same value will be dumped for the PERF_SAMPLE_WEIGHT
  sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample type for now.
  The following patches will apply the new factors for the
  PERF_SAMPLE_WEIGHT_STRUCT sample type.

The field in the union perf_sample_weight should be shared among
different architectures. A generic name is required, but it's hard to
abstract a name that applies to all architectures. For example, on X86,
the fields are to store all kinds of latency. While on PowerPC, it
stores MMCRA[TECX/TECM], which should not be latency. So a general name
prefix 'var$NUM' is used here.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1611873611-156687-2-git-send-email-kan.liang@linux.intel.com
2021-02-01 15:31:36 +01:00
Kajol Jain
e5f9d88586 powerpc/perf/hv-24x7: Dont create sysfs event files for dummy events
The hv_24x7 performance monitoring unit creates a list of supported
events from the event catalog obtained via HCALL. The hv_24x7 catalog
could also contain invalid or dummy events with names like RESERVED*.
These events do not have any hardware counters backing them. Add a
check to string compare the event names to filter such events out.

Result on power9 machine:

Before this patch:
  .....
  hv_24x7/PM_XLINK2_OUT_ODD_CYC,chip=?/              [Kernel PMU event]
  hv_24x7/PM_XLINK2_OUT_ODD_DATA_COUNT,chip=?/       [Kernel PMU event]
  hv_24x7/PM_XLINK2_OUT_ODD_TOTAL_UTIL,chip=?/       [Kernel PMU event]
  hv_24x7/PM_XTS_ATR_DEMAND_CHECKOUT,chip=?/         [Kernel PMU event]
  hv_24x7/PM_XTS_ATR_DEMAND_CHECKOUT_MISS,chip=?/    [Kernel PMU event]
  hv_24x7/PM_XTS_ATSD_SENT,chip=?/                   [Kernel PMU event]
  hv_24x7/PM_XTS_ATSD_TLBI_RCV,chip=?/               [Kernel PMU event]
  hv_24x7/RESERVED_NEST1,chip=?/                     [Kernel PMU event]
  hv_24x7/RESERVED_NEST10,chip=?/                    [Kernel PMU event]
  hv_24x7/RESERVED_NEST11,chip=?/                    [Kernel PMU event]
  hv_24x7/RESERVED_NEST12,chip=?/                    [Kernel PMU event]
  hv_24x7/RESERVED_NEST13,chip=?/                    [Kernel PMU event]
  ......

dmesg:
  [    0.000362] printk: console [hvc0] enabled
  [    0.815452] hv-24x7: read 1530 catalog entries, created 537 event attrs (0 failures), 275 descs

After this patch:
  ......
  hv_24x7/PM_XLINK2_OUT_ODD_AVLBL_CYC,chip=?/        [Kernel PMU event]
  hv_24x7/PM_XLINK2_OUT_ODD_CYC,chip=?/              [Kernel PMU event]
  hv_24x7/PM_XLINK2_OUT_ODD_DATA_COUNT,chip=?/       [Kernel PMU event]
  hv_24x7/PM_XLINK2_OUT_ODD_TOTAL_UTIL,chip=?/       [Kernel PMU event]
  hv_24x7/PM_XTS_ATR_DEMAND_CHECKOUT,chip=?/         [Kernel PMU event]
  hv_24x7/PM_XTS_ATR_DEMAND_CHECKOUT_MISS,chip=?/    [Kernel PMU event]
  hv_24x7/PM_XTS_ATSD_SENT,chip=?/                   [Kernel PMU event]
  hv_24x7/PM_XTS_ATSD_TLBI_RCV,chip=?/               [Kernel PMU event]
  hv_24x7/TOD,chip=?/                                [Kernel PMU event]
  ......

dmesg:
  [    0.000357] printk: console [hvc0] enabled
  [    0.808592] hv-24x7: read 1530 catalog entries, created 509 event attrs (0 failures), 275 descs

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
[mpe: Simplify ignore_event(), minor change log formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201228085204.18026-1-kjain@linux.ibm.com
2021-01-30 11:39:25 +11:00
Linus Torvalds
8a5be36b93 powerpc updates for 5.11
- Switch to the generic C VDSO, as well as some cleanups of our VDSO
    setup/handling code.
 
  - Support for KUAP (Kernel User Access Prevention) on systems using the hashed
    page table MMU, using memory protection keys.
 
  - Better handling of PowerVM SMT8 systems where all threads of a core do not
    share an L2, allowing the scheduler to make better scheduling decisions.
 
  - Further improvements to our machine check handling.
 
  - Show registers when unwinding interrupt frames during stack traces.
 
  - Improvements to our pseries (PowerVM) partition migration code.
 
  - Several series from Christophe refactoring and cleaning up various parts of
    the 32-bit code.
 
  - Other smaller features, fixes & cleanups.
 
 Thanks to:
   Alan Modra, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Ard
   Biesheuvel, Athira Rajeev, Balamuruhan S, Bill Wendling, Cédric Le Goater,
   Christophe Leroy, Christophe Lombard, Colin Ian King, Daniel Axtens, David
   Hildenbrand, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geert
   Uytterhoeven, Giuseppe Sacco, Greg Kurz, Harish, Jan Kratochvil, Jordan
   Niethe, Kaixu Xia, Laurent Dufour, Leonardo Bras, Madhavan Srinivasan, Mahesh
   Salgaonkar, Mathieu Desnoyers, Nathan Lynch, Nicholas Piggin, Oleg Nesterov,
   Oliver O'Halloran, Oscar Salvador, Po-Hsu Lin, Qian Cai, Qinglang Miao, Randy
   Dunlap, Ravi Bangoria, Sachin Sant, Sandipan Das, Sebastian Andrzej Siewior ,
   Segher Boessenkool, Srikar Dronamraju, Tyrel Datwyler, Uwe Kleine-König,
   Vincent Stehlé, Youling Tang, Zhang Xiaoxu.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl/bURITHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgEzBEAC1Vwibcog2P9rkJPb0q3UGWSYSx25V
 h/LwwxtM9Tm14j/LZsSgkOgIsfMaWEBIw/8D4efQ7AX9aFo+R0c2DdQMx1MG5MXz
 gZk58+l3LwId6h9+OrwurpEW+ZmURLAtGMSyFdkeiZ3/XTnkbf1XnewC0QWQe56a
 EGLmjx1MFl45jspoy7UIUXsXoNJIfflEKhrgUzSUh8X2eLmvB9ws6A4BXxbVzyZl
 lZv3+uWimU2pFgdkB9jOCxoG4zFEr2o5ovLHG7zCCVo5JoXmTPQ5cMVBraH206ms
 +5vCmu4qI8uP5UlZW/mZfhrtDiMdHdQqqFOaQwVlOmoUbU6L6E6rxm3iVnov2Bbi
 iUgxoeJDxAb2cM2EWFK6oWVgr7+NkwvXM1IG8xtprhHrCdnC9r+psQr/dswb3LSg
 MJ7u/RCq3uixy2kWP8E0NEHw7ngQZ/ZKPqzfnmIWOC7tYUxgaL02I8Ff9/ZXAI2J
 CnmqFYOjrimHkcwXGOtKkXNvfU0DiL97qpK2AQNWElE8+bUUmpw+ltUrsdSycYmv
 Afc4WIcVrTA+a9laSLgjdZbolbNSa3p+cMIYdPrVx9g+xqygbxIKv+EDGNv1WHfD
 GU1gmohMY+ZkMOvFRMi8LAsEm0DH/etWE0py/8uyxDYKnGyD1Ur6452DStkmgGb2
 azmcaOyLdb+HXA==
 =Ga3K
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Switch to the generic C VDSO, as well as some cleanups of our VDSO
   setup/handling code.

 - Support for KUAP (Kernel User Access Prevention) on systems using the
   hashed page table MMU, using memory protection keys.

 - Better handling of PowerVM SMT8 systems where all threads of a core
   do not share an L2, allowing the scheduler to make better scheduling
   decisions.

 - Further improvements to our machine check handling.

 - Show registers when unwinding interrupt frames during stack traces.

 - Improvements to our pseries (PowerVM) partition migration code.

 - Several series from Christophe refactoring and cleaning up various
   parts of the 32-bit code.

 - Other smaller features, fixes & cleanups.

Thanks to: Alan Modra, Alexey Kardashevskiy, Andrew Donnellan, Aneesh
Kumar K.V, Ard Biesheuvel, Athira Rajeev, Balamuruhan S, Bill Wendling,
Cédric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian King,
Daniel Axtens, David Hildenbrand, Frederic Barrat, Ganesh Goudar,
Gautham R. Shenoy, Geert Uytterhoeven, Giuseppe Sacco, Greg Kurz,
Harish, Jan Kratochvil, Jordan Niethe, Kaixu Xia, Laurent Dufour,
Leonardo Bras, Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu
Desnoyers, Nathan Lynch, Nicholas Piggin, Oleg Nesterov, Oliver
O'Halloran, Oscar Salvador, Po-Hsu Lin, Qian Cai, Qinglang Miao, Randy
Dunlap, Ravi Bangoria, Sachin Sant, Sandipan Das, Sebastian Andrzej
Siewior , Segher Boessenkool, Srikar Dronamraju, Tyrel Datwyler, Uwe
Kleine-König, Vincent Stehlé, Youling Tang, and Zhang Xiaoxu.

* tag 'powerpc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (304 commits)
  powerpc/32s: Fix cleanup_cpu_mmu_context() compile bug
  powerpc: Add config fragment for disabling -Werror
  powerpc/configs: Add ppc64le_allnoconfig target
  powerpc/powernv: Rate limit opal-elog read failure message
  powerpc/pseries/memhotplug: Quieten some DLPAR operations
  powerpc/ps3: use dma_mapping_error()
  powerpc: force inlining of csum_partial() to avoid multiple csum_partial() with GCC10
  powerpc/perf: Fix Threshold Event Counter Multiplier width for P10
  powerpc/mm: Fix hugetlb_free_pmd_range() and hugetlb_free_pud_range()
  KVM: PPC: Book3S HV: Fix mask size for emulated msgsndp
  KVM: PPC: fix comparison to bool warning
  KVM: PPC: Book3S: Assign boolean values to a bool variable
  powerpc: Inline setup_kup()
  powerpc/64s: Mark the kuap/kuep functions non __init
  KVM: PPC: Book3S HV: XIVE: Add a comment regarding VP numbering
  powerpc/xive: Improve error reporting of OPAL calls
  powerpc/xive: Simplify xive_do_source_eoi()
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG
  ...
2020-12-17 13:34:25 -08:00
Linus Torvalds
5e60366d56 fallthrough fixes for Clang for 5.11-rc1
Hi Linus,
 
 Please, pull the following patches that fix many fall-through warnings
 when building with Clang 12.0.0 and this[1] change reverted. Notice
 that in order to enable -Wimplicit-fallthrough for Clang, such change[1]
 is meant to be reverted at some point. So, these patches help to move
 in that direction.
 
 - powerpc: boot: include compiler_attributes.h (Nick Desaulniers)
 - Revert "lib: Revert use of fallthrough pseudo-keyword in lib/" (Nick Desaulniers)
 - powerpc: fix -Wimplicit-fallthrough (Nick Desaulniers)
 - lib: Fix fall-through warnings for Clang (Gustavo A. R. Silva)
 
 Thanks!
 
 [1] commit e2079e93f5 ("kbuild: Do not enable -Wimplicit-fallthrough for clang for now")
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAl/Xc1sACgkQRwW0y0cG
 2zHa7w/+In8K5ZCTLQoFC82uLLcXtlTb74FaoHeBaxTkJDsCbznV1+1Ivyg3ivjX
 rHfEFvpF3OANkVeyJ3tGOlKf5vK7Mvmc1KEuTghJtLqeDgmxtWJsJMeCGnJxsmuo
 jc9+0/QJ1XoLQMvG6Dwe3rhZn5UtS7k6tn3vHSIGv0obvjeJSDWTvh2oiDke5aTn
 u2nUYeYi/TTCiJn2Rnew8bdbNqhJWQEMYZWMxkPld+8FY9OMw7Hy6/NHi0cljpEj
 yAbKByzQKRaR94scNAjIDyW2A6jZH5VeebBPxxjYHvRtB95gXmTQgOe+qgR7ncGy
 +ilaA54aJVKjJB9W4RJH8YzApGInVT8NtyZ+GV3YhXoHTW5O2JMNAjHMgcpFcOus
 wx5xmyvM8qWUMUs9AYuUbdYXMca/NV2X9bvDj8B88N9OTw8VLF7Eb/SLyEgGzplQ
 QjBsIfTPrBo36YMHDb9R+MiprCGIQvGr8o2DqGSMHA/xFT1+5dyMgCoDKLxepQ8Q
 bvSNmcuXiQKQUlFxlXmt4k9tUdtVKPhj7VaQqiTTLH+6XUg44LlWfn8gMixqRMNh
 M3NWERb+I32HmodPJnbExA1YgxgblcXLdLxlvjhMhsH6cyc+2q0BYbfAUji4hrPh
 Ao4GuUDA/7xkPK7pO5LeOuRTZA8fY/FJaXjEZuelWEYdRdVtapo=
 =HLCX
 -----END PGP SIGNATURE-----

Merge tag 'fallthrough-fixes-clang-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull fallthrough fixes from Gustavo A. R. Silva:
 "Fix many fall-through warnings when building with Clang 12.0.0
  using -Wimplicit-fallthrough.

   - powerpc: boot: include compiler_attributes.h (Nick Desaulniers)

   - Revert "lib: Revert use of fallthrough pseudo-keyword in lib/"
     (Nick Desaulniers)

   - powerpc: fix -Wimplicit-fallthrough (Nick Desaulniers)

   - lib: Fix fall-through warnings for Clang (Gustavo A. R. Silva)"

* tag 'fallthrough-fixes-clang-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  lib: Fix fall-through warnings for Clang
  powerpc: fix -Wimplicit-fallthrough
  Revert "lib: Revert use of fallthrough pseudo-keyword in lib/"
  powerpc: boot: include compiler_attributes.h
2020-12-16 00:24:16 -08:00
Madhavan Srinivasan
ef0e3b650f powerpc/perf: Fix Threshold Event Counter Multiplier width for P10
Threshold Event Counter Multiplier (TECM) is part of Monitor Mode
Control Register A (MMCRA). This field along with Threshold Event
Counter Exponent (TECE) is used to get threshould counter value.
In Power10, this is a 8bit field, so patch fixes the
current code to modify the MMCRA[TECM] extraction macro to
handle this change. ISA v3.1 says this is a 7 bit field but
POWER10 it's actually 8 bits which will hopefully be fixed
in ISA v3.1 update.

Fixes: 170a315f41 ("powerpc/perf: Support to export MMCRA[TEC*] field to userspace")
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1608022578-1532-1-git-send-email-atrajeev@linux.vnet.ibm.com
2020-12-15 22:50:04 +11:00
Athira Rajeev
aa8e21c053 powerpc/perf: Exclude kernel samples while counting events in user space.
Perf event attritube supports exclude_kernel flag to avoid
sampling/profiling in supervisor state (kernel). Based on this event
attr flag, Monitor Mode Control Register bit is set to freeze on
supervisor state. But sometimes (due to hardware limitation), Sampled
Instruction Address Register (SIAR) locks on to kernel address even
when freeze on supervisor is set. Patch here adds a check to drop
those samples.

Cc: stable@vger.kernel.org
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606289215-1433-1-git-send-email-atrajeev@linux.vnet.ibm.com
2020-12-10 23:04:35 +11:00
Christophe Leroy
70b588a068 powerpc/ppc-opcode: Add PPC_RAW_MFSPR()
Add PPC_RAW_MFSPR() to replace open coding done in 8xx-pmu.c

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e281e3a611eead8817c49cf06a60072a021af823.1606231483.git.christophe.leroy@csgroup.eu
2020-12-09 23:48:13 +11:00
Christophe Leroy
89eecd938c powerpc/8xx: Use SPRN_SPRG_SCRATCH2 in DTLB miss exception
Use SPRN_SPRG_SCRATCH2 in DTLB miss exception instead of DAR
in order to be similar to ITLB miss exception.

This also simplifies mpc8xx_pmu_del()

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e3cc8f023ef40e1e8ae144e4dd1330a5ff022528.1606231483.git.christophe.leroy@csgroup.eu
2020-12-09 23:48:13 +11:00
Christophe Leroy
a314ea5abf powerpc/8xx: Use SPRN_SPRG_SCRATCH2 in ITLB miss exception
In order to re-enable MMU earlier, ensure ITLB miss exception
cannot clobber SPRN_SPRG_SCRATCH0 and SPRN_SPRG_SCRATCH1.
Do so by using SPRN_SPRG_SCRATCH2 and SPRN_M_TW instead, like
the DTLB miss exception.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/abc78e8e9577d473691ebb9996c6413b37bfd9ca.1606231483.git.christophe.leroy@csgroup.eu
2020-12-09 23:48:12 +11:00
Athira Rajeev
91668ab7db powerpc/perf: MMCR0 control for PMU registers under PMCC=00
PowerISA v3.1 introduces new control bit (PMCCEXT) for restricting
access to group B PMU registers in problem state when
MMCR0 PMCC=0b00. In problem state and when MMCR0 PMCC=0b00,
setting the Monitor Mode Control Register bit 54 (MMCR0 PMCCEXT),
will restrict read permission on Group B Performance Monitor
Registers (SIER, SIAR, SDAR and MMCR1). When this bit is set to zero,
group B registers will be readable. In other platforms (like power9),
the older behaviour is retained where group B PMU SPRs are readable.

Patch adds support for MMCR0 PMCCEXT bit in power10 by enabling
this bit during boot and during the PMU event enable/disable callback
functions.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-8-git-send-email-atrajeev@linux.vnet.ibm.com
2020-12-04 01:01:29 +11:00
Athira Rajeev
9a8ee52634 powerpc/perf: Fix to update cache events with l2l3 events in power10
Export l2l3 events (PM_L2_ST_MISS and PM_L2_ST) and LLC-prefetches
(PM_L3_PF_MISS_L3) via sysfs, and also add these to list of
cache_events.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-7-git-send-email-atrajeev@linux.vnet.ibm.com
2020-12-04 01:01:29 +11:00
Athira Rajeev
1f12316394 powerpc/perf: Fix to update generic event codes for power10
Fix the event code for events: branch-instructions (to PM_BR_FIN),
branch-misses (to PM_MPRED_BR_FIN) and cache-misses (to
PM_LD_DEMAND_MISS_L1_FIN) for power10 PMU. Update the
list of generic events with this modified event code.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-6-git-send-email-atrajeev@linux.vnet.ibm.com
2020-12-04 01:01:29 +11:00
Athira Rajeev
c0e3985790 powerpc/perf: Add generic and cache event list for power10 DD1
There are event code updates for some of the generic events
and cache events for power10. Inorder to maintain the current
event codes work with DD1 also, create a new array of generic_events,
cache_events and pmu_attr_groups with suffix _dd1, example,
power10_events_attr_dd1. So that further updates to event codes
can be made in the original list, ie, power10_events_attr. Update the
power10 pmu init code to pick the dd1 list while registering
the power PMU, based on the pvr (Processor Version Register) value.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-5-git-send-email-atrajeev@linux.vnet.ibm.com
2020-12-04 01:01:29 +11:00
Athira Rajeev
0263bbb377 powerpc/perf: Fix the PMU group constraints for threshold events in power10
The PMU group constraints mask for threshold events covers
all thresholding bits which includes threshold control value
(start/stop), select value as well as thresh_cmp value (MMCRA[9:18].
In power9, thresh_cmp bits were part of the event code. But in case
of power10, thresh_cmp bits are not part of event code due to
inclusion of MMCR3 bits. Hence thresh_cmp is not valid for
group constraints for power10.

Fix the PMU group constraints checking for threshold events in
power10 by using constraint mask and value for only threshold control
and select bits.

Fixes: a64e697cef ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-4-git-send-email-atrajeev@linux.vnet.ibm.com
2020-12-04 01:01:28 +11:00
Athira Rajeev
e924be7b0b powerpc/perf: Update the PMU group constraints for l2l3 events in power10
In Power9, L2/L3 bus events are always available as a
"bank" of 4 events. To obtain the counts for any of the
l2/l3 bus events in a given bank, the user will have to
program PMC4 with corresponding l2/l3 bus event for that
bank.

Commit 59029136d7 ("powerpc/perf: Add constraints for power9 l2/l3 bus events")
enforced this rule in Power9. But this is not valid for
Power10, since in Power10 Monitor Mode Control Register2
(MMCR2) has bits to configure l2/l3 event bits. Hence remove
this PMC4 constraint check from power10.

Since the l2/l3 bits in MMCR2 are not per-pmc, patch handles
group constrints checks for l2/l3 bits in MMCR2.

Fixes: a64e697cef ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-3-git-send-email-atrajeev@linux.vnet.ibm.com
2020-12-04 01:01:28 +11:00
Athira Rajeev
d3afd28cd2 powerpc/perf: Fix to update radix_scope_qual in power10
power10 uses bit 9 of the raw event code as RADIX_SCOPE_QUAL.
This bit is used for enabling the radix process events.
Patch fixes the PMU counter support functions to program bit
18 of MMCR1 ( Monitor Mode Control Register1 ) with the
RADIX_SCOPE_QUAL bit value. Since this field is not per-pmc,
add this to PMU group constraints to make sure events in a
group will have same bit value for this field. Use bit 21 as
constraint bit field for radix_scope_qual. Patch also updates
the power10 raw event encoding layout information, format field
and constraints bit layout to include the radix_scope_qual bit.

Fixes: a64e697cef ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606409684-1589-2-git-send-email-atrajeev@linux.vnet.ibm.com
2020-12-04 01:01:28 +11:00
Athira Rajeev
f66de7ac48 powerpc/perf: Invoke per-CPU variable access with disabled interrupts
The power_pmu_event_init() callback access per-cpu variable
(cpu_hw_events) to check for event constraints and Branch Stack
(BHRB). Current usage is to disable preemption when accessing the
per-cpu variable, but this does not prevent timer callback from
interrupting event_init. Fix this by using 'local_irq_save/restore'
to make sure the code path is invoked with disabled interrupts.

This change is tested in mambo simulator to ensure that, if a timer
interrupt comes in during the per-cpu access in event_init, it will be
soft masked and replayed later. For testing purpose, introduced a
udelay() in power_pmu_event_init() to make sure a timer interrupt arrives
while in per-cpu variable access code between local_irq_save/resore.
As expected the timer interrupt was replayed later during local_irq_restore
called from power_pmu_event_init. This was confirmed by adding
breakpoint in mambo and checking the backtrace when timer_interrupt
was hit.

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606814880-1720-1-git-send-email-atrajeev@linux.vnet.ibm.com
2020-12-04 01:01:21 +11:00
Christophe Leroy
91bf695596 powerpc/vdso: Retrieve sigtramp offsets at buildtime
This is copied from arm64.

Instead of using runtime generated signal trampoline offsets,
get offsets at buildtime.

If the said trampoline doesn't exist, build will fail. So no
need to check whether the trampoline exists or not in the VDSO.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f8bfd6812c3e3678b1cdb4d55a52f9eb022b40d3.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:17 +11:00
Christophe Leroy
c102f07667 powerpc/vdso: Replace vdso_base by vdso
All other architectures but s390 use a void pointer named 'vdso'
to reference the VDSO mapping.

In a following patch, the VDSO data page will be put in front of
text, vdso_base will then not anymore point to VDSO text.

To avoid confusion between vdso_base and VDSO text, rename vdso_base
into vdso and make it a void __user *.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8e6cefe474aa4ceba028abb729485cd46c140990.1601197618.git.christophe.leroy@csgroup.eu
2020-12-04 01:01:16 +11:00
Athira Rajeev
f75e7d73bd powerpc/perf: Fix crash with is_sier_available when pmu is not set
On systems without any specific PMU driver support registered, running
'perf record' with —intr-regs  will crash ( perf record -I <workload> ).

The relevant portion from crash logs and Call Trace:

Unable to handle kernel paging request for data at address 0x00000068
Faulting instruction address: 0xc00000000013eb18
Oops: Kernel access of bad area, sig: 11 [#1]
CPU: 2 PID: 13435 Comm: kill Kdump: loaded Not tainted 4.18.0-193.el8.ppc64le #1
NIP:  c00000000013eb18 LR: c000000000139f2c CTR: c000000000393d80
REGS: c0000004a07ab4f0 TRAP: 0300   Not tainted  (4.18.0-193.el8.ppc64le)
NIP [c00000000013eb18] is_sier_available+0x18/0x30
LR [c000000000139f2c] perf_reg_value+0x6c/0xb0
Call Trace:
[c0000004a07ab770] [c0000004a07ab7c8] 0xc0000004a07ab7c8 (unreliable)
[c0000004a07ab7a0] [c0000000003aa77c] perf_output_sample+0x60c/0xac0
[c0000004a07ab840] [c0000000003ab3f0] perf_event_output_forward+0x70/0xb0
[c0000004a07ab8c0] [c00000000039e208] __perf_event_overflow+0x88/0x1a0
[c0000004a07ab910] [c00000000039e42c] perf_swevent_hrtimer+0x10c/0x1d0
[c0000004a07abc50] [c000000000228b9c] __hrtimer_run_queues+0x17c/0x480
[c0000004a07abcf0] [c00000000022aaf4] hrtimer_interrupt+0x144/0x520
[c0000004a07abdd0] [c00000000002a864] timer_interrupt+0x104/0x2f0
[c0000004a07abe30] [c0000000000091c4] decrementer_common+0x114/0x120

When perf record session is started with "-I" option, capturing registers
on each sample calls is_sier_available() to check for the
SIER (Sample Instruction Event Register) availability in the platform.
This function in core-book3s accesses 'ppmu->flags'. If a platform specific
PMU driver is not registered, ppmu is set to NULL and accessing its
members results in a crash. Fix the crash by returning false in
is_sier_available() if ppmu is not set.

Fixes: 333804dc3b ("powerpc/perf: Update perf_regs structure to include SIER")
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1606185640-1720-1-git-send-email-atrajeev@linux.vnet.ibm.com
2020-12-04 01:01:09 +11:00
Peter Zijlstra
20c7775aec Merge remote-tracking branch 'origin/master' into perf/core
Further perf/core patches will depend on:

  d3f7b1bb20 ("mm/gup: fix gup_fast with dynamic page table folding")

which is already in Linus' tree.
2020-11-26 13:16:55 +01:00
Madhavan Srinivasan
2ca13a4cc5 powerpc/perf: Use regs->nip when SIAR is zero
In power10 DD1, there is an issue where the SIAR (Sampled Instruction
Address Register) is not latching to the sampled address during random
sampling. This results in value of 0s in the SIAR. Add a check to use
regs->nip when SIAR is zero.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201021085329.384535-5-maddy@linux.ibm.com
2020-11-19 16:56:56 +11:00
Athira Rajeev
d9f7088dd6 powerpc/perf: Use the address from SIAR register to set cpumode flags
While setting the processor mode for any sample, perf_get_misc_flags()
expects the privilege level to differentiate the userspace and kernel
address. On power10 DD1, there is an issue that causes MSR_HV MSR_PR
bits of Sampled Instruction Event Register (SIER) not to be set for
marked events. Hence add a check to use the address in SIAR (Sampled
Instruction Address Register) to identify the privilege level.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201021085329.384535-3-maddy@linux.ibm.com
2020-11-19 16:56:56 +11:00
Athira Rajeev
fdf13a6575 powerpc/perf: Drop the check for SIAR_VALID
In power10 DD1, there is an issue that causes the SIAR_VALID bit of
the SIER (Sampled Instruction Event Register) to not be set. But the
SIAR_VALID bit is used for fetching the instruction address from the
SIAR (Sampled Instruction Address Register), and marked events are
sampled only if the SIAR_VALID bit is set. So drop the check for
SIAR_VALID and return true always incase of power10 DD1.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201021085329.384535-2-maddy@linux.ibm.com
2020-11-19 16:56:56 +11:00
Athira Rajeev
9e8d13697c powerpc/perf: Add new power PMU flag "PPMU_P10_DD1" for power10 DD1
Add a new power PMU flag "PPMU_P10_DD1" which can be used to
conditionally add any code path for power10 DD1 processor version.
Also modify power10 PMU driver code to set this flag only for DD1,
based on the Processor Version Register (PVR) value.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201021085329.384535-1-maddy@linux.ibm.com
2020-11-19 16:56:55 +11:00
Nicholas Piggin
987c426320 powerpc/64s/perf: perf interrupt does not have to get_user_pages to access user memory
read_user_stack_slow that walks user address translation by hand is
only required on hash, because a hash fault can not be serviced from
"NMI" context (to avoid re-entering the hash code) so the user stack
can be mapped into Linux page tables but not accessible by the CPU.

Radix MMU mode does not have this restriction. A page fault failure
would indicate the page is not accessible via get_user_pages either,
so avoid this on radix.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201111120151.3150658-1-npiggin@gmail.com
2020-11-19 16:56:52 +11:00
Nick Desaulniers
49a4136505 powerpc: fix -Wimplicit-fallthrough
The "fallthrough" pseudo-keyword was added as a portable way to denote
intentional fallthrough. Clang will still warn on cases where there is a
fallthrough to an immediate break. Add explicit breaks for those cases.

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://github.com/ClangBuiltLinux/linux/issues/236
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-11-18 14:18:09 -06:00
Peter Zijlstra
76a4efa809 perf/arch: Remove perf_sample_data::regs_user_copy
struct perf_sample_data lives on-stack, we should be careful about it's
size. Furthermore, the pt_regs copy in there is only because x86_64 is a
trainwreck, solve it differently.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Link: https://lkml.kernel.org/r/20201030151955.258178461@infradead.org
2020-11-09 18:12:34 +01:00
Peter Zijlstra
267fb27352 perf: Reduce stack usage of perf_output_begin()
__perf_output_begin() has an on-stack struct perf_sample_data in the
unlikely case it needs to generate a LOST record. However, every call
to perf_output_begin() must already have a perf_sample_data on-stack.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20201030151954.985416146@infradead.org
2020-11-09 18:12:33 +01:00
Kan Liang
4cb6a42e4c powerpc/perf: Support PERF_SAMPLE_DATA_PAGE_SIZE
The new sample type, PERF_SAMPLE_DATA_PAGE_SIZE, requires the virtual
address. Update the data->addr if the sample type is set.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20201001135749.2804-4-kan.liang@linux.intel.com
2020-10-29 11:00:39 +01:00
Kajol Jain
09b791d955 powerpc/hv-gpci: Add sysfs files inside hv-gpci device to show cpumask
Patch here adds a cpumask attr to hv_gpci pmu along with ABI documentation.

Primary use to expose the cpumask is for the perf tool which has the
capability to parse the driver sysfs folder and understand the
cpumask file. Having cpumask file will reduce the number of perf command
line parameters (will avoid "-C" option in the perf tool
command line). It can also notify the user which is
the current cpu used to retrieve the counter data.

command:# cat /sys/devices/hv_gpci/cpumask
0

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201003074943.338618-5-kjain@linux.ibm.com
2020-10-07 22:34:49 +11:00
Kajol Jain
dcb5cdf60a powerpc/perf/hv-gpci: Add cpu hotplug support
Patch here adds cpu hotplug functions to hv_gpci pmu.
A new cpuhp_state "CPUHP_AP_PERF_POWERPC_HV_GPCI_ONLINE" enum
is added.

The online callback function updates the cpumask only if its
empty. As the primary intention of adding hotplug support
is to designate a CPU to make HCALL to collect the
counter data.

The offline function test and clear corresponding cpu in a cpumask
and update cpumask to any other active cpu.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201003074943.338618-4-kjain@linux.ibm.com
2020-10-07 22:34:48 +11:00
Kajol Jain
0f9866f7e8 powerpc/perf/hv-gpci: Fix starting index value
Commit 9e9f601084 ("powerpc/perf/{hv-gpci, hv-common}: generate
requests with counters annotated") adds a framework for defining
gpci counters.
In this patch, they adds starting_index value as '0xffffffffffffffff'.
which is wrong as starting_index is of size 32 bits.

Because of this, incase we try to run hv-gpci event we get error.

In power9 machine:

command#: perf stat -e hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
          -C 0 -I 1000
event syntax error: '..bie_count_and_time_tlbie_instructions_issued/'
                                  \___ value too big for format, maximum is 4294967295

This patch fix this issue and changes starting_index value to '0xffffffff'

After this patch:

command#: perf stat -e hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ -C 0 -I 1000
     1.000085786              1,024      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
     2.000287818              1,024      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
     2.439113909             17,408      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/

Fixes: 9e9f601084 ("powerpc/perf/{hv-gpci, hv-common}: generate requests with counters annotated")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201003074943.338618-1-kjain@linux.ibm.com
2020-10-07 22:34:48 +11:00
Athira Rajeev
3b6c3adbb2 powerpc/perf: Exclude pmc5/6 from the irrelevant PMU group constraints
PMU counter support functions enforces event constraints for group of
events to check if all events in a group can be monitored. Incase of
event codes using PMC5 and PMC6 ( 500fa and 600f4 respectively ), not
all constraints are applicable, say the threshold or sample bits. But
current code includes pmc5 and pmc6 in some group constraints (like
IC_DC Qualifier bits) which is actually not applicable and hence
results in those events not getting counted when scheduled along with
group of other events. Patch fixes this by excluding PMC5/6 from
constraints which are not relevant for it.

Fixes: 7ffd948 ("powerpc/perf: factor out power8 pmu functions")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1600672204-1610-1-git-send-email-atrajeev@linux.vnet.ibm.com
2020-10-06 23:22:27 +11:00
Michael Ellerman
d10ebe79df powerpc/perf: Add declarations to fix sparse warnings
Sparse warns about all the init functions:
  symbol init_ppc970_pmu was not declared. Should it be static?
  symbol init_power5p_pmu was not declared. Should it be static?
  symbol init_power5_pmu was not declared. Should it be static?
  symbol init_power6_pmu was not declared. Should it be static?
  symbol init_power7_pmu was not declared. Should it be static?
  symbol init_power9_pmu was not declared. Should it be static?
  symbol init_power8_pmu was not declared. Should it be static?
  symbol init_generic_compat_pmu was not declared. Should it be static?

They're already declared in internal.h, so just make sure all the C
files include that directly or indirectly.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://lore.kernel.org/r/20200916115637.3100484-2-mpe@ellerman.id.au
2020-09-18 19:59:43 +10:00
Michael Ellerman
960e370813 Merge branch 'fixes' into next
Bring in our fixes branch for this cycle which avoids some small
conflicts with upcoming commits.
2020-09-14 22:57:18 +10:00
Scott Cheloha
59562b5c33 powerpc/perf: consolidate GPCI hcall structs into asm/hvcall.h
The H_GetPerformanceCounterInfo (GPCI) hypercall input/output structs are
useful to modules outside of perf/, so move them into asm/hvcall.h to live
alongside the other powerpc hypercall structs.

Leave the perf-specific GPCI stuff in perf/hv-gpci.h.

Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
Acked-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200727184605.2945095-1-cheloha@linux.ibm.com
2020-09-02 11:00:20 +10:00
Athira Rajeev
82715a0f33 powerpc/perf: Fix reading of MSR[HV/PR] bits in trace-imc
IMC trace-mode uses MSR[HV/PR] bits to set the cpumode for the
instruction pointer captured in each sample. The bits are fetched from
the third double word of the trace record. Reading third double word
from IMC trace record should use be64_to_cpu() along with READ_ONCE
inorder to fetch correct MSR[HV/PR] bits. Patch addresses this change.

Currently we are using PERF_RECORD_MISC_HYPERVISOR as cpumode if MSR
HV is 1 and PR is 0 which means the address is from host counter. But
using PERF_RECORD_MISC_HYPERVISOR for host counter data will fail to
resolve the address -> symbol during "perf report" because perf tools
side uses PERF_RECORD_MISC_KERNEL to represent the host counter data.
Therefore, fix the trace imc sample data to use
PERF_RECORD_MISC_KERNEL as cpumode for host kernel information.

Fixes: 77ca3951cc ("powerpc/perf: Add kernel support for new MSR[HV PR] bits in trace-imc")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1598424029-1662-1-git-send-email-atrajeev@linux.vnet.ibm.com
2020-08-27 17:41:45 +10:00
Alexey Kardashevskiy
b460b51241 powerpc/perf: Fix crashes with generic_compat_pmu & BHRB
The bhrb_filter_map ("The Branch History Rolling Buffer") callback is
only defined in raw CPUs' power_pmu structs. The "architected" CPUs
use generic_compat_pmu, which does not have this callback, and crashes
occur if a user tries to enable branch stack for an event.

This add a NULL pointer check for bhrb_filter_map() which behaves as
if the callback returned an error.

This does not add the same check for config_bhrb() as the only caller
checks for cpuhw->bhrb_users which remains zero if bhrb_filter_map==0.

Fixes: be80e758d0 ("powerpc/perf: Add generic compat mode pmu driver")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200602025612.62707-1-aik@ozlabs.ru
2020-08-27 17:41:44 +10:00
zhengbin
ef23cf9a89 powerpc/perf: Remove set but not used variable 'target'
Fix gcc '-Wunused-but-set-variable' warning:

arch/powerpc/perf/imc-pmu.c: In function trace_imc_event_init:
arch/powerpc/perf/imc-pmu.c:1292:22: warning: variable target set but not used [-Wunused-but-set-variable]

It is introduced by commit 012ae24484 ("powerpc/perf:
Trace imc PMU functions"), but never used, so remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1574144074-142032-3-git-send-email-zhengbin13@huawei.com
2020-08-25 01:31:32 +10:00
Kajol Jain
64ef8f2c47 powerpc/perf/hv-24x7: Move cpumask file to top folder of hv-24x7 driver
Commit 792f73f747 ("powerpc/hv-24x7: Add sysfs files inside hv-24x7
device to show cpumask") added cpumask file as part of hv-24x7 driver
inside the interface folder. The cpumask file is supposed to be in the
top folder of the PMU driver in order to make hotplug work.

This patch fixes that issue and creates new group 'cpumask_attr_group'
to add cpumask file and make sure it added in top folder.

  command:# cat /sys/devices/hv_24x7/cpumask
  0

Fixes: 792f73f747 ("powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show cpumask")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200821080610.123997-1-kjain@linux.ibm.com
2020-08-21 23:35:27 +10:00
Athira Rajeev
17899eaf88 powerpc/perf: Fix soft lockups due to missed interrupt accounting
Performance monitor interrupt handler checks if any counter has
overflown and calls record_and_restart() in core-book3s which invokes
perf_event_overflow() to record the sample information. Apart from
creating sample, perf_event_overflow() also does the interrupt and
period checks via perf_event_account_interrupt().

Currently we record information only if the SIAR (Sampled Instruction
Address Register) valid bit is set (using siar_valid() check) and
hence the interrupt check.

But it is possible that we do sampling for some events that are not
generating valid SIAR, and hence there is no chance to disable the
event if interrupts are more than max_samples_per_tick. This leads to
soft lockup.

Fix this by adding perf_event_account_interrupt() in the invalid SIAR
code path for a sampling event. ie if SIAR is invalid, just do
interrupt check and don't record the sample information.

Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1596717992-7321-1-git-send-email-atrajeev@linux.vnet.ibm.com
2020-08-20 20:29:09 +10:00
Athira Rajeev
d735599a06 powerpc/perf: Add extended regs support for power10 platform
Include capability flag PERF_PMU_CAP_EXTENDED_REGS for power10 and
expose MMCR3, SIER2, SIER3 registers as part of extended regs. Also
introduce PERF_REG_PMU_MASK_31 to define extended mask value at
runtime for power10.

Suggested-by: Ryan Grimm <grimm@linux.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Nageswara R Sastry <nasastry@in.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-and-tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1596794701-23530-3-git-send-email-atrajeev@linux.vnet.ibm.com
2020-08-17 13:11:22 +10:00
Anju T Sudhakar
781fa4811d powerpc/perf: Add support for outputting extended regs in perf intr_regs
Add support for perf extended register capability in powerpc. The
capability flag PERF_PMU_CAP_EXTENDED_REGS, is used to indicate the
PMU which support extended registers. The generic code define the mask
of extended registers as 0 for non supported architectures.

Patch adds extended regs support for power9 platform by exposing
MMCR0, MMCR1 and MMCR2 registers.

REG_RESERVED mask needs update to include extended regs.
PERF_REG_EXTENDED_MASK, contains mask value of the supported
registers, is defined at runtime in the kernel based on platform since
the supported registers may differ from one processor version to
another and hence the MASK value.

With the patch:

  available registers: r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11
  r12 r13 r14 r15 r16 r17 r18 r19 r20 r21 r22 r23 r24 r25 r26
  r27 r28 r29 r30 r31 nip msr orig_r3 ctr link xer ccr softe
  trap dar dsisr sier mmcra mmcr0 mmcr1 mmcr2

  PERF_RECORD_SAMPLE(IP, 0x1): 4784/4784: 0 period: 1 addr: 0
  ... intr regs: mask 0xffffffffffff ABI 64-bit
  .... r0    0xc00000000012b77c
  .... r1    0xc000003fe5e03930
  .... r2    0xc000000001b0e000
  .... r3    0xc000003fdcddf800
  .... r4    0xc000003fc7880000
  .... r5    0x9c422724be
  .... r6    0xc000003fe5e03908
  .... r7    0xffffff63bddc8706
  .... r8    0x9e4
  .... r9    0x0
  .... r10   0x1
  .... r11   0x0
  .... r12   0xc0000000001299c0
  .... r13   0xc000003ffffc4800
  .... r14   0x0
  .... r15   0x7fffdd8b8b00
  .... r16   0x0
  .... r17   0x7fffdd8be6b8
  .... r18   0x7e7076607730
  .... r19   0x2f
  .... r20   0xc00000001fc26c68
  .... r21   0xc0002041e4227e00
  .... r22   0xc00000002018fb60
  .... r23   0x1
  .... r24   0xc000003ffec4d900
  .... r25   0x80000000
  .... r26   0x0
  .... r27   0x1
  .... r28   0x1
  .... r29   0xc000000001be1260
  .... r30   0x6008010
  .... r31   0xc000003ffebb7218
  .... nip   0xc00000000012b910
  .... msr   0x9000000000009033
  .... orig_r3 0xc00000000012b86c
  .... ctr   0xc0000000001299c0
  .... link  0xc00000000012b77c
  .... xer   0x0
  .... ccr   0x28002222
  .... softe 0x1
  .... trap  0xf00
  .... dar   0x0
  .... dsisr 0x80000000000
  .... sier  0x0
  .... mmcra 0x80000000000
  .... mmcr0 0x82008090
  .... mmcr1 0x1e000000
  .... mmcr2 0x0
   ... thread: perf:4784

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Nageswara R Sastry <nasastry@in.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-and-tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1596794701-23530-2-git-send-email-atrajeev@linux.vnet.ibm.com
2020-08-17 13:11:22 +10:00
Linus Torvalds
25d8d4eeca powerpc updates for 5.9
- Add support for (optionally) using queued spinlocks & rwlocks.
 
  - Support for a new faster system call ABI using the scv instruction on Power9
    or later.
 
  - Drop support for the PROT_SAO mmap/mprotect flag as it will be unsupported on
    Power10 and future processors, leaving us with no way to implement the
    functionality it requests. This risks breaking userspace, though we believe
    it is unused in practice.
 
  - A bug fix for, and then the removal of, our custom stack expansion checking.
    We now allow stack expansion up to the rlimit, like other architectures.
 
  - Remove the remnants of our (previously disabled) topology update code, which
    tried to react to NUMA layout changes on virtualised systems, but was prone
    to crashes and other problems.
 
  - Add PMU support for Power10 CPUs.
 
  - A change to our signal trampoline so that we don't unbalance the link stack
    (branch return predictor) in the signal delivery path.
 
  - Lots of other cleanups, refactorings, smaller features and so on as usual.
 
 Thanks to:
   Abhishek Goel, Alastair D'Silva, Alexander A. Klimov, Alexey Kardashevskiy,
   Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anton
   Blanchard, Arnd Bergmann, Athira Rajeev, Balamuruhan S, Bharata B Rao, Bill
   Wendling, Bin Meng, Cédric Le Goater, Chris Packham, Christophe Leroy,
   Christoph Hellwig, Daniel Axtens, Dan Williams, David Lamparter, Desnes A.
   Nunes do Rosario, Erhard F., Finn Thain, Frederic Barrat, Ganesh Goudar,
   Gautham R. Shenoy, Geoff Levand, Greg Kurz, Gustavo A. R. Silva, Hari Bathini,
   Harish, Imre Kaloz, Joel Stanley, Joe Perches, John Crispin, Jordan Niethe,
   Kajol Jain, Kamalesh Babulal, Kees Cook, Laurent Dufour, Leonardo Bras, Li
   RongQing, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Cave-Ayland, Michal
   Suchanek, Milton Miller, Mimi Zohar, Murilo Opsfelder Araujo, Nathan
   Chancellor, Nathan Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Oliver
   O'Halloran, Palmer Dabbelt, Pedro Miraglia Franco de Carvalho, Philippe
   Bergheaud, Pingfan Liu, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Randy
   Dunlap, Ravi Bangoria, Sachin Sant, Sam Bobroff, Sandipan Das, Santosh
   Sivaraj, Satheesh Rajendran, Shirisha Ganta, Sourabh Jain, Srikar Dronamraju,
   Stan Johnson, Stephen Rothwell, Thadeu Lima de Souza Cascardo, Thiago Jung
   Bauermann, Tom Lane, Vaibhav Jain, Vladis Dronov, Wei Yongjun, Wen Xiong,
   YueHaibing.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl8tOxATHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgDQfEAClXHWf6hnxB84bEu39D51NkVotL1IG
 BRWFvyix+xHuUkHIouBPAAMl6ngY5X6wkYd+Z+CY9zHNtdSDoVlJE30YXdMQA/dE
 L/rYxR1884yGR/uU/3wusboO68ReXwcKQPmKOymUfh0zH7ujyJsSWLpXFK1YDC5d
 2TVVTi0Q+P5ucMHDh0L+AHirIxZvtZSp43+J7xLtywsj+XAxJWCTGo5WCJbdgbCA
 Qbv3aOkVyUa3EgsbdM/STPpv82ebqT+PHxeSIO4Jw6ZODtKRH0R5YsWCApuY9eZ+
 ebY9RLmgv9ZAhJqB2fv9A5NDcMoGpZNmjM7HrWpXwULKQpkBGHCzJ9FcSdHVMOx8
 nbVMFjt4uzLwV1w8lFYslQ2tNH/uH2o9BlryV1RLpiiKokDAJO/NOsWN9y0u/I4J
 EmAM5DSX2LgVvvas96IlGK8KX4xkOkf8FLX/H5UDvvAfloH8J4CZXk/CWCab/nqY
 KEHPnMmYvQZ1w9SzyZg9sO/1p6Bl1Gmm75Jv2F1lBiRW/42VcGBI/qLsJ4lC59Fc
 KbwufYNYYG38wbxDLW1HAPJhRonxIcaZj3EEqk7aTiLZ55nNbu8e2k32CpNXTGqt
 npOhzJHimcq7L6+878ZW+xpbZwogIEUdRSsmwb6aT8za3ShnYwSA2Q3LYxh9xyGH
 j3GifvPq6Efp3Q==
 =QMY1
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Add support for (optionally) using queued spinlocks & rwlocks.

 - Support for a new faster system call ABI using the scv instruction on
   Power9 or later.

 - Drop support for the PROT_SAO mmap/mprotect flag as it will be
   unsupported on Power10 and future processors, leaving us with no way
   to implement the functionality it requests. This risks breaking
   userspace, though we believe it is unused in practice.

 - A bug fix for, and then the removal of, our custom stack expansion
   checking. We now allow stack expansion up to the rlimit, like other
   architectures.

 - Remove the remnants of our (previously disabled) topology update
   code, which tried to react to NUMA layout changes on virtualised
   systems, but was prone to crashes and other problems.

 - Add PMU support for Power10 CPUs.

 - A change to our signal trampoline so that we don't unbalance the link
   stack (branch return predictor) in the signal delivery path.

 - Lots of other cleanups, refactorings, smaller features and so on as
   usual.

Thanks to: Abhishek Goel, Alastair D'Silva, Alexander A. Klimov, Alexey
Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju
T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Balamuruhan
S, Bharata B Rao, Bill Wendling, Bin Meng, Cédric Le Goater, Chris
Packham, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Dan
Williams, David Lamparter, Desnes A. Nunes do Rosario, Erhard F., Finn
Thain, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geoff Levand,
Greg Kurz, Gustavo A. R. Silva, Hari Bathini, Harish, Imre Kaloz, Joel
Stanley, Joe Perches, John Crispin, Jordan Niethe, Kajol Jain, Kamalesh
Babulal, Kees Cook, Laurent Dufour, Leonardo Bras, Li RongQing, Madhavan
Srinivasan, Mahesh Salgaonkar, Mark Cave-Ayland, Michal Suchanek, Milton
Miller, Mimi Zohar, Murilo Opsfelder Araujo, Nathan Chancellor, Nathan
Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Oliver O'Halloran,
Palmer Dabbelt, Pedro Miraglia Franco de Carvalho, Philippe Bergheaud,
Pingfan Liu, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Randy
Dunlap, Ravi Bangoria, Sachin Sant, Sam Bobroff, Sandipan Das, Santosh
Sivaraj, Satheesh Rajendran, Shirisha Ganta, Sourabh Jain, Srikar
Dronamraju, Stan Johnson, Stephen Rothwell, Thadeu Lima de Souza
Cascardo, Thiago Jung Bauermann, Tom Lane, Vaibhav Jain, Vladis Dronov,
Wei Yongjun, Wen Xiong, YueHaibing.

* tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (337 commits)
  selftests/powerpc: Fix pkey syscall redefinitions
  powerpc: Fix circular dependency between percpu.h and mmu.h
  powerpc/powernv/sriov: Fix use of uninitialised variable
  selftests/powerpc: Skip vmx/vsx/tar/etc tests on older CPUs
  powerpc/40x: Fix assembler warning about r0
  powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric
  powerpc/papr_scm: Fetch nvdimm performance stats from PHYP
  cpuidle: pseries: Fixup exit latency for CEDE(0)
  cpuidle: pseries: Add function to parse extended CEDE records
  cpuidle: pseries: Set the latency-hint before entering CEDE
  selftests/powerpc: Fix online CPU selection
  powerpc/perf: Consolidate perf_callchain_user_[64|32]()
  powerpc/pseries/hotplug-cpu: Remove double free in error path
  powerpc/pseries/mobility: Add pr_debug() for device tree changes
  powerpc/pseries/mobility: Set pr_fmt()
  powerpc/cacheinfo: Warn if cache object chain becomes unordered
  powerpc/cacheinfo: Improve diagnostics about malformed cache lists
  powerpc/cacheinfo: Use name@unit instead of full DT path in debug messages
  powerpc/cacheinfo: Set pr_fmt()
  powerpc: fix function annotations to avoid section mismatch warnings with gcc-10
  ...
2020-08-07 10:33:50 -07:00
Michal Suchanek
d3a133aa0e powerpc/perf: Consolidate perf_callchain_user_[64|32]()
perf_callchain_user_64() and perf_callchain_user_32() are nearly
identical. Consolidate into one function with thin wrappers.

Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
[mpe: Adapt to copy_from_user_nofault(), minor formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200406210022.32265-1-msuchanek@suse.de
2020-07-30 22:53:49 +10:00
Nicholas Piggin
909adfc66b powerpc/64s/hash: Fix hash_preload running with interrupts enabled
Commit 2f92447f9f ("powerpc/book3s64/hash: Use the pte_t address from the
caller") removed the local_irq_disable from hash_preload, but it was
required for more than just the page table walk: the hash pte busy bit is
effectively a lock which may be taken in interrupt context, and the local
update flag test must not be preempted before it's used.

This solves apparent lockups with perf interrupting __hash_page_64K. If
get_perf_callchain then also takes a hash fault on the same page while it
is already locked, it will loop forever taking hash faults, which looks like
this:

  cpu 0x49e: Vector: 100 (System Reset) at [c00000001a4f7d70]
      pc: c000000000072dc8: hash_page_mm+0x8/0x800
      lr: c00000000000c5a4: do_hash_page+0x24/0x38
      sp: c0002ac1cc69ac70
     msr: 8000000000081033
    current = 0xc0002ac1cc602e00
    paca    = 0xc00000001de1f280   irqmask: 0x03   irq_happened: 0x01
      pid   = 20118, comm = pread2_processe
  Linux version 5.8.0-rc6-00345-g1fad14f18bc6
  49e:mon> t
  [c0002ac1cc69ac70] c00000000000c5a4 do_hash_page+0x24/0x38 (unreliable)
  --- Exception: 300 (Data Access) at c00000000008fa60 __copy_tofrom_user_power7+0x20c/0x7ac
  [link register   ] c000000000335d10 copy_from_user_nofault+0xf0/0x150
  [c0002ac1cc69af70] c00032bf9fa3c880 (unreliable)
  [c0002ac1cc69afa0] c000000000109df0 read_user_stack_64+0x70/0xf0
  [c0002ac1cc69afd0] c000000000109fcc perf_callchain_user_64+0x15c/0x410
  [c0002ac1cc69b060] c000000000109c00 perf_callchain_user+0x20/0x40
  [c0002ac1cc69b080] c00000000031c6cc get_perf_callchain+0x25c/0x360
  [c0002ac1cc69b120] c000000000316b50 perf_callchain+0x70/0xa0
  [c0002ac1cc69b140] c000000000316ddc perf_prepare_sample+0x25c/0x790
  [c0002ac1cc69b1a0] c000000000317350 perf_event_output_forward+0x40/0xb0
  [c0002ac1cc69b220] c000000000306138 __perf_event_overflow+0x88/0x1a0
  [c0002ac1cc69b270] c00000000010cf70 record_and_restart+0x230/0x750
  [c0002ac1cc69b620] c00000000010d69c perf_event_interrupt+0x20c/0x510
  [c0002ac1cc69b730] c000000000027d9c performance_monitor_exception+0x4c/0x60
  [c0002ac1cc69b750] c00000000000b2f8 performance_monitor_common_virt+0x1b8/0x1c0
  --- Exception: f00 (Performance Monitor) at c0000000000cb5b0 pSeries_lpar_hpte_insert+0x0/0x160
  [link register   ] c0000000000846f0 __hash_page_64K+0x210/0x540
  [c0002ac1cc69ba50] 0000000000000000 (unreliable)
  [c0002ac1cc69bb00] c000000000073ae0 update_mmu_cache+0x390/0x3a0
  [c0002ac1cc69bb70] c00000000037f024 wp_page_copy+0x364/0xce0
  [c0002ac1cc69bc20] c00000000038272c do_wp_page+0xdc/0xa60
  [c0002ac1cc69bc70] c0000000003857bc handle_mm_fault+0xb9c/0x1b60
  [c0002ac1cc69bd50] c00000000006c434 __do_page_fault+0x314/0xc90
  [c0002ac1cc69be20] c00000000000c5c8 handle_page_fault+0x10/0x2c
  --- Exception: 300 (Data Access) at 00007fff8c861fe8
  SP (7ffff6b19660) is in userspace

Fixes: 2f92447f9f ("powerpc/book3s64/hash: Use the pte_t address from the caller")
Reported-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reported-by: Anton Blanchard <anton@ozlabs.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200727060947.10060-1-npiggin@gmail.com
2020-07-27 17:02:09 +10:00
Athira Rajeev
1cade527f6 powerpc/perf: BHRB control to disable BHRB logic when not used
PowerISA v3.1 has few updates for the Branch History Rolling
Buffer(BHRB).

BHRB disable is controlled via Monitor Mode Control Register A (MMCRA)
bit, namely "BHRB Recording Disable (BHRBRD)". This field controls
whether BHRB entries are written when BHRB recording is enabled by
other bits. This patch implements support for this BHRB disable bit.
By setting 0b1 to this bit will disable the BHRB and by setting 0b0 to
this bit will have BHRB enabled. This addresses backward
compatibility (for older OS), since this bit will be cleared and
hardware will be writing to BHRB by default.

This patch addresses changes to set MMCRA (BHRBRD) at boot for
power10 (there by the core will run faster) and enable this feature
only on runtime ie, on explicit need from user. Also save/restore
MMCRA in the restore path of state-loss idle state to make sure we
keep BHRB disabled if it was not enabled on request at runtime.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1594996707-3727-12-git-send-email-atrajeev@linux.vnet.ibm.com
2020-07-22 21:56:42 +10:00
Athira Rajeev
80350a4bac powerpc/perf: Add Power10 BHRB filter support for PERF_SAMPLE_BRANCH_IND_CALL/COND
PowerISA v3.1 introduce filtering support for
PERF_SAMPLE_BRANCH_IND_CALL/COND. The patch adds BHRB filter
support for "ind_call" and "cond" in power10_bhrb_filter_map().

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1594996707-3727-11-git-send-email-atrajeev@linux.vnet.ibm.com
2020-07-22 21:56:42 +10:00
Athira Rajeev
bfe3b1945d powerpc/perf: Ignore the BHRB kernel address filtering for P10
Commit bb19af8160 ("powerpc/perf: Prevent kernel address leak to
userspace via BHRB buffer") added a check in bhrb_read() to filter
the kernel address from BHRB buffer. This patch modified it to avoid
that check for PowerISA v3.1 based processors, since PowerISA v3.1
allows only MSR[PR]=1 address to be written to BHRB buffer.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1594996707-3727-10-git-send-email-atrajeev@linux.vnet.ibm.com
2020-07-22 21:56:41 +10:00
Athira Rajeev
a64e697cef powerpc/perf: power10 Performance Monitoring support
Base enablement patch to register performance monitoring hardware
support for power10. Patch introduce the raw event encoding format,
defines the supported list of events, config fields for the event
attributes and their corresponding bit values which are exported via
sysfs.

Patch also enhances the support function in isa207_common.c to include
power10 pmu hardware.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1594996707-3727-9-git-send-email-atrajeev@linux.vnet.ibm.com
2020-07-22 21:56:41 +10:00
Madhavan Srinivasan
9908c826d5 powerpc/perf: Add Power10 PMU feature to DT CPU features
Add Power10 feature function to DT CPU features, along with a Power10
specific init() to initialize PMU SPRs, sets the oprofile_cpu_type and
cpu_features. This will enable performance monitoring unit (PMU) for
Power10 in CPU features with "performance-monitor-power10".

For Power ISA v3.1, BHRB disable is controlled via Monitor Mode
Control Register A (MMCRA) bit, namely "BHRB Recording
Disable (BHRBRD)". This patch initializes MMCRA BHRBRD to disable BHRB
feature at boot for Power10.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
[mpe: Move MMCRA_BHRB_DISABLE as noted by jpn, drop CPU setup changes]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1594996707-3727-8-git-send-email-atrajeev@linux.vnet.ibm.com
2020-07-22 21:56:41 +10:00
Madhavan Srinivasan
c718547e4a powerpc/perf: Add support for ISA3.1 PMU SPRs
PowerISA v3.1 includes new performance monitoring unit(PMU)
special purpose registers (SPRs). They are

Monitor Mode Control Register 3 (MMCR3)
Sampled Instruction Event Register 2 (SIER2)
Sampled Instruction Event Register 3 (SIER3)

MMCR3 is added for further sampling related configuration
control. SIER2/SIER3 are added to provide additional
information about the sampled instruction.

Patch adds new PPMU flag called "PPMU_ARCH_31" to support handling of
these new SPRs, updates the struct thread_struct to include these new
SPRs, include MMCR3 in struct mmcr_regs. This is needed to support
programming of MMCR3 SPR during event_enable/disable. Patch also adds
the sysfs support for the MMCR3 SPR along with SPRN_ macros for these
new pmu SPRs.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
[mpe: Rename to PPMU_ARCH_31 as noted by jpn]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1594996707-3727-5-git-send-email-atrajeev@linux.vnet.ibm.com
2020-07-22 21:56:41 +10:00
Athira Rajeev
9d4fc86dcd powerpc/perf: Update Power PMU cache_events to u64 type
Events of type PERF_TYPE_HW_CACHE was described for Power PMU
as: int (*cache_events)[type][op][result];

where type, op, result values unpacked from the event attribute config
value is used to generate the raw event code at runtime.

So far the event code values which used to create these cache-related
events were within 32 bit and `int` type worked. In power10,
some of the event codes are of 64-bit value and hence update the
Power PMU cache_events to `u64` type in `power_pmu` struct.
Also propagate this change to existing all PMU driver code paths
which are using ppmu->cache_events.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1594996707-3727-4-git-send-email-atrajeev@linux.vnet.ibm.com
2020-07-22 21:56:40 +10:00
Athira Rajeev
78d76819e6 powerpc/perf: Update cpu_hw_event to use struct for storing MMCR registers
core-book3s currently uses array to store the MMCR registers as part
of per-cpu `cpu_hw_events`. This patch does a clean up to use `struct`
to store mmcr regs instead of array. This will make code easier to read
and reduces chance of any subtle bug that may come in the future, say
when new registers are added. Patch updates all relevant code that was
using MMCR array ( cpuhw->mmcr[x]) to use newly introduced `struct`.
This includes the PMU driver code for supported platforms (power5
to power9) and ISA macros for counter support functions.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1594996707-3727-2-git-send-email-atrajeev@linux.vnet.ibm.com
2020-07-22 00:03:17 +10:00
Anju T Sudhakar
77ca3951cc powerpc/perf: Add kernel support for new MSR[HV PR] bits in trace-imc
IMC trace-mode record has MSR[HV PR] bits added in the third DW.
These bits can be used to set the cpumode for the instruction pointer
captured in each sample.

Add support in kernel to use these bits to set the cpumode for
each sample.

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200713144623.508695-1-maddy@linux.ibm.com
2020-07-16 13:12:46 +10:00
Kajol Jain
792f73f747 powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show cpumask
Patch here adds a cpumask attr to hv_24x7 pmu along with ABI documentation.

Primary use to expose the cpumask is for the perf tool which has the
capability to parse the driver sysfs folder and understand the
cpumask file. Having cpumask file will reduce the number of perf command
line parameters (will avoid "-C" option in the perf tool
command line). It can also notify the user which is
the current cpu used to retrieve the counter data.

command:# cat /sys/devices/hv_24x7/interface/cpumask
0

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200709051836.723765-3-kjain@linux.ibm.com
2020-07-16 13:12:41 +10:00
Kajol Jain
1a8f0886a6 powerpc/perf/hv-24x7: Add cpu hotplug support
Patch here adds cpu hotplug functions to hv_24x7 pmu.
A new cpuhp_state "CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE" enum
is added.

The online callback function updates the cpumask only if its
empty. As the primary intention of adding hotplug support
is to designate a CPU to make HCALL to collect the
counter data.

The offline function test and clear corresponding cpu in a cpumask
and update cpumask to any other active cpu.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200709051836.723765-2-kjain@linux.ibm.com
2020-07-16 13:12:41 +10:00
Christoph Hellwig
c0ee37e85e maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault
Better describe what these functions do.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-17 10:57:41 -07:00
Christoph Hellwig
fe557319aa maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault
Better describe what these functions do.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-17 10:57:41 -07:00
Mike Rapoport
e31cf2f4ca mm: don't include asm/pgtable.h if linux/mm.h is already included
Patch series "mm: consolidate definitions of page table accessors", v2.

The low level page table accessors (pXY_index(), pXY_offset()) are
duplicated across all architectures and sometimes more than once.  For
instance, we have 31 definition of pgd_offset() for 25 supported
architectures.

Most of these definitions are actually identical and typically it boils
down to, e.g.

static inline unsigned long pmd_index(unsigned long address)
{
        return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
}

static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
{
        return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
}

These definitions can be shared among 90% of the arches provided
XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.

For architectures that really need a custom version there is always
possibility to override the generic version with the usual ifdefs magic.

These patches introduce include/linux/pgtable.h that replaces
include/asm-generic/pgtable.h and add the definitions of the page table
accessors to the new header.

This patch (of 12):

The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the
functions involving page table manipulations, e.g.  pte_alloc() and
pmd_alloc().  So, there is no point to explicitly include <asm/pgtable.h>
in the files that include <linux/mm.h>.

The include statements in such cases are remove with a simple loop:

	for f in $(git grep -l "include <linux/mm.h>") ; do
		sed -i -e '/include <asm\/pgtable.h>/ d' $f
	done

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:13 -07:00
Souptick Joarder
dadbb612f6 mm/gup.c: convert to use get_user_{page|pages}_fast_only()
API __get_user_pages_fast() renamed to get_user_pages_fast_only() to
align with pin_user_pages_fast_only().

As part of this we will get rid of write parameter.  Instead caller will
pass FOLL_WRITE to get_user_pages_fast_only().  This will not change any
existing functionality of the API.

All the callers are changed to pass FOLL_WRITE.

Also introduce get_user_page_fast_only(), and use it in a few places
that hard-code nr_pages to 1.

Updated the documentation of the API.

Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Paul Mackerras <paulus@ozlabs.org>		[arch/powerpc/kvm]
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Michal Suchanek <msuchanek@suse.de>
Link: http://lkml.kernel.org/r/1590396812-31277-1-git-send-email-jrdr.linux@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08 11:05:56 -07:00
Linus Torvalds
7ae77150d9 powerpc updates for 5.8
- Support for userspace to send requests directly to the on-chip GZIP
    accelerator on Power9.
 
  - Rework of our lockless page table walking (__find_linux_pte()) to make it
    safe against parallel page table manipulations without relying on an IPI for
    serialisation.
 
  - A series of fixes & enhancements to make our machine check handling more
    robust.
 
  - Lots of plumbing to add support for "prefixed" (64-bit) instructions on
    Power10.
 
  - Support for using huge pages for the linear mapping on 8xx (32-bit).
 
  - Remove obsolete Xilinx PPC405/PPC440 support, and an associated sound driver.
 
  - Removal of some obsolete 40x platforms and associated cruft.
 
  - Initial support for booting on Power10.
 
  - Lots of other small features, cleanups & fixes.
 
 Thanks to:
   Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Andrey Abramov,
   Aneesh Kumar K.V, Balamuruhan S, Bharata B Rao, Bulent Abali, Cédric Le
   Goater, Chen Zhou, Christian Zigotzky, Christophe JAILLET, Christophe Leroy,
   Dmitry Torokhov, Emmanuel Nicolet, Erhard F., Gautham R. Shenoy, Geoff Levand,
   George Spelvin, Greg Kurz, Gustavo A. R. Silva, Gustavo Walbon, Haren Myneni,
   Hari Bathini, Joel Stanley, Jordan Niethe, Kajol Jain, Kees Cook, Leonardo
   Bras, Madhavan Srinivasan., Mahesh Salgaonkar, Markus Elfring, Michael
   Neuling, Michal Simek, Nathan Chancellor, Nathan Lynch, Naveen N. Rao,
   Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pingfan Liu, Qian Cai, Ram
   Pai, Raphael Moreira Zinsly, Ravi Bangoria, Sam Bobroff, Sandipan Das, Segher
   Boessenkool, Stephen Rothwell, Sukadev Bhattiprolu, Tyrel Datwyler, Wolfram
   Sang, Xiongfeng Wang.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl7aYZ8THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgPiKD/9zNCuZLFMAFrIdbm0HlYA2RGYZFT75
 GUHsqYyei1pxA7PgM3KwJiXELVODsBv0eQbgNh1tbecKrxPRegN/cywd1KLjPZ7I
 v5/qweQP8MvR0RhzjbhvUcO0jq/f8u2LbJr5mUfVzjU6tAvrvcWo3oZqDElsekCS
 kgyOH3r1vZ2PLTMiGFhb0gWi2iqc+6BHU1AFCGPCMjB1Vu5d5+54VvZ/6lllGsOF
 yg9CBXmmVvQ+Bn6tH4zdEB78FYxnAIwBqlbmL79i5ca+HQJ0Sw6HuPRy9XYq35p6
 2EiXS4Wrgp7i7+1TN3HO362u5Onb8TSyQU7NS6yCFPoJ6JQxcJMBIw6mHhnXOPuZ
 CrjgcdwUMjx8uDoKmX1Epbfuex2w+AysW+4yBHPFiSgl3klKC3D0wi95mR485w2F
 rN8uzJtrDeFKcYZJG7IoB/cgFCCPKGf9HaXr8q0S/jBKMffx91ul3cfzlfdIXOCw
 FDNw/+ZX7UD6ddFEG12ZTO+vdL8yf1uCRT/DIZwUiDMIA0+M6F4nc7j3lfyZfoO1
 65f9UlhoLxScq7VH2fKH4UtZatO9cPID2z1CmiY4UbUIPtFDepSuYClgLF+Duf4b
 rkfxhKU0+Ja1zNH5XNc+L+Bc5/W4lFiJXz02dYIjtHoUpWkc1aToOETVwzggYFNM
 G3PXIBOI0jRgRw==
 =o0WU
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Support for userspace to send requests directly to the on-chip GZIP
   accelerator on Power9.

 - Rework of our lockless page table walking (__find_linux_pte()) to
   make it safe against parallel page table manipulations without
   relying on an IPI for serialisation.

 - A series of fixes & enhancements to make our machine check handling
   more robust.

 - Lots of plumbing to add support for "prefixed" (64-bit) instructions
   on Power10.

 - Support for using huge pages for the linear mapping on 8xx (32-bit).

 - Remove obsolete Xilinx PPC405/PPC440 support, and an associated sound
   driver.

 - Removal of some obsolete 40x platforms and associated cruft.

 - Initial support for booting on Power10.

 - Lots of other small features, cleanups & fixes.

Thanks to: Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan,
Andrey Abramov, Aneesh Kumar K.V, Balamuruhan S, Bharata B Rao, Bulent
Abali, Cédric Le Goater, Chen Zhou, Christian Zigotzky, Christophe
JAILLET, Christophe Leroy, Dmitry Torokhov, Emmanuel Nicolet, Erhard F.,
Gautham R. Shenoy, Geoff Levand, George Spelvin, Greg Kurz, Gustavo A.
R. Silva, Gustavo Walbon, Haren Myneni, Hari Bathini, Joel Stanley,
Jordan Niethe, Kajol Jain, Kees Cook, Leonardo Bras, Madhavan
Srinivasan., Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Michal
Simek, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin,
Oliver O'Halloran, Paul Mackerras, Pingfan Liu, Qian Cai, Ram Pai,
Raphael Moreira Zinsly, Ravi Bangoria, Sam Bobroff, Sandipan Das, Segher
Boessenkool, Stephen Rothwell, Sukadev Bhattiprolu, Tyrel Datwyler,
Wolfram Sang, Xiongfeng Wang.

* tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (299 commits)
  powerpc/pseries: Make vio and ibmebus initcalls pseries specific
  cxl: Remove dead Kconfig options
  powerpc: Add POWER10 architected mode
  powerpc/dt_cpu_ftrs: Add MMA feature
  powerpc/dt_cpu_ftrs: Enable Prefixed Instructions
  powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selected
  powerpc: Add support for ISA v3.1
  powerpc: Add new HWCAP bits
  powerpc/64s: Don't set FSCR bits in INIT_THREAD
  powerpc/64s: Save FSCR to init_task.thread.fscr after feature init
  powerpc/64s: Don't let DT CPU features set FSCR_DSCR
  powerpc/64s: Don't init FSCR_DSCR in __init_FSCR()
  powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG
  powerpc/module_64: Use special stub for _mcount() with -mprofile-kernel
  powerpc/module_64: Simplify check for -mprofile-kernel ftrace relocations
  powerpc/module_64: Consolidate ftrace code
  powerpc/32: Disable KASAN with pages bigger than 16k
  powerpc/uaccess: Don't set KUEP by default on book3s/32
  powerpc/uaccess: Don't set KUAP by default on book3s/32
  powerpc/8xx: Reduce time spent in allow_user_access() and friends
  ...
2020-06-05 12:39:30 -07:00
Kajol Jain
60beb65da1 powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show processor details
To expose the system dependent parameter like total number of
sockets and numbers of chips per socket, patch adds two sysfs files.
"sockets" and "chips" are added to /sys/devices/hv_24x7/interface/
of the "hv_24x7" pmu.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200525104308.9814-4-kjain@linux.ibm.com
2020-05-28 23:24:39 +10:00
Kajol Jain
8ba2142673 powerpc/hv-24x7: Add rtas call in hv-24x7 driver to get processor details
For hv_24x7 socket/chip level events, specific chip-id to which
the data requested should be added as part of pmu events.
But number of chips/socket in the system details are not exposed.

Patch implements read_24x7_sys_info() to get system parameter values
like number of sockets, cores per chip and chips per socket. Rtas_call
with token "PROCESSOR_MODULE_INFO" is used to get these values.

Subsequent patch exports these values via sysfs.

Patch also make these parameters default to 1.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200525104308.9814-3-kjain@linux.ibm.com
2020-05-28 23:24:39 +10:00
Kajol Jain
b4ac18eead powerpc/perf/hv-24x7: Fix inconsistent output values incase multiple hv-24x7 events run
Commit 2b206ee6b0 ("powerpc/perf/hv-24x7: Display change in counter
values")' added to print _change_ in the counter value rather then raw
value for 24x7 counters. Incase of transactions, the event count
is set to 0 at the beginning of the transaction. It also sets
the event's prev_count to the raw value at the time of initialization.
Because of setting event count to 0, we are seeing some weird behaviour,
whenever we run multiple 24x7 events at a time.

For example:

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
			   hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
	  		   -C 0 -I 1000 sleep 100

     1.000121704                120 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
     1.000121704                  5 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
     2.000357733                  8 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
     2.000357733                 10 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
     3.000495215 18,446,744,073,709,551,616 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
     3.000495215 18,446,744,073,709,551,616 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
     4.000641884                 56 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
     4.000641884 18,446,744,073,709,551,616 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
     5.000791887 18,446,744,073,709,551,616 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Getting these large values in case we do -I.

As we are setting event_count to 0, for interval case, overall event_count is not
coming in incremental order. As we may can get new delta lesser then previous count.
Because of which when we print intervals, we are getting negative value which create
these large values.

This patch removes part where we set event_count to 0 in function
'h_24x7_event_read'. There won't be much impact as we do set event->hw.prev_count
to the raw value at the time of initialization to print change value.

With this patch
In power9 platform

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
		           hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
			   -C 0 -I 1000 sleep 100

     1.000117685                 93 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
     1.000117685                  1 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
     2.000349331                 98 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
     2.000349331                  2 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
     3.000495900                131 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
     3.000495900                  4 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
     4.000645920                204 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
     4.000645920                 61 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
     4.284169997                 22 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Suggested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Tested-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200525104308.9814-2-kjain@linux.ibm.com
2020-05-28 23:24:39 +10:00
Christophe Leroy
1251288e64 powerpc/8xx: Remove now unused TLB miss functions
The code to setup linear and IMMR mapping via huge TLB entries is
not called anymore. Remove it.

Also remove the handling of removed code exits in the perf driver.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/75750d25849cb8e73ca519866bb892d7eb9649c0.1589866984.git.christophe.leroy@csgroup.eu
2020-05-26 22:22:22 +10:00
Jordan Niethe
94afd069d9 powerpc: Use a datatype for instructions
Currently unsigned ints are used to represent instructions on powerpc.
This has worked well as instructions have always been 4 byte words.

However, ISA v3.1 introduces some changes to instructions that mean
this scheme will no longer work as well. This change is Prefixed
Instructions. A prefixed instruction is made up of a word prefix
followed by a word suffix to make an 8 byte double word instruction.
No matter the endianness of the system the prefix always comes first.
Prefixed instructions are only planned for powerpc64.

Introduce a ppc_inst type to represent both prefixed and word
instructions on powerpc64 while keeping it possible to exclusively
have word instructions on powerpc32.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
[mpe: Fix compile error in emulate_spe()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200506034050.24806-12-jniethe5@gmail.com
2020-05-19 00:10:37 +10:00
Jordan Niethe
7534625128 powerpc: Use a macro for creating instructions from u32s
In preparation for instructions having a more complex data type start
using a macro, ppc_inst(), for making an instruction out of a u32.  A
macro is used so that instructions can be used as initializer elements.
Currently this does nothing, but it will allow for creating a data type
that can represent prefixed instructions.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
[mpe: Change include guard to _ASM_POWERPC_INST_H]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Link: https://lore.kernel.org/r/20200506034050.24806-7-jniethe5@gmail.com
2020-05-19 00:10:36 +10:00
Aneesh Kumar K.V
15759cb054 powerpc/perf/callchain: Use __get_user_pages_fast in read_user_stack_slow
read_user_stack_slow is called with interrupts soft disabled and it copies contents
from the page which we find mapped to a specific address. To convert
userspace address to pfn, the kernel now uses lockless page table walk.

The kernel needs to make sure the pfn value read remains stable and is not released
and reused for another process while the contents are read from the page. This
can only be achieved by holding a page reference.

One of the first approaches I tried was to check the pte value after the kernel
copies the contents from the page. But as shown below we can still get it wrong

CPU0                           CPU1
pte = READ_ONCE(*ptep);
                               pte_clear(pte);
                               put_page(page);
                               page = alloc_page();
                               memcpy(page_address(page), "secret password", nr);
memcpy(buf, kaddr + offset, nb);
                               put_page(page);
                               handle_mm_fault()
                               page = alloc_page();
                               set_pte(pte, page);
if (pte_val(pte) != pte_val(*ptep))

Hence switch to __get_user_pages_fast.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200505071729.54912-8-aneesh.kumar@linux.ibm.com
2020-05-05 21:20:14 +10:00
Alexey Budankov
ff46758313 powerpc/perf: open access for CAP_PERFMON privileged process
Open access to monitoring for CAP_PERFMON privileged process.  Providing
the access under CAP_PERFMON capability singly, without the rest of
CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials
and makes operation more secure.

CAP_PERFMON implements the principle of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and only
for the time that such privileges are actually required)

For backward compatibility reasons access to the monitoring remains open
for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for
secure monitoring is discouraged with respect to CAP_PERFMON capability.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: intel-gfx@lists.freedesktop.org
Cc: linux-doc@vger.kernel.org
Cc: linux-man@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Cc: selinux@vger.kernel.org
Link: http://lore.kernel.org/lkml/ac98cd9f-b59e-673c-c70d-180b3e7695d2@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-16 12:19:08 -03:00
Linus Torvalds
e4da01d833 powerpc updates for 5.7 #2
- A fix for a crash in machine check handling on pseries (ie. guests)
 
  - A small series to make it possible to disable CONFIG_COMPAT, and turn it off
    by default for ppc64le where it's not used.
 
  - A few other miscellaneous fixes and small improvements.
 
 Thanks to:
   Alexey Kardashevskiy, Anju T Sudhakar, Arnd Bergmann, Christophe Leroy, Dan
   Carpenter, Ganesh Goudar, Geert Uytterhoeven, Geoff Levand, Mahesh Salgaonkar,
   Markus Elfring, Michal Suchanek, Nicholas Piggin, Stephen Boyd, Wen Xiong.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl6O8LgTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgMcYEACbGf+Z9brLSasYajoqU6QdqGPacHEN
 1a9TEmUnN+HWgtfkkoEBFbyuYHnhyhYuf7hvNccDjDA91ESVylO+Wq7Q+v/xoz29
 LVyb3V6uuVMLHnoqwP5jpr0lS0aOpuu3Nc2SpfBuolDtJqeMxpVEEK3Ln3uATlq6
 SnEAxQEKb2x4Y4Tfuq5A3txupj0s/UYrmeR6GAdkN3Oapbb9uQl8Ql2smqrZo0cq
 6TLxtoFzbfXkV6NmY2V0OKMTeXt0fhrvdDhFEBckCUpRZLv4Fd7CwPWNER2bUVs6
 04kg87BAO8qRyfr3G93oP0mWgi65kpXI8yN6Vt5Lig+5PnxNh7swvEqDpQR1s9se
 uVoeN7RBHOKQppZRkpzf6yZbelCcMuwTeIBQ9XOx/jYFAHJ2nUkJm9TYgKckVvNY
 E4shiM1eoOVMrEmODFBCUmUkJLyn5jU1+r5mn708v2Nb5E1XgoTejitB6bHyL+Aa
 zo/0DdsZO86iNE7th94oHQRgVbx1vtP9kV6vK6BLB5M95RSGEdVAEMo5CTL70wr+
 hz7suOaijPi+TeCW9YPrcOHcHOQE9pzTFQoVZHuv8egbvd9Rni3aBAMMqHx/lQdE
 Hk4zN7IytOLqQTfINNYgoo0kUKI1ipJQOBfR5jyeLhgg6yp36CnO/m+0QGvrILbi
 18tlf2qkD75Z3g==
 =72sb
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull more powerpc updates from Michael Ellerman:
 "The bulk of this is the series to make CONFIG_COMPAT user-selectable,
  it's been around for a long time but was blocked behind the
  syscall-in-C series.

  Plus there's also a few fixes and other minor things.

  Summary:

   - A fix for a crash in machine check handling on pseries (ie. guests)

   - A small series to make it possible to disable CONFIG_COMPAT, and
     turn it off by default for ppc64le where it's not used.

   - A few other miscellaneous fixes and small improvements.

  Thanks to: Alexey Kardashevskiy, Anju T Sudhakar, Arnd Bergmann,
  Christophe Leroy, Dan Carpenter, Ganesh Goudar, Geert Uytterhoeven,
  Geoff Levand, Mahesh Salgaonkar, Markus Elfring, Michal Suchanek,
  Nicholas Piggin, Stephen Boyd, Wen Xiong"

* tag 'powerpc-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  selftests/powerpc: Always build the tm-poison test 64-bit
  powerpc: Improve ppc_save_regs()
  Revert "powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled"
  powerpc/time: Replace <linux/clk-provider.h> by <linux/of_clk.h>
  powerpc/pseries/ddw: Extend upper limit for huge DMA window for persistent memory
  powerpc/perf: split callchain.c by bitness
  powerpc/64: Make COMPAT user-selectable disabled on littleendian by default.
  powerpc/64: make buildable without CONFIG_COMPAT
  powerpc/perf: consolidate valid_user_sp -> invalid_user_sp
  powerpc/perf: consolidate read_user_stack_32
  powerpc: move common register copy functions from signal_32.c to signal.c
  powerpc: Add back __ARCH_WANT_SYS_LLSEEK macro
  powerpc/ps3: Set CONFIG_UEVENT_HELPER=y in ps3_defconfig
  powerpc/ps3: Remove an unneeded NULL check
  powerpc/ps3: Remove duplicate error message
  powerpc/powernv: Re-enable imc trace-mode in kernel
  powerpc/perf: Implement a global lock to avoid races between trace, core and thread imc events.
  powerpc/pseries: Fix MCE handling on pseries
  selftests/eeh: Skip ahci adapters
  powerpc/64s: Fix doorbell wakeup msgclr optimisation
2020-04-09 11:01:42 -07:00
Michal Suchanek
7c0eda1a04 powerpc/perf: split callchain.c by bitness
Building callchain.c with !COMPAT proved quite ugly with all the
defines. Splitting out the 32bit and 64bit parts looks better.

No code change intended.

Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a20027bf1074935a7934ee2a6757c99ea047e70d.1584699455.git.msuchanek@suse.de
2020-04-03 00:10:00 +11:00
Michal Suchanek
0a7601b6ff powerpc/64: make buildable without CONFIG_COMPAT
There are numerous references to 32bit functions in generic and 64bit
code so ifdef them out.

Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e5619617020ef3a1f54f0c076e7d74cb9ec9f3bf.1584699455.git.msuchanek@suse.de
2020-04-03 00:10:00 +11:00
Michal Suchanek
2910428106 powerpc/perf: consolidate valid_user_sp -> invalid_user_sp
Merge the 32bit and 64bit version.

Halve the check constants on 32bit.

Use STACK_TOP since it is defined.

Passing is_64 is now redundant since is_32bit_task() is used to
determine which callchain variant should be used. Use STACK_TOP and
is_32bit_task() directly.

This removes a page from the valid 32bit area on 64bit:
 #define TASK_SIZE_USER32 (0x0000000100000000UL - (1 * PAGE_SIZE))
 #define STACK_TOP_USER32 TASK_SIZE_USER32

Change return value to bool. It is inverted by users anyway.

Change to invalid_user_sp to avoid inverting the return value twice.

Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/be8e40fc0737fb28ad08b198552dee7cac1c5ce2.1584699455.git.msuchanek@suse.de
2020-04-03 00:10:00 +11:00
Michal Suchanek
d6c19bdee2 powerpc/perf: consolidate read_user_stack_32
There are two almost identical copies for 32bit and 64bit.

The function is used only in 32bit code which will be split out in next
patch so consolidate to one function.

Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0c21c919ed1296420199c78f7c3cfd29d3c7e909.1584699455.git.msuchanek@suse.de
2020-04-03 00:09:59 +11:00
Anju T Sudhakar
a36e8ba60b powerpc/perf: Implement a global lock to avoid races between trace, core and thread imc events.
IMC(In-memory Collection Counters) does performance monitoring in
two different modes, i.e accumulation mode(core-imc and thread-imc events),
and trace mode(trace-imc events). A cpu thread can either be in
accumulation-mode or trace-mode at a time and this is done via the LDBAR
register in POWER architecture. The current design does not address the
races between thread-imc and trace-imc events.

Patch implements a global id and lock to avoid the races between
core, trace and thread imc events. With this global id-lock
implementation, the system can either run core, thread or trace imc
events at a time. i.e. to run any core-imc events, thread/trace imc events
should not be enabled/monitored.

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200313055238.8656-1-anju@linux.vnet.ibm.com
2020-04-03 00:09:58 +11:00
Kan Liang
bbfd5e4fab perf/core: Add new branch sample type for HW index of raw branch records
The low level index is the index in the underlying hardware buffer of
the most recently captured taken branch which is always saved in
branch_entries[0]. It is very useful for reconstructing the call stack.
For example, in Intel LBR call stack mode, the depth of reconstructed
LBR call stack limits to the number of LBR registers. With the low level
index information, perf tool may stitch the stacks of two samples. The
reconstructed LBR call stack can break the HW limitation.

Add a new branch sample type to retrieve low level index of raw branch
records. The low level index is between -1 (unknown) and max depth which
can be retrieved in /sys/devices/cpu/caps/branches.

Only when the new branch sample type is set, the low level index
information is dumped into the PERF_SAMPLE_BRANCH_STACK output.
Perf tool should check the attr.branch_sample_type, and apply the
corresponding format for PERF_SAMPLE_BRANCH_STACK samples.
Otherwise, some user case may be broken. For example, users may parse a
perf.data, which include the new branch sample type, with an old version
perf tool (without the check). Users probably get incorrect information
without any warning.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200127165355.27495-2-kan.liang@linux.intel.com
2020-02-11 13:23:49 +01:00
Christophe Leroy
6edc318585 powerpc/8xx: Use alternative scratch registers in DTLB miss handler
In preparation of handling CONFIG_VMAP_STACK, DTLB miss handler need
to use different scratch registers than other exception handlers in
order to not jeopardise exception entry on stack DTLB misses.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c5287ea59ae9630f505019b309bf94029241635f.1576916812.git.christophe.leroy@c-s.fr
2020-01-27 22:36:16 +11:00
Christophe Leroy
def0bfdbd6 powerpc: use probe_user_read() and probe_user_write()
Instead of opencoding, use probe_user_read() to failessly read
a user location and probe_user_write() for writing to user.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e041f5eedb23f09ab553be8a91c3de2087147320.1579800517.git.christophe.leroy@c-s.fr
2020-01-26 00:11:35 +11:00
Linus Torvalds
7794b1d418 powerpc updates for 5.5
Highlights:
 
  - Infrastructure for secure boot on some bare metal Power9 machines. The
    firmware support is still in development, so the code here won't actually
    activate secure boot on any existing systems.
 
  - A change to xmon (our crash handler / pseudo-debugger) to restrict it to
    read-only mode when the kernel is lockdown'ed, otherwise it's trivial to drop
    into xmon and modify kernel data, such as the lockdown state.
 
  - Support for KASLR on 32-bit BookE machines (Freescale / NXP).
 
  - Fixes for our flush_icache_range() and __kernel_sync_dicache() (VDSO) to work
    with memory ranges >4GB.
 
  - Some reworks of the pseries CMM (Cooperative Memory Management) driver to
    make it behave more like other balloon drivers and enable some cleanups of
    generic mm code.
 
  - A series of fixes to our hardware breakpoint support to properly handle
    unaligned watchpoint addresses.
 
 Plus a bunch of other smaller improvements, fixes and cleanups.
 
 Thanks to:
   Alastair D'Silva, Andrew Donnellan, Aneesh Kumar K.V, Anthony Steinhauser,
   Cédric Le Goater, Chris Packham, Chris Smart, Christophe Leroy, Christopher M.
   Riedl, Christoph Hellwig, Claudio Carvalho, Daniel Axtens, David Hildenbrand,
   Deb McLemore, Diana Craciun, Eric Richter, Geert Uytterhoeven, Greg
   Kroah-Hartman, Greg Kurz, Gustavo L. F. Walbon, Hari Bathini, Harish, Jason
   Yan, Krzysztof Kozlowski, Leonardo Bras, Mathieu Malaterre, Mauro S. M.
   Rodrigues, Michal Suchanek, Mimi Zohar, Nathan Chancellor, Nathan Lynch, Nayna
   Jain, Nick Desaulniers, Oliver O'Halloran, Qian Cai, Rasmus Villemoes, Ravi
   Bangoria, Sam Bobroff, Santosh Sivaraj, Scott Wood, Thomas Huth, Tyrel
   Datwyler, Vaibhav Jain, Valentin Longchamp, YueHaibing.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl3hBycTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgApBEACk91MEQDYJ9MF9I6uN+85qb5p4pMsp
 rGzqnpt+XFidbDAc3eP63pYfIDSo3jtkQ2YL7shAnDOTvkO0md+Vqkl9Aq/G6FIf
 lDBlwbgkXMSxS/O2Lpvfn4NZAoK6dKmiV55LSgfliM62X3e2Saeg6TR55wBTgJ6/
 SlYPDwZfcVHOAiFS3UmfB+hkiIZk+AI5Zr5VAZvT2ZmeH36yAWkq4JgJI1uAk6m1
 /7iCnlfUjx/nl/BhnA3kjjmAgGCJ5s/WuVgwFMz47XpMBWGBhLWpMh/NqDTFb8ca
 kpiVQoVPQe2xyO3pL/kOwBy6sii26ftfHDhLKMy1hJdEhVQzS5LerPIMeh1qsU8Q
 hV/Cj+jfsrS/vBDOehj3jwx93t+861PmTOqgLnpYQ6Ltrt+2B/74+fufGMHE1kI3
 Ffo7xvNw4sw6bSziDxDFqUx2P1dFN5D5EJsJsYM98ekkVAAkzNqCDRvfD2QI8Pif
 VXWPYXqtNJTrVPJA0D7Yfo9FDNwhANd0f1zi7r/U5mVXBFUyKOlGqTQSkXgMrVeK
 3I7wHPOVGgdA5UUkfcd3pcuqsY081U9E//o5PUfj8ybO5JCwly8NoatbG+xHmKia
 a72uJT8MjCo9mGCHKDrwi9l/kqms6ZSv8RP+yMhGuB52YoiGc6PpVyab5jXIUd1N
 yTtBlC0YGW1JYw==
 =JHzg
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights:

   - Infrastructure for secure boot on some bare metal Power9 machines.
     The firmware support is still in development, so the code here
     won't actually activate secure boot on any existing systems.

   - A change to xmon (our crash handler / pseudo-debugger) to restrict
     it to read-only mode when the kernel is lockdown'ed, otherwise it's
     trivial to drop into xmon and modify kernel data, such as the
     lockdown state.

   - Support for KASLR on 32-bit BookE machines (Freescale / NXP).

   - Fixes for our flush_icache_range() and __kernel_sync_dicache()
     (VDSO) to work with memory ranges >4GB.

   - Some reworks of the pseries CMM (Cooperative Memory Management)
     driver to make it behave more like other balloon drivers and enable
     some cleanups of generic mm code.

   - A series of fixes to our hardware breakpoint support to properly
     handle unaligned watchpoint addresses.

  Plus a bunch of other smaller improvements, fixes and cleanups.

  Thanks to: Alastair D'Silva, Andrew Donnellan, Aneesh Kumar K.V,
  Anthony Steinhauser, Cédric Le Goater, Chris Packham, Chris Smart,
  Christophe Leroy, Christopher M. Riedl, Christoph Hellwig, Claudio
  Carvalho, Daniel Axtens, David Hildenbrand, Deb McLemore, Diana
  Craciun, Eric Richter, Geert Uytterhoeven, Greg Kroah-Hartman, Greg
  Kurz, Gustavo L. F. Walbon, Hari Bathini, Harish, Jason Yan, Krzysztof
  Kozlowski, Leonardo Bras, Mathieu Malaterre, Mauro S. M. Rodrigues,
  Michal Suchanek, Mimi Zohar, Nathan Chancellor, Nathan Lynch, Nayna
  Jain, Nick Desaulniers, Oliver O'Halloran, Qian Cai, Rasmus Villemoes,
  Ravi Bangoria, Sam Bobroff, Santosh Sivaraj, Scott Wood, Thomas Huth,
  Tyrel Datwyler, Vaibhav Jain, Valentin Longchamp, YueHaibing"

* tag 'powerpc-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (144 commits)
  powerpc/fixmap: fix crash with HIGHMEM
  x86/efi: remove unused variables
  powerpc: Define arch_is_kernel_initmem_freed() for lockdep
  powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp
  powerpc: Avoid clang warnings around setjmp and longjmp
  powerpc: Don't add -mabi= flags when building with Clang
  powerpc: Fix Kconfig indentation
  powerpc/fixmap: don't clear fixmap area in paging_init()
  selftests/powerpc: spectre_v2 test must be built 64-bit
  powerpc/powernv: Disable native PCIe port management
  powerpc/kexec: Move kexec files into a dedicated subdir.
  powerpc/32: Split kexec low level code out of misc_32.S
  powerpc/sysdev: drop simple gpio
  powerpc/83xx: map IMMR with a BAT.
  powerpc/32s: automatically allocate BAT in setbat()
  powerpc/ioremap: warn on early use of ioremap()
  powerpc: Add support for GENERIC_EARLY_IOREMAP
  powerpc/fixmap: Use __fix_to_virt() instead of fix_to_virt()
  powerpc/8xx: use the fixmapped IMMR in cpm_reset()
  powerpc/8xx: add __init to cpm1 init functions
  ...
2019-11-30 14:35:43 -08:00
Michal Suchanek
42484d2c0f powerpc/perf: remove current_is_64bit()
Since commit ed1cd6deb0 ("powerpc: Activate CONFIG_THREAD_INFO_IN_TASK")
current_is_64bit() is quivalent to !is_32bit_task().
Remove the redundant function.

Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190912194633.12045-1-msuchanek@suse.de
2019-11-13 16:58:10 +11:00
Joel Fernandes (Google)
da97e18458 perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl.  This has a number of
limitations:

1. The sysctl is only a single value. Many types of accesses are controlled
   based on the single value thus making the control very limited and
   coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
   all processes get access to perf_event_open(2) opening the door to
   security issues.

This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.

5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
   syscall itself. The hook is called from all the places that the
   perf_event_paranoid sysctl is checked to keep it consistent with the
   systctl. The hook gets passed a 'type' argument which controls CPU,
   kernel and tracepoint accesses (in this context, CPU, kernel and
   tracepoint have the same semantics as the perf_event_paranoid sysctl).
   Additionally, I added an 'open' type which is similar to
   perf_event_paranoid sysctl == 3 patch carried in Android and several other
   distros but was rejected in mainline [1] in 2016.

2. perf_event_alloc: This allocates a new security object for the event
   which stores the current SID within the event. It will be useful when
   the perf event's FD is passed through IPC to another process which may
   try to read the FD. Appropriate security checks will limit access.

3. perf_event_free: Called when the event is closed.

4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.

5. perf_event_write: Called from the ioctl(2) syscalls for the event.

[1] https://lwn.net/Articles/696240/

Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.

To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
2019-10-17 21:31:55 +02:00
Nicholas Piggin
10c4bd7cd2 powerpc/perf: fix imc allocation failure handling
The alloc_pages_node return value should be tested for failure
before being passed to page_address.

Tested-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190724084638.24982-3-npiggin@gmail.com
2019-08-20 21:22:20 +10:00
Linus Torvalds
192f0f8e9d powerpc updates for 5.3
Notable changes:
 
  - Removal of the NPU DMA code, used by the out-of-tree Nvidia driver, as well
    as some other functions only used by drivers that haven't (yet?) made it
    upstream.
 
  - A fix for a bug in our handling of hardware watchpoints (eg. perf record -e
    mem: ...) which could lead to register corruption and kernel crashes.
 
  - Enable HAVE_ARCH_HUGE_VMAP, which allows us to use large pages for vmalloc
    when using the Radix MMU.
 
  - A large but incremental rewrite of our exception handling code to use gas
    macros rather than multiple levels of nested CPP macros.
 
 And the usual small fixes, cleanups and improvements.
 
 Thanks to:
   Alastair D'Silva, Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju
   T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Cédric Le Goater,
   Christian Lamparter, Christophe Leroy, Christophe Lombard, Christoph Hellwig,
   Daniel Axtens, Denis Efremov, Enrico Weigelt, Frederic Barrat, Gautham R.
   Shenoy, Geert Uytterhoeven, Geliang Tang, Gen Zhang, Greg Kroah-Hartman, Greg
   Kurz, Gustavo Romero, Krzysztof Kozlowski, Madhavan Srinivasan, Masahiro
   Yamada, Mathieu Malaterre, Michael Neuling, Nathan Lynch, Naveen N. Rao,
   Nicholas Piggin, Nishad Kamdar, Oliver O'Halloran, Qian Cai, Ravi Bangoria,
   Sachin Sant, Sam Bobroff, Satheesh Rajendran, Segher Boessenkool, Shaokun
   Zhang, Shawn Anastasio, Stewart Smith, Suraj Jitindar Singh, Thiago Jung
   Bauermann, YueHaibing.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdKVoLAAoJEFHr6jzI4aWA0kIP/A6shIbbE7H5W2hFrqt/PPPK
 3+VrvPKbOFF+W6hcE/RgSZmEnUo0svdNjHUd/eMfFS1vb/uRt2QDdrsHUNNwURQL
 M2mcLXFwYpnjSjb/XMgDbHpAQxjeGfTdYLonUIejN7Rk8KQUeLyKQ3SBn6kfMc46
 DnUUcPcjuRGaETUmVuZZ4e40ZWbJp8PKDrSJOuUrTPXMaK5ciNbZk5mCWXGbYl6G
 BMQAyv4ld/417rNTjBEP/T2foMJtioAt4W6mtlgdkOTdIEZnFU67nNxDBthNSu2c
 95+I+/sML4KOp1R4yhqLSLIDDbc3bg3c99hLGij0d948z3bkSZ8bwnPaUuy70C4v
 U8rvl/+N6C6H3DgSsPE/Gnkd8DnudqWY8nULc+8p3fXljGwww6/Qgt+6yCUn8BdW
 WgixkSjKgjDmzTw8trIUNEqORrTVle7cM2hIyIK2Q5T4kWzNQxrLZ/x/3wgoYjUa
 1KwIzaRo5JKZ9D3pJnJ5U+knE2/90rJIyfcp0W6ygyJsWKi2GNmq1eN3sKOw0IxH
 Tg86RENIA/rEMErNOfP45sLteMuTR7of7peCG3yumIOZqsDVYAzerpvtSgip2cvK
 aG+9HcYlBFOOOF9Dabi8GXsTBLXLfwiyjjLSpA9eXPwW8KObgiNfTZa7ujjTPvis
 4mk9oukFTFUpfhsMmI3T
 =3dBZ
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Notable changes:

   - Removal of the NPU DMA code, used by the out-of-tree Nvidia driver,
     as well as some other functions only used by drivers that haven't
     (yet?) made it upstream.

   - A fix for a bug in our handling of hardware watchpoints (eg. perf
     record -e mem: ...) which could lead to register corruption and
     kernel crashes.

   - Enable HAVE_ARCH_HUGE_VMAP, which allows us to use large pages for
     vmalloc when using the Radix MMU.

   - A large but incremental rewrite of our exception handling code to
     use gas macros rather than multiple levels of nested CPP macros.

  And the usual small fixes, cleanups and improvements.

  Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Andreas Schwab,
  Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Arnd Bergmann,
  Athira Rajeev, Cédric Le Goater, Christian Lamparter, Christophe
  Leroy, Christophe Lombard, Christoph Hellwig, Daniel Axtens, Denis
  Efremov, Enrico Weigelt, Frederic Barrat, Gautham R. Shenoy, Geert
  Uytterhoeven, Geliang Tang, Gen Zhang, Greg Kroah-Hartman, Greg Kurz,
  Gustavo Romero, Krzysztof Kozlowski, Madhavan Srinivasan, Masahiro
  Yamada, Mathieu Malaterre, Michael Neuling, Nathan Lynch, Naveen N.
  Rao, Nicholas Piggin, Nishad Kamdar, Oliver O'Halloran, Qian Cai, Ravi
  Bangoria, Sachin Sant, Sam Bobroff, Satheesh Rajendran, Segher
  Boessenkool, Shaokun Zhang, Shawn Anastasio, Stewart Smith, Suraj
  Jitindar Singh, Thiago Jung Bauermann, YueHaibing"

* tag 'powerpc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (163 commits)
  powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state.
  powerpc/eeh: Handle hugepages in ioremap space
  ocxl: Update for AFU descriptor template version 1.1
  powerpc/boot: pass CONFIG options in a simpler and more robust way
  powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h
  powerpc/irq: Don't WARN continuously in arch_local_irq_restore()
  powerpc/module64: Use symbolic instructions names.
  powerpc/module32: Use symbolic instructions names.
  powerpc: Move PPC_HA() PPC_HI() and PPC_LO() to ppc-opcode.h
  powerpc/module64: Fix comment in R_PPC64_ENTRY handling
  powerpc/boot: Add lzo support for uImage
  powerpc/boot: Add lzma support for uImage
  powerpc/boot: don't force gzipped uImage
  powerpc/8xx: Add microcode patch to move SMC parameter RAM.
  powerpc/8xx: Use IO accessors in microcode programming.
  powerpc/8xx: replace #ifdefs by IS_ENABLED() in microcode.c
  powerpc/8xx: refactor programming of microcode CPM params.
  powerpc/8xx: refactor printing of microcode patch name.
  powerpc/8xx: Refactor microcode write
  powerpc/8xx: refactor writing of CPM microcode arrays
  ...
2019-07-13 16:08:36 -07:00
Geliang Tang
c197922f0a powerpc/perf/24x7: use rb_entry
To make the code clearer, use rb_entry() instead of container_of() to
deal with rbtree.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-04 17:09:37 +10:00
Anju T Sudhakar
9c9f8fb71f powerpc/perf: Use cpumask_last() to determine the designated cpu for nest/core units.
Nest and core IMC (In-Memory Collection counters) assigns a particular
cpu as the designated target for counter data collection. During
system boot, the first online cpu in a chip gets assigned as the
designated cpu for that chip(for nest-imc) and the first online cpu in
a core gets assigned as the designated cpu for that core(for
core-imc).

If the designated cpu goes offline, the next online cpu from the same
chip(for nest-imc)/core(for core-imc) is assigned as the next target,
and the event context is migrated to the target cpu. Currently,
cpumask_any_but() function is used to find the target cpu. Though this
function is expected to return a `random` cpu, this always returns the
next online cpu.

If all cpus in a chip/core is offlined in a sequential manner,
starting from the first cpu, the event migration has to happen for all
the cpus which goes offline. Since the migration process involves a
grace period, the total time taken to offline all the cpus will be
significantly high.

Example:
  In a system which has 2 sockets, with
  NUMA node0 CPU(s):     0-87
  NUMA node8 CPU(s):     88-175

  Time taken to offline cpu 88-175:
  real    2m56.099s
  user    0m0.191s
  sys     0m0.000s

Use cpumask_last() to choose the target cpu, when the designated cpu
goes online, so the migration will happen only when the last_cpu in
the mask goes offline. This way the time taken to offline all cpus in
a chip/core can be reduced.

With the patch:

  Time taken  to offline cpu 88-175:
  real    0m12.207s
  user    0m0.171s
  sys     0m0.000s

Offlining all cpus in reverse order is also taken care because,
cpumask_any_but() is used to find the designated cpu if the last cpu
in the mask goes offline. Since cpumask_any_but() always return the
first cpu in the mask, that becomes the designated cpu and migration
will happen only when the first_cpu in the mask goes offline.

Example: With the patch,

  Time taken to offline cpu from 175-88:
  real    0m9.330s
  user    0m0.110s
  sys     0m0.000s

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-19 20:05:09 +10:00
Linus Torvalds
460b48a0fe powerpc fixes for 5.2 #3
A minor fix to our IMC PMU code to print a less confusing error message when the
 driver can't initialise properly.
 
 A fix for a bug where a user requesting an unsupported branch sampling filter
 can corrupt PMU state, preventing the PMU from counting properly.
 
 And finally a fix for a bug in our support for kexec_file_load(), which
 prevented loading a kernel and initramfs. Most versions of kexec don't yet use
 kexec_file_load().
 
 Thanks to:
   Anju T Sudhakar, Dave Young, Madhavan Srinivasan, Ravi Bangoria, Thiago Jung
   Bauermann.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJc86afAAoJEFHr6jzI4aWAIC8P/R/1Y8ro94tmu58MvqNmG80F
 Rc/+/1dY023X/itMHikSzPzs1UnQ1sjSKOOk3WOJcgVBE6Ks9Q7peeWD6wnSwYUq
 PwvXAZNmsD0Bgh9Oazj7M4dMSZHq7zJzQ8WzopltVueKgx941BLwYPRZG5zFXfcV
 k2Zaln87wfOuyfl/gQyWuG2ikziHJl9EvS3IIFejAnnogkFMti/FAe5bYXP3IT4W
 c4whLQJMNLeaeA7zeFBUvotWCZVfCq+Li9QTKWYJIPIZqN3kSBP3wHuUC8wEFuXa
 32+qu7RW5NfbuJERtCJYXqjO9C9A/QgOAm2n0LF2vBzo5CwBIqiR69PRuVzzUekU
 NhYsHbbwviRl1By7pxUCNm48TW+3utm903etEp+PPV3FYdG+E2hlhs8NoUEdiN5X
 PWzEMqUqneN6+HnwpEeeiDQef19XTaGr3PUd8CtJYJQ4rSeyvnBASjGKfVuBW6FL
 85aCwsmAYdapJOyhK5OSaDEWwMnpLBClJKqNrdH7y9fFm2lFHSJSNxHERBtSQcfx
 Ra+F7M0tyPRJFmVfXiX8u2ZYvYBLUIYUV1+mJHokEFOR9tW+Oy1lV3UMlXzWZXqb
 CY3qpCk0ix1rLjv8VpDoVQE2LkMhhUGqsgymSFueIwsrQhfhb/ymiVmaEH202tTM
 DyvBWmVuI9vyqR3Vt9GQ
 =ilhV
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "A minor fix to our IMC PMU code to print a less confusing error
  message when the driver can't initialise properly.

  A fix for a bug where a user requesting an unsupported branch sampling
  filter can corrupt PMU state, preventing the PMU from counting
  properly.

  And finally a fix for a bug in our support for kexec_file_load(),
  which prevented loading a kernel and initramfs. Most versions of kexec
  don't yet use kexec_file_load().

  Thanks to: Anju T Sudhakar, Dave Young, Madhavan Srinivasan, Ravi
  Bangoria, Thiago Jung Bauermann"

* tag 'powerpc-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/kexec: Fix loading of kernel + initramfs with kexec_file_load()
  powerpc/perf: Fix MMCRA corruption by bhrb_filter
  powerpc/powernv: Return for invalid IMC domain
2019-06-02 10:21:04 -07:00
Thomas Gleixner
2874c5fd28 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:32 -07:00
Thomas Gleixner
7cb22cc3ec treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 138
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or any
  later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 1 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190524100844.067492367@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:25:16 -07:00
Thomas Gleixner
f4344b19fa treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 112
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or
  later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 4 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190523091650.480557885@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24 17:39:01 +02:00
Ravi Bangoria
3202e35ec1 powerpc/perf: Fix MMCRA corruption by bhrb_filter
Consider a scenario where user creates two events:

  1st event:
    attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
    attr.branch_sample_type = PERF_SAMPLE_BRANCH_ANY;
    fd = perf_event_open(attr, 0, 1, -1, 0);

  This sets cpuhw->bhrb_filter to 0 and returns valid fd.

  2nd event:
    attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
    attr.branch_sample_type = PERF_SAMPLE_BRANCH_CALL;
    fd = perf_event_open(attr, 0, 1, -1, 0);

  It overrides cpuhw->bhrb_filter to -1 and returns with error.

Now if power_pmu_enable() gets called by any path other than
power_pmu_add(), ppmu->config_bhrb(-1) will set MMCRA to -1.

Fixes: 3925f46bb5 ("powerpc/perf: Enable branch stack sampling framework")
Cc: stable@vger.kernel.org # v3.10+
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-22 21:38:20 +10:00
Anju T Sudhakar
012ae24484 powerpc/perf: Trace imc PMU functions
Add PMU functions to support trace-imc.

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 02:55:00 +10:00
Anju T Sudhakar
72c69dcddc powerpc/perf: Trace imc events detection and cpuhotplug
Patch detects trace-imc events, does memory initilizations for each online
cpu, and registers cpuhotplug call-backs.

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 02:55:00 +10:00
Madhavan Srinivasan
216c3087a3 powerpc/perf: Add privileged access check for thread_imc
Add code to restrict user access to thread_imc pmu since
some event report privilege level information.

Fixes: f74c89bd80 ("powerpc/perf: Add thread IMC PMU support")
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 02:54:59 +10:00
Anju T Sudhakar
dd50cf7cbc powerpc/perf: Rearrange setting of ldbar for thread-imc
LDBAR holds the memory address allocated for each cpu. For thread-imc
the mode bit (i.e bit 1) of LDBAR is set to accumulation.
Currently, ldbar is loaded with per cpu memory address and mode set to
accumulation at boot time.

To enable trace-imc, the mode bit of ldbar should be set to 'trace'. So to
accommodate trace-mode of IMC, reposition setting of ldbar for thread-imc
to thread_imc_event_add(). Also reset ldbar at thread_imc_event_del().

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 02:54:59 +10:00
Anju T Sudhakar
860b7d2286 powerpc/perf: Fix loop exit condition in nest_imc_event_init
The data structure (i.e struct imc_mem_info) to hold the memory address
information for nest imc units is allocated based on the number of nodes
in the system.

nest_imc_event_init() traverse this struct array to calculate the memory
base address for the event-cpu. If we fail to find a match for the event
cpu's chip-id in imc_mem_info struct array, then the do-while loop will
iterate until we crash.

Fix this by changing the loop exit condition based on the number of
non zero vbase elements in the array, since the allocation is done for
nr_chips + 1.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 885dcd709b ("powerpc/perf: Add nest IMC PMU support")
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 02:54:59 +10:00
Anju T Sudhakar
a913e5e8b4 powerpc/perf: Return accordingly on invalid chip-id in
Nest hardware counter memory resides in a per-chip reserve-memory.
During nest_imc_event_init(), chip-id of the event-cpu is considered to
calculate the base memory addresss for that cpu. Return, proper error
condition if the chip_id calculated is invalid.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 885dcd709b ("powerpc/perf: Add nest IMC PMU support")
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 02:54:59 +10:00
Madhavan Srinivasan
659a6e38db powerpc/perf: Remove PM_BR_CMPL_ALT from power9 event list
PM_BR_CMPL_ALT event is not supported, remove it from the power9 event
list.

Fixes: 24bedcb7c8 ("powerpc/perf: Fix branch event code for power9")
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 02:54:59 +10:00
Madhavan Srinivasan
be80e758d0 powerpc/perf: Add generic compat mode pmu driver
Most of the power processor generation performance monitoring
unit (PMU) driver code is bundled in the kernel and one of those
is enabled/registered based on the oprofile_cpu_type check at
the boot.

But things get little tricky incase of "compat" mode boot.
IBM POWER System Server based processors has a compactibility
mode feature, which simpily put is, Nth generation processor
(lets say POWER8) will act and appear in a mode consistent
with an earlier generation (N-1) processor (that is POWER7).
And in this "compat" mode boot, kernel modify the
"oprofile_cpu_type" to be Nth generation (POWER8). If Nth
generation pmu driver is bundled (POWER8), it gets registered.

Key dependency here is to have distro support for latest
processor performance monitoring support. Patch here adds
a generic "compat-mode" performance monitoring driver to
be register in absence of powernv platform specific pmu driver.

Driver supports only "cycles" and "instruction" events.
"0x0001e" used as event code for "cycles" and "0x00002"
used as event code for "instruction" events. New file
called "generic-compat-pmu.c" is created to contain the driver
specific code. And base raw event code format modeled
on PPMU_ARCH_207S.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Use SPDX tag for license]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 02:54:58 +10:00
Madhavan Srinivasan
708597daf2 powerpc/perf: init pmu from core-book3s
Currenty pmu driver file for each ppc64 generation processor
has a __init call in itself. Refactor the code by moving the
__init call to core-books.c. This also clean's up compat mode
pmu driver registration.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Use SPDX tag for license]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 02:54:58 +10:00
Linus Torvalds
6c3ac11343 powerpc updates for 5.1
Notable changes:
 
  - Enable THREAD_INFO_IN_TASK to move thread_info off the stack.
 
  - A big series from Christoph reworking our DMA code to use more of the generic
    infrastructure, as he said:
    "This series switches the powerpc port to use the generic swiotlb and
     noncoherent dma ops, and to use more generic code for the coherent direct
     mapping, as well as removing a lot of dead code."
 
  - Increase our vmalloc space to 512T with the Hash MMU on modern CPUs, allowing
    us to support machines with larger amounts of total RAM or distance between
    nodes.
 
  - Two series from Christophe, one to optimise TLB miss handlers on 6xx, and
    another to optimise the way STRICT_KERNEL_RWX is implemented on some 32-bit
    CPUs.
 
  - Support for KCOV coverage instrumentation which means we can run syzkaller
    and discover even more bugs in our code.
 
 And as always many clean-ups, reworks and minor fixes etc.
 
 Thanks to:
  Alan Modra, Alexey Kardashevskiy, Alistair Popple, Andrea Arcangeli, Andrew
  Donnellan, Aneesh Kumar K.V, Aravinda Prasad, Balbir Singh, Brajeswar Ghosh,
  Breno Leitao, Christian Lamparter, Christian Zigotzky, Christophe Leroy,
  Christoph Hellwig, Corentin Labbe, Daniel Axtens, David Gibson, Diana Craciun,
  Firoz Khan, Gustavo A. R. Silva, Igor Stoppa, Joe Lawrence, Joel Stanley,
  Jonathan Neuschäfer, Jordan Niethe, Laurent Dufour, Madhavan Srinivasan, Mahesh
  Salgaonkar, Mark Cave-Ayland, Masahiro Yamada, Mathieu Malaterre, Matteo Croce,
  Meelis Roos, Michael W. Bringmann, Nathan Chancellor, Nathan Fontenot, Nicholas
  Piggin, Nick Desaulniers, Nicolai Stange, Oliver O'Halloran, Paul Mackerras,
  Peter Xu, PrasannaKumar Muralidharan, Qian Cai, Rashmica Gupta, Reza Arbab,
  Robert P. J. Day, Russell Currey, Sabyasachi Gupta, Sam Bobroff, Sandipan Das,
  Sergey Senozhatsky, Souptick Joarder, Stewart Smith, Tyrel Datwyler, Vaibhav
  Jain, YueHaibing.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcgRJlAAoJEFHr6jzI4aWAL9oP+gPlrZgyaAg/51lmubLtlbtk
 QuGU8EiuJZoJD1OHrMPtppBOY7rQZOxJe58AoPig8wTvs+j/TxJ25fmiZncnf5U2
 PC8QAjbj0UmQHgy+K30sUeOnDg9tdkHKHJ5/ecjJcvykkqsjyMnV7biFQ1cOA0HT
 LflXHEEtiG9P9u7jZoAhtnfpgn1/l9mhTYMe26J1fqvC0164qMDFaXDTQXyDfyvG
 gmuqccGMawSk7IdagmQxwXtwyfwOnarmGn+n31XKRejApGZ/pjiEA23JOJOaJcia
 m76Jy3roao6sEtCUNpBFXEtwOy9POy3OiGy6yg/9896tDMvG84OuO6ltV1nFGawL
 PmwE+ug63L4g/HWxZyAeb26T2oTTp/YIaKQPtsq4d286pvg/qr2KPNzFoAEhmJqU
 yLrebv276pVeiLpLmCLPvcPj9t76vWKZaUm0FoE+zUDg7Rl7Alow8A/c4tdjOI6y
 QwpbCiYseyiJ32lCZZdbN7Cy6+iM6vb3i1oNKc8MVqhBGTwLJnTU0ruPBSvCaRvD
 NoQWO1RWpNu/BuivuLEKS9q3AoxenGwiqowxGhdVmI3Oc9jGWcEYlduR00VDYPVp
 /RCfwtTY5NyC++h5cnbz8aLJ1hBXG5m79CXfprV+zPWeiLPCaMT6w9Y5QUS2wqA+
 EZ734NknDJOjaHc4cGdZ
 =Z9bb
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Notable changes:

   - Enable THREAD_INFO_IN_TASK to move thread_info off the stack.

   - A big series from Christoph reworking our DMA code to use more of
     the generic infrastructure, as he said:
       "This series switches the powerpc port to use the generic swiotlb
        and noncoherent dma ops, and to use more generic code for the
        coherent direct mapping, as well as removing a lot of dead
        code."

   - Increase our vmalloc space to 512T with the Hash MMU on modern
     CPUs, allowing us to support machines with larger amounts of total
     RAM or distance between nodes.

   - Two series from Christophe, one to optimise TLB miss handlers on
     6xx, and another to optimise the way STRICT_KERNEL_RWX is
     implemented on some 32-bit CPUs.

   - Support for KCOV coverage instrumentation which means we can run
     syzkaller and discover even more bugs in our code.

  And as always many clean-ups, reworks and minor fixes etc.

  Thanks to: Alan Modra, Alexey Kardashevskiy, Alistair Popple, Andrea
  Arcangeli, Andrew Donnellan, Aneesh Kumar K.V, Aravinda Prasad, Balbir
  Singh, Brajeswar Ghosh, Breno Leitao, Christian Lamparter, Christian
  Zigotzky, Christophe Leroy, Christoph Hellwig, Corentin Labbe, Daniel
  Axtens, David Gibson, Diana Craciun, Firoz Khan, Gustavo A. R. Silva,
  Igor Stoppa, Joe Lawrence, Joel Stanley, Jonathan Neuschäfer, Jordan
  Niethe, Laurent Dufour, Madhavan Srinivasan, Mahesh Salgaonkar, Mark
  Cave-Ayland, Masahiro Yamada, Mathieu Malaterre, Matteo Croce, Meelis
  Roos, Michael W. Bringmann, Nathan Chancellor, Nathan Fontenot,
  Nicholas Piggin, Nick Desaulniers, Nicolai Stange, Oliver O'Halloran,
  Paul Mackerras, Peter Xu, PrasannaKumar Muralidharan, Qian Cai,
  Rashmica Gupta, Reza Arbab, Robert P. J. Day, Russell Currey,
  Sabyasachi Gupta, Sam Bobroff, Sandipan Das, Sergey Senozhatsky,
  Souptick Joarder, Stewart Smith, Tyrel Datwyler, Vaibhav Jain,
  YueHaibing"

* tag 'powerpc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits)
  powerpc/32: Clear on-stack exception marker upon exception return
  powerpc: Remove export of save_stack_trace_tsk_reliable()
  powerpc/mm: fix "section_base" set but not used
  powerpc/mm: Fix "sz" set but not used warning
  powerpc/mm: Check secondary hash page table
  powerpc: remove nargs from __SYSCALL
  powerpc/64s: Fix unrelocated interrupt trampoline address test
  powerpc/powernv/ioda: Fix locked_vm counting for memory used by IOMMU tables
  powerpc/fsl: Fix the flush of branch predictor.
  powerpc/powernv: Make opal log only readable by root
  powerpc/xmon: Fix opcode being uninitialized in print_insn_powerpc
  powerpc/powernv: move OPAL call wrapper tracing and interrupt handling to C
  powerpc/64s: Fix data interrupts vs d-side MCE reentrancy
  powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy
  powerpc/64s: system reset interrupt preserve HSRRs
  powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
  powerpc/mm/hash: Handle mmap_min_addr correctly in get_unmapped_area topdown search
  powerpc/hugetlb: Handle mmap_min_addr correctly in get_unmapped_area callback
  selftests/powerpc: Remove duplicate header
  powerpc sstep: Add support for modsd, modud instructions
  ...
2019-03-07 12:56:26 -08:00
Michael Ellerman
637cfeb9f9 Merge branch 'fixes' into next
There's a few important fixes in our fixes branch, in particular the
pgd/pud_present() one, so merge it now.
2019-02-19 19:56:26 +11:00
Ingo Molnar
98cb621081 Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04 08:45:42 +01:00
Madhavan Srinivasan
ab4510e9ac powerpc/perf: Add mem access events to sysfs
Add mem-loads/mem-stores events to sysfs.
The event is formed based on raw event encoding.
Primary PMU event used here is PM_MRK_INST_CMPL
along with MMCRA[SM] modes and Thresholding bit

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-31 10:38:27 +11:00
Andrew Murray
c2c9091d9e perf/core, arch/powerpc: use PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs
For PowerPC PMUs that do not support context exclusion let's
advertise the PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that
perf will prevent us from handling events where any exclusion flags
are set. Let's also remove the now unnecessary check for exclusion
flags.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: robin.murphy@arm.com
Cc: suzuki.poulose@arm.com
Link: https://lkml.kernel.org/r/1547128414-50693-10-git-send-email-andrew.murray@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-01-21 11:01:26 +01:00
Madhavan Srinivasan
6529870cb0 powerpc/perf: Update perf_regs structure to include MMCRA
On each sample, Monitor Mode Control Register A (MMCRA) content is
saved in pt_regs. MMCRA does not have a entry as-is in the pt_regs but
instead, MMCRA content is saved in the "dsisr" register of pt_regs.

Patch adds another entry to the perf_regs structure to include the
"MMCRA" printing which internally maps to the "dsisr" of pt_regs.

It also check for the MMCRA availability in the platform and present
value accordingly

mpe: This was the 2nd patch in a series with commit 333804dc3b
("powerpc/perf: Update perf_regs structure to include SIER") but I
accidentally only merged the 1st patch, so merge this one now.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-08 19:22:47 +11:00
Ravi Bangoria
0c9108b083 Powerpc/perf: Wire up PMI throttling
Commit 14c63f17b1 ("perf: Drop sample rate when sampling is too
slow") introduced a way to throttle PMU interrupts if we're spending
too much time just processing those. Wire up powerpc PMI handler to
use this infrastructure.

We have throttling of the *rate* of interrupts, but this adds
throttling based on the *time taken* to process the interrupts.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-21 11:32:49 +11:00
Madhavan Srinivasan
3757cba80a powerpc/perf: Remove l2 bus events from HW cache event array
Remove PM_L2_ST_MISS and PM_L2_ST from HW cache event array since
these are bus events. And these needs to be programmed in groups.
Hence remove them.

Fixes: f1fb60bfde ('powerpc/perf: Export Power9 generic and cache events to sysfs')
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-20 20:53:11 +11:00
Madhavan Srinivasan
59029136d7 powerpc/perf: Add constraints for power9 l2/l3 bus events
In previous generation processors, both bus events and direct
events of performance monitoring unit can be individually
programmabled and monitored in PMCs.

But in Power9, L2/L3 bus events are always available as a
"bank" of 4 events. To obtain the counts for any of the
l2/l3 bus events in a given bank, the user will have to
program PMC4 with corresponding l2/l3 bus event for that
bank.

Patch enforce two contraints incase of L2/L3 bus events.

1)Any L2/L3 event when programmed is also expected to program corresponding
PMC4 event from that group.
2)PMC4 event should always been programmed first due to group constraint
logic limitation

For ex. consider these L3 bus events

PM_L3_PF_ON_CHIP_MEM (0x460A0),
PM_L3_PF_MISS_L3 (0x160A0),
PM_L3_CO_MEM (0x260A0),
PM_L3_PF_ON_CHIP_CACHE (0x360A0),

1) This is an INVALID group for L3 Bus event monitoring,
since it is missing PMC4 event.
	perf stat -e "{r160A0,r260A0,r360A0}" < >

And this is a VALID group for L3 Bus events:
	perf stat -e "{r460A0,r160A0,r260A0,r360A0}" < >

2) This is an INVALID group for L3 Bus event monitoring,
since it is missing PMC4 event.
	perf stat -e "{r260A0,r360A0}" < >

And this is a VALID group for L3 Bus events:
	perf stat -e "{r460A0,r260A0,r360A0}" < >

3) This is an INVALID group for L3 Bus event monitoring,
since it is missing PMC4 event.
	perf stat -e "{r360A0}" < >

And this is a VALID group for L3 Bus events:
	perf stat -e "{r460A0,r360A0}" < >

Patch here implements group constraint logic suggested by Michael Ellerman.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-20 20:53:11 +11:00
Madhavan Srinivasan
2d46d4877b powerpc/perf: Fix unit_sel/cache_sel checks
Raw event code has couple of fields "unit" and "cache" in it, to capture
the "unit" to monitor for a given pmcxsel and cache reload qualifier to
program in MMCR1.

isa207_get_constraint() refers "unit" field to update the MMCRC (L2/L3)
Event bus control fields with "cache" bits of the raw event code.
These are power8 specific and not supported by PowerISA v3.0 pmu. So wrap
the checks to be power8 specific. Also, "cache" bit field is referred to
update MMCR1[16:17] and this check can be power8 specific.

Fixes: 7ffd948fae ('powerpc/perf: factor out power8 pmu functions')
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-20 20:53:11 +11:00
Madhavan Srinivasan
8c31459d61 powerpc/perf: Cleanup cache_sel bits comment
Update the raw event code comment in power9-pmu.c with respect to
"cache" bits, since power9 MMCRC does not support these.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-20 20:53:11 +11:00
Madhavan Srinivasan
333804dc3b powerpc/perf: Update perf_regs structure to include SIER
On each sample, Sample Instruction Event Register (SIER) content
is saved in pt_regs. SIER does not have a entry as-is in the pt_regs
but instead, SIER content is saved in the "dar" register of pt_regs.

Patch adds another entry to the perf_regs structure to include the "SIER"
printing which internally maps to the "dar" of pt_regs.

It also check for the SIER availability in the platform and present
value accordingly

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-20 20:53:11 +11:00
Madhavan Srinivasan
17cfccc915 powerpc/perf: Fix thresholding counter data for unknown type
MMCRA[34:36] and MMCRA[38:44] expose the thresholding counter value.
Thresholding counter can be used to count latency cycles such as
load miss to reload. But threshold counter value is not relevant
when the sampled instruction type is unknown or reserved. Patch to
fix the thresholding counter value to zero when sampled instruction
type is unknown or reserved.

Fixes: 170a315f41c6('powerpc/perf: Support to export MMCRA[TEC*] field to userspace')
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-20 20:53:11 +11:00
Breno Leitao
4851f75098 powerpc/perf: Declare static identifier a such
There are three symbols (two variables and a function) that are being used
solely in the same file (imc-pmu.c), thus, these symbols should be static,
but they are not. This was detected by sparse:

	arch/powerpc/perf/imc-pmu.c:31:20: warning: symbol 'nest_imc_refc' was not declared. Should it be static?
	arch/powerpc/perf/imc-pmu.c:37:20: warning: symbol 'core_imc_refc' was not declared. Should it be static?
	arch/powerpc/perf/imc-pmu.c:46:16: warning: symbol 'imc_event_to_pmu' was not declared. Should it be static?

This patch simply adds the 'static' storage-class definition to these
symbols, thus, restricting their usage only in the imc-pmu.c file.

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-11-25 17:11:21 +11:00
Christophe Leroy
709cf19c57 powerpc/8xx: Use patch_site for perf counters setup
The 8xx TLB miss routines are patched when (de)activating
perf counters.

This patch uses the new patch_site functionality in order
to get a better code readability and avoid a label mess when
dumping the code with 'objdump -d'

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-26 21:58:58 +11:00
Michael Ellerman
23ad1a2700 powerpc: Add -Werror at arch/powerpc level
Back when I added -Werror in commit ba55bd7436 ("powerpc: Add
configurable -Werror for arch/powerpc") I did it by adding it to most
of the arch Makefiles.

At the time we excluded math-emu, because apparently it didn't build
cleanly. But that seems to have been fixed somewhere in the interim.

So move the -Werror addition to the top-level of the arch, this saves
us from repeating it in every Makefile and means we won't forget to
add it to any new sub-dirs.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19 00:56:17 +11:00
Joel Stanley
6233b6da0c powerpc/perf: Quiet IMC PMU registration message
On a Power9 box we get a few screens full of these on boot. Drop
them to pr_debug.

  [    5.993645] nest_centaur6_imc performance monitor hardware support registered
  [    5.993728] nest_centaur7_imc performance monitor hardware support registered
  [    5.996510] core_imc performance monitor hardware support registered
  [    5.996569] nest_mba0_imc performance monitor hardware support registered
  [    5.996631] nest_mba1_imc performance monitor hardware support registered
  [    5.996685] nest_mba2_imc performance monitor hardware support registered

Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Reviewed-by: Stewart Smith <stewart@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-13 22:21:25 +11:00