Commit Graph

692 Commits

Author SHA1 Message Date
Nathan Chancellor
35f20786c4 powerpc: xor_vmx: Add '-mhard-float' to CFLAGS
arch/powerpc/lib/xor_vmx.o is built with '-msoft-float' (from the main
powerpc Makefile) and '-maltivec' (from its CFLAGS), which causes an
error when building with clang after a recent change in main:

  error: option '-msoft-float' cannot be specified with '-maltivec'
  make[6]: *** [scripts/Makefile.build:243: arch/powerpc/lib/xor_vmx.o] Error 1

Explicitly add '-mhard-float' before '-maltivec' in xor_vmx.o's CFLAGS
to override the previous inclusion of '-msoft-float' (as the last option
wins), which matches how other areas of the kernel use '-maltivec', such
as AMDGPU.

Cc: stable@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/1986
Link: 4792f912b2
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240127-ppc-xor_vmx-drop-msoft-float-v1-1-f24140e81376@kernel.org
2024-03-07 00:13:28 +11:00
Michael Ellerman
8488cdcb00 powerpc/64s: Move dcbt/dcbtst sequence into a macro
There's an almost identical code sequence to specify load/store access
hints in __copy_tofrom_user_power7(), copypage_power7() and
memcpy_power7().

Move the sequence into a common macro, which is passed the registers to
use as they differ slightly.

There also needs to be a copy in the selftests, it could be shared in
future if the headers are cleaned up / refactored.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240229122521.762431-1-mpe@ellerman.id.au
2024-03-03 23:05:21 +11:00
Christophe Leroy
d5835fb60b powerpc: Use user_mode() macro when possible
There is a nice macro to check user mode.

Use it instead of open coding anding with MSR_PR to increase
readability and avoid having to comment what that anding is for.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/fbf74887dcf1f1ba9e1680fc3247cbb581b00662.1708078228.git.christophe.leroy@csgroup.eu
2024-02-22 21:55:33 +11:00
Naveen N Rao
8f9abaa6d7 powerpc/lib: Validate size for vector operations
Some of the fp/vmx code in sstep.c assume a certain maximum size for the
instructions being emulated. The size of those operations however is
determined separately in analyse_instr().

Add a check to validate the assumption on the maximum size of the
operations, so as to prevent any unintended kernel stack corruption.

Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Build-tested-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231123071705.397625-1-naveen@kernel.org
2023-11-27 22:06:19 +11:00
Michael Ellerman
df99da19c6 powerpc/lib: Avoid array bounds warnings in vec ops
Building with GCC with -Warray-bounds enabled there are several warnings
in sstep.c along the lines of:

  In function ‘do_byte_reverse’,
      inlined from ‘do_vec_load’ at arch/powerpc/lib/sstep.c:691:3,
      inlined from ‘emulate_loadstore’ at arch/powerpc/lib/sstep.c:3439:9:
  arch/powerpc/lib/sstep.c:289:23: error: array subscript 2 is outside array bounds of ‘u8[16]’ {aka ‘unsigned char[16]’} [-Werror=array-bounds=]
    289 |                 up[2] = byterev_8(up[1]);
        |                 ~~~~~~^~~~~~~~~~~~~~~~~~
  arch/powerpc/lib/sstep.c: In function ‘emulate_loadstore’:
  arch/powerpc/lib/sstep.c:681:11: note: at offset 16 into object ‘u’ of size 16
    681 |         } u = {};
        |           ^

do_byte_reverse() supports a size up to 32 bytes, but in these cases the
caller is only passing a 16 byte buffer. In practice there is no bug,
do_vec_load() is only called from the LOAD_VMX case in emulate_loadstore().
That in turn is only reached when analyse_instr() recognises VMX ops,
and in all cases the size is no greater than 16:

  $ git grep -w LOAD_VMX arch/powerpc/lib/sstep.c
  arch/powerpc/lib/sstep.c:                        op->type = MKOP(LOAD_VMX, 0, 1);
  arch/powerpc/lib/sstep.c:                        op->type = MKOP(LOAD_VMX, 0, 2);
  arch/powerpc/lib/sstep.c:                        op->type = MKOP(LOAD_VMX, 0, 4);
  arch/powerpc/lib/sstep.c:                        op->type = MKOP(LOAD_VMX, 0, 16);

Similarly for do_vec_store().

Although the warning is incorrect, the code would be safer if it clamped
the size from the caller to the known size of the buffer. Do that using
min_t().

Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Closes: https://lore.kernel.org/linuxppc-dev/YpbUcPrm61RLIiZF@debian.me/
Reported-by: Jan-Benedict Glaw <jbglaw@lug-owl.de>
Closes: https://lore.kernel.org/linuxppc-dev/20221212215117.aa7255t7qd6yefk4@lug-owl.de/
Reported-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Closes: https://lore.kernel.org/linuxppc-dev/6a8bf78c-aedb-4d5a-b0aa-82a51a17b884@embeddedor.com/
Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Build-tested-by: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231120235436.1569255-1-mpe@ellerman.id.au
2023-11-27 22:05:14 +11:00
Masahiro Yamada
1b1e380026 powerpc: add crtsavres.o to always-y instead of extra-y
crtsavres.o is linked to modules. However, as explained in commit
d0e628cd81 ("kbuild: doc: clarify the difference between extra-y
and always-y"), 'make modules' does not build extra-y.

For example, the following command fails:

  $ make ARCH=powerpc LLVM=1 KBUILD_MODPOST_WARN=1 mrproper ps3_defconfig modules
    [snip]
    LD [M]  arch/powerpc/platforms/cell/spufs/spufs.ko
  ld.lld: error: cannot open arch/powerpc/lib/crtsavres.o: No such file or directory
  make[3]: *** [scripts/Makefile.modfinal:56: arch/powerpc/platforms/cell/spufs/spufs.ko] Error 1
  make[2]: *** [Makefile:1844: modules] Error 2
  make[1]: *** [/home/masahiro/workspace/linux-kbuild/Makefile:350: __build_one_by_one] Error 2
  make: *** [Makefile:234: __sub-make] Error 2

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Fixes: baa25b571a ("powerpc/64: Do not link crtsavres.o in vmlinux")
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231120232332.4100288-1-masahiroy@kernel.org
2023-11-27 22:01:15 +11:00
Hari Bathini
465cabc97b powerpc/code-patching: introduce patch_instructions()
patch_instruction() entails setting up pte, patching the instruction,
clearing the pte and flushing the tlb. If multiple instructions need
to be patched, every instruction would have to go through the above
drill unnecessarily. Instead, introduce patch_instructions() function
that sets up the pte, clears the pte and flushes the tlb only once
per page range of instructions to be patched. Duplicate most of the
patch_instruction() code instead of merging with it, to avoid the
performance degradation observed on ppc32, for patch_instruction(),
with the code path merged. Also, setup poking_init() always as BPF
expects poking_init() to be setup even when STRICT_KERNEL_RWX is off.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231020141358.643575-2-hbathini@linux.ibm.com
2023-10-23 20:33:19 +11:00
Christophe Leroy
74726fda9f powerpc/code-patching: Perform hwsync in __patch_instruction() in case of failure
Commit c28c15b6d2 ("powerpc/code-patching: Use temporary mm for
Radix MMU") added a hwsync for when __patch_instruction() fails,
we results in a quite odd unbalanced logic.

Instead of calling mb() when __patch_instruction() returns an error,
call mb() in the __patch_instruction()'s error path directly.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/e88b154eaf2efd9ff177d472d3411dcdec8ff4f5.1696675567.git.christophe.leroy@csgroup.eu
2023-10-20 23:19:13 +11:00
Nicholas Piggin
b629b54170 powerpc/qspinlock: Rename yield_propagate_owner tunable
Rename yield_propagate_owner to yield_sleepy_owner, which better
describes what it does (what, not how).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Reviewed-by: "Nysal Jan K.A" <nysal@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231016124305.139923-7-npiggin@gmail.com
2023-10-20 22:43:34 +11:00
Nicholas Piggin
1e6d5f7257 powerpc/qspinlock: Propagate sleepy if previous waiter is preempted
The sleepy (aka lock-owner-is-preempted) condition is propagated down
the queue by each waiter. If a waiter becomes preempted, it can no
longer propagate sleepy. To allow subsequent waiters to yield to the
lock owner, also check the lock owner in this case.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Reviewed-by: "Nysal Jan K.A" <nysal@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231016124305.139923-6-npiggin@gmail.com
2023-10-20 22:43:34 +11:00
Nicholas Piggin
fcf77d4427 powerpc/qspinlock: don't propagate the not-sleepy state
To simplify things, don't propagate the not-sleepy condition back down
the queue. Instead, have the waiters clear their own node->sleepy when
finding the lock owner is not preempted.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Reviewed-by: "Nysal Jan K.A" <nysal@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231016124305.139923-5-npiggin@gmail.com
2023-10-20 22:43:34 +11:00
Nicholas Piggin
fd8fae50c9 powerpc/qspinlock: propagate owner preemptedness rather than CPU number
Rather than propagating the CPU number of the preempted lock owner,
just propagate whether the owner was preempted. Waiters must read the
lock value when yielding to it to prevent races anyway, so might as
well always load the owner CPU from the lock.

To further simplify the code, also don't propagate the -1 (or
sleepy=false in the new scheme) down the queue. Instead, have the
waiters clear it themselves when finding the lock owner is not
preempted.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Reviewed-by: "Nysal Jan K.A" <nysal@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231016124305.139923-4-npiggin@gmail.com
2023-10-20 22:43:34 +11:00
Nicholas Piggin
f656864738 powerpc/qspinlock: stop queued waiters trying to set lock sleepy
If a queued waiter notices the lock owner or the previous waiter has
been preempted, it attempts to mark the lock sleepy, but it does this
as a try-set operation using the original lock value it got when
queueing, which will become stale as the queue progresses, and the
try-set will fail. Drop this and just set the sleepy seen clock.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Reviewed-by: "Nysal Jan K.A" <nysal@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231016124305.139923-3-npiggin@gmail.com
2023-10-20 22:43:34 +11:00
Nicholas Piggin
f9bc9bbe8a powerpc/qspinlock: Fix stale propagated yield_cpu
yield_cpu is a sample of a preempted lock holder that gets propagated
back through the queue. Queued waiters use this to yield to the
preempted lock holder without continually sampling the lock word (which
would defeat the purpose of MCS queueing by bouncing the cache line).

The problem is that yield_cpu can become stale. It can take some time to
be passed down the chain, and if any queued waiter gets preempted then
it will cease to propagate the yield_cpu to later waiters.

This can result in yielding to a CPU that no longer holds the lock,
which is bad, but particularly if it is currently in H_CEDE (idle),
then it appears to be preempted and some hypervisors (PowerVM) can
cause very long H_CONFER latencies waiting for H_CEDE wakeup. This
results in latency spikes and hard lockups on oversubscribed
partitions with lock contention.

This is a minimal fix. Before yielding to yield_cpu, sample the lock
word to confirm yield_cpu is still the owner, and bail out of it is not.

Thanks to a bunch of people who reported this and tracked down the
exact problem using tracepoints and dispatch trace logs.

Fixes: 28db61e207 ("powerpc/qspinlock: allow propagation of yield CPU down the queue")
Cc: stable@vger.kernel.org # v6.2+
Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reported-by: Laurent Dufour <ldufour@linux.ibm.com>
Reported-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Debugged-by: "Nysal Jan K.A" <nysal@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231016124305.139923-2-npiggin@gmail.com
2023-10-18 21:07:21 +11:00
Michael Ellerman
fabdb27da7 powerpc: Drop zalloc_maybe_bootmem()
The only callers of zalloc_maybe_bootmem() are PCI setup routines. These
used to be called early during boot before slab setup, and also during
runtime due to hotplug.

But commit 5537fcb319 ("powerpc/pci: Add ppc_md.discover_phbs()")
moved the boot-time calls later, after slab setup, meaning there's no
longer any need for zalloc_maybe_bootmem(), kzalloc() can be used in all
cases.

Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230823055430.752550-1-mpe@ellerman.id.au
2023-08-24 22:33:16 +10:00
Masahiro Yamada
3932618287 powerpc: replace #include <asm/export.h> with #include <linux/export.h>
Commit ddb5cdbafa ("kbuild: generate KSYMTAB entries by modpost")
deprecated <asm/export.h>, which is now a wrapper of <linux/export.h>.

Replace #include <asm/export.h> with #include <linux/export.h>.

After all the <asm/export.h> lines are converted, <asm/export.h> and
<asm-generic/export.h> will be removed.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
[mpe: Fixup selftests that stub asm/export.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230806150954.394189-2-masahiroy@kernel.org
2023-08-16 23:54:48 +10:00
Christophe Leroy
0d5769f950 powerpc/step: Mark __copy_mem_out() and __emulate_dcbz() __always_inline
objtool reports two folliwng warnings:
  arch/powerpc/lib/sstep.o: warning: objtool: copy_mem_out+0x3c
    (.text+0x30c): call to __copy_mem_out() with UACCESS enabled
  arch/powerpc/lib/sstep.o: warning: objtool: emulate_dcbz+0x70
    (.text+0x4dc): call to __emulate_dcbz() with UACCESS enabled

Mark __copy_mem_out() and __emulate_dcbz() __always_inline

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/f1d4a15da70190f8c2fcddb377bbc1e09827242c.1687343857.git.christophe.leroy@csgroup.eu
2023-08-16 23:54:48 +10:00
Christophe Leroy
6b289911c8 powerpc/features: Add capability to update mmu features later
On powerpc32, features fixup is performed very early and that's too
early to read the cmdline and take into account 'nosmap' parameter.

On the other hand, no userspace access is performed that early and
KUAP feature fixup can be performed later.

Add a function to update mmu features. The function is passed a
mask with the features that can be updated.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/31b27ee2c9d338f4f82cd8cd69d6bff979495290.1689091022.git.christophe.leroy@csgroup.eu
2023-08-02 22:22:17 +10:00
Masahiro Yamada
54a11654de powerpc: remove checks for binutils older than 2.25
Commit e441273947 ("Documentation: raise minimum supported version of
binutils to 2.25") allows us to remove the checks for old binutils.

There is no more user for ld-ifversion. Remove it as well.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230119082250.151485-1-masahiroy@kernel.org
2023-06-27 16:59:29 +10:00
Rohan McLure
6f3136326e powerpc: qspinlock: Enforce qnode writes prior to publishing to queue
Annotate the release barrier and memory clobber (in effect, producing a
compiler barrier) in the publish_tail_cpu call. These barriers have the
effect of ensuring that qnode attributes are all written to prior to
publish the node to the waitqueue.

Even while the initial write to the 'locked' attribute is guaranteed to
terminate prior to the node being visible, KCSAN still complains that
the write is reorderable by the compiler. Issue a kcsan_release() to
inform KCSAN of the release barrier contained in publish_tail_cpu().

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230510033117.1395895-3-rmclure@linux.ibm.com
2023-06-21 15:13:57 +10:00
Rohan McLure
03d44ee80e powerpc: qspinlock: Mark accesses to qnode lock checks
The powerpc implementation of qspinlocks will both poll and spin on the
bitlock guarding a qnode. Mark these accesses with READ_ONCE to convey
to KCSAN that polling is intentional here.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230510033117.1395895-2-rmclure@linux.ibm.com
2023-06-21 15:13:57 +10:00
Nicholas Piggin
7e3a68be42 powerpc/64: vmlinux support building with PCREL addresing
PC-Relative or PCREL addressing is an extension to the ELF ABI which
uses Power ISA v3.1 PC-relative instructions to calculate addresses,
rather than the traditional TOC scheme.

Add an option to build vmlinux using pcrel addressing. Modules continue
to use TOC addressing.

- TOC address helpers and r2 are poisoned with -1 when running vmlinux.
  r2 could be used for something useful once things are ironed out.

- Assembly must call C functions with @notoc annotation, or the linker
  complains aobut a missing nop after the call. This is done with the
  CFUNC macro introduced earlier.

- Boot: with the exception of prom_init, the execution branches to the
  kernel virtual address early in boot, before any addresses are
  generated, which ensures 34-bit pcrel addressing does not miss the
  high PAGE_OFFSET bits. TOC relative addressing has a similar
  requirement. prom_init does not go to the virtual address and its
  addresses should not carry over to the post-prom kernel.

- Ftrace trampolines are converted from TOC addressing to pcrel
  addressing, including module ftrace trampolines that currently use the
  kernel TOC to find ftrace target functions.

- BPF function prologue and function calling generation are converted
  from TOC to pcrel.

- copypage_64.S has an interesting problem, prefixed instructions have
  alignment restrictions so the linker can add padding, which makes the
  assembler treat the difference between two local labels as
  non-constant even if alignment is arranged so padding is not required.
  This may need toolchain help to solve nicely, for now move the prefix
  instruction out of the alternate patch section to work around it.

This reduces kernel text size by about 6%.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230408021752.862660-6-npiggin@gmail.com
2023-04-20 12:59:21 +10:00
Nicholas Piggin
4e991e3c16 powerpc: add CFUNC assembly label annotation
This macro is to be used in assembly where C functions are called.
pcrel addressing mode requires branches to functions with a
localentry value of 1 to have either a trailing nop or @notoc.
This macro permits the latter without changing callers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add dummy definitions to fix selftests build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230408021752.862660-5-npiggin@gmail.com
2023-04-20 12:54:24 +10:00
Ira Weiny
0398abca61 powerpc: Remove memcpy_page_flushcache()
Commit 21b56c8477 ("iov_iter: get rid of separate bvec and xarray
callbacks") removed the calls to memcpy_page_flushcache().

Remove the unnecessary memcpy_page_flushcache() call.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20221230-kmap-x86-v1-2-15f1ecccab50@intel.com
2023-03-29 23:53:02 +11:00
Rohan McLure
2fb857bc9f powerpc/kcsan: Add exclusions from instrumentation
Exclude various incompatible compilation units from KCSAN
instrumentation.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230206021801.105268-2-rmclure@linux.ibm.com
2023-02-10 22:19:56 +11:00
Michael Ellerman
980411a4d1 powerpc/code-patching: Fix oops with DEBUG_VM enabled
Nathan reported that the new per-cpu mm patching oopses if DEBUG_VM is
enabled:

  ------------[ cut here ]------------
  kernel BUG at arch/powerpc/mm/pgtable.c:333!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc2+ #1
  Hardware name: IBM PowerNV (emulated by qemu) POWER9 0x4e1200 opal:v7.0 PowerNV
  ...
  NIP assert_pte_locked+0x180/0x1a0
  LR  assert_pte_locked+0x170/0x1a0
  Call Trace:
    0x60000000 (unreliable)
    patch_instruction+0x618/0x6d0
    arch_prepare_kprobe+0xfc/0x2d0
    register_kprobe+0x520/0x7c0
    arch_init_kprobes+0x28/0x3c
    init_kprobes+0x108/0x184
    do_one_initcall+0x60/0x2e0
    kernel_init_freeable+0x1f0/0x3e0
    kernel_init+0x34/0x1d0
    ret_from_kernel_thread+0x5c/0x64

It's caused by the assert_spin_locked() failing in assert_pte_locked().
The assert fails because the PTE was unlocked in text_area_cpu_up_mm(),
and never relocked.

The PTE page shouldn't be freed, the patching_mm is only used for
patching on this CPU, only that single PTE is ever mapped, and it's only
unmapped at CPU offline.

In fact assert_pte_locked() has a special case to ignore init_mm
entirely, and the patching_mm is more-or-less like init_mm, so possibly
the check could be skipped for patching_mm too.

But for now be conservative, and use the proper PTE accessors at
patching time, so that the PTE lock is held while the PTE is used. That
also avoids the warning in assert_pte_locked().

With that it's no longer necessary to save the PTE in
cpu_patching_context for the mm_patch_enabled() case.

Fixes: c28c15b6d2 ("powerpc/code-patching: Use temporary mm for Radix MMU")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221216125913.990972-1-mpe@ellerman.id.au
2022-12-16 23:59:43 +11:00
Nicholas Piggin
13959373e9 powerpc/qspinlock: Fix 32-bit build
Some 32-bit configurations don't pull in the spin_begin/end/relax
definitions. Fix is to restore a lost include.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 84990b1695 ("powerpc/qspinlock: add mcs queueing for contended waiters")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/oe-kbuild-all/202212050224.i7uh9fOh-lkp@intel.com
Link: https://lore.kernel.org/r/20221208123225.1566113-1-npiggin@gmail.com
2022-12-12 12:34:52 +11:00
Christophe Leroy
6f3a81b600 powerpc/code-patching: Remove protection against patching init addresses after init
Once init section is freed, attempting to patch init code
ends up in the weed.

Commit 51c3c62b58 ("powerpc: Avoid code patching freed init sections")
protected patch_instruction() against that, but it is the responsibility
of the caller to ensure that the patched memory is valid.

All callers have now been verified and fixed so the check
can be removed.

This improves ftrace activation by about 2% on 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/504310828f473d424e2ed229eff57bf075f52796.1669969781.git.christophe.leroy@csgroup.eu
2022-12-02 21:59:57 +11:00
Christophe Leroy
b988e7797d powerpc/feature-fixups: Do not patch init section after init
Once init section is freed, attempting to patch init code
ends up in the weed.

Commit 51c3c62b58 ("powerpc: Avoid code patching freed init sections")
protected patch_instruction() against that, but it is the responsibility
of the caller to ensure that the patched memory is valid.

In the same spirit as jump_label with its jump_label_can_update()
function, add is_fixup_addr_valid() function to skip patching on
freed init section.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8e9311fc1b057e4e6a2a3a0701ebcc74b787affe.1669969781.git.christophe.leroy@csgroup.eu
2022-12-02 21:59:57 +11:00
Christophe Leroy
3d1dbbca33 powerpc/feature-fixups: Refactor other fixups patching
Several fonctions have the same loop for patching instructions.

Introduce function do_patch_fixups() to refactor those loops.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/58ab36949c18f94d466fc98d6c085783b0cd474f.1669969781.git.christophe.leroy@csgroup.eu
2022-12-02 21:59:56 +11:00
Christophe Leroy
6076dc349b powerpc/feature-fixups: Refactor entry fixups patching
Several fonctions have the same loop for patching instructions.

Introduce function do_patch_entry_fixups() to refactor those loops.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/79eeff7b20a98f7136da5f79b1f7c436928f27f3.1669969781.git.christophe.leroy@csgroup.eu
2022-12-02 21:59:56 +11:00
Christophe Leroy
84ecfe6f38 powerpc/code-patching: Remove #ifdef CONFIG_STRICT_KERNEL_RWX
No need to have one implementation of patch_instruction() for
CONFIG_STRICT_KERNEL_RWX and one for !CONFIG_STRICT_KERNEL_RWX.

In patch_instruction(), call raw_patch_instruction() when
!CONFIG_STRICT_KERNEL_RWX.

In poking_init(), bail out immediately, it will be equivalent
to the weak default implementation.

Everything else is declared static and will be discarded by
GCC when !CONFIG_STRICT_KERNEL_RWX.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f67d2a109404d03e8fdf1ea15388c8778337a76b.1669969781.git.christophe.leroy@csgroup.eu
2022-12-02 21:59:56 +11:00
Michael Ellerman
22db71bcba Merge branch 'topic/qspinlock' into next
Merge Nick's powerpc qspinlock implementation. From his cover letter:

This replaces the generic queued spinlock code (like s390 does) with our
own implementation.

Generic PV qspinlock code is causing latency / starvation regressions on
large systems that are resulting in hard lockups reported (mostly in
pathoogical cases). The generic qspinlock code has a number of issues
important for powerpc hardware and hypervisors that aren't easily solved
without changing code that would impact other architectures. Follow
s390's lead and implement our own for now.

Issues for powerpc using generic qspinlocks:
  - The previous lock value should not be loaded with simple loads, and
    need not be passed around from previous loads or cmpxchg results,
    because powerpc uses ll/sc-style atomics which can perform more
    complex operations that do not require this. powerpc implementations
    tend to prefer loads use larx for improved coherency performance.
  - The queueing process should absolutely minimise the number of stores
    to the lock word to reduce exclusive coherency probes, important for
    large system scalability. The pending logic is counter productive
    here.
  - Non-atomic unlock for paravirt locks is important (atomic
    instructions tend to still be more expensive than x86 CPUs).
  - Yielding to the lock owner is important in the oversubscribed
    paravirt case, which requires storing the owner CPU in the lock
    word.
  - More control of lock stealing for the paravirt case is important to
    keep latency down on large systems.
  - The lock acquisition operation should always be made with a special
    variant of atomic instructions with the lock hint bit set,
    including (especially) in the queueing paths. This is more a matter
    of adding more arch lock helpers so not an insurmountable problem
    for generic code.
2022-12-02 18:04:56 +11:00
Nicholas Piggin
c03be0a3f3 powerpc: add definition for pt_regs offset within an interrupt frame
This is a common offset that currently uses the overloaded
STACK_FRAME_OVERHEAD constant. It's easier to read and more
flexible to use a specific regs offset for this.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221127124942.1665522-8-npiggin@gmail.com
2022-12-02 17:54:08 +11:00
Benjamin Gray
2f228ee1ad powerpc/code-patching: Consolidate and cache per-cpu patching context
With the temp mm context support, there are CPU local variables to hold
the patch address and pte. Use these in the non-temp mm path as well
instead of adding a level of indirection through the text_poke_area
vm_struct and pointer chasing the pte.

As both paths use these fields now, there is no need to let unreferenced
variables be dropped by the compiler, so it is cleaner to merge them
into a single context struct. This has the additional benefit of
removing a redundant CPU local pointer, as only one of cpu_patching_mm /
text_poke_area is ever used, while remaining well-typed. It also groups
each CPU's data into a single cacheline.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
[mpe: Shorten name to 'area' as suggested by Christophe]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221109045112.187069-10-bgray@linux.ibm.com
2022-12-02 17:54:06 +11:00
Christopher M. Riedl
c28c15b6d2 powerpc/code-patching: Use temporary mm for Radix MMU
x86 supports the notion of a temporary mm which restricts access to
temporary PTEs to a single CPU. A temporary mm is useful for situations
where a CPU needs to perform sensitive operations (such as patching a
STRICT_KERNEL_RWX kernel) requiring temporary mappings without exposing
said mappings to other CPUs. Another benefit is that other CPU TLBs do
not need to be flushed when the temporary mm is torn down.

Mappings in the temporary mm can be set in the userspace portion of the
address-space.

Interrupts must be disabled while the temporary mm is in use. HW
breakpoints, which may have been set by userspace as watchpoints on
addresses now within the temporary mm, are saved and disabled when
loading the temporary mm. The HW breakpoints are restored when unloading
the temporary mm. All HW breakpoints are indiscriminately disabled while
the temporary mm is in use - this may include breakpoints set by perf.

Use the `poking_init` init hook to prepare a temporary mm and patching
address. Initialize the temporary mm using mm_alloc(). Choose a
randomized patching address inside the temporary mm userspace address
space. The patching address is randomized between PAGE_SIZE and
DEFAULT_MAP_WINDOW-PAGE_SIZE.

Bits of entropy with 64K page size on BOOK3S_64:

	bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE)

	PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB
	bits of entropy = log2(128TB / 64K)
	bits of entropy = 31

The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash MMU
operates - by default the space above DEFAULT_MAP_WINDOW is not
available. Currently the Hash MMU does not use a temporary mm so
technically this upper limit isn't necessary; however, a larger
randomization range does not further "harden" this overall approach and
future work may introduce patching with a temporary mm on Hash as well.

Randomization occurs only once during initialization for each CPU as it
comes online.

The patching page is mapped with PAGE_KERNEL to set EAA[0] for the PTE
which ignores the AMR (so no need to unlock/lock KUAP) according to
PowerISA v3.0b Figure 35 on Radix.

Based on x86 implementation:

commit 4fc19708b1
("x86/alternatives: Initialize temporary mm for patching")

and:

commit b3fd8e83ad
("x86/alternatives: Use temporary mm for text poking")

From: Benjamin Gray <bgray@linux.ibm.com>

Synchronisation is done according to ISA 3.1B Book 3 Chapter 13
"Synchronization Requirements for Context Alterations". Switching the mm
is a change to the PID, which requires a CSI before and after the change,
and a hwsync between the last instruction that performs address
translation for an associated storage access.

Instruction fetch is an associated storage access, but the instruction
address mappings are not being changed, so it should not matter which
context they use. We must still perform a hwsync to guard arbitrary
prior code that may have accessed a userspace address.

TLB invalidation is local and VA specific. Local because only this core
used the patching mm, and VA specific because we only care that the
writable mapping is purged. Leaving the other mappings intact is more
efficient, especially when performing many code patches in a row (e.g.,
as ftrace would).

Signed-off-by: Christopher M. Riedl <cmr@bluescreens.de>
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
[mpe: Use mm_alloc() per 107b6828a7cd ("x86/mm: Use mm_alloc() in poking_init()")]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221109045112.187069-9-bgray@linux.ibm.com
2022-12-02 17:52:56 +11:00
Nicholas Piggin
0b2199841a powerpc/qspinlock: add compile-time tuning adjustments
This adds compile-time options that allow the EH lock hint bit to be
enabled or disabled, and adds some new options that may or may not
help matters. To help with experimentation and tuning.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-18-npiggin@gmail.com
2022-12-02 17:48:50 +11:00
Nicholas Piggin
12b459a5eb powerpc/qspinlock: provide accounting and options for sleepy locks
Finding the owner or a queued waiter on a lock with a preempted vcpu is
indicative of an oversubscribed guest causing the lock to get into
trouble. Provide some options to detect this situation and have new CPUs
avoid queueing for a longer time (more steal iterations) to minimise the
problems caused by vcpu preemption on the queue.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-17-npiggin@gmail.com
2022-12-02 17:48:50 +11:00
Nicholas Piggin
39dfc73596 powerpc/qspinlock: allow indefinite spinning on a preempted owner
Provide an option that holds off queueing indefinitely while the lock
owner is preempted. This could reduce queueing latencies for very
overcommitted vcpu situations.

This is disabled by default.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-16-npiggin@gmail.com
2022-12-02 17:48:50 +11:00
Nicholas Piggin
cc79701114 powerpc/qspinlock: reduce remote node steal spins
Allow for a reduction in the number of times a CPU from a different
node than the owner can attempt to steal the lock before queueing.
This could bias the transfer behaviour of the lock across the
machine and reduce NUMA crossings.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-15-npiggin@gmail.com
2022-12-02 17:48:50 +11:00
Nicholas Piggin
71c235027c powerpc/qspinlock: use spin_begin/end API
Use the spin_begin/spin_cpu_relax/spin_end APIs in qspinlock, which helps
to prevent threads issuing a lot of expensive priority nops which may not
have much effect due to immediately executing low then medium priority.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-14-npiggin@gmail.com
2022-12-02 17:48:50 +11:00
Nicholas Piggin
f61ab43cc1 powerpc/qspinlock: allow lock stealing in trylock and lock fastpath
This change allows trylock to steal the lock. It also allows the initial
lock attempt to steal the lock rather than bailing out and going to the
slow path.

This gives trylock more strength: without this a continually-contended
lock will never permit a trylock to succeed. With this change, the
trylock has a small but non-zero chance.

It also gives the lock fastpath most of the benefit of passing the
reservation back through to the steal loop in the slow path without the
complexity.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-13-npiggin@gmail.com
2022-12-02 17:48:50 +11:00
Nicholas Piggin
be742c573f powerpc/qspinlock: add ability to prod new queue head CPU
After the head of the queue acquires the lock, it releases the
next waiter in the queue to become the new head. Add an option
to prod the new head if its vCPU was preempted. This may only
have an effect if queue waiters are yielding.

Disable this option by default for now, i.e., no logical change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-12-npiggin@gmail.com
2022-12-02 17:48:50 +11:00
Nicholas Piggin
28db61e207 powerpc/qspinlock: allow propagation of yield CPU down the queue
Having all CPUs poll the lock word for the owner CPU that should be
yielded to defeats most of the purpose of using MCS queueing for
scalability. Yet it may be desirable for queued waiters to yield to a
preempted owner.

With this change, queue waiters never sample the owner CPU directly from
the lock word. The queue head (which is spinning on the lock) propagates
the owner CPU back to the next waiter if it finds the owner has been
preempted. That waiter then propagates the owner CPU back to the next
waiter, and so on.

s390 addresses this problem differenty, by having queued waiters sample
the lock word to find the owner at a low frequency. That has the
advantage of being simpler, the advantage of propagation is that the
lock word never has to be accesed by queued waiters, and the transfer of
cache lines to transmit the owner data is only required when lock holder
vCPU preemption occurs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-11-npiggin@gmail.com
2022-12-02 17:48:50 +11:00
Nicholas Piggin
b4c3cdc1a6 powerpc/qspinlock: allow stealing when head of queue yields
If the head of queue is preventing stealing but it finds the owner vCPU
is preempted, it will yield its cycles to the owner which could cause it
to become preempted. Add an option to re-allow stealers before yielding,
and disallow them again after returning from the yield.

Disable this option by default for now, i.e., no logical change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-10-npiggin@gmail.com
2022-12-02 17:48:50 +11:00
Nicholas Piggin
bd48287b2c powerpc/qspinlock: implement option to yield to previous node
Queued waiters which are not at the head of the queue don't spin on
the lock word but their qnode lock word, waiting for the previous queued
CPU to release them. Add an option which allows these waiters to yield
to the previous CPU if its vCPU is preempted.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-9-npiggin@gmail.com
2022-12-02 17:48:49 +11:00
Nicholas Piggin
085f03311b powerpc/qspinlock: paravirt yield to lock owner
Waiters spinning on the lock word should yield to the lock owner if the
vCPU is preempted. This improves performance when the hypervisor has
oversubscribed physical CPUs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-8-npiggin@gmail.com
2022-12-02 17:48:49 +11:00
Nicholas Piggin
e1a31e7fd7 powerpc/qspinlock: store owner CPU in lock word
Store the owner CPU number in the lock word so it may be yielded to,
as powerpc's paravirtualised simple spinlocks do.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-7-npiggin@gmail.com
2022-12-02 17:48:49 +11:00
Nicholas Piggin
0944534ef4 powerpc/qspinlock: theft prevention to control latency
Give the queue head the ability to stop stealers. After a number of
spins without successfully acquiring the lock, the queue head sets
this, which halts stealing and will assure it is the next owner.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-6-npiggin@gmail.com
2022-12-02 17:48:49 +11:00
Nicholas Piggin
6aa42f883c powerpc/qspinlock: allow new waiters to steal the lock before queueing
Allow new waiters to "steal" the lock before queueing. That is, to
acquire it while other CPUs have queued.

This particularly helps paravirt performance when physical CPUs are
oversubscribed, by keeping the lock from becoming a strict FIFO and
vCPU preemption causing queue train wrecks.

The new __queued_spin_trylock_steal() function is put in qspinlock.h
to save having to move it, because it will be used there by a later
change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-5-npiggin@gmail.com
2022-12-02 17:48:49 +11:00
Nicholas Piggin
b3a73b7db2 powerpc/qspinlock: convert atomic operations to assembly
This uses more optimal ll/sc style access patterns (rather than
cmpxchg), and also sets the EH=1 lock hint on those operations
which acquire ownership of the lock.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-4-npiggin@gmail.com
2022-12-02 17:48:49 +11:00
Nicholas Piggin
84990b1695 powerpc/qspinlock: add mcs queueing for contended waiters
This forms the basis of the qspinlock slow path.

Like generic qspinlocks and unlike the vanilla MCS algorithm, the lock
owner does not participate in the queue, only waiters. The first waiter
spins on the lock word, then when the lock is released it takes
ownership and unqueues the next waiter. This is how qspinlocks can be
implemented with the spinlock API -- lock owners don't need a node, only
waiters do.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221126095932.1234527-2-npiggin@gmail.com
2022-12-02 17:48:49 +11:00
Nicholas Piggin
9f61521c7a powerpc/qspinlock: powerpc qspinlock implementation
Add a powerpc specific implementation of queued spinlocks. This is the
build framework with a very simple (non-queued) spinlock implementation
to begin with. Later changes add queueing, and other features and
optimisations one-at-a-time. It is done this way to more easily see how
the queued spinlocks are built, and to make performance and correctness
bisects more useful.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Drop paravirt.h & processor.h changes to fix 32-bit build]
[mpe: Fix 32-bit build of qspinlock.o & disallow GENERIC_LOCKBREAK per Nick]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/CONLLQB6DCJU.2ZPOS7T6S5GRR@bobo
2022-12-02 17:48:02 +11:00
Benjamin Gray
071c95c1ac powerpc/code-patching: Use WARN_ON and fix check in poking_init
BUG_ON() when failing to initialise the code patching window is
unnecessary, and use of BUG_ON is discouraged. We don't set
poking_init_done in this case, so failure to init the boot CPU will
result in a strict RWX error when a following patch_instruction uses
raw_patch_instruction. If it only fails for later CPUs, they won't be
onlined in the first place.

The return value of cpuhp_setup_state() is also >= 0 on success,
so check for < 0.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221109045112.187069-3-bgray@linux.ibm.com
2022-11-30 21:46:48 +11:00
Michael Ellerman
611c020239 Merge branch 'fixes' into next
Merge our fixes branch to bring in some changes that are prerequisites
for work in next.
2022-11-30 21:46:06 +11:00
Nicholas Piggin
b86cf14f24 powerpc: add compile-time support for lbarx, lharx
ISA v2.06 (POWER7 and up) as well as e6500 support lbarx and lharx.
Add a compile option that allows code to use it, and add support in
cmpxchg and xchg 8 and 16 bit values without shifting and masking.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220909052312.63916-1-npiggin@gmail.com
2022-11-24 23:31:47 +11:00
Nicholas Piggin
00ff1eaac1 powerpc: Fix reschedule bug in KUAP-unlocked user copy
schedule must not be explicitly called while KUAP is unlocked, because
the AMR register will not be saved across the context switch on
64s (preemption is allowed because that is driven by interrupts which do
save the AMR).

exit_vmx_usercopy() runs inside an unlocked user access region, and it
calls preempt_enable() which will call schedule() if need_resched() was
set while non-preemptible. This can cause tasks to run unprotected when
the should not, and can cause the user copy to be improperly blocked
when scheduling back to it.

Fix this by avoiding the explicit resched for preempt kernels by
generating an interrupt to reschedule the context if need_resched() got
set.

Reported-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221013151647.1857994-3-npiggin@gmail.com
2022-10-18 22:46:19 +11:00
Nicholas Piggin
dab3b8f4fd powerpc/64: asm use consistent global variable declaration and access
Use helper macros to access global variables, and place them in .data
sections rather than in .toc. Putting addresses in TOC is not required
because the kernel is linked with a single TOC.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926034057.2360083-3-npiggin@gmail.com
2022-09-28 19:22:12 +10:00
Christophe Leroy
3e7318584d powerpc: Remove CONFIG_PPC_FSL_BOOK3E
CONFIG_PPC_FSL_BOOK3E is redundant with CONFIG_PPC_E500.

Remove it.

And rename five files accordingly.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Rename include guards to match new file names]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/795cb93b88c9a0279289712e674f39e3b108a1b4.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26 23:00:13 +10:00
Christophe Leroy
8b4bb0ad00 powerpc/code-patching: Speed up page mapping/unmapping
Since commit 591b4b2684 ("powerpc/code-patching: Pre-map patch area")
the patch area is premapped so intermediate page tables are already
allocated.

Use __set_pte_at() directly instead of the heavy map_kernel_page(),
at for unmapping just do a pte_clear() followed by a flush.

__set_pte_at() can be used directly without the filters in
set_pte_at() because we are mapping a normal page non executable.

Make sure gcc knows text_poke_area is page aligned in order to
optimise the flush.

This change reduces by 66% the time needed to activate ftrace on
an 8xx (588000 tb ticks instead of 1744000).

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Add ptesync needed on radix to avoid spurious fault]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220815114840.1468656-1-mpe@ellerman.id.au
2022-09-01 13:56:01 +10:00
Christophe Leroy
de40303b54 powerpc/ppc-opcode: Define and use PPC_RAW_SETB()
We have PPC_INST_SETB then build the 'setb' instruction in the
user.

Instead, define PPC_RAW_SETB() and use it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b08a4f26919a8f8cdcf7544ab552d9c1c63418b5.1657205708.git.christophe.leroy@csgroup.eu
2022-07-27 21:36:05 +10:00
Michael Ellerman
2a83afe72a powerpc/64: Drop ppc_inst_as_str()
The ppc_inst_as_str() macro tries to make printing variable length,
aka "prefixed", instructions convenient. It mostly succeeds, but it does
hide an on-stack buffer, which triggers stack protector.

More problematically it doesn't compile at all with GCC 12,
with -Wdangling-pointer, due to the fact that it returns the char buffer
declared inside the macro:

  arch/powerpc/kernel/trace/ftrace.c: In function '__ftrace_modify_call':
  ./include/linux/printk.h:475:44: error: using a dangling pointer to '__str' [-Werror=dangling-pointer=]
    475 | #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
    ...
  arch/powerpc/kernel/trace/ftrace.c:567:17: note: in expansion of macro 'pr_err'
    567 |                 pr_err("Not expected bl: opcode is %s\n", ppc_inst_as_str(op));
        |                 ^~~~~~
  ./arch/powerpc/include/asm/inst.h:156:14: note: '__str' declared here
    156 |         char __str[PPC_INST_STR_LEN];   \
        |              ^~~~~

This could be fixed by having the caller declare the buffer, but in some
places there'd need to be two buffers. In all cases where
ppc_inst_as_str() is used the output is not really meant for user
consumption, it's almost always indicative of a kernel bug.

A simpler solution is to just print the value as an unsigned long. For
normal instructions the output is identical. For prefixed instructions
the value is printed as a single 64-bit quantity, whereas previously the
low half was printed first. But that is good enough for debug output,
especially as prefixed instructions will be rare in kernel code in
practice.

Old:
  c000000000111170  60420000      ori     r2,r2,0
  c000000000111174  04100001 e580fb00     .long 0xe580fb0004100001

New:
  c00000000010f90c  60420000      ori     r2,r2,0
  c00000000010f910  e580fb0004100001      .long 0xe580fb0004100001

Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reported-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20220531065936.3674348-1-mpe@ellerman.id.au
2022-06-29 19:37:07 +10:00
Linus Torvalds
6112bd00e8 powerpc updates for 5.19
- Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT).
 
  - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later).
 
  - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ.
 
  - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later.
 
  - Drop support for system call instruction emulation.
 
  - Many other small features and fixes.
 
 Thanks to: Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas Sanjaya, Bjorn
 Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian King, Daniel Axtens, Dwaipayan
 Ray, Fabiano Rosas, Finn Thain, Frank Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu
 Hua, Haowen Bai, Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing
 Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof Kozlowski, Laurent
 Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes, Miaoqian Lin, Minghao Chi, Nathan
 Chancellor, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár,
 Paul Mackerras, Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib
 Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang wangx, Xiaomeng Tong,
 Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing, Yu Kuai, Zheng Bin, Zou Wei, Zucheng
 Zheng.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmKSEgETHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgJpLEACee7mu2I00Z7VWtW5ckT4RFbAXYZcM
 Hv5DbTnVB2ItoQMRHvG52DNbR73j9HnYrz8kpwfTBVk90udxVP14L/swXDs3xbT4
 riXEYtJ1DRVc/bLiOK637RLPWNrmmZStWZme7k0Y9Ki5Aif8i1Erjjq7EIy47m9j
 j1MTcwp3ND7IsBON2nZ3PkttEHhevKvOwCPb/BWtPMDV0OhyQUFKB2SNegrlCrkT
 wshDgdQcYqbIix98PoGa2ZfUVgFQD3JVLzXa4sLpqouzGD+HvEFStOFa2Gq/ZEvV
 zunaeXDdZUCjlib6KvA8+aumBbIQ1s/urrDbxd+3BuYxZ094vNP1B428NT1AWVtl
 3bEZQIN8GSx0v9aHxZ8HePsAMXgG9d2o0xC9EMQ430+cqroN+6UHP7lkekwkprb7
 U9EpZCG9U8jV6SDcaMigW3tooEjn657we0R8nZG2NgUNssdSHVh/JYxGDALPXIAk
 awL3NQrR0tYF3Y3LJm5AxdQrK1hJH8E+hZFCZvIpUXGsr/uf9Gemy/62pD1rhrr/
 niULpxIneRGkJiXB5qdGy8pRu27ED53k7Ky6+8MWSEFQl1mUsHSryYACWz939D8c
 DydhBwQqDTl6Ozs41a5TkVjIRLOCrZADUd/VZM6A4kEOqPJ5t2Gz22Bn8ya1z6Ks
 5Sx6vrGH7GnDjA==
 =15oQ
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT)

 - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later)

 - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ

 - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later

 - Drop support for system call instruction emulation

 - Many other small features and fixes

Thanks to Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas
Sanjaya, Bjorn Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian
King, Daniel Axtens, Dwaipayan Ray, Fabiano Rosas, Finn Thain, Frank
Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu Hua, Haowen Bai,
Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing
Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof
Kozlowski, Laurent Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes,
Miaoqian Lin, Minghao Chi, Nathan Chancellor, Naveen N. Rao, Nicholas
Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár, Paul Mackerras,
Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib
Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang
wangx, Xiaomeng Tong, Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing,
Yu Kuai, Zheng Bin, Zou Wei, and Zucheng Zheng.

* tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits)
  powerpc/64: Include cache.h directly in paca.h
  powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU is set
  powerpc/xics: Include missing header
  powerpc/powernv/pci: Drop VF MPS fixup
  powerpc/fsl_book3e: Don't set rodata RO too early
  powerpc/microwatt: Add mmu bits to device tree
  powerpc/powernv/flash: Check OPAL flash calls exist before using
  powerpc/powermac: constify device_node in of_irq_parse_oldworld()
  powerpc/powermac: add missing g5_phy_disable_cpu1() declaration
  selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch"
  powerpc: Enable the DAWR on POWER9 DD2.3 and above
  powerpc/64s: Add CPU_FTRS_POWER10 to ALWAYS mask
  powerpc/64s: Add CPU_FTRS_POWER9_DD2_2 to CPU_FTRS_ALWAYS mask
  powerpc: Fix all occurences of "the the"
  selftests/powerpc/pmu/ebb: remove fixed_instruction.S
  powerpc/platforms/83xx: Use of_device_get_match_data()
  powerpc/eeh: Drop redundant spinlock initialization
  powerpc/iommu: Add missing of_node_put in iommu_init_early_dart
  powerpc/pseries/vas: Call misc_deregister if sysfs init fails
  powerpc/papr_scm: Fix leaking nvdimm_events_map elements
  ...
2022-05-28 11:27:17 -07:00
Daniel Axtens
5352090a99 powerpc/kasan: Don't instrument non-maskable or raw interrupts
Disable address sanitization for raw and non-maskable interrupt
handlers, because they can run in real mode, where we cannot access
the shadow memory.  (Note that kasan_arch_is_ready() doesn't test for
real mode, since it is a static branch for speed, and in any case not
all the entry points to the generic KASAN code are protected by
kasan_arch_is_ready guards.)

The changes to interrupt_nmi_enter/exit_prepare() look larger than
they actually are.  The changes are equivalent to adding
!IS_ENABLED(CONFIG_KASAN) to the conditions for calling nmi_enter() or
nmi_exit() in real mode.  That is, the code is equivalent to using the
following condition for calling nmi_enter/exit:

	if (((!IS_ENABLED(CONFIG_PPC_BOOK3S_64) ||
			!firmware_has_feature(FW_FEATURE_LPAR) ||
			radix_enabled()) &&
		    !IS_ENABLED(CONFIG_KASAN) ||
		(mfmsr() & MSR_DR))

That unwieldy condition has been split into several statements with
comments, for easier reading.

The nmi_ipi_lock functions that call atomic functions (i.e.,
nmi_ipi_lock_start(), nmi_ipi_lock() and nmi_ipi_unlock()), besides
being marked noinstr, now call arch_atomic_* functions instead of
atomic_* functions because with KASAN enabled, the atomic_* functions
are wrappers which explicitly do address sanitization on their
arguments.  Since we are trying to avoid address sanitization, we have
to use the lower-level arch_atomic_* versions.

In hv_nmi_check_nonrecoverable(), the regs_set_unrecoverable() call
has been open-coded so as to avoid having to either trust the inlining
or mark regs_set_unrecoverable() as noinstr.

[paulus@ozlabs.org: combined a few work-in-progress commits of
 Daniel's and wrote the commit message.]

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YoTFGaKM8Pd46PIK@cleo
2022-05-22 15:58:29 +10:00
Christophe Leroy
4390a58ee1 powerpc/inst: Remove PPC_INST_BRANCH
Convert last users of PPC_INST_BRANCH to PPC_RAW_BRANCH()

And remove PPC_INST_BRANCH.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/fa8807108a2ef2287a2c9651d6e1ff7c051923d9.1652074503.git.christophe.leroy@csgroup.eu
2022-05-22 15:58:27 +10:00
Christophe Leroy
bbffdd2fc7 powerpc/ftrace: Use patch_instruction() return directly
Instead of returning -EPERM when patch_instruction() fails,
just return what patch_instruction returns.

That simplifies ftrace_modify_code():

	   0:	94 21 ff c0 	stwu    r1,-64(r1)
	   4:	93 e1 00 3c 	stw     r31,60(r1)
	   8:	7c 7f 1b 79 	mr.     r31,r3
	   c:	40 80 00 30 	bge     3c <ftrace_modify_code+0x3c>
	  10:	93 c1 00 38 	stw     r30,56(r1)
	  14:	7c 9e 23 78 	mr      r30,r4
	  18:	7c a4 2b 78 	mr      r4,r5
	  1c:	80 bf 00 00 	lwz     r5,0(r31)
	  20:	7c 1e 28 40 	cmplw   r30,r5
	  24:	40 82 00 34 	bne     58 <ftrace_modify_code+0x58>
	  28:	83 c1 00 38 	lwz     r30,56(r1)
	  2c:	7f e3 fb 78 	mr      r3,r31
	  30:	83 e1 00 3c 	lwz     r31,60(r1)
	  34:	38 21 00 40 	addi    r1,r1,64
	  38:	48 00 00 00 	b       38 <ftrace_modify_code+0x38>
				38: R_PPC_REL24	patch_instruction

Before:

	   0:	94 21 ff c0 	stwu    r1,-64(r1)
	   4:	93 e1 00 3c 	stw     r31,60(r1)
	   8:	7c 7f 1b 79 	mr.     r31,r3
	   c:	40 80 00 4c 	bge     58 <ftrace_modify_code+0x58>
	  10:	93 c1 00 38 	stw     r30,56(r1)
	  14:	7c 9e 23 78 	mr      r30,r4
	  18:	7c a4 2b 78 	mr      r4,r5
	  1c:	80 bf 00 00 	lwz     r5,0(r31)
	  20:	7c 08 02 a6 	mflr    r0
	  24:	90 01 00 44 	stw     r0,68(r1)
	  28:	7c 1e 28 40 	cmplw   r30,r5
	  2c:	40 82 00 48 	bne     74 <ftrace_modify_code+0x74>
	  30:	7f e3 fb 78 	mr      r3,r31
	  34:	48 00 00 01 	bl      34 <ftrace_modify_code+0x34>
				34: R_PPC_REL24	patch_instruction
	  38:	80 01 00 44 	lwz     r0,68(r1)
	  3c:	20 63 00 00 	subfic  r3,r3,0
	  40:	83 c1 00 38 	lwz     r30,56(r1)
	  44:	7c 63 19 10 	subfe   r3,r3,r3
	  48:	7c 08 03 a6 	mtlr    r0
	  4c:	83 e1 00 3c 	lwz     r31,60(r1)
	  50:	38 21 00 40 	addi    r1,r1,64
	  54:	4e 80 00 20 	blr

It improves ftrace activation/deactivation duration by about 3%.

Modify patch_instruction() return on failure to -EPERM in order to
match with ftrace expectations. Other users of patch_instruction()
do not care about the exact error value returned.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/49a8597230713e2633e7d9d7b56140787c4a7e20.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:28 +10:00
Christophe Leroy
d2f47dabf1 powerpc/code-patching: Inline create_branch()
create_branch() is a good candidate for inlining because:
- Flags can be folded in.
- Range tests are likely to be already done.

Hence reducing the create_branch() to only a set of instructions.

So inline it.

It improves ftrace activation by 10%.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/69851cc9a7bf8f03d025e6d29e165f2d0bd3bb6e.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:28 +10:00
Christophe Leroy
1acbf27e8a powerpc/code-patching: Inline is_offset_in_{cond}_branch_range()
Test in is_offset_in_branch_range() and is_offset_in_cond_branch_range()
are simple tests that are worth inlining.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a05be0ccb7373e6a9789a1988fcd0c810f5f9269.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:28 +10:00
Christophe Leroy
1751289268 powerpc/code-patching: Use jump_label to check if poking_init() is done
It's only during early startup that poking_init() is not done yet,
for instance when calling ftrace_init().

Once poking_init() has been called there must be a poking area, no
need to check it everytime patch_instruction() is called.

ftrace activation time is reduced by 7% with the change on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8d6088aca7b63247377b6d9e4897d08d935fbe93.1647962456.git.christophe.leroy@csgroup.eu
2022-05-11 23:06:39 +10:00
Christophe Leroy
b033767848 powerpc/code-patching: Use jump_label for testing freed initmem
Once init is done, initmem is freed forever so no need to
test system_state at every call to patch_instruction().

Use jump_label.

This reduces by 2% the time needed to activate ftrace on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0aee964721cab7316cffde21a2ca223cee14d373.1647962456.git.christophe.leroy@csgroup.eu
2022-05-11 23:06:39 +10:00
Christophe Leroy
cb3ac45214 powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES
If CONFIG_MODULES is not set, there is no point in checking
whether text is in module area.

This reduced the time needed to activate/deactivate ftrace
by more than 10% on an 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f3c701cce00a38620788c0fc43ff0b611a268c54.1647962456.git.christophe.leroy@csgroup.eu
2022-05-08 22:15:41 +10:00
Nicholas Piggin
a553476c44 powerpc/64: remove system call instruction emulation
emulate_step() instruction emulation including sc instruction emulation
initially appeared in xmon. It was then moved into sstep.c where kprobes
could use it too, and later hw_breakpoint and uprobes started to use it.

Until uprobes, the only instruction emulation users were for kernel
mode instructions.

- xmon only steps / breaks on kernel addresses.
- kprobes is kernel only.
- hw_breakpoint only emulates kernel instructions, single steps user.

At one point, there was support for the kernel to execute sc
instructions, although that is long removed and it's not clear whether
there were any in-tree users. So system call emulation is not required
by the above users.

uprobes uses emulate_step and it appears possible to emulate sc
instruction in userspace. Userspace system call emulation is broken and
it's not clear it ever worked well.

The big complication is that userspace takes an interrupt to the kernel
to emulate the instruction. The user->kernel interrupt sets up registers
and interrupt stack frame expecting to return to userspace, then system
call instruction emulation re-directs that stack frame to the kernel,
early in the system call interrupt handler. This means the interrupt
return code takes the kernel->kernel restore path, which does not
restore everything as the system call interrupt handler would expect
coming from userspace. regs->iamr appears to get lost for example,
because the kernel->kernel return does not restore the user iamr.
Accounting such as irqflags tracing and CPU accounting does not get
flipped back to user mode as the system call handler expects, so those
appear to enter the kernel twice without returning to userspace.

These things may be individually fixable with various complication, but
it is a big complexity for unclear real benefit.

Furthermore, it is not possible to single step a system call instruction
since it causes an interrupt. As such, a separate patch disables probing
on system call instructions.

This patch removes system call emulation and disables stepping system
calls.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[minor commit log edit, and also get rid of '#ifdef CONFIG_PPC64']
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a412e3b3791ed83de18704c8d90f492e7a0049c0.1648648712.git.naveen.n.rao@linux.vnet.ibm.com
2022-05-06 00:00:20 +10:00
Yang Li
9923a6dace powerpc/sstep: Use bitwise instead of arithmetic operator for flags
Fix the following coccinelle warnings:
./arch/powerpc/lib/sstep.c:1090:20-21: WARNING: sum of probable
bitmasks, consider |
./arch/powerpc/lib/sstep.c:1115:20-21: WARNING: sum of probable
bitmasks, consider |
./arch/powerpc/lib/sstep.c:1134:20-21: WARNING: sum of probable
bitmasks, consider |

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1613811455-2457-1-git-send-email-yang.lee@linux.alibaba.com
2022-05-02 23:02:15 +10:00
Christoph Hellwig
6308499b5e net: unexport csum_and_copy_{from,to}_user
csum_and_copy_from_user and csum_and_copy_to_user are exported by a few
architectures, but not actually used in modular code.  Drop the exports.

Link: https://lkml.kernel.org/r/20220421070440.1282704-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-04-29 14:37:59 -07:00
Linus Torvalds
1f1c153e40 powerpc updates for 5.18
- Enforce kernel RO, and implement STRICT_MODULE_RWX for 603.
 
  - Add support for livepatch to 32-bit.
 
  - Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS.
 
  - Merge vdso64 and vdso32 into a single directory.
 
  - Fix build errors with newer binutils.
 
  - Add support for UADDR64 relocations, which are emitted by some toolchains. This allows
    powerpc to build with the latest lld.
 
  - Fix (another) potential userspace r13 corruption in transactional memory handling.
 
  - Cleanups of function descriptor handling & related fixes to LKDTM.
 
 Thanks to: Abdul Haleem, Alexey Kardashevskiy, Anders Roxell, Aneesh Kumar K.V, Anton
 Blanchard, Arnd Bergmann, Athira Rajeev, Bhaskar Chowdhury, Cédric Le Goater, Chen
 Jingwen, Christophe JAILLET, Christophe Leroy, Corentin Labbe, Daniel Axtens, Daniel
 Henrique Barboza, David Dai, Fabiano Rosas, Ganesh Goudar, Guo Zhengkui, Hangyu Hua, Haren
 Myneni, Hari Bathini, Igor Zhbanov, Jakob Koschel, Jason Wang, Jeremy Kerr, Joachim
 Wiberg, Jordan Niethe, Julia Lawall, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan
 Srinivasan, Mamatha Inamdar, Maxime Bizon, Maxim Kiselev, Maxim Kochetkov, Michal
 Suchanek, Nageswara R Sastry, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nour-eddine
 Taleb, Paul Menzel, Ping Fang, Pratik R. Sampat, Randy Dunlap, Ritesh Harjani, Rohan
 McLure, Russell Currey, Sachin Sant, Segher Boessenkool, Shivaprasad G Bhat, Sourabh Jain,
 Thierry Reding, Tobias Waldekranz, Tyrel Datwyler, Vaibhav Jain, Vladimir Oltean, Wedson
 Almeida Filho, YueHaibing.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmI9TtQTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgLp2D/0dwoliEJubRCfoawYUGhxTRZuo6ZYw
 EQzprOiFA/MtrZyPfbrX/FwxeeetzQWysaw2r5JAuQwx5Jb7Od9dNIrVmueFEktC
 hD4fkO8YT+QuOD3Xhp/rDQTImdw4fkeofIjnWIqEAtz0XGInmiRQKOnojVe/Po7f
 72Yi1u80LxYBAnkN/Hhpmi/BsVmu0Nh3wELu+JZopQXjINj4RyD49ayCBSLbmiNc
 uo7oYzJ0/WsZHNTpX9kAzzCq+XmI3dKZPyf2AOCvoRxJTmUPCRZF9QCwsmQFikiI
 vZOdz4fI5e+C0aYJj8ODmWMrXiS+JUQdEShjGg9t9K6EN8idC8joKWpAuXjTA9KN
 kRjzXX7AvjxaMEGbLe8gjU0PmEjY3eSzMOy15Oc/C0DRRswXRzrXdx2AF+/J6bQb
 MWMM4aCKfcYs5/TENkEnV0xpbOCOy4ikHM1KZbxvVrShvjSlNIL9XTOnl/pNK5BJ
 XSSI2mfnjKkbI1+l0KQ4NBXIRTo6HLpu5jwY3Xh97Tq7kaEfqDbO5p2P2HoOCiLa
 ZjdzmpP99zM6wnqUSj+lyvjob7btyhoq6TKmPtxfKbR6OaSfRJ760BCJ5y15Y9Hc
 rHey4Y/NL7LqsVYFZxi4/T6Ncq1hNeYr2Fiis4gH+/1zjr6Cd4othnvw3Slaxhst
 AaHpN3pyx1QI6g==
 =8r2c
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Livepatch support for 32-bit is probably the standout new feature,
  otherwise mostly just lots of bits and pieces all over the board.

  There's a series of commits cleaning up function descriptor handling,
  which touches a few other arches as well as LKDTM. It has acks from
  Arnd, Kees and Helge.

  Summary:

   - Enforce kernel RO, and implement STRICT_MODULE_RWX for 603.

   - Add support for livepatch to 32-bit.

   - Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS.

   - Merge vdso64 and vdso32 into a single directory.

   - Fix build errors with newer binutils.

   - Add support for UADDR64 relocations, which are emitted by some
     toolchains. This allows powerpc to build with the latest lld.

   - Fix (another) potential userspace r13 corruption in transactional
     memory handling.

   - Cleanups of function descriptor handling & related fixes to LKDTM.

  Thanks to Abdul Haleem, Alexey Kardashevskiy, Anders Roxell, Aneesh
  Kumar K.V, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Bhaskar
  Chowdhury, Cédric Le Goater, Chen Jingwen, Christophe JAILLET,
  Christophe Leroy, Corentin Labbe, Daniel Axtens, Daniel Henrique
  Barboza, David Dai, Fabiano Rosas, Ganesh Goudar, Guo Zhengkui, Hangyu
  Hua, Haren Myneni, Hari Bathini, Igor Zhbanov, Jakob Koschel, Jason
  Wang, Jeremy Kerr, Joachim Wiberg, Jordan Niethe, Julia Lawall, Kajol
  Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mamatha Inamdar,
  Maxime Bizon, Maxim Kiselev, Maxim Kochetkov, Michal Suchanek,
  Nageswara R Sastry, Nathan Lynch, Naveen N. Rao, Nicholas Piggin,
  Nour-eddine Taleb, Paul Menzel, Ping Fang, Pratik R. Sampat, Randy
  Dunlap, Ritesh Harjani, Rohan McLure, Russell Currey, Sachin Sant,
  Segher Boessenkool, Shivaprasad G Bhat, Sourabh Jain, Thierry Reding,
  Tobias Waldekranz, Tyrel Datwyler, Vaibhav Jain, Vladimir Oltean,
  Wedson Almeida Filho, and YueHaibing"

* tag 'powerpc-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits)
  powerpc/pseries: Fix use after free in remove_phb_dynamic()
  powerpc/time: improve decrementer clockevent processing
  powerpc/time: Fix KVM host re-arming a timer beyond decrementer range
  powerpc/tm: Fix more userspace r13 corruption
  powerpc/xive: fix return value of __setup handler
  powerpc/64: Add UADDR64 relocation support
  powerpc: 8xx: fix a return value error in mpc8xx_pic_init
  powerpc/ps3: remove unneeded semicolons
  powerpc/64: Force inlining of prevent_user_access() and set_kuap()
  powerpc/bitops: Force inlining of fls()
  powerpc: declare unmodified attribute_group usages const
  powerpc/spufs: Fix build warning when CONFIG_PROC_FS=n
  powerpc/secvar: fix refcount leak in format_show()
  powerpc/64e: Tie PPC_BOOK3E_64 to PPC_FSL_BOOK3E
  powerpc: Move C prototypes out of asm-prototypes.h
  powerpc/kexec: Declare kexec_paca static
  powerpc/smp: Declare current_set static
  powerpc: Cleanup asm-prototypes.c
  powerpc/ftrace: Use STK_GOT in ftrace_mprofile.S
  powerpc/ftrace: Regroup PPC64 specific operations in ftrace_mprofile.S
  ...
2022-03-25 09:39:36 -07:00
Linus Torvalds
194dfe88d6 asm-generic updates for 5.18
There are three sets of updates for 5.18 in the asm-generic tree:
 
  - The set_fs()/get_fs() infrastructure gets removed for good. This
    was already gone from all major architectures, but now we can
    finally remove it everywhere, which loses some particularly
    tricky and error-prone code.
    There is a small merge conflict against a parisc cleanup, the
    solution is to use their new version.
 
  - The nds32 architecture ends its tenure in the Linux kernel. The
    hardware is still used and the code is in reasonable shape, but
    the mainline port is not actively maintained any more, as all
    remaining users are thought to run vendor kernels that would never
    be updated to a future release.
    There are some obvious conflicts against changes to the removed
    files.
 
  - A series from Masahiro Yamada cleans up some of the uapi header
    files to pass the compile-time checks.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmI69BsACgkQmmx57+YA
 GNn/zA//f4d5VTT0ThhRxRWTu9BdThGHoB8TUcY7iOhbsWu0X/913NItRC3UeWNl
 IdmisaXgVtirg1dcC2pWUmrcHdoWOCEGfK4+Zr2NhSWfuZDWvODHK9pGWk4WLnhe
 cQgUNBvIuuAMryGtrOBwHPO4TpfCyy2ioeVP36ZfcsWXdDxTrqfaq/56mk3sxIP6
 sUTk1UEjut9NG4C9xIIvcSU50R3l6LryQE/H9kyTLtaSvfvTOvprcVYCq0GPmSzo
 DtQ1Wwa9zbJ+4EqoMiP5RrgQwWvOTg2iRByLU8ytwlX3e/SEF0uihvMv1FQbL8zG
 G8RhGUOKQSEhaBfc3lIkm8GpOVPh0uHzB6zhn7daVmAWtazRD2Nu59BMjipa+ims
 a8Z58iHH7jRAnKeEkVZqXKb1CEiUxaQx/IeVPzN4QlwMhDtwrI76LY7ZJ1zCqTGY
 ENG0yRLav1XselYBslOYXGtOEWcY5EZPWqLyWbp4P9vz2g0Fe0gZxoIOvPmNQc89
 QnfXpCt7vm/DGkyO255myu08GOLeMkisVqUIzLDB9avlym5mri7T7vk9abBa2YyO
 CRpTL5gl1/qKPWuH1UI5mvhT+sbbBE2SUHSuy84btns39ZKKKynwCtdu+hSQkKLE
 h9pV30Gf1cLTD4JAE0RWlUgOmbBLVp34loTOexQj4MrLM1noOnw=
 =vtCN
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "There are three sets of updates for 5.18 in the asm-generic tree:

   - The set_fs()/get_fs() infrastructure gets removed for good.

     This was already gone from all major architectures, but now we can
     finally remove it everywhere, which loses some particularly tricky
     and error-prone code. There is a small merge conflict against a
     parisc cleanup, the solution is to use their new version.

   - The nds32 architecture ends its tenure in the Linux kernel.

     The hardware is still used and the code is in reasonable shape, but
     the mainline port is not actively maintained any more, as all
     remaining users are thought to run vendor kernels that would never
     be updated to a future release.

   - A series from Masahiro Yamada cleans up some of the uapi header
     files to pass the compile-time checks"

* tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (27 commits)
  nds32: Remove the architecture
  uaccess: remove CONFIG_SET_FS
  ia64: remove CONFIG_SET_FS support
  sh: remove CONFIG_SET_FS support
  sparc64: remove CONFIG_SET_FS support
  lib/test_lockup: fix kernel pointer check for separate address spaces
  uaccess: generalize access_ok()
  uaccess: fix type mismatch warnings from access_ok()
  arm64: simplify access_ok()
  m68k: fix access_ok for coldfire
  MIPS: use simpler access_ok()
  MIPS: Handle address errors for accesses above CPU max virtual user address
  uaccess: add generic __{get,put}_kernel_nofault
  nios2: drop access_ok() check from __put_user()
  x86: use more conventional access_ok() definition
  x86: remove __range_not_ok()
  sparc64: add __{get,put}_kernel_nofault()
  nds32: fix access_ok() checks in get/put_user
  uaccess: fix nios2 and microblaze get_user_8()
  sparc64: fix building assembly files
  ...
2022-03-23 18:03:08 -07:00
Linus Torvalds
93e220a62d Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - hwrng core now credits for low-quality RNG devices.

  Algorithms:
   - Optimisations for neon aes on arm/arm64.
   - Add accelerated crc32_be on arm64.
   - Add ffdheXYZ(dh) templates.
   - Disallow hmac keys < 112 bits in FIPS mode.
   - Add AVX assembly implementation for sm3 on x86.

  Drivers:
   - Add missing local_bh_disable calls for crypto_engine callback.
   - Ensure BH is disabled in crypto_engine callback path.
   - Fix zero length DMA mappings in ccree.
   - Add synchronization between mailbox accesses in octeontx2.
   - Add Xilinx SHA3 driver.
   - Add support for the TDES IP available on sama7g5 SoC in atmel"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits)
  crypto: xilinx - Turn SHA into a tristate and allow COMPILE_TEST
  MAINTAINERS: update HPRE/SEC2/TRNG driver maintainers list
  crypto: dh - Remove the unused function dh_safe_prime_dh_alg()
  hwrng: nomadik - Change clk_disable to clk_disable_unprepare
  crypto: arm64 - cleanup comments
  crypto: qat - fix initialization of pfvf rts_map_msg structures
  crypto: qat - fix initialization of pfvf cap_msg structures
  crypto: qat - remove unneeded assignment
  crypto: qat - disable registration of algorithms
  crypto: hisilicon/qm - fix memset during queues clearing
  crypto: xilinx: prevent probing on non-xilinx hardware
  crypto: marvell/octeontx - Use swap() instead of open coding it
  crypto: ccree - Fix use after free in cc_cipher_exit()
  crypto: ccp - ccp_dmaengine_unregister release dma channels
  crypto: octeontx2 - fix missing unlock
  hwrng: cavium - fix NULL but dereferenced coccicheck error
  crypto: cavium/nitrox - don't cast parameter in bit operations
  crypto: vmx - add missing dependencies
  MAINTAINERS: Add maintainer for Xilinx ZynqMP SHA3 driver
  crypto: xilinx - Add Xilinx SHA3 driver
  ...
2022-03-21 16:02:36 -07:00
Christophe Leroy
76222808fc powerpc: Move C prototypes out of asm-prototypes.h
We originally added asm-prototypes.h in commit 42f5b4cacd ("powerpc:
Introduce asm-prototypes.h"). It's purpose was for prototypes of C
functions that are only called from asm, in order to fix sparse
warnings about missing prototypes.

A few months later Nick added a different use case in
commit 4efca4ed05 ("kbuild: modversions for EXPORT_SYMBOL() for asm")
for C prototypes for exported asm functions. This is basically the
inverse of our original usage.

Since then we've added various prototypes to asm-prototypes.h for both
reasons, meaning we now need to unstitch it all.

Dispatch prototypes of C functions into relevant headers and keep
only the prototypes for functions defined in assembly.

For the time being, leave prom_init() there because moving it
into asm/prom.h or asm/setup.h conflicts with
drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowrom.o
This will be fixed later by untaggling asm/pci.h and asm/prom.h
or by renaming the function in shadowrom.c

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/62d46904eca74042097acf4cb12c175e3067f3d1.1646413435.git.christophe.leroy@csgroup.eu
2022-03-08 22:06:25 +11:00
Michael Ellerman
591b4b2684 powerpc/code-patching: Pre-map patch area
Paul reported a warning with DEBUG_ATOMIC_SLEEP=y:

  BUG: sleeping function called from invalid context at include/linux/sched/mm.h:256
  in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
  preempt_count: 0, expected: 0
  ...
  Call Trace:
    dump_stack_lvl+0xa0/0xec (unreliable)
    __might_resched+0x2f4/0x310
    kmem_cache_alloc+0x220/0x4b0
    __pud_alloc+0x74/0x1d0
    hash__map_kernel_page+0x2cc/0x390
    do_patch_instruction+0x134/0x4a0
    arch_jump_label_transform+0x64/0x78
    __jump_label_update+0x148/0x180
    static_key_enable_cpuslocked+0xd0/0x120
    static_key_enable+0x30/0x50
    check_kvm_guest+0x60/0x88
    pSeries_smp_probe+0x54/0xb0
    smp_prepare_cpus+0x3e0/0x430
    kernel_init_freeable+0x20c/0x43c
    kernel_init+0x30/0x1a0
    ret_from_kernel_thread+0x5c/0x64

Peter pointed out that this is because do_patch_instruction() has
disabled interrupts, but then map_patch_area() calls map_kernel_page()
then hash__map_kernel_page() which does a sleeping memory allocation.

We only see the warning in KVM guests with SMT enabled, which is not
particularly common, or on other platforms if CONFIG_KPROBES is
disabled, also not common. The reason we don't see it in most
configurations is that another path that happens to have interrupts
enabled has allocated the required page tables for us, eg. there's a
path in kprobes init that does that. That's just pure luck though.

As Christophe suggested, the simplest solution is to do a dummy
map/unmap when we initialise the patching, so that any required page
table levels are pre-allocated before the first call to
do_patch_instruction(). This works because the unmap doesn't free any
page tables that were allocated by the map, it just clears the PTE,
leaving the page table levels there for the next map.

Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Debugged-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220223015821.473097-1-mpe@ellerman.id.au
2022-03-08 00:04:57 +11:00
Anders Roxell
8219d31eff powerpc/lib/sstep: Fix build errors with newer binutils
Building tinyconfig with gcc (Debian 11.2.0-16) and assembler (Debian
2.37.90.20220207) the following build error shows up:

  {standard input}: Assembler messages:
  {standard input}:10576: Error: unrecognized opcode: `stbcx.'
  {standard input}:10680: Error: unrecognized opcode: `lharx'
  {standard input}:10694: Error: unrecognized opcode: `lbarx'

Rework to add assembler directives [1] around the instruction.  The
problem with this might be that we can trick a power6 into
single-stepping through an stbcx. for instance, and it will execute that
in kernel mode.

[1] https://sourceware.org/binutils/docs/as/PowerPC_002dPseudo.html#PowerPC_002dPseudo

Fixes: 350779a29f ("powerpc: Handle most loads and stores in instruction emulation code")
Cc: stable@vger.kernel.org # v4.14+
Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220224162215.3406642-3-anders.roxell@linaro.org
2022-03-01 23:51:09 +11:00
Anders Roxell
a633cb1edd powerpc/lib/sstep: Fix 'sthcx' instruction
Looks like there been a copy paste mistake when added the instruction
'stbcx' twice and one was probably meant to be 'sthcx'. Changing to
'sthcx' from 'stbcx'.

Fixes: 350779a29f ("powerpc: Handle most loads and stores in instruction emulation code")
Cc: stable@vger.kernel.org # v4.14+
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220224162215.3406642-1-anders.roxell@linaro.org
2022-03-01 23:51:08 +11:00
Arnd Bergmann
23fc539e81 uaccess: fix type mismatch warnings from access_ok()
On some architectures, access_ok() does not do any argument type
checking, so replacing the definition with a generic one causes
a few warnings for harmless issues that were never caught before.

Fix the ones that I found either through my own test builds or
that were reported by the 0-day bot.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25 09:36:05 +01:00
Anders Roxell
fe663df782 powerpc/lib/sstep: fix 'ptesync' build error
Building tinyconfig with gcc (Debian 11.2.0-16) and assembler (Debian
2.37.90.20220207) the following build error shows up:

  {standard input}: Assembler messages:
  {standard input}:2088: Error: unrecognized opcode: `ptesync'
  make[3]: *** [/builds/linux/scripts/Makefile.build:287: arch/powerpc/lib/sstep.o] Error 1

Add the 'ifdef CONFIG_PPC64' around the 'ptesync' in function
'emulate_update_regs()' to like it is in 'analyse_instr()'. Since it looks like
it got dropped inadvertently by commit 3cdfcbfd32 ("powerpc: Change
analyse_instr so it doesn't modify *regs").

A key detail is that analyse_instr() will never recognise lwsync or
ptesync on 32-bit (because of the existing ifdef), and as a result
emulate_update_regs() should never be called with an op specifying
either of those on 32-bit. So removing them from emulate_update_regs()
should be a nop in terms of runtime behaviour.

Fixes: 3cdfcbfd32 ("powerpc: Change analyse_instr so it doesn't modify *regs")
Cc: stable@vger.kernel.org # v4.14+
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
[mpe: Add last paragraph of change log mentioning analyse_instr() details]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220211005113.1361436-1-anders.roxell@linaro.org
2022-02-15 22:31:35 +11:00
Christophe Leroy
6836f09903 powerpc/lib/sstep: use truncate_if_32bit()
Use truncate_if_32bit() when possible instead of open coding.

truncate_if_32bit() returns an unsigned long, so don't use it when
a signed value is expected.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7e1c07123f13156d4a27991a2e2694fb584bc068.1642752375.git.christophe.leroy@csgroup.eu
2022-02-12 22:47:43 +11:00
Christophe Leroy
7c3bba9199 powerpc/lib/sstep: Remove unneeded #ifdef __powerpc64__
MSR_64BIT is always defined, no need to hide code using MSR_64BIT
inside an #ifdef __powerpc64__

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ee61b693bc7e046eed1abb7a34909eb4878a9442.1642752375.git.christophe.leroy@csgroup.eu
2022-02-12 22:47:43 +11:00
Christophe Leroy
67484e0de9 powerpc/lib/sstep: Use l1_dcache_bytes() instead of opencoding
Don't opencode dcache size retrieval based on whether that's ppc32 or ppc64.

Use l1_dcache_bytes()

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/6c608fd4795e2d8ea1a0a449405a0087f76d8bb3.1642752375.git.christophe.leroy@csgroup.eu
2022-02-12 22:47:42 +11:00
Ard Biesheuvel
297565aa22 lib/xor: make xor prototypes more friendly to compiler vectorization
Modern compilers are perfectly capable of extracting parallelism from
the XOR routines, provided that the prototypes reflect the nature of the
input accurately, in particular, the fact that the input vectors are
expected not to overlap. This is not documented explicitly, but is
implied by the interchangeability of the various C routines, some of
which use temporary variables while others don't: this means that these
routines only behave identically for non-overlapping inputs.

So let's decorate these input vectors with the __restrict modifier,
which informs the compiler that there is no overlap. While at it, make
the input-only vectors pointer-to-const as well.

Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/563
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-02-11 20:39:39 +11:00
Christophe Leroy
1231816373 powerpc/32: Remove remaining .stabs annotations
STABS debug format has been superseded long time ago by DWARF.

Remove the few remaining .stabs annotations from old 32 bits code.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/68932ec2ba6b868d35006b96e90f0890f3da3c05.1638273868.git.christophe.leroy@csgroup.eu
2022-02-07 21:03:10 +11:00
Christophe Leroy
bba496656a powerpc/32: Fix boot failure with GCC latent entropy plugin
Boot fails with GCC latent entropy plugin enabled.

This is due to early boot functions trying to access 'latent_entropy'
global data while the kernel is not relocated at its final
destination yet.

As there is no way to tell GCC to use PTRRELOC() to access it,
disable latent entropy plugin in early_32.o and feature-fixups.o and
code-patching.o

Fixes: 38addce8b6 ("gcc-plugins: Add latent_entropy plugin")
Cc: stable@vger.kernel.org # v4.9+
Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215217
Link: https://lore.kernel.org/r/2bac55483b8daf5b1caa163a45fa5f9cdbe18be4.1640178426.git.christophe.leroy@csgroup.eu
2021-12-23 22:36:58 +11:00
Christophe Leroy
309a0a6018 powerpc/code-patching: Replace patch_instruction() by ppc_inst_write() in selftests
The purpose of selftests is to check that instructions are
properly formed. Not to check that they properly run.

For that test it uses normal memory, not special test
memory.

In preparation of a future patch enforcing patch_instruction()
to be used only on valid text areas, implement a ppc_inst_write()
instruction which is the complement of ppc_inst_read(). This
new function writes the formated instruction in valid kernel
memory and doesn't bother about icache.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7cf5335cc07ca9b6f8cdaa20ca9887fce4df3bea.1638446239.git.christophe.leroy@csgroup.eu
2021-12-23 22:36:58 +11:00
Christophe Leroy
f30a578d76 powerpc/code-patching: Move code patching selftests in its own file
Code patching selftests are half of code-patching.c.
As they are guarded by CONFIG_CODE_PATCHING_SELFTESTS,
they'd be better in their own file.

Also add a missing __init for instr_is_branch_to_addr()

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c0c30504f04eb546a48ff77127a8bccd12a3d809.1638446239.git.christophe.leroy@csgroup.eu
2021-12-23 22:36:58 +11:00
Christophe Leroy
31acc59956 powerpc/code-patching: Move instr_is_branch_{i/b}form() in code-patching.h
To enable moving selftests in their own C file in following patch,
move instr_is_branch_iform() and instr_is_branch_bform()
to code-patching.h

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/fca0f3b191211b3681020885a611bf73eef20563.1638446239.git.christophe.leroy@csgroup.eu
2021-12-23 22:36:58 +11:00
Christophe Leroy
29562a9da2 powerpc/code-patching: Move patch_exception() outside code-patching.c
patch_exception() is dedicated to book3e/64 is nothing more than
a normal use of patch_branch(), so move it into a place dedicated
to book3e/64.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0968622b98b1fb51838c35b844c42ad6609de62e.1638446239.git.christophe.leroy@csgroup.eu
2021-12-23 22:36:55 +11:00
Christophe Leroy
ff14a9c09f powerpc/code-patching: Use test_trampoline for prefixed patch test
Use the dedicated test_trampoline function for testing prefixed
patching like other tests and remove the hand coded assembly stuff.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a450ef3f8653f75e1bd9aaf7a3889d379752f33b.1638446239.git.christophe.leroy@csgroup.eu
2021-12-23 22:35:25 +11:00
Christophe Leroy
d5937db114 powerpc/code-patching: Fix patch_branch() return on out-of-range failure
Do not silentely ignore a failure of create_branch() in
patch_branch(). Return -ERANGE.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8540cb64b1f06710eaf41e3835c7ba3e21fa2b05.1638446239.git.christophe.leroy@csgroup.eu
2021-12-23 22:35:24 +11:00
Christophe Leroy
6b21af7449 powerpc/code-patching: Reorganise do_patch_instruction() to ease error handling
Split do_patch_instruction() in two functions, the caller doing the
spin locking and the callee doing everything else.

And remove a few unnecessary initialisations and intermediate
variables.

This allows the callee to return from anywhere in the function.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/dbc85980a0d2a935731b272e8907e8bb1d8fc8c5.1638446239.git.christophe.leroy@csgroup.eu
2021-12-23 22:35:24 +11:00
Christophe Leroy
a3483c3dd1 powerpc/code-patching: Fix unmap_patch_area() error handling
pXd_offset() doesn't return NULL. When the base is NULL, it
still adds the offset.

Use pXd_none() to check validity instead. It also improves
performance by folding out none existing levels as pXd_none()
always returns 0 in that case.

Such an error is unexpected, use WARN_ON() so that the caller
doesn't have to worry about it, and drop the returned value.

And now that unmap_patch_area() doesn't return error, we can
take into account the error returned by __patch_instruction().

While at it, remove the 'inline' property which is useless.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/299804b117fae35c786c827536c91f25352e279b.1638446239.git.christophe.leroy@csgroup.eu
2021-12-23 22:35:24 +11:00
Christophe Leroy
285672f993 powerpc/code-patching: Fix error handling in do_patch_instruction()
Use real errors instead of using -1 as error, so that errors
returned by callees can be used towards callers.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/85259d894069e47f915ea580b169e1adbeec7a61.1638446239.git.christophe.leroy@csgroup.eu
2021-12-23 22:35:24 +11:00
Christophe Leroy
af5304a750 powerpc/code-patching: Remove init_mem_is_free
A new state has been added by commit d2635f2012 ("mm: create a new
system state and fix core_kernel_text()"). That state tells when
initmem is about to be released and is redundant with init_mem_is_free.

Remove init_mem_is_free.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ad8c3ccb39c8edaa89fd3eda1cc7218baea1cde5.1638446239.git.christophe.leroy@csgroup.eu
2021-12-23 22:35:24 +11:00
Christophe Leroy
edecd2d6d6 powerpc/code-patching: Remove pr_debug()/pr_devel() messages and fix check()
code-patching has been working for years now, time has come to
remove debugging messages.

Change useful message to KERN_INFO and remove other ones.

Also add KERN_ERR to check() macro and change it into a do/while
to make checkpatch happy.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3ff9823c0a812a8a145d979a9600a6d4591b80ee.1638446239.git.christophe.leroy@csgroup.eu
2021-12-23 22:35:24 +11:00