Commit Graph

8840 Commits

Author SHA1 Message Date
Jann Horn
d6c494e8ee vdso/timens: Refactor copy-pasted find_timens_vvar_page() helper into one copy
find_timens_vvar_page() is not architecture-specific, as can be seen from
how all five per-architecture versions of it are the same.

(arm64, powerpc and riscv are exactly the same; x86 and s390 have two
characters difference inside a comment, less blank lines, and mark the
!CONFIG_TIME_NS version as inline.)

Refactor the five copies into a central copy in kernel/time/namespace.c.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20221130115320.2918447-1-jannh@google.com
2022-12-01 11:35:40 +01:00
Jordan Niethe
3671f4ebe3 powerpc: Allow clearing and restoring registers independent of saved breakpoint state
For the coming temporary mm used for instruction patching, the
breakpoint registers need to be cleared to prevent them from
accidentally being triggered. As soon as the patching is done, the
breakpoints will be restored.

The breakpoint state is stored in the per-cpu variable current_brk[].
Add a suspend_breakpoints() function which will clear the breakpoint
registers without touching the state in current_brk[]. Add a pair
function restore_breakpoints() which will move the state in
current_brk[] back to the registers.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221109045112.187069-2-bgray@linux.ibm.com
2022-11-30 21:46:48 +11:00
Michael Ellerman
611c020239 Merge branch 'fixes' into next
Merge our fixes branch to bring in some changes that are prerequisites
for work in next.
2022-11-30 21:46:06 +11:00
Russell Currey
f668027521 powerpc/8xx: Fix warning in hw_breakpoint_handler()
In hw_breakpoint_handler(), ea is set by wp_get_instr_detail() except
for 8xx, leading the variable to be passed uninitialised to
wp_check_constraints().  This is safe as wp_check_constraints() returns
early without using ea, so just set it to make the compiler happy.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221024041346.103608-1-ruscur@russell.cc
2022-11-24 23:31:49 +11:00
Naveen N. Rao
266b1991a4 powerpc/kprobes: Use preempt_enable() rather than the no_resched variant
preempt_enable_no_resched() is just the same as preempt_enable() when we
are in a irqs disabled context. kprobe_handler() and the post/fault
handlers are all called with irqs disabled. As such, convert those to
just use preempt_enable().

Reported-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/72639f75fe66f931ec8c2165276ffbfb0fe1006f.1666262278.git.naveen.n.rao@linux.vnet.ibm.com
2022-11-24 23:31:49 +11:00
Naveen N. Rao
04ec5d5782 powerpc/kprobes: Have optimized_callback() use preempt_enable()
Similar to x86 commit 2e62024c26 ("kprobes/x86: Use preempt_enable()
in optimized_callback()"), change powerpc optprobes to use
preempt_enable() rather than preempt_enable_no_resched() since powerpc
also removed irq disabling for optprobes in commit f72180cc93
("powerpc/kprobes: Do not disable interrupts for optprobes and
kprobes_on_ftrace").

Reported-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1885bab182626c33d9bf6421f430abf924c521a5.1666262278.git.naveen.n.rao@linux.vnet.ibm.com
2022-11-24 23:31:49 +11:00
Naveen N. Rao
2fa9482334 powerpc/kprobes: Remove preempt disable around call to get_kprobe() in arch_prepare_kprobe()
arch_prepare_kprobe() is called from register_kprobe() via
prepare_kprobe(), or through register_aggr_kprobe(), both with the
kprobe_mutex held. Per the comment for get_kprobe():
  /*
   * This routine is called either:
   *	- under the 'kprobe_mutex' - during kprobe_[un]register().
   *				OR
   *	- with preemption disabled - from architecture specific code.
   */

As such, there is no need to disable preemption around the call to
get_kprobe(). Drop the same.

Reported-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1043d06a0affed83a4a46dd29466e72820ee215d.1666262278.git.naveen.n.rao@linux.vnet.ibm.com
2022-11-24 23:31:49 +11:00
Nicholas Piggin
f985adaf2f powerpc: remove the last remnants of cputime_t
cputime_t was a core kernel type, removed by commits
ed5c8c854f2b..b672592f0221. As explained in commit b672592f02
("sched/cputime: Remove generic asm headers"), the final cleanup
is for the arch to provide cputime_to_nsec[s](). Commit ade7667a98
("powerpc: Add cputime_to_nsecs()") did that, but justdidn't remove
the then-unused cputime_to_usecs(), cputime_t type, and associated
remnants.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221006105653.115829-1-npiggin@gmail.com
2022-11-24 23:31:48 +11:00
Michael Ellerman
d90bb7b4fd powerpc: Print instruction dump on a single line
Although the previous commit made the powerpc instruction dump usable
with scripts/decodecode, there are still some problems.

Because the dump is split across multiple lines, the script doesn't cope
with printk timestamps or caller info.

That can be fixed by printing the entire dump on one line, eg:

  [   12.016307][  T112] --- interrupt: c00
  [   12.016605][  T112] Code: 4b7aae15 60000000 3d22016e 3c62ffec 39291160 38639bc0 e8890000 4b7aadf9 60000000 4bfffee8 7c0802a6 60000000 <0fe00000> 60420000 3c4c008f 384268a0
  [   12.017655][  T112] ---[ end trace 0000000000000000 ]---

That output can then be piped directly into scripts/decodecode and
interpreted correctly.

Printing the dump on a single line does produce a very long line, about
173 characters. That is still shorter than x86, which prints nearly 200
characters even without timestamps etc.

All consoles I'm aware of will wrap the line if it's too long, so the
length should not be a functional problem. If anything it should help on
consoles like VGA by using less vertical space.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221006032019.1128624-2-mpe@ellerman.id.au
2022-11-24 23:31:48 +11:00
Michael Ellerman
3e65412709 powerpc: Make instruction dump work with scripts/decodecode
Matt reported that scripts/decodecode doesn't work for the instruction
dump in the powerpc oops output. Although there are scripts around that
can decode it, it would be preferable if the standard in-tree script
worked.

All other arches prefix the instruction dump with "Code:", and that's
what the script looks for, so use that.

The script then works as expected:

  $ CROSS_COMPILE=powerpc64le-linux-gnu- ./scripts/decodecode
  Code:
  fbc1fff0 f821ffc1 7c7d1b78 7c9c2378 ebc30028 7fdff378 48000018 60000000
  60000000 ebff0008 7c3ef840 41820048 <815f0060> e93f0000 5529077c 7d295378
  ^D

  All code
  ========
     0:   f0 ff c1 fb     std     r30,-16(r1)
     4:   c1 ff 21 f8     stdu    r1,-64(r1)
     8:   78 1b 7d 7c     mr      r29,r3
     ...

Note that the script doesn't cope well with printk timestamps or printk
caller info.

Reported-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221006032019.1128624-1-mpe@ellerman.id.au
2022-11-24 23:31:48 +11:00
Disha Goel
2223552256 powerpc/kvm: Remove unused macros from asm-offset
The kvm code was refactored to convert some of kvm assembly routines to C.
This includes commits which moved code path for the kvm guest entry/exit
for p7/8 from aseembly to C. As part of the code changes, usage of some of
the macros were removed. But definitions still exist in the assembly files.
Commits are listed below:

Commit 2e1ae9cd56 ("KVM: PPC: Book3S HV: Implement radix prefetch workaround by disabling MMU")
Commit 9769a7fd79 ("KVM: PPC: Book3S HV: Remove radix guest support from P7/8 path")
Commit fae5c9f366 ("KVM: PPC: Book3S HV: remove ISA v3.0 and v3.1 support from P7/8 path")
Commit 57dc0eed73 ("KVM: PPC: Book3S HV P9: Implement PMU save/restore in C")

Many of the asm-offset macro definitions were missed to remove. Patch
fixes by removing the unused macros.

Signed-off-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220916105736.268153-2-disgoel@linux.vnet.ibm.com
2022-11-24 23:31:47 +11:00
Sathvika Vasireddy
d0160bd5d3 powerpc/vdso: Skip objtool from running on VDSO files
Do not run objtool on VDSO files, by using OBJECT_FILES_NON_STANDARD.

Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-8-sv@linux.ibm.com
2022-11-18 19:00:06 +11:00
Christophe Leroy
2da3776167 powerpc/32: Fix objtool unannotated intra-function call warnings
Fix several annotations in assembly files on PPC32.

[Sathvika Vasireddy: Changed subject line and removed Kconfig change to
 enable objtool, as it is a part of "objtool/powerpc: Enable objtool to
 be built on ppc" patch in this series.]

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-7-sv@linux.ibm.com
2022-11-18 19:00:06 +11:00
Jason A. Donenfeld
8032bf1233 treewide: use get_random_u32_below() instead of deprecated function
This is a simple mechanical transformation done by:

@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
  (E)

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Reviewed-by: SeongJae Park <sj@kernel.org> # for damon
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-11-18 02:15:15 +01:00
Nicholas Piggin
eb761a1760 powerpc: Fix writable sections being moved into the rodata region
.data.rel.ro*  catches .data.rel.root_cpuacct, and the kernel crashes on
a store in css_clear_dir. At least we know read-only data protection is
working...

Fixes: b6adc6d6d3 ("powerpc/build: move .data.rel.ro, .sdata2 to read-only")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221116043954.3307852-1-npiggin@gmail.com
2022-11-16 21:37:14 +11:00
Sergey Shtylyov
18b9fe54d9 powerpc: ptrace: user_regset_copyin_ignore() always returns 0
user_regset_copyin_ignore() always returns 0, so checking its result seems
pointless -- don't do this anymore...

[akpm@linux-foundation.org: fix gpr32_set_common()]
Link: https://lkml.kernel.org/r/20221014212235.10770-11-s.shtylyov@omp.ru
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Helge Deller <deller@gmx.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-15 14:30:40 -08:00
Sathvika Vasireddy
8d0c21b506 powerpc: Curb objtool unannotated intra-function call warnings
objtool throws the following unannotated intra-function call warnings:
arch/powerpc/kernel/entry_64.o: warning: objtool: .text+0x4: unannotated intra-function call
arch/powerpc/kvm/book3s_hv_rmhandlers.o: warning: objtool: .text+0xe64: unannotated intra-function call
arch/powerpc/kvm/book3s_hv_rmhandlers.o: warning: objtool: .text+0xee4: unannotated intra-function call

Fix these warnings by annotating intra-function calls, using
ANNOTATE_INTRA_FUNCTION_CALL macro, to indicate that the branch targets
are valid.

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-5-sv@linux.ibm.com
2022-11-15 20:11:47 +11:00
Sathvika Vasireddy
29a011fc79 powerpc: Fix objtool unannotated intra-function call warnings
Objtool throws unannotated intra-function call warnings in the following
assembly files:

arch/powerpc/kernel/vector.o: warning: objtool: .text+0x53c: unannotated intra-function call

arch/powerpc/kvm/book3s_hv_rmhandlers.o: warning: objtool: .text+0x60: unannotated intra-function call
arch/powerpc/kvm/book3s_hv_rmhandlers.o: warning: objtool: .text+0x124: unannotated intra-function call
arch/powerpc/kvm/book3s_hv_rmhandlers.o: warning: objtool: .text+0x5d4: unannotated intra-function call
arch/powerpc/kvm/book3s_hv_rmhandlers.o: warning: objtool: .text+0x5dc: unannotated intra-function call
arch/powerpc/kvm/book3s_hv_rmhandlers.o: warning: objtool: .text+0xcb8: unannotated intra-function call
arch/powerpc/kvm/book3s_hv_rmhandlers.o: warning: objtool: .text+0xd0c: unannotated intra-function call
arch/powerpc/kvm/book3s_hv_rmhandlers.o: warning: objtool: .text+0x1030: unannotated intra-function call

arch/powerpc/kernel/head_64.o: warning: objtool: .text+0x358: unannotated intra-function call
arch/powerpc/kernel/head_64.o: warning: objtool: .text+0x728: unannotated intra-function call
arch/powerpc/kernel/head_64.o: warning: objtool: .text+0x4d94: unannotated intra-function call
arch/powerpc/kernel/head_64.o: warning: objtool: .text+0x4ec4: unannotated intra-function call

arch/powerpc/kvm/book3s_hv_interrupts.o: warning: objtool: .text+0x6c: unannotated intra-function call
arch/powerpc/kernel/misc_64.o: warning: objtool: .text+0x64: unannotated intra-function call

Objtool does not add STT_NOTYPE symbols with size 0 to the rbtree, which
is why find_call_destination() function is not able to find the
destination symbol for 'bl' instruction. For such symbols, objtool is
throwing unannotated intra-function call warnings in assembly files. Fix
these warnings by annotating those symbols with SYM_FUNC_START_LOCAL and
SYM_FUNC_END macros, inorder to set symbol type to STT_FUNC and symbol
size accordingly.

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Sathvika Vasireddy <sv@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-4-sv@linux.ibm.com
2022-11-15 20:11:47 +11:00
Andreas Schwab
ce883a2ba3 powerpc/32: fix syscall wrappers with 64-bit arguments
With the introduction of syscall wrappers all wrappers for syscalls with
64-bit arguments must be handled specially, not only those that have
unaligned 64-bit arguments. This left out the fallocate() and
sync_file_range2() syscalls.

Fixes: 7e92e01b72 ("powerpc: Provide syscall wrapper")
Fixes: e237506238 ("powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs")
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/87mt9cxd6g.fsf_-_@igel.home
2022-11-01 10:24:09 +11:00
Nicholas Piggin
65722736c3 powerpc/64s/interrupt: Fix clear of PACA_IRQS_HARD_DIS when returning to soft-masked context
Commit a4cb3651a1 ("powerpc/64s/interrupt: Fix lost interrupts when
returning to soft-masked context") fixed the problem of pending irqs
being cleared when clearing the HARD_DIS bit, but then it didn't clear
the bit at all. This change clears HARD_DIS without affecting other bits
in the mask.

When an interrupt hits in a soft-masked section that has MSR[EE]=1, it
can hard disable and set PACA_IRQS_HARD_DIS, which must be cleared when
returning to the EE=1 caller (unless it was set due to a MUST_HARD_MASK
interrupt becoming pending). Failure to clear this leaves the
returned-to context running with MSR[EE]=1 and PACA_IRQS_HARD_DIS, which
confuses irq assertions and could be dangerous for code that might test
the flag.

This was observed in a hash MMU kernel where a kernel hash fault hits in
a local_irqs_disabled region that has EE=1. The hash fault also runs
with EE=1, then as it returns, a decrementer hits in the restart section
and the irq restart code hard-masks which sets the PACA_IRQ_HARD_DIS
flag, which is not clear when the original context is returned to.

Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Fixes: a4cb3651a1 ("powerpc/64s/interrupt: Fix lost interrupts when returning to soft-masked context")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221022052207.471328-1-npiggin@gmail.com
2022-10-27 00:38:35 +11:00
Nicholas Piggin
dc398a084d powerpc/64s/interrupt: Perf NMI should not take normal exit path
NMI interrupts should exit with EXCEPTION_RESTORE_REGS not with
interrupt_return_srr, which is what the perf NMI handler currently does.
This breaks if a PMI hits after interrupt_exit_user_prepare_main() has
switched the context tracking to user mode, then the CT_WARN_ON() in
interrupt_exit_kernel_prepare() fires because it returns to kernel with
context set to user.

This could possibly be solved by soft-disabling PMIs in the exit path,
but that reduces our ability to profile that code. The warning could be
removed, but it's potentially useful.

All other NMIs and soft-NMIs return using EXCEPTION_RESTORE_REGS, so
this makes perf interrupts consistent with that and seems like the best
fix.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Squash in fixups from Nick]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221006140413.126443-3-npiggin@gmail.com
2022-10-18 22:46:19 +11:00
Nicholas Piggin
a073672eb0 powerpc/64/interrupt: Prevent NMI PMI causing a dangerous warning
NMI PMIs really should not return using the normal interrupt_return
function. If such a PMI hits in code returning to user with the context
switched to user mode, this warning can fire. This was enough to cause
crashes when reproducing on 64s, because another perf interrupt would
hit while reporting bug, and that would cause another bug, and so on
until smashing the stack.

Work around that particular crash for now by just disabling that context
warning for PMIs. This is a hack and not a complete fix, there could be
other such problems lurking in corners. But it does fix the known crash.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221014030729.2077151-3-npiggin@gmail.com
2022-10-18 22:46:19 +11:00
Linus Torvalds
f1947d7c8a Random number generator fixes for Linux 6.1-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmNHYD0ACgkQSfxwEqXe
 A655AA//dJK0PdRghqrKQsl18GOCffV5TUw5i1VbJQbI9d8anfxNjVUQiNGZi4et
 qUwZ8OqVXxYx1Z1UDgUE39PjEDSG9/cCvOpMUWqN20/+6955WlNZjwA7Fk6zjvlM
 R30fz5CIJns9RFvGT4SwKqbVLXIMvfg/wDENUN+8sxt36+VD2gGol7J2JJdngEhM
 lW+zqzi0ABqYy5so4TU2kixpKmpC08rqFvQbD1GPid+50+JsOiIqftDErt9Eg1Mg
 MqYivoFCvbAlxxxRh3+UHBd7ZpJLtp1UFEOl2Rf00OXO+ZclLCAQAsTczucIWK9M
 8LCZjb7d4lPJv9RpXFAl3R1xvfc+Uy2ga5KeXvufZtc5G3aMUKPuIU7k28ZyblVS
 XXsXEYhjTSd0tgi3d0JlValrIreSuj0z2QGT5pVcC9utuAqAqRIlosiPmgPlzXjr
 Us4jXaUhOIPKI+Musv/fqrxsTQziT0jgVA3Njlt4cuAGm/EeUbLUkMWwKXjZLTsv
 vDsBhEQFmyZqxWu4pYo534VX2mQWTaKRV1SUVVhQEHm57b00EAiZohoOvweB09SR
 4KiJapikoopmW4oAUFotUXUL1PM6yi+MXguTuc1SEYuLz/tCFtK8DJVwNpfnWZpE
 lZKvXyJnHq2Sgod/hEZq58PMvT6aNzTzSg7YzZy+VabxQGOO5mc=
 =M+mV
 -----END PGP SIGNATURE-----

Merge tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull more random number generator updates from Jason Donenfeld:
 "This time with some large scale treewide cleanups.

  The intent of this pull is to clean up the way callers fetch random
  integers. The current rules for doing this right are:

   - If you want a secure or an insecure random u64, use get_random_u64()

   - If you want a secure or an insecure random u32, use get_random_u32()

     The old function prandom_u32() has been deprecated for a while
     now and is just a wrapper around get_random_u32(). Same for
     get_random_int().

   - If you want a secure or an insecure random u16, use get_random_u16()

   - If you want a secure or an insecure random u8, use get_random_u8()

   - If you want secure or insecure random bytes, use get_random_bytes().

     The old function prandom_bytes() has been deprecated for a while
     now and has long been a wrapper around get_random_bytes()

   - If you want a non-uniform random u32, u16, or u8 bounded by a
     certain open interval maximum, use prandom_u32_max()

     I say "non-uniform", because it doesn't do any rejection sampling
     or divisions. Hence, it stays within the prandom_*() namespace, not
     the get_random_*() namespace.

     I'm currently investigating a "uniform" function for 6.2. We'll see
     what comes of that.

  By applying these rules uniformly, we get several benefits:

   - By using prandom_u32_max() with an upper-bound that the compiler
     can prove at compile-time is ≤65536 or ≤256, internally
     get_random_u16() or get_random_u8() is used, which wastes fewer
     batched random bytes, and hence has higher throughput.

   - By using prandom_u32_max() instead of %, when the upper-bound is
     not a constant, division is still avoided, because
     prandom_u32_max() uses a faster multiplication-based trick instead.

   - By using get_random_u16() or get_random_u8() in cases where the
     return value is intended to indeed be a u16 or a u8, we waste fewer
     batched random bytes, and hence have higher throughput.

  This series was originally done by hand while I was on an airplane
  without Internet. Later, Kees and I worked on retroactively figuring
  out what could be done with Coccinelle and what had to be done
  manually, and then we split things up based on that.

  So while this touches a lot of files, the actual amount of code that's
  hand fiddled is comfortably small"

* tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  prandom: remove unused functions
  treewide: use get_random_bytes() when possible
  treewide: use get_random_u32() when possible
  treewide: use get_random_{u8,u16}() when possible, part 2
  treewide: use get_random_{u8,u16}() when possible, part 1
  treewide: use prandom_u32_max() when possible, part 2
  treewide: use prandom_u32_max() when possible, part 1
2022-10-16 15:27:07 -07:00
Linus Torvalds
70609c1495 powerpc fixes for 6.1 #2
- Fix 32-bit syscall wrappers with 64-bit arguments of unaligned register-pairs.
    Notably this broke ftruncate64 & pread/write64, which can lead to file corruption.
 
  - Fix lost interrupts when returning to soft-masked context on 64-bit.
 
  - Fix build failure when CONFIG_DTL=n.
 
 Thanks to: Nicholas Piggin, Jason A. Donenfeld, Guenter Roeck, Arnd Bergmann, Sachin Sant.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmNJQp8THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgPd3D/9+uPapAs6/ZGkrbojseiF/fZNyIfTF
 HcLZ3iWUP9+APuaRxd1m3HbBoWdgFQ9yC5abYnX1z+E/yEhbTaOof9Gu1vXfxbIF
 Orlc+X12uZMrdU1L2qlgk/89QLeKOTlp/b2eUXJ332Vl4DfJDoknMsqTqbMz9NrL
 EDj6vmG3JzJgKHmplwxd7d4CExQNh9khZr3lZ6Ae/hC1cZ1PVXD67akwxsc0i8E2
 dBGzwsP1SYXPrBMXONT1GNs/whXrKBHNyLWFueab3p0FsSkbc4sY/HaAPzjq7YzK
 aPV0FQi2YpIZ5IIunZA+vW+tQ8EvyOpOUPMGb7iiN/31YcFbf/k6Y92AlCFlHRq3
 9vvsNGzDeFBfvH2wWcR1XaAMQHmBpC4L+dYbCKIyuK9DjgZtpZUV3acagRMFXqMN
 TeN9h2GXAYxCdNc27I5v3wAvbNcx69E4Bs5F7voKGD4pb7V/4d/li2oix6v0FKIy
 qAt1PGhD4IgPE3oJOEOYsJRVQNyDfG4eKlFh5ijyyc5hQeGNeKiAH+G+a61R3ueo
 GvVOdkR/1QxK1XkbgrMyclb0nHN1kEjRHeIcMFW1+dMKA9UvOlJ6u8nkOFRvA3js
 r00O1uE7DT5WzdGXgEHCCDd7PybBKxpM32m4/vzgDM7xG+DHxp3j7LTQsDOfkFsH
 bkc31LqrAw3yDQ==
 =B+7s
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix 32-bit syscall wrappers with 64-bit arguments of unaligned
   register-pairs. Notably this broke ftruncate64 & pread/write64, which
   can lead to file corruption.

 - Fix lost interrupts when returning to soft-masked context on 64-bit.

 - Fix build failure when CONFIG_DTL=n.

Thanks to Nicholas Piggin, Jason A. Donenfeld, Guenter Roeck, Arnd
Bergmann, and Sachin Sant.

* tag 'powerpc-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pseries: Fix CONFIG_DTL=n build
  powerpc/64s/interrupt: Fix lost interrupts when returning to soft-masked context
  powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs
2022-10-14 11:16:18 -07:00
Nicholas Piggin
a4cb3651a1 powerpc/64s/interrupt: Fix lost interrupts when returning to soft-masked context
It's possible for an interrupt returning to an irqs-disabled context to
lose a pending soft-masked irq because it branches to part of the exit
code for irqs-enabled contexts, which is meant to clear only the
PACA_IRQS_HARD_DIS flag from PACAIRQHAPPENED by zeroing the byte. This
just looks like a simple thinko from a recent commit (if there was no
hard mask pending, there would be no reason to clear it anyway).

This also adds comment to the code that actually does need to clear the
flag.

Fixes: e485f6c751 ("powerpc/64/interrupt: Fix return to masked context after hard-mask irq becomes pending")
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221013064418.1311104-1-npiggin@gmail.com
2022-10-13 22:29:49 +11:00
Linus Torvalds
676cb49573 - hfs and hfsplus kmap API modernization from Fabio Francesco
- Valentin Schneider makes crash-kexec work properly when invoked from
   an NMI-time panic.
 
 - ntfs bugfixes from Hawkins Jiawei
 
 - Jiebin Sun improves IPC msg scalability by replacing atomic_t's with
   percpu counters.
 
 - nilfs2 cleanups from Minghao Chi
 
 - lots of other single patches all over the tree!
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0Yf0gAKCRDdBJ7gKXxA
 joapAQDT1d1zu7T8yf9cQXkYnZVuBKCjxKE/IsYvqaq1a42MjQD/SeWZg0wV05B8
 DhJPj9nkEp6R3Rj3Mssip+3vNuceAQM=
 =lUQY
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - hfs and hfsplus kmap API modernization (Fabio Francesco)

 - make crash-kexec work properly when invoked from an NMI-time panic
   (Valentin Schneider)

 - ntfs bugfixes (Hawkins Jiawei)

 - improve IPC msg scalability by replacing atomic_t's with percpu
   counters (Jiebin Sun)

 - nilfs2 cleanups (Minghao Chi)

 - lots of other single patches all over the tree!

* tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits)
  include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype
  proc: test how it holds up with mapping'less process
  mailmap: update Frank Rowand email address
  ia64: mca: use strscpy() is more robust and safer
  init/Kconfig: fix unmet direct dependencies
  ia64: update config files
  nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure
  fork: remove duplicate included header files
  init/main.c: remove unnecessary (void*) conversions
  proc: mark more files as permanent
  nilfs2: remove the unneeded result variable
  nilfs2: delete unnecessary checks before brelse()
  checkpatch: warn for non-standard fixes tag style
  usr/gen_init_cpio.c: remove unnecessary -1 values from int file
  ipc/msg: mitigate the lock contention with percpu counter
  percpu: add percpu_counter_add_local and percpu_counter_sub_local
  fs/ocfs2: fix repeated words in comments
  relay: use kvcalloc to alloc page array in relay_alloc_page_array
  proc: make config PROC_CHILDREN depend on PROC_FS
  fs: uninline inode_maybe_inc_iversion()
  ...
2022-10-12 11:00:22 -07:00
Nicholas Piggin
e237506238 powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs
powerpc 32-bit system call (and function) calling convention for 64-bit
arguments requires the next available odd-pair (two sequential registers
with the first being odd-numbered) from the standard register argument
allocation.

The first argument register is r3, so a 64-bit argument that appears at
an even position in the argument list must skip a register (unless there
were preceding 64-bit arguments, which might throw things off). This
requires non-standard compat definitions to deal with the holes in the
argument register allocation.

With pt_regs syscall wrappers which use a standard mapper to map pt_regs
GPRs to function arguments, 32-bit kernels hit the same basic problem,
the standard definitions don't cope with the unused argument registers.

Fix this by having 32-bit kernels share those syscall definitions with
compat.

Thanks to Jason for spending a lot of time finding and bisecting this
and developing a trivial reproducer. The perfect bug report.

Reported-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Fixes: 7e92e01b72 ("powerpc: Provide syscall wrapper")
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221012035335.866440-1-npiggin@gmail.com
2022-10-13 00:49:58 +11:00
Jason A. Donenfeld
81895a65ec treewide: use prandom_u32_max() when possible, part 1
Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
the minimum required bytes from the RNG and avoids divisions. This was
done mechanically with this coccinelle script:

@basic@
expression E;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u64;
@@
(
- ((T)get_random_u32() % (E))
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ((E) - 1))
+ prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2)
|
- ((u64)(E) * get_random_u32() >> 32)
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ~PAGE_MASK)
+ prandom_u32_max(PAGE_SIZE)
)

@multi_line@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
identifier RAND;
expression E;
@@

-       RAND = get_random_u32();
        ... when != RAND
-       RAND %= (E);
+       RAND = prandom_u32_max(E);

// Find a potential literal
@literal_mask@
expression LITERAL;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
position p;
@@

        ((T)get_random_u32()@p & (LITERAL))

// Add one to the literal.
@script:python add_one@
literal << literal_mask.LITERAL;
RESULT;
@@

value = None
if literal.startswith('0x'):
        value = int(literal, 16)
elif literal[0] in '123456789':
        value = int(literal, 10)
if value is None:
        print("I don't know how to handle %s" % (literal))
        cocci.include_match(False)
elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1:
        print("Skipping 0x%x for cleanup elsewhere" % (value))
        cocci.include_match(False)
elif value & (value + 1) != 0:
        print("Skipping 0x%x because it's not a power of two minus one" % (value))
        cocci.include_match(False)
elif literal.startswith('0x'):
        coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1))
else:
        coccinelle.RESULT = cocci.make_expr("%d" % (value + 1))

// Replace the literal mask with the calculated result.
@plus_one@
expression literal_mask.LITERAL;
position literal_mask.p;
expression add_one.RESULT;
identifier FUNC;
@@

-       (FUNC()@p & (LITERAL))
+       prandom_u32_max(RESULT)

@collapse_ret@
type T;
identifier VAR;
expression E;
@@

 {
-       T VAR;
-       VAR = (E);
-       return VAR;
+       return E;
 }

@drop_var@
type T;
identifier VAR;
@@

 {
-       T VAR;
        ... when != VAR
 }

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: KP Singh <kpsingh@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-11 17:42:55 -06:00
Joel Stanley
ae5b6779fa powerpc: Fix 85xx build
The merge of the kbuild tree dropped the renaming of the FSL_BOOKE
kconfig option.

Fixes: 8afc66e8d4 ("Merge tag 'kbuild-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-11 10:13:34 -07:00
Linus Torvalds
27bc50fc90 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any negative
   reports (or any positive ones, come to that).
 
 - Also the Maple Tree from Liam R.  Howlett.  An overlapping range-based
   tree for vmas.  It it apparently slight more efficient in its own right,
   but is mainly targeted at enabling work to reduce mmap_lock contention.
 
   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.
 
   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   (https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
   This has yet to be addressed due to Liam's unfortunately timed
   vacation.  He is now back and we'll get this fixed up.
 
 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer.  It uses
   clang-generated instrumentation to detect used-unintialized bugs down to
   the single bit level.
 
   KMSAN keeps finding bugs.  New ones, as well as the legacy ones.
 
 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.
 
 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
   file/shmem-backed pages.
 
 - userfaultfd updates from Axel Rasmussen
 
 - zsmalloc cleanups from Alexey Romanov
 
 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
 
 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.
 
 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.
 
 - memcg cleanups from Kairui Song.
 
 - memcg fixes and cleanups from Johannes Weiner.
 
 - Vishal Moola provides more folio conversions
 
 - Zhang Yi removed ll_rw_block() :(
 
 - migration enhancements from Peter Xu
 
 - migration error-path bugfixes from Huang Ying
 
 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths.  For optimizations by PMEM drivers, DRM
   drivers, etc.
 
 - vma merging improvements from Jakub Matěn.
 
 - NUMA hinting cleanups from David Hildenbrand.
 
 - xu xin added aditional userspace visibility into KSM merging activity.
 
 - THP & KSM code consolidation from Qi Zheng.
 
 - more folio work from Matthew Wilcox.
 
 - KASAN updates from Andrey Konovalov.
 
 - DAMON cleanups from Kaixu Xia.
 
 - DAMON work from SeongJae Park: fixes, cleanups.
 
 - hugetlb sysfs cleanups from Muchun Song.
 
 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
 joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
 bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
 =xfWx
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP & KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
2022-10-10 17:53:04 -07:00
Linus Torvalds
d4013bc4d4 bitmap patches for v6.1-rc1
From Phil Auld:
 drivers/base: Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES
 
 From me:
 cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
 
 This series cleans that mess and adds new config FORCE_NR_CPUS that
 allows to optimize cpumask subsystem if the number of CPUs is known
 at compile-time.
 
 From me:
 lib: optimize find_bit() functions
 
 Reworks find_bit() functions based on new FIND_{FIRST,NEXT}_BIT() macros.
 
 From me:
 lib/find: add find_nth_bit()
 
 Adds find_nth_bit(), which is ~70 times faster than bitcounting with
 for_each() loop:
         for_each_set_bit(bit, mask, size)
                 if (n-- == 0)
                         return bit;
 
 Also adds bitmap_weight_and() to let people replace this pattern:
 	tmp = bitmap_alloc(nbits);
 	bitmap_and(tmp, map1, map2, nbits);
 	weight = bitmap_weight(tmp, nbits);
 	bitmap_free(tmp);
 with a single bitmap_weight_and() call.
 
 From me:
 cpumask: repair cpumask_check()
 
 After switching cpumask to use nr_cpu_ids, cpumask_check() started
 generating many false-positive warnings. This series fixes it.
 
 From Valentin Schneider:
 bitmap,cpumask: Add for_each_cpu_andnot() and for_each_cpu_andnot()
 
 Extends the API with one more function and applies it in sched/core.
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmNBwmUACgkQsUSA/Tof
 vshPRwv+KlqnZlKtuSPgbo/Kgswworpi/7TqfnN9GWlb8AJ2uhjBKI3GFwv4TDow
 7KV6wdKdXYLr4pktcIhWy3qLrT+bDDExfarHRo3QI1A1W42EJ+ZiUaGnQGcnVMzD
 5q/K1YMJYq0oaesHEw5PVUh8mm6h9qRD8VbX1u+riW/VCWBj3bho9Dp4mffQ48Q6
 hVy/SnMGgClQwNYp+sxkqYx38xUqUGYoU5MzeziUmoS6pZQh+4lF33MULnI3EKmc
 /ehXilPPtOV/Tm0RovDWFfm3rjNapV9FXHu8Ob2z/c+1A29EgXnE3pwrBDkAx001
 TQrL9qbCANRDGPLzWQHw0dwFIaXvTdrSttCsfYYfU5hI4JbnJEe0Pqkaaohy7jqm
 r0dW/TlyOG5T+k8Kwdx9w9A+jKs8TbKKZ8HOaN8BpkXswVnpbzpQbj3TITZI4aeV
 6YR4URBQ5UkrVLEXFXbrOzwjL2zqDdyNoBdTJmGLJ+5b/n0HHzmyMVkegNIwLLM3
 GR7sMQae
 =Q/+F
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-6.1-rc1' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES (Phil Auld)

 - cleanup nr_cpu_ids vs nr_cpumask_bits mess (me)

   This series cleans that mess and adds new config FORCE_NR_CPUS that
   allows to optimize cpumask subsystem if the number of CPUs is known
   at compile-time.

 - optimize find_bit() functions (me)

   Reworks find_bit() functions based on new FIND_{FIRST,NEXT}_BIT()
   macros.

 - add find_nth_bit() (me)

   Adds find_nth_bit(), which is ~70 times faster than bitcounting with
   for_each() loop:

	for_each_set_bit(bit, mask, size)
		if (n-- == 0)
			return bit;

   Also adds bitmap_weight_and() to let people replace this pattern:

	tmp = bitmap_alloc(nbits);
	bitmap_and(tmp, map1, map2, nbits);
	weight = bitmap_weight(tmp, nbits);
	bitmap_free(tmp);

   with a single bitmap_weight_and() call.

 - repair cpumask_check() (me)

   After switching cpumask to use nr_cpu_ids, cpumask_check() started
   generating many false-positive warnings. This series fixes it.

 - Add for_each_cpu_andnot() and for_each_cpu_andnot() (Valentin
   Schneider)

   Extends the API with one more function and applies it in sched/core.

* tag 'bitmap-6.1-rc1' of https://github.com/norov/linux: (28 commits)
  sched/core: Merge cpumask_andnot()+for_each_cpu() into for_each_cpu_andnot()
  lib/test_cpumask: Add for_each_cpu_and(not) tests
  cpumask: Introduce for_each_cpu_andnot()
  lib/find_bit: Introduce find_next_andnot_bit()
  cpumask: fix checking valid cpu range
  lib/bitmap: add tests for for_each() loops
  lib/find: optimize for_each() macros
  lib/bitmap: introduce for_each_set_bit_wrap() macro
  lib/find_bit: add find_next{,_and}_bit_wrap
  cpumask: switch for_each_cpu{,_not} to use for_each_bit()
  net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and}
  cpumask: add cpumask_nth_{,and,andnot}
  lib/bitmap: remove bitmap_ord_to_pos
  lib/bitmap: add tests for find_nth_bit()
  lib: add find_nth{,_and,_andnot}_bit()
  lib/bitmap: add bitmap_weight_and()
  lib/bitmap: don't call __bitmap_weight() in kernel code
  tools: sync find_bit() implementation
  lib/find_bit: optimize find_next_bit() functions
  lib/find_bit: create find_first_zero_bit_le()
  ...
2022-10-10 12:49:34 -07:00
Linus Torvalds
8afc66e8d4 Kbuild updates for v6.1
- Remove potentially incomplete targets when Kbuid is interrupted by
    SIGINT etc. in case GNU Make may miss to do that when stderr is piped
    to another program.
 
  - Rewrite the single target build so it works more correctly.
 
  - Fix rpm-pkg builds with V=1.
 
  - List top-level subdirectories in ./Kbuild.
 
  - Ignore auto-generated __kstrtab_* and __kstrtabns_* symbols in kallsyms.
 
  - Avoid two different modules in lib/zstd/ having shared code, which
    potentially causes building the common code as build-in and modular
    back-and-forth.
 
  - Unify two modpost invocations to optimize the build process.
 
  - Remove head-y syntax in favor of linker scripts for placing particular
    sections in the head of vmlinux.
 
  - Bump the minimal GNU Make version to 3.82.
 
  - Clean up misc Makefiles and scripts.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmM+4vcVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGY2IQAInr0JUNnkkxwUSXtOcQuA3IK8RJ
 FbU9HXJRoV9H+7+l3SMlN7mIbrs5eE5fTY3iwQ3CVe139d1+1q7nvTMRv8owywJx
 GBgzswncuu1lk7iQQ//CxiqMwSCG8GJdYn1uDVy4I5jg3o+DtFZJtyq2Wb7pqsMm
 ZhZ4PozRN+idYQJSF6Vx/zEVLHI7quMBwfe4CME8/0Kg2+hnYzbXV/aUf0ED2emq
 zdCMDQgIOK5AhY+8qgMXKYnBUJMTqBp6LoR4p3ApfUkwRFY0sGa0/LK3U/B22OE7
 uWyR4fCUExGyerlcHEVev+9eBfmsLLPyqlchNwpSDOPf5OSdnKmgqJEBR/Cvx0eh
 URerPk7EHxyH3G8yi+cU2GtofNTGc5RHPRgJE2ADsQEi5TAUKGmbXMlsFRL/51Vn
 lTANZObBNa1d4enljF6TfTL5nuccOa+DKvXnH9fQ49t0QdtSikv6J/lGwilwm1Sr
 BctmCsySPuURZfkpI9OQnLuouloMXl9f7Q/+S39haS/tSgvPpyITyO71nxDnXn/s
 BbFObZJUk9QkqOACjBP1hNErTLt83uBxQ9z+rDCw/SbLIe4nw0wyneuygfHI5rI8
 3RZB2DbGauuJHX2Zs6YGS14SLSY33IsLqKR1/Vy3LrPvOHuEvNiOR8LITq5E0YCK
 OffK2Y5cIlXR0QWf
 =DHiN
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Remove potentially incomplete targets when Kbuid is interrupted by
   SIGINT etc in case GNU Make may miss to do that when stderr is piped
   to another program.

 - Rewrite the single target build so it works more correctly.

 - Fix rpm-pkg builds with V=1.

 - List top-level subdirectories in ./Kbuild.

 - Ignore auto-generated __kstrtab_* and __kstrtabns_* symbols in
   kallsyms.

 - Avoid two different modules in lib/zstd/ having shared code, which
   potentially causes building the common code as build-in and modular
   back-and-forth.

 - Unify two modpost invocations to optimize the build process.

 - Remove head-y syntax in favor of linker scripts for placing
   particular sections in the head of vmlinux.

 - Bump the minimal GNU Make version to 3.82.

 - Clean up misc Makefiles and scripts.

* tag 'kbuild-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (41 commits)
  docs: bump minimal GNU Make version to 3.82
  ia64: simplify esi object addition in Makefile
  Revert "kbuild: Check if linker supports the -X option"
  kbuild: rebuild .vmlinux.export.o when its prerequisite is updated
  kbuild: move modules.builtin(.modinfo) rules to Makefile.vmlinux_o
  zstd: Fixing mixed module-builtin objects
  kallsyms: ignore __kstrtab_* and __kstrtabns_* symbols
  kallsyms: take the input file instead of reading stdin
  kallsyms: drop duplicated ignore patterns from kallsyms.c
  kbuild: reuse mksysmap output for kallsyms
  mksysmap: update comment about __crc_*
  kbuild: remove head-y syntax
  kbuild: use obj-y instead extra-y for objects placed at the head
  kbuild: hide error checker logs for V=1 builds
  kbuild: re-run modpost when it is updated
  kbuild: unify two modpost invocations
  kbuild: move vmlinux.o rule to the top Makefile
  kbuild: move .vmlinux.objs rule to Makefile.modpost
  kbuild: list sub-directories in ./Kbuild
  Makefile.compiler: replace cc-ifversion with compiler-specific macros
  ...
2022-10-10 12:00:45 -07:00
Linus Torvalds
3871d93b82 Perf events updates for v6.1:
- PMU driver updates:
 
      - Add AMD Last Branch Record Extension Version 2 (LbrExtV2)
        feature support for Zen 4 processors.
 
      - Extend the perf ABI to provide branch speculation information,
        if available, and use this on CPUs that have it (eg. LbrExtV2).
 
      - Improve Intel PEBS TSC timestamp handling & integration.
 
      - Add Intel Raptor Lake S CPU support.
 
      - Add 'perf mem' and 'perf c2c' memory profiling support on
        AMD CPUs by utilizing IBS tagged load/store samples.
 
      - Clean up & optimize various x86 PMU details.
 
  - HW breakpoints:
 
      - Big rework to optimize the code for systems with hundreds of CPUs and
        thousands of breakpoints:
 
         - Replace the nr_bp_mutex global mutex with the bp_cpuinfo_sem
 	  per-CPU rwsem that is read-locked during most of the key operations.
 
 	- Improve the O(#cpus * #tasks) logic in toggle_bp_slot()
 	  and fetch_bp_busy_slots().
 
 	- Apply micro-optimizations & cleanups.
 
   - Misc cleanups & enhancements.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmM/2pMRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iIMA/+J+MCEVTt9kwZeBtHoPX7iZ5gnq1+McoQ
 f6ALX19AO/ZSuA7EBA3cS3Ny5eyGy3ofYUnRW+POezu9CpflLW/5N27R2qkZFrWC
 A09B86WH676ZrmXt+oI05rpZ2y/NGw4gJxLLa4/bWF3g9xLfo21i+YGKwdOnNFpl
 DEdCVHtjlMcOAU3+on6fOYuhXDcYd7PKGcCfLE7muOMOAtwyj0bUDBt7m+hneZgy
 qbZHzDU2DZ5L/LXiMyuZj5rC7V4xUbfZZfXglG38YCW1WTieS3IjefaU2tREhu7I
 rRkCK48ULDNNJR3dZK8IzXJRxusq1ICPG68I+nm/K37oZyTZWtfYZWehW/d/TnPa
 tUiTwimabz7UUqaGq9ZptxwINcAigax0nl6dZ3EseeGhkDE6j71/3kqrkKPz4jth
 +fCwHLOrI3c4Gq5qWgPvqcUlUneKf3DlOMtzPKYg7sMhla2zQmFpYCPzKfm77U/Z
 BclGOH3FiwaK6MIjPJRUXTePXqnUseqCR8PCH/UPQUeBEVHFcMvqCaa15nALed8x
 dFi76VywR9mahouuLNq6sUNePlvDd2B124PygNwegLlBfY9QmKONg9qRKOnQpuJ6
 UprRJjLOOucZ/N/jn6+ShHkqmXsnY2MhfUoHUoMQ0QAI+n++e+2AuePo251kKWr8
 QlqKxd9PMQU=
 =LcGg
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf events updates from Ingo Molnar:
 "PMU driver updates:

   - Add AMD Last Branch Record Extension Version 2 (LbrExtV2) feature
     support for Zen 4 processors.

   - Extend the perf ABI to provide branch speculation information, if
     available, and use this on CPUs that have it (eg. LbrExtV2).

   - Improve Intel PEBS TSC timestamp handling & integration.

   - Add Intel Raptor Lake S CPU support.

   - Add 'perf mem' and 'perf c2c' memory profiling support on AMD CPUs
     by utilizing IBS tagged load/store samples.

   - Clean up & optimize various x86 PMU details.

  HW breakpoints:

   - Big rework to optimize the code for systems with hundreds of CPUs
     and thousands of breakpoints:

      - Replace the nr_bp_mutex global mutex with the bp_cpuinfo_sem
        per-CPU rwsem that is read-locked during most of the key
        operations.

      - Improve the O(#cpus * #tasks) logic in toggle_bp_slot() and
        fetch_bp_busy_slots().

      - Apply micro-optimizations & cleanups.

  - Misc cleanups & enhancements"

* tag 'perf-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits)
  perf/hw_breakpoint: Annotate tsk->perf_event_mutex vs ctx->mutex
  perf: Fix pmu_filter_match()
  perf: Fix lockdep_assert_event_ctx()
  perf/x86/amd/lbr: Adjust LBR regardless of filtering
  perf/x86/utils: Fix uninitialized var in get_branch_type()
  perf/uapi: Define PERF_MEM_SNOOPX_PEER in kernel header file
  perf/x86/amd: Support PERF_SAMPLE_PHY_ADDR
  perf/x86/amd: Support PERF_SAMPLE_ADDR
  perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT}
  perf/x86/amd: Support PERF_SAMPLE_DATA_SRC
  perf/x86/amd: Add IBS OP_DATA2 DataSrc bit definitions
  perf/mem: Introduce PERF_MEM_LVLNUM_{EXTN_MEM|IO}
  perf/x86/uncore: Add new Raptor Lake S support
  perf/x86/cstate: Add new Raptor Lake S support
  perf/x86/msr: Add new Raptor Lake S support
  perf/x86: Add new Raptor Lake S support
  bpf: Check flags for branch stack in bpf_read_branch_records helper
  perf, hw_breakpoint: Fix use-after-free if perf_event_open() fails
  perf: Use sample_flags for raw_data
  perf: Use sample_flags for addr
  ...
2022-10-10 09:27:46 -07:00
Linus Torvalds
4899a36f91 powerpc updates for 6.1
- Remove our now never-true definitions for pgd_huge() and p4d_leaf().
 
  - Add pte_needs_flush() and huge_pmd_needs_flush() for 64-bit.
 
  - Add support for syscall wrappers.
 
  - Add support for KFENCE on 64-bit.
 
  - Update 64-bit HV KVM to use the new guest state entry/exit accounting API.
 
  - Support execute-only memory when using the Radix MMU (P9 or later).
 
  - Implement CONFIG_PARAVIRT_TIME_ACCOUNTING for pseries guests.
 
  - Updates to our linker script to move more data into read-only sections.
 
  - Allow the VDSO to be randomised on 32-bit.
 
  - Many other small features and fixes.
 
 Thanks to: Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Christophe
 Leroy, David Hildenbrand, Disha Goel, Fabiano Rosas, Gaosheng Cui, Gustavo A. R. Silva,
 Haren Myneni, Hari Bathini, Jilin Yuan, Joel Stanley, Kajol Jain, Kees Cook, Krzysztof
 Kozlowski, Laurent Dufour, Liang He, Li Huafei, Lukas Bulwahn, Madhavan Srinivasan, Nathan
 Chancellor, Nathan Lynch, Nicholas Miehlbradt, Nicholas Piggin, Pali Rohár, Rohan McLure,
 Russell Currey, Sachin Sant, Segher Boessenkool, Shrikanth Hegde, Tyrel Datwyler, Wolfram
 Sang, ye xingchen, Zheng Yongjun.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmNCpBMTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgDx3EACCf86iumFF3RyvENtDwoTRgH3H0z2E
 /ZC4LKrtxgaPFJzKUT4F0kLK85Hw5GzMEKK42NIhAB0o5vFwmEzxOtnlHOyEufAm
 EDIZDIfxV2J9Qx/cW2DSojPj/o9O6noXwhw9SBqMwiDWd8gXmNgOUEklAO7aR7Vq
 Ne2N2FLMNthZydCoHR6dAEjfe2ceFXP5cALwzQO+ILDdZQ0UcF2Yq4yw/gEDoCrB
 FH7mmE7UaQQHvYzo85VTZu7XfUys1P7kUcnhVurOg7/07ITnvnQR+itKZXC+bSft
 1K7ULtjd2QiCgxZA/apFc3lO46kqHVFsB3onRQw12/Ku5vfGFfY0L0iK97OgM4s0
 0u4r+J7A+MM5YBJVVjwZ6woYO5CWMHYKBZepxOpcvftPxj1LNkiHsryqKILGISEC
 aIY/lI0hpeNU4QshDMXzSTgeb/VF9O5cGPncTPkOFbXxD4RpVyz8tSngsG1+D8lj
 S6B2h3k4A14rnblLOxP22jcedBlTYQcRQS4vwr0a7+63QTjfSJ12xT3ucIAKU9f7
 65rVSS/igbrfxqHDmrd60WWZBMXeK0Zy7YIG6iYPTxpP31eFpSp9wtDlV7V2+EH2
 F2p+TJY8aTA8UW+2L5gigN3RsBeeEB8zxJkB14ivICM7+XzVu11PxPDqjDZYkfzC
 ueKKvCcHhHAYqQ==
 =TFBA
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Remove our now never-true definitions for pgd_huge() and p4d_leaf().

 - Add pte_needs_flush() and huge_pmd_needs_flush() for 64-bit.

 - Add support for syscall wrappers.

 - Add support for KFENCE on 64-bit.

 - Update 64-bit HV KVM to use the new guest state entry/exit accounting
   API.

 - Support execute-only memory when using the Radix MMU (P9 or later).

 - Implement CONFIG_PARAVIRT_TIME_ACCOUNTING for pseries guests.

 - Updates to our linker script to move more data into read-only
   sections.

 - Allow the VDSO to be randomised on 32-bit.

 - Many other small features and fixes.

Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Christophe Leroy, David Hildenbrand, Disha Goel, Fabiano Rosas,
Gaosheng Cui, Gustavo A. R. Silva, Haren Myneni, Hari Bathini, Jilin
Yuan, Joel Stanley, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent
Dufour, Liang He, Li Huafei, Lukas Bulwahn, Madhavan Srinivasan, Nathan
Chancellor, Nathan Lynch, Nicholas Miehlbradt, Nicholas Piggin, Pali
Rohár, Rohan McLure, Russell Currey, Sachin Sant, Segher Boessenkool,
Shrikanth Hegde, Tyrel Datwyler, Wolfram Sang, ye xingchen, and Zheng
Yongjun.

* tag 'powerpc-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits)
  KVM: PPC: Book3S HV: Fix stack frame regs marker
  powerpc: Don't add __powerpc_ prefix to syscall entry points
  powerpc/64s/interrupt: Fix stack frame regs marker
  powerpc/64: Fix msr_check_and_set/clear MSR[EE] race
  powerpc/64s/interrupt: Change must-hard-mask interrupt check from BUG to WARN
  powerpc/pseries: Add firmware details to the hardware description
  powerpc/powernv: Add opal details to the hardware description
  powerpc: Add device-tree model to the hardware description
  powerpc/64: Add logical PVR to the hardware description
  powerpc: Add PVR & CPU name to hardware description
  powerpc: Add hardware description string
  powerpc/configs: Enable PPC_UV in powernv_defconfig
  powerpc/configs: Update config files for removed/renamed symbols
  powerpc/mm: Fix UBSAN warning reported on hugetlb
  powerpc/mm: Always update max/min_low_pfn in mem_topology_setup()
  powerpc/mm/book3s/hash: Rename flush_tlb_pmd_range
  powerpc: Drops STABS_DEBUG from linker scripts
  powerpc/64s: Remove lost/old comment
  powerpc/64s: Remove old STAB comment
  powerpc: remove orphan systbl_chk.sh
  ...
2022-10-09 14:05:15 -07:00
Michael Ellerman
9474689020 powerpc: Don't add __powerpc_ prefix to syscall entry points
When using syscall wrappers the __SYSCALL_DEFINEx() and related macros
add a "__powerpc_" prefix to all syscall entry points.

So for example sys_mmap becomes __powerpc_sys_mmap.

This risks breaking workflows and tools that expect the old naming
scheme. At a minimum setting a breakpoint on eg. sys_mmap with gdb no
longer works.

There seems to be no compelling reason to add the "__powerpc_" prefix,
other than that it follows what some other arches do (x86, arm64, s390).

But unlike other arches powerpc doesn't always enable syscall wrappers,
so the syscall entry points can change name depending on CONFIG options.

For those reasons drop the "__powerpc_" prefix, reverting to the
existing naming.

Doing so reveals two prototypes in signal.h that have the incorrect type
when syscall wrappers are enabled. There are already prototypes for both
functions in syscalls.h, so drop the ones from signal.h.

Fixes: 7e92e01b72 ("powerpc: Provide syscall wrapper")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221006135940.1223988-1-mpe@ellerman.id.au
2022-10-07 00:59:54 +11:00
Nicholas Piggin
b2e82e495a powerpc/64s/interrupt: Fix stack frame regs marker
The value of the stack frame regs marker that gets saved on the stack in
interrupt entry code does not match the regs marker value, which breaks
stack frame marker matching.

This stray instruction looks to have been introduced in a mismerge.

Fixes: bf75a3258a ("powerpc/64s/interrupt: move early boot ILE fixup into a macro")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Mismerge by yours truly -_-]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221004132952.984341-1-npiggin@gmail.com
2022-10-05 11:09:22 +11:00
Nicholas Piggin
0fa6831811 powerpc/64: Fix msr_check_and_set/clear MSR[EE] race
irq soft-masking means that when Linux irqs are disabled, the MSR[EE]
value can change from 1 to 0 asynchronously: if a masked interrupt of
the PACA_IRQ_MUST_HARD_MASK variety fires while irqs are disabled,
the masked handler will return with MSR[EE]=0.

This means a sequence like mtmsr(mfmsr() | MSR_FP) is racy if it can
be called with local irqs disabled, unless a hard_irq_disable has been
done.

Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221004051157.308999-2-npiggin@gmail.com
2022-10-04 23:16:20 +11:00
Nicholas Piggin
8154850b28 powerpc/64s/interrupt: Change must-hard-mask interrupt check from BUG to WARN
This new assertion added is generally harmless and gets fixed up
naturally, but it does indicate a problem with MSR manipulation
somewhere.

Fixes: c39fb71a54 ("powerpc/64s/interrupt: masked handler debug check for previous hard disable")
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221004051157.308999-1-npiggin@gmail.com
2022-10-04 23:16:20 +11:00
Masahiro Yamada
3216484550 kbuild: use obj-y instead extra-y for objects placed at the head
The objects placed at the head of vmlinux need special treatments:

 - arch/$(SRCARCH)/Makefile adds them to head-y in order to place
   them before other archives in the linker command line.

 - arch/$(SRCARCH)/kernel/Makefile adds them to extra-y instead of
   obj-y to avoid them going into built-in.a.

This commit gets rid of the latter.

Create vmlinux.a to collect all the objects that are unconditionally
linked to vmlinux. The objects listed in head-y are moved to the head
of vmlinux.a by using 'ar m'.

With this, arch/$(SRCARCH)/kernel/Makefile can consistently use obj-y
for builtin objects.

There is no *.o that is directly linked to vmlinux. Drop unneeded code
in scripts/clang-tools/gen_compile_commands.py.

$(AR) mPi needs 'T' to workaround the llvm-ar bug. The fix was suggested
by Nathan Chancellor [1].

[1]: https://lore.kernel.org/llvm/YyjjT5gQ2hGMH0ni@dev-arch.thelio-3990X/

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-10-02 18:04:05 +09:00
Michael Ellerman
5412297079 powerpc: Add device-tree model to the hardware description
Add the model of the machine we're on to the hardware description, which
is printed at boot and in case of an oops.

eg: Hardware name: IBM,8247-22L

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220930082709.55830-4-mpe@ellerman.id.au
2022-09-30 18:35:53 +10:00
Michael Ellerman
48b7019b6a powerpc/64: Add logical PVR to the hardware description
If we detect a logical PVR add that to the hardware description, which
is printed at boot and in case of an oops.

eg: Hardware name: ... 0xf000004

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220930082709.55830-3-mpe@ellerman.id.au
2022-09-30 18:35:53 +10:00
Michael Ellerman
bd649d40e0 powerpc: Add PVR & CPU name to hardware description
Add the PVR and CPU name to the hardware description, which is printed
at boot and in case of an oops.

eg: Hardware name: ... POWER8E (raw) 0x4b0201

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220930082709.55830-2-mpe@ellerman.id.au
2022-09-30 18:35:52 +10:00
Michael Ellerman
41dc056391 powerpc: Add hardware description string
Create a hardware description string, which we will use to record
various details of the hardware platform we are running on.

Print the accumulated description at boot, and use it to set the generic
description which is printed in oopses.

To begin with add ppc_md.name, aka the "machine description".

Example output at boot with the full series applied:

  Linux version 6.0.0-rc2-gcc-11.1.0-00199-g893f9007a5ce-dirty (michael@alpine1-p1) (powerpc64-linux-gcc (GCC) 11.1.0, GNU ld (GNU Binutils) 2.36.1) #844 SMP Thu Sep 29 22:29:53 AEST 2022
  Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1200 0xf000005 of:SLOF,git-5b4c5a pSeries
  printk: bootconsole [udbg0] enabled

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220930082709.55830-1-mpe@ellerman.id.au
2022-09-30 18:35:52 +10:00
Michael Ellerman
7673335e2a powerpc: Drops STABS_DEBUG from linker scripts
No toolchain we support should be generating stabs debug information
anymore. Drop the sections entirely from our linker scripts.

We removed all the manual stabs annotations in commit
1231816373 ("powerpc/32: Remove remaining .stabs annotations").

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220928130951.1732983-1-mpe@ellerman.id.au
2022-09-30 18:35:52 +10:00
Michael Ellerman
0c36099642 powerpc/64s: Remove lost/old comment
The bulk of this was moved/reworded in:
  57f266497d ("powerpc: Use gas sections for arranging exception vectors")

And now appears around line 700 in arch/powerpc/kernel/exceptions-64s.S.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220928130941.1732818-1-mpe@ellerman.id.au
2022-09-30 18:35:51 +10:00
Michael Ellerman
57a8e4b26e powerpc/64s: Remove old STAB comment
This used to be about the 0x4300 handler, but that was moved in commit
da2bc4644c ("powerpc/64s: Add new exception vector macros").

Note that "STAB" here refers to "Segment Table" not the debug format.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220928130912.1732466-1-mpe@ellerman.id.au
2022-09-30 18:35:51 +10:00
Nicholas Piggin
a08661af4c powerpc: remove orphan systbl_chk.sh
arch/powerpc/kernel/systbl_chk.sh has not been referenced since commit
ab66dcc76d ("powerpc: generate uapi header and system call table
files"). Remove it.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220929032120.3592593-1-npiggin@gmail.com
2022-09-30 18:35:51 +10:00
Peter Zijlstra
a1ebcd5943 Linux 6.0-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmMwwY4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGdlwH/0ESzdb6F9zYWwHR
 E08har56/IfwjOsn1y+JuHibpwUjzskLzdwIfI5zshSZAQTj5/UyC0P7G/wcYh/Z
 INh1uHGazmDUkx4O3lwuWLR+mmeUxZRWdq4NTwYDRNPMSiPInVxz+cZJ7y0aPr2e
 wii7kMFRHgXmX5DMDEwuHzehsJF7vZrp8zBu2DqzVUGnbwD50nPbyMM3H4g9mute
 fAEpDG0X3+smqMaKL+2rK0W/Av/87r3U8ZAztBem3nsCJ9jT7hqMO1ICcKmFMviA
 DTERRMwWjPq+mBPE2CiuhdaXvNZBW85Ds81mSddS6MsO6+Tvuzfzik/zSLQJxlBi
 vIqYphY=
 =NqG+
 -----END PGP SIGNATURE-----

Merge branch 'v6.0-rc7'

Merge upstream to get RAPTORLAKE_S

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-09-29 12:20:50 +02:00
Haren Myneni
335e1a9104 powerpc: Ignore DSI error caused by the copy/paste instruction
The data storage interrupt (DSI) error will be generated when the
paste operation is issued on the suspended Nest Accelerator (NX)
window due to NX state changes. The hypervisor expects the
partition to ignore this error during page fault handling.
To differentiate DSI caused by an actual HW configuration or by
the NX window, a new “ibm,pi-features” type value is defined.
Byte 0, bit 3 of pi-attribute-specifier-type is now defined to
indicate this DSI error. If this error is not ignored, the user
space can get SIGBUS when the NX request is issued.

This patch adds changes to read ibm,pi-features property and ignore
DSI error during page fault handling if MMU_FTR_NX_DSI is defined.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
[mpe: Mention PAPR version in comment]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b9cd844b85eb8f70459109ce1b14e44c4cc85fa7.camel@linux.ibm.com
2022-09-28 22:52:32 +10:00
Li Huafei
97f88a3d72 powerpc/kprobes: Fix null pointer reference in arch_prepare_kprobe()
I found a null pointer reference in arch_prepare_kprobe():

  # echo 'p cmdline_proc_show' > kprobe_events
  # echo 'p cmdline_proc_show+16' >> kprobe_events
  Kernel attempted to read user page (0) - exploit attempt? (uid: 0)
  BUG: Kernel NULL pointer dereference on read at 0x00000000
  Faulting instruction address: 0xc000000000050bfc
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in:
  CPU: 0 PID: 122 Comm: sh Not tainted 6.0.0-rc3-00007-gdcf8e5633e2e #10
  NIP:  c000000000050bfc LR: c000000000050bec CTR: 0000000000005bdc
  REGS: c0000000348475b0 TRAP: 0300   Not tainted  (6.0.0-rc3-00007-gdcf8e5633e2e)
  MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 88002444  XER: 20040006
  CFAR: c00000000022d100 DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 0
  ...
  NIP arch_prepare_kprobe+0x10c/0x2d0
  LR  arch_prepare_kprobe+0xfc/0x2d0
  Call Trace:
    0xc0000000012f77a0 (unreliable)
    register_kprobe+0x3c0/0x7a0
    __register_trace_kprobe+0x140/0x1a0
    __trace_kprobe_create+0x794/0x1040
    trace_probe_create+0xc4/0xe0
    create_or_delete_trace_kprobe+0x2c/0x80
    trace_parse_run_command+0xf0/0x210
    probes_write+0x20/0x40
    vfs_write+0xfc/0x450
    ksys_write+0x84/0x140
    system_call_exception+0x17c/0x3a0
    system_call_vectored_common+0xe8/0x278
  --- interrupt: 3000 at 0x7fffa5682de0
  NIP:  00007fffa5682de0 LR: 0000000000000000 CTR: 0000000000000000
  REGS: c000000034847e80 TRAP: 3000   Not tainted  (6.0.0-rc3-00007-gdcf8e5633e2e)
  MSR:  900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 44002408  XER: 00000000

The address being probed has some special:

  cmdline_proc_show: Probe based on ftrace
  cmdline_proc_show+16: Probe for the next instruction at the ftrace location

The ftrace-based kprobe does not generate kprobe::ainsn::insn, it gets
set to NULL. In arch_prepare_kprobe() it will check for:

  ...
  prev = get_kprobe(p->addr - 1);
  preempt_enable_no_resched();
  if (prev && ppc_inst_prefixed(ppc_inst_read(prev->ainsn.insn))) {
  ...

If prev is based on ftrace, 'ppc_inst_read(prev->ainsn.insn)' will occur
with a null pointer reference. At this point prev->addr will not be a
prefixed instruction, so the check can be skipped.

Check if prev is ftrace-based kprobe before reading 'prev->ainsn.insn'
to fix this problem.

Fixes: b4657f7650 ("powerpc/kprobes: Don't allow breakpoints on suffixes")
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
[mpe: Trim oops]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220923093253.177298-1-lihuafei1@huawei.com
2022-09-28 22:19:52 +10:00
Nathan Lynch
b37ac1894a powerpc/smp: poll cpu_callin_map more aggressively in __cpu_up()
At boot time, it is not necessary to delay between polls of
cpu_callin_map when waiting for a kicked CPU to come up. Remove the
delay intervals, but preserve the overall deadline (five seconds).

At run time, the first poll result is usually negative and we incur a
sleeping wait. If we spin on the callin word for a short time first,
we can reduce __cpu_up() from dozens of milliseconds to under 1ms in
the common case on a P9 LPAR:

$ ppc64_cpu --smt=off
$ bpftrace -e 'kprobe:__cpu_up {
                 @start[tid] = nsecs;
               }
               kretprobe:__cpu_up /@start[tid]/ {
                 @us = hist((nsecs - @start[tid]) / 1000);
                 delete(@start[tid]);
               }' -c 'ppc64_cpu --smt=on'

Before:

@us:
[16K, 32K)        85 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[32K, 64K)        13 |@@@@@@@                                             |

After:

@us:
[128, 256)        95 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[256, 512)         3 |@                                                   |

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926220250.157022-1-nathanl@linux.ibm.com
2022-09-28 19:22:14 +10:00
Nathan Lynch
b8f3e48834 powerpc/rtas: block error injection when locked down
The error injection facility on pseries VMs allows corruption of
arbitrary guest memory, potentially enabling a sufficiently privileged
user to disable lockdown or perform other modifications of the running
kernel via the rtas syscall.

Block the PAPR error injection facility from being opened or called
when locked down.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Acked-by: Paul Moore <paul@paul-moore.com> (LSM)
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926131643.146502-3-nathanl@linux.ibm.com
2022-09-28 19:22:14 +10:00
Nicholas Piggin
e1100cee05 powerpc/64s/interrupt: halt early boot interrupts if paca is not set up
Ensure r13 is zero from very early in boot until it gets set to the
boot paca pointer. This allows early program and mce handlers to halt
if there is no valid paca, rather than potentially run off into the
weeds. This preserves register and memory contents for low level
debugging tools.

Nothing could be printed to console at this point in any case because
even udbg is only set up after the boot paca is set, so this shouldn't
be missed.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926055620.2676869-6-npiggin@gmail.com
2022-09-28 19:22:13 +10:00
Nicholas Piggin
519b2e317e powerpc/64: don't set boot CPU's r13 to paca until the structure is set up
The idea is to get to the point where if r13 is non-zero, then it should
contain a reasonable paca. This can be used in early boot program check
and machine check handlers to avoid running off into the weeds if they
hit before r13 has a paca.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926055620.2676869-5-npiggin@gmail.com
2022-09-28 19:22:13 +10:00
Nicholas Piggin
b830c8754e powerpc/64: avoid using r13 in relocate
relocate() uses r13 in early boot before it is used for the paca. Use
a different register for this so r13 is kept unchanged until it is
set to the paca pointer.

Avoid r14 as well while we're here, there's no reason not to use the
volatile registers which is a bit less surprising, and r14 could be used
as another fixed reg one day.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926055620.2676869-4-npiggin@gmail.com
2022-09-28 19:22:13 +10:00
Nicholas Piggin
2f5182cffa powerpc/64s: early boot machine check handler
Use the early boot interrupt fixup in the machine check handler to allow
the machine check handler to run before interrupt endian is set up.
Branch to an early boot handler that just does a basic crash, which
allows it to run before ppc_md is set up. MSR[ME] is enabled on the boot
CPU earlier, and the machine check stack is temporarily set to the
middle of the init task stack.

This allows machine checks (e.g., due to invalid data access in real
mode) to print something useful earlier in boot (as soon as udbg is set
up, if CONFIG_PPC_EARLY_DEBUG=y).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926055620.2676869-3-npiggin@gmail.com
2022-09-28 19:22:13 +10:00
Nicholas Piggin
bf75a3258a powerpc/64s/interrupt: move early boot ILE fixup into a macro
In preparation for using this sequence in machine check interrupt, move
it into a macro, with a small change to make it position independent.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926055620.2676869-2-npiggin@gmail.com
2022-09-28 19:22:12 +10:00
Nicholas Piggin
3569d84bb2 powerpc/64e: provide an addressing macro for use with TOC in alternate register
The interrupt entry code carefully saves a minimal number of registers,
so in some places the TOC is required, it is loaded into a different
register, so provide a macro that can supply an alternate TOC register.

This continues to use got addressing because TOC-relative results in
"got/toc optimization is not supported" messages by the linker. Having
r2 be one of the saved registers and using that for TOC addressing may
be the best way to avoid that and switch this to TOC addressing.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926034057.2360083-6-npiggin@gmail.com
2022-09-28 19:22:12 +10:00
Nicholas Piggin
8e93fb33c8 powerpc/64: provide a helper macro to load r2 with the kernel TOC
A later change stops the kernel using r2 and loads it with a poison
value.  Provide a PACATOC loading abstraction which can hide this
detail.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926034057.2360083-5-npiggin@gmail.com
2022-09-28 19:22:12 +10:00
Nicholas Piggin
dab3b8f4fd powerpc/64: asm use consistent global variable declaration and access
Use helper macros to access global variables, and place them in .data
sections rather than in .toc. Putting addresses in TOC is not required
because the kernel is linked with a single TOC.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926034057.2360083-3-npiggin@gmail.com
2022-09-28 19:22:12 +10:00
Nicholas Piggin
17773afdcd powerpc/64: use 32-bit immediate for STACK_FRAME_REGS_MARKER
Using a 32-bit constant for this marker allows it to be loaded with
two ALU instructions, like 32-bit. This avoids a TOC entry and a
TOC load that depends on the r2 value that has just been loaded from
the PACA.

This changes the value for 32-bit as well, so both have the same
value in the low 4 bytes and 64-bit has 0 in the top bytes.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926034057.2360083-2-npiggin@gmail.com
2022-09-28 19:22:12 +10:00
Nicholas Piggin
1da5351f9e powerpc/64/irq: tidy soft-masked irq replay and improve documentation
irq replay is quite complicated because of softirq processing which
itself enables and disables irqs. Several considerations need to be
accounted for due to this, and they are not clearly documented.

Refactor the irq replay code a bit to tidy and deduplicate some common
functions. Add comments, debug checks.

This has a minor functional change that irq tracing enable/disable is
done after each interrupt replayed, rather than after a batch. It also
re-sets state to IRQS_ALL_DISABLED after an interrupt, which doesn't
matter much because interrupts are hard disabled at this point, but it
is more consistent with how interrupt handlers are called.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926054305.2671436-8-npiggin@gmail.com
2022-09-28 19:22:11 +10:00
Nicholas Piggin
c39fb71a54 powerpc/64s/interrupt: masked handler debug check for previous hard disable
Prior changes eliminated cases of masked PACA_IRQ_MUST_HARD_MASK
interrupts that re-fire due to MSR[EE] being enabled while they are
pending. Add a debug check in the masked interrupt handler to catch
if this occurs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926054305.2671436-6-npiggin@gmail.com
2022-09-28 19:22:11 +10:00
Nicholas Piggin
e485f6c751 powerpc/64/interrupt: Fix return to masked context after hard-mask irq becomes pending
If a synchronous interrupt (e.g., hash fault) is taken inside an
irqs-disabled region which has MSR[EE]=1, then an asynchronous interrupt
that is PACA_IRQ_MUST_HARD_MASK (e.g., PMI) is taken inside the
synchronous interrupt handler, then the synchronous interrupt will
return with MSR[EE]=1 and the asynchronous interrupt fires again.

If the asynchronous interrupt is a PMI and the original context does not
have PMIs disabled (only Linux IRQs), the asynchronous interrupt will
fire despite having the PMI marked soft pending. This can confuse the
perf code and cause warnings.

This patch changes the interrupt return so that irqs-disabled MSR[EE]=1
contexts will be returned to with MSR[EE]=0 if a PACA_IRQ_MUST_HARD_MASK
interrupt has become pending in the meantime.

The longer explanation for what happens:
1. local_irq_disable()
2. Hash fault interrupt fires, do_hash_fault handler runs
3. interrupt_enter_prepare() sets IRQS_ALL_DISABLED
4. interrupt_enter_prepare() sets MSR[EE]=1
5. PMU interrupt fires, masked handler runs
6. Masked handler marks PMI pending
7. Masked handler returns with PACA_IRQ_HARD_DIS set, MSR[EE]=0
8. do_hash_fault interrupt return handler runs
9. interrupt_exit_kernel_prepare() clears PACA_IRQ_HARD_DIS
10. interrupt returns with MSR[EE]=1
11. PMU interrupt fires, perf handler runs

Fixes: 4423eb5ae3 ("powerpc/64/interrupt: make normal synchronous interrupts enable MSR[EE] if possible")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926054305.2671436-4-npiggin@gmail.com
2022-09-28 19:22:11 +10:00
Nicholas Piggin
799f7063c7 powerpc/64: mark irqs hard disabled in boot paca
This prevents interrupts in early boot (e.g., program check) from
enabling MSR[EE], potentially causing endian mismatch or other
crashes when reporting early boot traps.

Fixes: 4423eb5ae3 ("powerpc/64/interrupt: make normal synchronous interrupts enable MSR[EE] if possible")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220926054305.2671436-3-npiggin@gmail.com
2022-09-28 19:22:11 +10:00
Nicholas Piggin
dabeb572ad powerpc: add ISA v3.0 / v3.1 wait opcode macro
The wait instruction encoding changed between ISA v2.07 and ISA v3.0.
In v3.1 the instruction gained a new field.

Update the PPC_WAIT macro to the current encoding. Rename the older
incompatible one with a _v203 suffix as it was introduced in v2.03
(the WC field was introduced in v2.07 but the kernel only uses WC=0).

Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220920122259.363092-1-npiggin@gmail.com
2022-09-28 19:22:10 +10:00
Nicholas Piggin
c84550203b powerpc/time: avoid programming DEC at the start of the timer interrupt
Setting DEC to maximum at the start of the timer interrupt is not
necessary and can be avoided for performance when MSR[EE] is not
enabled during the handler as explained in commit 0faf20a1ad
("powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless
perf is in use"), where this change was first attempted.

The idea is that the timer interrupt runs with MSR[EE]=0, and at the end
of the interrupt DEC is programmed to the next timer interval, so there
is no need to clear the decrementer exception before then.

When the above commit was merged, that was not quite true. The low res
timer subsystem had some cases in the oneshot timer code where if the
tick was to be stopped and no timers active, the clock device would not
get the ->set_state_oneshot_stopped() call, so DEC would not get
reprogrammed, and this would hang taking continual timer interrupts.

So this was reverted in commit d2b9be1f4a ("powerpc/time: Always set
decrementer in timer_interrupt()"), which was a partial revert of the
above commit.

Commit 62c1256d54 ("timers/nohz: Switch to ONESHOT_STOPPED in the
low-res handler when the tick is stopped") was later merged to fix this
missing case in the timer subsystem, so now the behaviour can be
restored.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220909142457.278032-1-npiggin@gmail.com
2022-09-28 19:22:09 +10:00
Pali Rohár
b19448fe84 powerpc: Add support for early debugging via Serial 16550 console
Currently powerpc early debugging contains lot of platform specific
options, but does not support standard UART / serial 16550 console.

Later legacy_serial.c code supports registering UART as early debug console
from device tree but it is not early during booting, but rather later after
machine description code finishes.

So for real early debugging via UART is current code unsuitable.

Add support for new early debugging option CONFIG_PPC_EARLY_DEBUG_16550
which enable Serial 16550 console on address defined by new option
CONFIG_PPC_EARLY_DEBUG_16550_PHYSADDR and by stride by option
CONFIG_PPC_EARLY_DEBUG_16550_STRIDE.

With this change it is possible to debug powerpc machine descriptor code.
For example this early debugging code can print on serial console also
"No suitable machine description found" error which is done before
legacy_serial.c code.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220822231501.16827-1-pali@kernel.org
2022-09-28 19:22:09 +10:00
Rohan McLure
7e92e01b72 powerpc: Provide syscall wrapper
Implement syscall wrapper as per s390, x86, arm64. When enabled
cause handlers to accept parameters from a stack frame rather than
from user scratch register state. This allows for user registers to be
safely cleared in order to reduce caller influence on speculation
within syscall routine. The wrapper is a macro that emits syscall
handler symbols that call into the target handler, obtaining its
parameters from a struct pt_regs on the stack.

As registers are already saved to the stack prior to calling
system_call_exception, it appears that this function is executed more
efficiently with the new stack-pointer convention than with parameters
passed by registers, avoiding the allocation of a stack frame for this
method. On a 32-bit system, we see >20% performance increases on the
null_syscall microbenchmark, and on a Power 8 the performance gains
amortise the cost of clearing and restoring registers which is
implemented at the end of this series, seeing final result of ~5.6%
performance improvement on null_syscall.

Syscalls are wrapped in this fashion on all platforms except for the
Cell processor as this commit does not provide SPU support. This can be
quickly fixed in a successive patch, but requires spu_sys_callback to
allocate a pt_regs structure to satisfy the wrapped calling convention.

Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmai.com>
[mpe: Make incompatible with COMPAT to retain clearing of high bits of args]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-22-rmclure@linux.ibm.com
2022-09-28 19:22:09 +10:00
Rohan McLure
f8971c627b powerpc: Change system_call_exception calling convention
Change system_call_exception arguments to pass a pointer to a stack
frame container caller state, as well as the original r0, which
determines the number of the syscall. This has been observed to yield
improved performance to passing them by registers, circumventing the
need to allocate a stack frame.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Retain clearing of high bits of args for compat tasks]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-21-rmclure@linux.ibm.com
2022-09-28 19:22:09 +10:00
Rohan McLure
8640de0dee powerpc: Use common syscall handler type
Cause syscall handlers to be typed as follows when called indirectly
throughout the kernel. This is to allow for better type checking.

typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long,
                           unsigned long, unsigned long, unsigned long);

Since both 32 and 64-bit abis allow for at least the first six
machine-word length parameters to a function to be passed by registers,
even handlers which admit fewer than six parameters may be viewed as
having the above type.

Coercing syscalls to syscall_fn requires a cast to void* to avoid
-Wcast-function-type.

Fixup comparisons in VDSO to avoid pointer-integer comparison. Introduce
explicit cast on systems with SPUs.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-19-rmclure@linux.ibm.com
2022-09-28 19:22:09 +10:00
Rohan McLure
39859aea41 powerpc: Enable compile-time check for syscall handlers
The table of syscall handlers and registered compatibility syscall
handlers has in past been produced using assembly, with function
references resolved at link time. This moves link-time errors to
compile-time, by rewriting systbl.S in C, and including the
linux/syscalls.h, linux/compat.h and asm/syscalls.h headers for
prototypes.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-18-rmclure@linux.ibm.com
2022-09-28 19:22:09 +10:00
Rohan McLure
8cd1def4b8 powerpc: Include all arch-specific syscall prototypes
Forward declare all syscall handler prototypes where a generic prototype
is not provided in either linux/syscalls.h or linux/compat.h in
asm/syscalls.h. This is required for compile-time type-checking for
syscall handlers, which is implemented later in this series.

32-bit compatibility syscall handlers are expressed in terms of types in
ppc32.h. Expose this header globally.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Use standard include guard naming for syscalls_32.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-17-rmclure@linux.ibm.com
2022-09-28 19:22:08 +10:00
Rohan McLure
dec20c50df powerpc: Adopt SYSCALL_DEFINE for arch-specific syscall handlers
Arch-specific implementations of syscall handlers are currently used
over generic implementations for the following reasons:

1. Semantics unique to powerpc
2. Compatibility syscalls require 'argument padding' to comply with
   64-bit argument convention in ELF32 abi.
3. Parameter types or order is different in other architectures.

These syscall handlers have been defined prior to this patch series
without invoking the SYSCALL_DEFINE or COMPAT_SYSCALL_DEFINE macros with
custom input and output types. We remove every such direct definition in
favour of the aforementioned macros.

Also update syscalls.tbl in order to refer to the symbol names generated
by each of these macros. Since ppc64_personality can be called by both
64 bit and 32 bit binaries through compatibility, we must generate both
both compat_sys_ and sys_ symbols for this handler.

As an aside:
A number of architectures including arm and powerpc agree on an
alternative argument order and numbering for most of these arch-specific
handlers. A future patch series may allow for asm/unistd.h to signal
through its defines that a generic implementation of these syscall
handlers with the correct calling convention be emitted, through the
__ARCH_WANT_COMPAT_SYS_... convention.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-16-rmclure@linux.ibm.com
2022-09-28 19:22:08 +10:00
Rohan McLure
ac17defbeb powerpc: Provide do_ppc64_personality helper
Avoid duplication in future patch that will define the ppc64_personality
syscall handler in terms of the SYSCALL_DEFINE and COMPAT_SYSCALL_DEFINE
macros, by extracting the common body of ppc64_personality into a helper
function.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-15-rmclure@linux.ibm.com
2022-09-28 19:22:08 +10:00
Rohan McLure
b7fa9ce86d powerpc: Remove direct call to mmap2 syscall handlers
Syscall handlers should not be invoked internally by their symbol names,
as these symbols defined by the architecture-defined SYSCALL_DEFINE
macro. Move the compatibility syscall definition for mmap2 to
syscalls.c, so that all mmap implementations can share a helper function.

Remove 'inline' on static mmap helper.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fix compat_sys_mmap2() prototype and offset handling]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-14-rmclure@linux.ibm.com
2022-09-28 19:21:26 +10:00
Matthew Wilcox (Oracle)
405e669172 powerpc: remove mmap linked list walks
Use the VMA iterator instead.

Link: https://lkml.kernel.org/r/20220906194824.2110408-34-Liam.Howlett@oracle.com
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Tested-by: Yu Zhao <yuzhao@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26 19:46:19 -07:00
Rohan McLure
4df0221f9d powerpc: Remove direct call to personality syscall handler
Syscall handlers should not be invoked internally by their symbol names,
as these symbols defined by the architecture-defined SYSCALL_DEFINE
macro. Fortunately, in the case of ppc64_personality, its call to
sys_personality can be replaced with an invocation to the
equivalent ksys_personality inline helper in <linux/syscalls.h>.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-13-rmclure@linux.ibm.com
2022-09-26 23:00:16 +10:00
Rohan McLure
b6b1334c95 powerpc/32: Remove powerpc select specialisation
Syscall #82 has been implemented for 32-bit platforms in a unique way on
powerpc systems. This hack will in effect guess whether the caller is
expecting new select semantics or old select semantics. It does so via a
guess, based off the first parameter. In new select, this parameter
represents the length of a user-memory array of file descriptors, and in
old select this is a pointer to an arguments structure.

The heuristic simply interprets sufficiently large values of its first
parameter as being a call to old select. The following is a discussion
on how this syscall should be handled.


As discussed in this thread, the existence of such a hack suggests that for
whatever powerpc binaries may predate glibc, it is most likely that they
would have taken use of the old select semantics. x86 and arm64 both
implement this syscall with oldselect semantics.

Remove the powerpc implementation, and update syscall.tbl to refer to emit
a reference to sys_old_select and compat_sys_old_select
for 32-bit binaries, in keeping with how other architectures support
syscall #82.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/lkml/13737de5-0eb7-e881-9af0-163b0d29a1a0@csgroup.eu/
Link: https://lore.kernel.org/r/20220921065605.1051927-12-rmclure@linux.ibm.com
2022-09-26 23:00:15 +10:00
Rohan McLure
c2e7a19827 powerpc: Use generic fallocate compatibility syscall
The powerpc fallocate compat syscall handler is identical to the
generic implementation provided by commit 59c10c52f5 ("riscv:
compat: syscall: Add compat_sys_call_table implementation"), and as
such can be removed in favour of the generic implementation.

A future patch series will replace more architecture-defined syscall
handlers with generic implementations, dependent on introducing generic
implementations that are compatible with powerpc and arm's parameter
reorderings.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-11-rmclure@linux.ibm.com
2022-09-26 23:00:15 +10:00
Rohan McLure
016ff72bd2 powerpc: Fix fallocate and fadvise64_64 compat parameter combination
As reported[1] by Arnd, the arch-specific fadvise64_64 and fallocate
compatibility handlers assume parameters are passed with 32-bit
big-endian ABI. This affects the assignment of odd-even parameter pairs
to the high or low words of a 64-bit syscall parameter.

Fix fadvise64_64 fallocate compat handlers to correctly swap upper/lower
32 bits conditioned on endianness.

A future patch will replace the arch-specific compat fallocate with an
asm-generic implementation. This patch is intended for ease of
back-port.

[1]: https://lore.kernel.org/all/be29926f-226e-48dc-871a-e29a54e80583@www.fastmail.com/

Fixes: 57f48b4b74 ("powerpc/compat_sys: swap hi/lo parts of 64-bit syscall args in LE mode")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-9-rmclure@linux.ibm.com
2022-09-26 23:00:15 +10:00
Rohan McLure
620f5c59c8 powerpc/64s: Fix comment on interrupt handler prologue
Interrupt handlers on 64s systems will often need to save register state
from the interrupted process to make space for loading special purpose
registers or for internal state.

Fix a comment documenting a common code path macro in the beginning of
interrupt handlers where r10 is saved to the PACA to afford space for
the value of the CFAR. Comment is currently written as if r10-r12 are
saved to PACA, but in fact only r10 is saved, with r11-r12 saved much
later. The distance in code between these saves has grown over the many
revisions of this macro. Fix this by signalling with a comment where
r11-r12 are saved to the PACA.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reported-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-8-rmclure@linux.ibm.com
2022-09-26 23:00:15 +10:00
Rohan McLure
53ecaa6778 powerpc/64e: Clarify register saves and clears with {SAVE,ZEROIZE}_GPRS
The common interrupt handler prologue macro and the bad_stack
trampolines include consecutive sequences of register saves, and some
register clears. Neaten such instances by expanding use of the SAVE_GPRS
macro and employing the ZEROIZE_GPR macro when appropriate.

Also simplify an invocation of SAVE_GPRS targetting all non-volatile
registers to SAVE_NVGPRS.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reported-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-7-rmclure@linux.ibm.com
2022-09-26 23:00:15 +10:00
Rohan McLure
15ba74502c powerpc/32: Clarify interrupt restores with REST_GPR macro in entry_32.S
Restoring the register state of the interrupted thread involves issuing
a large number of predictable loads to the kernel stack frame. Issue the
REST_GPR{,S} macros to clearly signal when this is happening, and bunch
together restores at the end of the interrupt handler where the saved
value is not consumed earlier in the handler code.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-6-rmclure@linux.ibm.com
2022-09-26 23:00:15 +10:00
Rohan McLure
2b1dac4b5f powerpc/64s: Use {ZEROIZE,SAVE,REST}_GPRS macros in sc, scv 0 handlers
Use the convenience macros for saving/clearing/restoring gprs in keeping
with syscall calling conventions. The plural variants of these macros
can store a range of registers for concision.

This works well when the user gpr value we are hoping to save is still
live. In the syscall interrupt handlers, user register state is
sometimes juggled between registers. Hold-off from issuing the SAVE_GPR
macro for applicable neighbouring lines to highlight the delicate
register save logic.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-5-rmclure@linux.ibm.com
2022-09-26 23:00:15 +10:00
Rohan McLure
2c27d4a419 powerpc: Save caller r3 prior to system_call_exception
This reverts commit 8875f47b76 ("powerpc/syscall: Save r3 in regs->orig_r3
").

Save caller's original r3 state to the kernel stackframe before entering
system_call_exception. This allows for user registers to be cleared by
the time system_call_exception is entered, reducing the influence of
user registers on speculation within the kernel.

Prior to this commit, orig_r3 was saved at the beginning of
system_call_exception. Instead, save orig_r3 while the user value is
still live in r3.

Also replicate this early save in 32-bit. A similar save was removed in
commit 6f76a01173 ("powerpc/syscall: implement system call entry/exit
logic in C for PPC32") when 32-bit adopted system_call_exception. Revert
its removal of orig_r3 saves.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-3-rmclure@linux.ibm.com
2022-09-26 23:00:14 +10:00
Rohan McLure
5ba6c9a912 powerpc: Remove asmlinkage from syscall handler definitions
The asmlinkage macro has no special meaning in powerpc, and prior to
this patch is used sporadically on some syscall handler definitions. On
architectures that do not define asmlinkage, it resolves to extern "C"
for C++ compilers and a nop otherwise. The current invocations of
asmlinkage provide far from complete support for C++ toolchains, and so
the macro serves no purpose in powerpc.

Remove all invocations of asmlinkage in arch/powerpc. These incidentally
only occur in syscall definitions and prototypes.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-2-rmclure@linux.ibm.com
2022-09-26 23:00:14 +10:00
Christophe Leroy
6556fd1a1e powerpc: Cleanup idle for e500
e500 idle setup is a bit messy.

e500_idle() is used for PPC32 while book3e_idle() is used for PPC64.
As they are mutually exclusive, call them all e500_idle().

Use CONFIG_MPC_85xx instead of PPC32 + E500 in Makefile and rename
idle_e500.c to idle_85xx.c .

Rename idle_book3e.c to idle_64e.c and remove #ifdef PPC64 in
as it's only built on PPC64.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8039301334e948974c85ec5ef2db37751075185b.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26 23:00:14 +10:00
Christophe Leroy
aa5f59df20 powerpc: Remove CONFIG_PPC_BOOK3E_MMU
CONFIG_PPC_BOOK3E_MMU is redundant with CONFIG_PPC_E500.

Remove it.

Also rename mmu-book3e.h to mmu-e500.h

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c5549cd59a131204ff94ab909cad2e2dad4ddf2f.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26 23:00:14 +10:00
Christophe Leroy
3e7318584d powerpc: Remove CONFIG_PPC_FSL_BOOK3E
CONFIG_PPC_FSL_BOOK3E is redundant with CONFIG_PPC_E500.

Remove it.

And rename five files accordingly.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Rename include guards to match new file names]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/795cb93b88c9a0279289712e674f39e3b108a1b4.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26 23:00:13 +10:00
Christophe Leroy
688de017ef powerpc: Change CONFIG_E500 to CONFIG_PPC_E500
It will be used outside arch/powerpc, make it clear its a
powerpc configuration item.

And we already have CONFIG_PPC_E500MC, so that will make
it more consistent.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e63b22083c11c4300f4a82d3123a46e5fdd54fa6.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26 23:00:13 +10:00
Christophe Leroy
e0d68273d7 powerpc: Remove CONFIG_PPC_BOOK3E
CONFIG_PPC_BOOK3E is redundant with CONFIG_PPC_BOOK3E_64.

The later is more explicit about the fact that it's a 64 bits target.

Remove CONFIG_PPC_BOOK3E.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5d0891490813c19cdcfc04678f512ea68cba3e64.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26 23:00:13 +10:00
Christophe Leroy
d7216567c6 powerpc/cputable: Split cpu_specs[] for mpc85xx and e500mc
e500v1/v2 and e500mc are said to be mutually exclusive in Kconfig.

Split e500 cpu_specs[] and then restrict the non e500mc to PPC32
which is then 85xx.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Tweak formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/553b901ea91e393df231103da4b018e9b251b0e9.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26 23:00:05 +10:00
Christophe Leroy
dfc3095cec powerpc: Remove CONFIG_FSL_BOOKE
PPC_85xx is PPC32 only.
PPC_85xx always selects E500 and is the only PPC32 that
selects E500.
FSL_BOOKE is selected when E500 and PPC32 are selected.

So FSL_BOOKE is redundant with PPC_85xx.

Remove FSL_BOOKE.

And rename four files accordingly.

cpu_setup_fsl_booke.S is not renamed because it is linked to
PPC_FSL_BOOK3E and not to FSL_BOOKE as suggested by its name.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/08e3e15594e66d63b9e89c5b4f9c35153913c28f.1663606875.git.christophe.leroy@csgroup.eu
2022-09-26 22:47:37 +10:00
Christophe Leroy
e320a76db4 powerpc/cputable: Split cpu_specs[] out of cputable.h
cpu_specs[] is full of #ifdefs depending on the different
types of CPU.

CPUs are mutually exclusive, it is therefore possible to split
cpu_specs[] into smaller more readable pieces.

Create cpu_specs_XXX.h that will each be dedicated on one
of the following mutually exclusive families:
- 40x
- 44x
- 47x
- 8xx
- e500
- book3s/32
- book3s/64

In book3s/32, the block for 603 has been moved in front in order
to not have two 604 blocks.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Fix CONFIG_47x to be CONFIG_PPC_47x, tweak some formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a44b865e0318286155273b10cdf524ab697928c1.1663606875.git.christophe.leroy@csgroup.eu
2022-09-26 22:47:13 +10:00
Christophe Leroy
76b719881a powerpc/cputable: Move __cpu_setup() prototypes out of cputable.h
Move all prototypes out of cputable.h

For that rename cpu_setup_power.h to cpu_setup.h and move all
prototypes in it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Standardise cpu_spec *spec formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f45118489ee450db654db8bbcdfd8f5907337c22.1663606875.git.christophe.leroy@csgroup.eu
2022-09-26 22:26:49 +10:00
Christophe Leroy
afd2288a4c powerpc/cputable: Remove __machine_check_early_realmode_p{7/8/9} prototypes
__machine_check_early_realmode_p{7/8/9} are already in mce.h
which is included. Remove them from cputable.c

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b77fc0f90e3a9c065324cbff549b718ccf0809f8.1663606875.git.christophe.leroy@csgroup.eu
2022-09-26 21:00:42 +10:00
Christophe Leroy
b6100bedf1 powerpc/64e: Remove unnecessary #ifdef CONFIG_PPC_FSL_BOOK3E
CONFIG_PPC_BOOK3E_64 implies CONFIG_PPC_FSL_BOOK3E so no need of
additional #ifdefs in files built exclusively for CONFIG_PPC_BOOK3E_64.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/df16255c13b63b0221c9be63b94a6864bed22c12.1663606875.git.christophe.leroy@csgroup.eu
2022-09-26 21:00:41 +10:00
Christophe Leroy
0069f3d14e powerpc/64e: Tie PPC_BOOK3E_64 to PPC_E500MC
The only 64-bit Book3E CPUs we support require the selection
of CONFIG_PPC_E500MC.

However our Kconfig allows configurating a kernel that has 64-bit
Book3E support, but without CONFIG_PPC_E500MC enabled. Such a kernel
would never boot, it doesn't know about any CPUs.

To fix this, force CONFIG_PPC_E500MC to be selected whenever we are
building a 64-bit Book3E kernel.

And add a test to detect future situations where cpu_specs is empty.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ae5d8b8b3ccc346e61d2ec729767f92766273f0b.1663606875.git.christophe.leroy@csgroup.eu
2022-09-26 21:00:41 +10:00
David Hildenbrand
c4167aec98 powerpc/prom_init: drop PROM_BUG()
Unused, let's drop it.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220920122302.99195-3-david@redhat.com
2022-09-26 20:58:18 +10:00
Michael Ellerman
456c300510 powerpc/microwatt: Remove unused early debug code
The original microwatt submission[1] included some early debug code for
using the Microwatt "potato" UART.

The series that was eventually merged switched to using a standard UART,
and so doesn't need any special early debug handling. But some of the
original code was merged accidentally under the non-existent
CONFIG_PPC_EARLY_DEBUG_MICROWATT.

Drop the unused code.

1: https://lore.kernel.org/linuxppc-dev/20200509050340.GD1464954@thinks.paulus.ozlabs.org/

Fixes: 48b545b801 ("powerpc/microwatt: Use standard 16550 UART for console")
Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220919052755.800907-1-mpe@ellerman.id.au
2022-09-26 20:58:17 +10:00
Michael Ellerman
e74611aa91 powerpc/64: Remove unused SYS_CALL_TABLE symbol
In interrupt_64.S, formerly entry_64.S, there are two toc entries
created for sys_call_table and compat_sys_call_table.

These are no longer used, since the system call entry was converted from
asm to C, so remove them.

Fixes: 68b34588e2 ("powerpc/64/sycall: Implement syscall entry/exit logic in C")
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220913124545.2817825-1-mpe@ellerman.id.au
2022-09-26 20:58:17 +10:00
Nicholas Piggin
fdfdcfd504 powerpc/build: put sys_call_table in .data.rel.ro if RELOCATABLE
Const function pointers by convention live in .data.rel.ro if they need
to be relocated. Now that .data.rel.ro is linked into the read-only
region, put them in the right section. This doesn't make much practical
difference, but it will make the C conversion of sys_call_table a
smaller change as far as linking goes.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220916040755.2398112-8-npiggin@gmail.com
2022-09-26 20:58:17 +10:00
Nicholas Piggin
1e9eca485a powerpc/64/build: merge .got and .toc input sections
Follow the binutils ld internal linker script and merge .got and .toc
input sections in the .got output section.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220916040755.2398112-7-npiggin@gmail.com
2022-09-26 20:58:17 +10:00
Nicholas Piggin
c787fed118 powerpc/64/build: only include .opd with ELFv1
ELFv2 does not use function descriptors so .opd is not required.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220916040755.2398112-6-npiggin@gmail.com
2022-09-26 20:58:17 +10:00
Nicholas Piggin
b6adc6d6d3 powerpc/build: move .data.rel.ro, .sdata2 to read-only
.sdata2 is a readonly small data section for ppc32, and .data.rel.ro
is data that needs relocating but is read-only after that so these
can both be moved to the read only memory region.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220916040755.2398112-5-npiggin@gmail.com
2022-09-26 20:58:17 +10:00
Nicholas Piggin
f21ba4499a powerpc/build: move got, toc, plt, branch_lt sections to read-only
This moves linker-related tables from .data to read-only area.
Relocations are performed at early boot time before memory is protected,
after which there should be no modifications required.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Don't use SPECIAL as reported by lkp@intel.com]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220916040755.2398112-4-npiggin@gmail.com
2022-09-26 20:58:17 +10:00
Nicholas Piggin
1faa1235c1 powerpc/32/build: move got1/got2 sections out of text
Following the example from the binutils default linker script, move
.got1 and .got2 out of .text, to just after RO_DATA.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220916040755.2398112-3-npiggin@gmail.com
2022-09-26 20:58:17 +10:00
Nicholas Piggin
7082f8e7d2 powerpc: move __end_rodata to cover arch read-only sections
powerpc has a number of read-only sections and tables that are put after
RO_DATA(). Move the __end_rodata symbol to cover these as well.

Setting memory to read-only at boot is done using __init_begin, change
that to use __end_rodata.

This makes is_kernel_rodata() exactly cover the read-only region, as
well as other things using __end_rodata (e.g., kernel/dma/debug.c).
Boot dmesg also prints the rodata size more accurately.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220916040755.2398112-2-npiggin@gmail.com
2022-09-26 20:58:16 +10:00
Michael Ellerman
b150a4d12b powerpc/vmlinux.lds: Add an explicit symbol for the SRWX boundary
Currently __init_begin is used as the boundary for strict RWX between
executable/read-only text and data, and non-executable (after boot) code
and data.

But that's a little subtle, so add an explicit symbol to document that
the SRWX boundary lies there, and add a comment making it clear that
__init_begin must also begin there.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220916131422.318752-2-mpe@ellerman.id.au
2022-09-26 20:58:16 +10:00
Michael Ellerman
331771e836 powerpc/vmlinux.lds: Ensure STRICT_ALIGN_SIZE is at least page aligned
Add a check that STRICT_ALIGN_SIZE is aligned to at least PAGE_SIZE.

That then makes the alignment to PAGE_SIZE immediately after the
alignment to STRICT_ALIGN_SIZE redundant, so remove it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220916131422.318752-1-mpe@ellerman.id.au
2022-09-26 20:58:16 +10:00
Michael Ellerman
0c32903197 powerpc/64: Remove unused prom_init_toc symbols
Commit 24d33ac5b8 ("powerpc/64s: Make prom_init require RELOCATABLE")
made prom_init depend on CONFIG_RELOCATABLE.

But it missed cleaning up a case in the linker script for RELOCATABLE=n,
and associated symbols. Remove them now.

Fixes: 24d33ac5b8 ("powerpc/64s: Make prom_init require RELOCATABLE")
Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220920131157.1032707-1-mpe@ellerman.id.au
2022-09-26 20:58:16 +10:00
Michael Ellerman
ec5c3a359c Merge branch 'fixes' into next
Merge our fixes branch to bring in a few things that new feature patches
rely on or conflict with.
2022-09-24 00:40:14 +10:00
Yury Norov
546a073d62 powerpc/64: don't refer nr_cpu_ids in asm code when it's undefined
generic_secondary_common_init() calls LOAD_REG_ADDR(r7, nr_cpu_ids)
conditionally on CONFIG_SMP. However, if 'NR_CPUS == 1', kernel doesn't
use the nr_cpu_ids, and in C code, it's just:
  #if NR_CPUS == 1
  #define nr_cpu_ids
  ...

This series makes declaration of nr_cpu_ids conditional on NR_CPUS == 1,
and that reveals the issue, because compiler can't link the
LOAD_REG_ADDR(r7, nr_cpu_ids) against nonexisting symbol.

Current code looks unsafe for those who build kernel with CONFIG_SMP=y and
NR_CPUS == 1. This is weird configuration, but not disallowed.

Fix the linker error by replacing LOAD_REG_ADDR() with LOAD_REG_IMMEDIATE()
conditionally on NR_CPUS == 1.

As the following patch adds CONFIG_FORCE_NR_CPUS option that has the
similar effect on nr_cpu_ids, make the generic_secondary_common_init()
conditional on it too.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-09-20 16:11:44 -07:00
Nathan Lynch
f88aabad33 Revert "powerpc/rtas: Implement reentrant rtas call"
At the time this was submitted by Leonardo, I confirmed -- or thought
I had confirmed -- with PowerVM partition firmware development that
the following RTAS functions:

- ibm,get-xive
- ibm,int-off
- ibm,int-on
- ibm,set-xive

were safe to call on multiple CPUs simultaneously, not only with
respect to themselves as indicated by PAPR, but with arbitrary other
RTAS calls:

https://lore.kernel.org/linuxppc-dev/875zcy2v8o.fsf@linux.ibm.com/

Recent discussion with firmware development makes it clear that this
is not true, and that the code in commit b664db8e3f ("powerpc/rtas:
Implement reentrant rtas call") is unsafe, likely explaining several
strange bugs we've seen in internal testing involving DLPAR and
LPM. These scenarios use ibm,configure-connector, whose internal state
can be corrupted by the concurrent use of the "reentrant" functions,
leading to symptoms like endless busy statuses from RTAS.

Fixes: b664db8e3f ("powerpc/rtas: Implement reentrant rtas call")
Cc: stable@vger.kernel.org # v5.8+
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Laurent Dufour <laurent.dufour@fr.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220907220111.223267-1-nathanl@linux.ibm.com
2022-09-14 22:43:13 +10:00
Kefeng Wang
2be9880dc8 kernel: exit: cleanup release_thread()
Only x86 has own release_thread(), introduce a new weak release_thread()
function to clean empty definitions in other ARCHs.

Link: https://lkml.kernel.org/r/20220819014406.32266-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Guo Ren <guoren@kernel.org>				[csky]
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Brian Cain <bcain@quicinc.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>			[powerpc]
Acked-by: Stafford Horne <shorne@gmail.com>			[openrisc]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>		[arm64]
Acked-by: Huacai Chen <chenhuacai@kernel.org>			[LoongArch]
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Guo Ren <guoren@kernel.org> [csky]
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Xuerui Wang <kernel@xen0n.name>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:07 -07:00
Linus Torvalds
f448dda895 asm-generic: SOFTIRQ_ON_OWN_STACK rework
Just one fixup patch, reworking the softirq_on_own_stack logic for
 preempt-rt kernels as discussed in
 https://lore.kernel.org/all/CAHk-=wgZSD3W2y6yczad2Am=EfHYyiPzTn3CfXxrriJf9i5W5w@mail.gmail.com/
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmMZ/FUACgkQmmx57+YA
 GNl4DBAAutecM/RGJnK//0+NEcCsuHoIWusifeEIhR8yNFxm1UOiwBNYV4M/73UA
 x2lLGQCu4ze1zSU3BXX1Ug3MfpmrBOhlsW2+x7j0b3tG9vXpvojitTsCPo6lgNDu
 nUx58rtN0wfZgwT7c4DaueEL1aIZ1yUOgCE+SCGw3subgpZeoaYUi3sbZxI2K+T8
 MnzcVVpmccMaxLh0xwCn6u4csYrarj0XodY11a8+DAN9Afek9nEq4UNhIewjvQw0
 2OTbGuxIBZoXNmWdjcDBO97eUtq0oJcRIgPkRi0xlas2hzRJMwO6liqsA34r4r7r
 Irxyyw4//jlCtJOKwrv+rSxR6ULvGw2+UBk/Q/nYeN9hL2qKnwJIa8hryj+cQ/os
 KWclMGEYtKvOzciMIvViXAOY4sa7jZO1Q9rntyUIVkNiqQqLDRxocW+GjsqG5KZI
 bl+akScQhX3a7bEZePO0EJqfitjaxk5JtxF3KcV5apKaWto+EwOpvRf6eUVT97/n
 Zpoy2dBygOtWiMJR7o143xeOQlhutDCOh6VeDfJiUA57ekABRY/8+R+ZltkmDP84
 4iCCtE4oWuMXz390ybBHxKcNjiknLUMvDUNq5wfjsyJ8ftBRF+1+vKR+BdLqsx8O
 d5/FjRXC5MWaW4GLO89l+MZh+BrOkobdq333VxIStpS9Jd0pUa4=
 =0A1T
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-fixes-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull SOFTIRQ_ON_OWN_STACK rework from Arnd Bergmann:
 "Just one fixup patch, reworking the softirq_on_own_stack logic for
  preempt-rt kernels as discussed in

    https://lore.kernel.org/all/CAHk-=wgZSD3W2y6yczad2Am=EfHYyiPzTn3CfXxrriJf9i5W5w@mail.gmail.com/"

* tag 'asm-generic-fixes-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig.
2022-09-09 07:23:29 -04:00
Christophe Leroy
78c73c80fd powerpc/math-emu: Inhibit W=1 warnings
When building with W=1 you get:

	arch/powerpc/math-emu/fre.c:6:5: error: no previous prototype for 'fre' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fsqrt.c:11:1: error: no previous prototype for 'fsqrt' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fsqrts.c:12:1: error: no previous prototype for 'fsqrts' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/frsqrtes.c:6:5: error: no previous prototype for 'frsqrtes' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/mtfsf.c:10:1: error: no previous prototype for 'mtfsf' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/mtfsfi.c:10:1: error: no previous prototype for 'mtfsfi' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fabs.c:7:1: error: no previous prototype for 'fabs' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fadd.c:11:1: error: no previous prototype for 'fadd' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fadds.c:12:1: error: no previous prototype for 'fadds' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fcmpo.c:11:1: error: no previous prototype for 'fcmpo' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fcmpu.c:11:1: error: no previous prototype for 'fcmpu' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fcmpu.c:14:19: error: variable 'B_c' set but not used [-Werror=unused-but-set-variable]
	arch/powerpc/math-emu/fcmpu.c:13:19: error: variable 'A_c' set but not used [-Werror=unused-but-set-variable]
	arch/powerpc/math-emu/fctiw.c:11:1: error: no previous prototype for 'fctiw' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fctiwz.c:11:1: error: no previous prototype for 'fctiwz' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fdiv.c:11:1: error: no previous prototype for 'fdiv' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fdivs.c:12:1: error: no previous prototype for 'fdivs' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fmadd.c:11:1: error: no previous prototype for 'fmadd' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fmadds.c:12:1: error: no previous prototype for 'fmadds' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fmsub.c:11:1: error: no previous prototype for 'fmsub' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fmsubs.c:12:1: error: no previous prototype for 'fmsubs' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fmul.c:11:1: error: no previous prototype for 'fmul' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fmuls.c:12:1: error: no previous prototype for 'fmuls' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fnabs.c:7:1: error: no previous prototype for 'fnabs' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fneg.c:7:1: error: no previous prototype for 'fneg' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fnmadd.c:11:1: error: no previous prototype for 'fnmadd' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fnmadds.c:12:1: error: no previous prototype for 'fnmadds' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fnmsub.c:11:1: error: no previous prototype for 'fnmsub' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fnmsubs.c:12:1: error: no previous prototype for 'fnmsubs' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fres.c:7:1: error: no previous prototype for 'fres' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/frsp.c:12:1: error: no previous prototype for 'frsp' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fsel.c:11:1: error: no previous prototype for 'fsel' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/lfs.c:12:1: error: no previous prototype for 'lfs' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/frsqrte.c:7:1: error: no previous prototype for 'frsqrte' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fsub.c:11:1: error: no previous prototype for 'fsub' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fsubs.c:12:1: error: no previous prototype for 'fsubs' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/mcrfs.c:10:1: error: no previous prototype for 'mcrfs' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/mffs.c:10:1: error: no previous prototype for 'mffs' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/mtfsb0.c:10:1: error: no previous prototype for 'mtfsb0' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/mtfsb1.c:10:1: error: no previous prototype for 'mtfsb1' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/stfiwx.c:7:1: error: no previous prototype for 'stfiwx' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/stfs.c:12:1: error: no previous prototype for 'stfs' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/fmr.c:7:1: error: no previous prototype for 'fmr' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/lfd.c:10:1: error: no previous prototype for 'lfd' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/stfd.c:7:1: error: no previous prototype for 'stfd' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/math_efp.c:177:5: error: no previous prototype for 'do_spe_mathemu' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/math_efp.c:726:5: error: no previous prototype for 'speround_handler' [-Werror=missing-prototypes]
	arch/powerpc/math-emu/math_efp.c:893:12: error: no previous prototype for 'spe_mathemu_init' [-Werror=missing-prototypes]

Fix the warnings in math_efp.c by adding prototypes of do_spe_mathemu()
and speround_handler() to asm/processor.h and declare spe_mathemu_init()
static.

The other warnings are benign and not worth the churn of fixing them,
expecially the 'unused-but-set-variable' which would impact the core
part of 'math-emu'.

So silence them by adding -Wno-missing-prototypes -Wno-unused-but-set-variable.

But then you get:

	arch/powerpc/math-emu/fre.c:6:5: error: no previous declaration for 'fre' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fsqrt.c:11:1: error: no previous declaration for 'fsqrt' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fsqrts.c:12:1: error: no previous declaration for 'fsqrts' [-Werror=missing-declarations]
	arch/powerpc/math-emu/frsqrtes.c:6:5: error: no previous declaration for 'frsqrtes' [-Werror=missing-declarations]
	arch/powerpc/math-emu/mtfsf.c:10:1: error: no previous declaration for 'mtfsf' [-Werror=missing-declarations]
	arch/powerpc/math-emu/mtfsfi.c:10:1: error: no previous declaration for 'mtfsfi' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fabs.c:7:1: error: no previous declaration for 'fabs' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fadd.c:11:1: error: no previous declaration for 'fadd' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fadds.c:12:1: error: no previous declaration for 'fadds' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fcmpo.c:11:1: error: no previous declaration for 'fcmpo' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fcmpu.c:11:1: error: no previous declaration for 'fcmpu' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fctiw.c:11:1: error: no previous declaration for 'fctiw' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fctiwz.c:11:1: error: no previous declaration for 'fctiwz' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fdiv.c:11:1: error: no previous declaration for 'fdiv' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fdivs.c:12:1: error: no previous declaration for 'fdivs' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fmadd.c:11:1: error: no previous declaration for 'fmadd' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fmadds.c:12:1: error: no previous declaration for 'fmadds' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fmsub.c:11:1: error: no previous declaration for 'fmsub' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fmsubs.c:12:1: error: no previous declaration for 'fmsubs' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fmul.c:11:1: error: no previous declaration for 'fmul' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fmuls.c:12:1: error: no previous declaration for 'fmuls' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fnabs.c:7:1: error: no previous declaration for 'fnabs' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fneg.c:7:1: error: no previous declaration for 'fneg' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fnmadd.c:11:1: error: no previous declaration for 'fnmadd' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fnmadds.c:12:1: error: no previous declaration for 'fnmadds' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fnmsub.c:11:1: error: no previous declaration for 'fnmsub' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fnmsubs.c:12:1: error: no previous declaration for 'fnmsubs' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fres.c:7:1: error: no previous declaration for 'fres' [-Werror=missing-declarations]
	arch/powerpc/math-emu/frsp.c:12:1: error: no previous declaration for 'frsp' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fsel.c:11:1: error: no previous declaration for 'fsel' [-Werror=missing-declarations]
	arch/powerpc/math-emu/lfs.c:12:1: error: no previous declaration for 'lfs' [-Werror=missing-declarations]
	arch/powerpc/math-emu/frsqrte.c:7:1: error: no previous declaration for 'frsqrte' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fsub.c:11:1: error: no previous declaration for 'fsub' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fsubs.c:12:1: error: no previous declaration for 'fsubs' [-Werror=missing-declarations]
	arch/powerpc/math-emu/mcrfs.c:10:1: error: no previous declaration for 'mcrfs' [-Werror=missing-declarations]
	arch/powerpc/math-emu/mffs.c:10:1: error: no previous declaration for 'mffs' [-Werror=missing-declarations]
	arch/powerpc/math-emu/mtfsb0.c:10:1: error: no previous declaration for 'mtfsb0' [-Werror=missing-declarations]
	arch/powerpc/math-emu/mtfsb1.c:10:1: error: no previous declaration for 'mtfsb1' [-Werror=missing-declarations]
	arch/powerpc/math-emu/stfiwx.c:7:1: error: no previous declaration for 'stfiwx' [-Werror=missing-declarations]
	arch/powerpc/math-emu/stfs.c:12:1: error: no previous declaration for 'stfs' [-Werror=missing-declarations]
	arch/powerpc/math-emu/fmr.c:7:1: error: no previous declaration for 'fmr' [-Werror=missing-declarations]
	arch/powerpc/math-emu/lfd.c:10:1: error: no previous declaration for 'lfd' [-Werror=missing-declarations]
	arch/powerpc/math-emu/stfd.c:7:1: error: no previous declaration for 'stfd' [-Werror=missing-declarations]

So also add -Wno-missing-declarations.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/688084b40b5ac88f2905cb207d5dad947d8d34dc.1662531153.git.christophe.leroy@csgroup.eu
2022-09-08 11:11:18 +10:00
Sebastian Andrzej Siewior
8cbb2b50ee asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig.
Remove the CONFIG_PREEMPT_RT symbol from the ifdef around
do_softirq_own_stack() and move it to Kconfig instead.

Enable softirq stacks based on SOFTIRQ_ON_OWN_STACK which depends on
HAVE_SOFTIRQ_ON_OWN_STACK and its default value is set to !PREEMPT_RT.
This ensures that softirq stacks are not used on PREEMPT_RT and avoids
a 'select' statement on an option which has a 'depends' statement.

Link: https://lore.kernel.org/YvN5E%2FPrHfUhggr7@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-09-05 17:20:55 +02:00
Liang He
ce63c44b63 powerpc/pci-common: Fix refcount bug for 'phb->dn'
In pcibios_alloc_controller(), 'phb' is allocated and escaped into
global 'hose_list'. So we should call of_node_get() when a new reference
created into 'phb->dn'. And when phb is freed, we should call
of_node_put() on it.

NOTE: This function is called in the iteration of for_each_xx in
chrp_find_bridges() and pSeries_discover_phbs(). If there is no
of_node_get(), the object may be prematurely freed.

Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220702022936.266146-1-windhl@126.com
2022-09-05 17:30:29 +10:00
Liang He
110a1fcb6c powerpc/pci_dn: Add missing of_node_put()
In pci_add_device_node_info(), use of_node_put() to drop the reference
to 'parent' returned by of_get_parent() to keep refcount balance.

Fixes: cca87d303c ("powerpc/pci: Refactor pci_dn")
Co-authored-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/r/20220701131750.240170-1-windhl@126.com
2022-09-05 17:30:25 +10:00
Liang He
d1aabbbb25 powerpc/kernel: Add missing of_node_put() in legacy_serial.c
In find_legacy_serial_ports(), of_find_node_by_path() will return a node
pointer with refcount incremented. The reference should be dropped with
of_node_put() when it is not used anymore.

Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220619070811.4067215-1-windhl@126.com
2022-09-05 17:28:26 +10:00
Christophe Leroy
1d53c0192b powerpc/vdso: link with -z noexecstack
With recent binutils, the following warning appears:

  VDSO32L arch/powerpc/kernel/vdso/vdso32.so.dbg
/opt/gcc-12.2.0-nolibc/powerpc64-linux/bin/../lib/gcc/powerpc64-linux/12.2.0/../../../../powerpc64-linux/bin/ld: warning: arch/powerpc/kernel/vdso/getcpu-32.o: missing .note.GNU-stack section implies executable stack
/opt/gcc-12.2.0-nolibc/powerpc64-linux/bin/../lib/gcc/powerpc64-linux/12.2.0/../../../../powerpc64-linux/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker

To avoid that, explicitly tell the linker we don't want executable
stack.

For more explanations, see commit ffcf9c5700 ("x86: link vdso
and boot with -z noexecstack --no-warn-rwx-segments")

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b95f2e3216a574837dd61208444e9515c3423da4.1662132312.git.christophe.leroy@csgroup.eu
2022-09-05 17:28:25 +10:00
Nicholas Piggin
6ba5aa541a powerpc/pseries: Move dtl scanning and steal time accounting to pseries platform
dtl is the PAPR Dispatch Trace Log, which is entirely a pseries feature.
The pseries platform alrady has a file dealing with the dtl, so move
scanning for stolen time accounting there from kernel/time.c.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220902085316.2071519-5-npiggin@gmail.com
2022-09-05 14:14:27 +10:00
Marco Elver
f95e5a3d59 powerpc/hw_breakpoint: Avoid relying on caller synchronization
Internal data structures (cpu_bps, task_bps) of powerpc's hw_breakpoint
implementation have relied on nr_bp_mutex serializing access to them.

Before overhauling synchronization of kernel/events/hw_breakpoint.c,
introduce 2 spinlocks to synchronize cpu_bps and task_bps respectively,
thus avoiding reliance on callers synchronizing powerpc's hw_breakpoint.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20220829124719.675715-10-elver@google.com
2022-08-30 10:56:23 +02:00
Wolfram Sang
14be375634 powerpc: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Link: https://lore.kernel.org/r/20220818205946.6336-1-wsa+renesas@sang-engineering.com
2022-08-26 11:02:20 +10:00
Michael Ellerman
eb316ae798 powerpc: Move patch sites out of asm-prototypes.h
The definitions for the patch sites etc. don't belong in
asm-prototypes.h, they are not EXPORT'ed asm symbols.

Move them into sections.h which is traditionally used for asm symbols.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220818050659.187181-1-mpe@ellerman.id.au
2022-08-26 11:02:20 +10:00
Michael Ellerman
91926d8b7e powerpc/rtas: Fix RTAS MSR[HV] handling for Cell
The semi-recent changes to MSR handling when entering RTAS (firmware)
cause crashes on IBM Cell machines. An example trace:

  kernel tried to execute user page (2fff01a8) - exploit attempt? (uid: 0)
  BUG: Unable to handle kernel instruction fetch
  Faulting instruction address: 0x2fff01a8
  Oops: Kernel access of bad area, sig: 11 [#1]
  BE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=4 NUMA Cell
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W          6.0.0-rc2-00433-gede0a8d3307a #207
  NIP:  000000002fff01a8 LR: 0000000000032608 CTR: 0000000000000000
  REGS: c0000000015236b0 TRAP: 0400   Tainted: G        W           (6.0.0-rc2-00433-gede0a8d3307a)
  MSR:  0000000008001002 <ME,RI>  CR: 00000000  XER: 20000000
  ...
  NIP 0x2fff01a8
  LR  0x32608
  Call Trace:
    0xc00000000143c5f8 (unreliable)
    .rtas_call+0x224/0x320
    .rtas_get_boot_time+0x70/0x150
    .read_persistent_clock64+0x114/0x140
    .read_persistent_wall_and_boot_offset+0x24/0x80
    .timekeeping_init+0x40/0x29c
    .start_kernel+0x674/0x8f0
    start_here_common+0x1c/0x50

Unlike PAPR platforms where RTAS is only used in guests, on the IBM Cell
machines Linux runs with MSR[HV] set but also uses RTAS, provided by
SLOF.

Fix it by copying the MSR[HV] bit from the MSR value we've just read
using mfmsr into the value used for RTAS.

It seems like we could also fix it using an #ifdef CELL to set MSR[HV],
but that doesn't work because it's possible to build a single kernel
image that runs on both Cell native and pseries.

Fixes: b6b1c3ce06 ("powerpc/rtas: Keep MSR[RI] set when calling RTAS")
Cc: stable@vger.kernel.org # v5.19+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Jordan Niethe <jniethe5@gmail.com>
Link: https://lore.kernel.org/r/20220823115952.1203106-2-mpe@ellerman.id.au
2022-08-26 08:41:54 +10:00
Masahiro Yamada
c7acee3d2f powerpc: align syscall table for ppc32
Christophe Leroy reported that commit 7b4537199a ("kbuild: link
symbol CRCs at final link,  removing CONFIG_MODULE_REL_CRCS") broke
mpc85xx_defconfig + CONFIG_RELOCATABLE=y.

    LD      vmlinux
    SYSMAP  System.map
    SORTTAB vmlinux
    CHKREL  vmlinux
  WARNING: 451 bad relocations
  c0b312a9 R_PPC_UADDR32     .head.text-0x3ff9ed54
  c0b312ad R_PPC_UADDR32     .head.text-0x3ffac224
  c0b312b1 R_PPC_UADDR32     .head.text-0x3ffb09f4
  c0b312b5 R_PPC_UADDR32     .head.text-0x3fe184dc
  c0b312b9 R_PPC_UADDR32     .head.text-0x3fe183a8
      ...

The compiler emits a bunch of R_PPC_UADDR32, which is not supported by
arch/powerpc/kernel/reloc_32.S.

The reason is there exists an unaligned symbol.

  $ powerpc-linux-gnu-nm -n vmlinux
    ...
  c0b31258 d spe_aligninfo
  c0b31298 d __func__.0
  c0b312a9 D sys_call_table
  c0b319b8 d __func__.0

Commit 7b4537199a is not the root cause. Even before that, I can
reproduce the same issue for mpc85xx_defconfig + CONFIG_RELOCATABLE=y
+ CONFIG_MODVERSIONS=n.

It is just that nobody noticed because when CONFIG_MODVERSIONS is
enabled, a __crc_* symbol inserted before sys_call_table was hiding the
unalignment issue.

Adding alignment to the syscall table for ppc32 fixes the issue.

Cc: stable@vger.kernel.org
Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Trim change log discussion, add Cc stable]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/lkml/38605f6a-a568-f884-f06f-ea4da5b214f0@csgroup.eu/
Link: https://lore.kernel.org/r/20220820165129.1147589-1-masahiroy@kernel.org
2022-08-26 08:41:40 +10:00
Pali Rohár
0382a35bef powerpc/pci: Enable PCI domains in /proc when PCI bus numbers are not unique
On 32-bit powerpc systems with more PCIe controllers and more PCI
domains, where on more PCI domains are same PCI numbers, when kernel is
compiled with CONFIG_PROC_FS=y and
CONFIG_PPC_PCI_BUS_NUM_DOMAIN_DEPENDENT=y options, kernel prints
"proc_dir_entry 'pci/01' already registered" error message.

  proc_dir_entry 'pci/01' already registered
  WARNING: CPU: 0 PID: 1 at fs/proc/generic.c:377 proc_register+0x1a8/0x1ac
  ...
  NIP proc_register+0x1a8/0x1ac
  LR  proc_register+0x1a8/0x1ac
  Call Trace:
    proc_register+0x1a8/0x1ac (unreliable)
    _proc_mkdir+0x78/0xa4
    pci_proc_attach_device+0x11c/0x168
    pci_proc_init+0x80/0x98
    do_one_initcall+0x80/0x284
    kernel_init_freeable+0x1f4/0x2a0
    kernel_init+0x24/0x150
    ret_from_kernel_thread+0x5c/0x64

This regression started appearing after commit
5663568130 ("powerpc/pci: Add config option for using all 256 PCI
buses") in case in each mPCIe slot is connected PCIe card and therefore
PCI bus 1 is populated in for every PCIe controller / PCI domain.

The reason is that PCI procfs code expects that when PCI bus numbers are
not unique across all PCI domains, function pci_proc_domain() returns
true for domain dependent buses.

Fix this issue by setting PCI_ENABLE_PROC_DOMAINS and
PCI_COMPAT_DOMAIN_0 flags for 32-bit powerpc code when
CONFIG_PPC_PCI_BUS_NUM_DOMAIN_DEPENDENT is enabled. Same approach is
already implemented for 64-bit powerpc code (where PCI bus numbers are
always domain dependent).

Fixes: 5663568130 ("powerpc/pci: Add config option for using all 256 PCI buses")
Signed-off-by: Pali Rohár <pali@kernel.org>
[mpe: Trim change log oops message]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220820115113.30581-1-pali@kernel.org
2022-08-25 17:47:08 +10:00
Christophe Leroy
8a8f786666 powerpc/vdso: Don't map VDSO at a fixed address on PPC32
PPC64 removed default mapping address from VDSO in
commit 30d0b36828 ("powerpc: Move 64bit VDSO to improve context
switch performance").

Do like PPC64 and let get_unmapped_area() place the VDSO mapping
at the address it wants, don't force a default address.

This allows randomisation of VDSO address.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/cba76f5a5b01fcc49415e632d92c11c1c5998cab.1660843877.git.christophe.leroy@csgroup.eu
2022-08-22 13:36:59 +10:00
Michael Ellerman
8d48562a27 powerpc/pci: Fix get_phb_number() locking
The recent change to get_phb_number() causes a DEBUG_ATOMIC_SLEEP
warning on some systems:

  BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
  in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper
  preempt_count: 1, expected: 0
  RCU nest depth: 0, expected: 0
  1 lock held by swapper/1:
   #0: c157efb0 (hose_spinlock){+.+.}-{2:2}, at: pcibios_alloc_controller+0x64/0x220
  Preemption disabled at:
  [<00000000>] 0x0
  CPU: 0 PID: 1 Comm: swapper Not tainted 5.19.0-yocto-standard+ #1
  Call Trace:
  [d101dc90] [c073b264] dump_stack_lvl+0x50/0x8c (unreliable)
  [d101dcb0] [c0093b70] __might_resched+0x258/0x2a8
  [d101dcd0] [c0d3e634] __mutex_lock+0x6c/0x6ec
  [d101dd50] [c0a84174] of_alias_get_id+0x50/0xf4
  [d101dd80] [c002ec78] pcibios_alloc_controller+0x1b8/0x220
  [d101ddd0] [c140c9dc] pmac_pci_init+0x198/0x784
  [d101de50] [c140852c] discover_phbs+0x30/0x4c
  [d101de60] [c0007fd4] do_one_initcall+0x94/0x344
  [d101ded0] [c1403b40] kernel_init_freeable+0x1a8/0x22c
  [d101df10] [c00086e0] kernel_init+0x34/0x160
  [d101df30] [c001b334] ret_from_kernel_thread+0x5c/0x64

This is because pcibios_alloc_controller() holds hose_spinlock but
of_alias_get_id() takes of_mutex which can sleep.

The hose_spinlock protects the phb_bitmap, and also the hose_list, but
it doesn't need to be held while get_phb_number() calls the OF routines,
because those are only looking up information in the device tree.

So fix it by having get_phb_number() take the hose_spinlock itself, only
where required, and then dropping the lock before returning.
pcibios_alloc_controller() then needs to take the lock again before the
list_add() but that's safe, the order of the list is not important.

Fixes: 0fe1e96fef ("powerpc/pci: Prefer PCI domain assignment via DT 'linux,pci-domain' and alias")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220815065550.1303620-1-mpe@ellerman.id.au
2022-08-15 16:56:14 +10:00
Linus Torvalds
d785610f05 powerpc fixes for 6.0 #2
- Ensure we never emit lwarx with EH=1 on 32-bit, because some 32-bit CPUs trap on it
    rather than ignoring it as they should.
 
  - Fix ftrace when building with clang, which was broken by some refactoring.
 
  - A couple of other minor fixes.
 
 Thanks to: Christophe Leroy, Naveen N. Rao, Nick Desaulniers, Ondrej Mosnacek, Pali Rohár,
 Russell Currey Segher Boessenkool.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmL4KksTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgEdED/4/YGF3az2N9a0cHt+qnuO2gMGuRdZ2
 COV1d2+2+37nCk2sqrPX+ASBYK0F6vXTivOCIYIItkWD1vYktHC4LrG+slUltb1S
 ZHttp+IYLC5S24v89ViKSQuQ1MNYWU86ii9Wq/mKilsDCrGWpAHpnAv/wip2Epzv
 Xk/9XWE8MCEBeVwJIXBKgbvOC0+pmftyrNtuHDW+AmST+km54ch1OpomAYHZOnsr
 ix1FdJwr/HeZn8EwFYyUePcFnUjNdTa5/Ty4BYn1VPKzEmmAae9UEPM61fmxlOhr
 JiYG0S8/QgZDcjkSsFZGNWmsOPACrYMMCnHQi3VAHRndnXjX6gTv3mPyLBblTbCf
 C3o5xYDqfY0rgXZSCM4NT0IutZ+5yv7q2p/jYn55NR6OpB/D/94pOJdZr6a1H4Cz
 e5Pr0bN2rLlaHTofqscDoWMCjAcyyymQclB/dqrBCvDViiJj3efCS4wrFCoSNnlG
 a0xEYecxprtL+wFD+E5LH8A10Ll1y5zNfh3rXzIMWtX5iWT0Rjx040SAFDG2a/cG
 A3IBKqT85dPGgxQEEV2uP6GGIjl+7Fd6nX+FVgl5AkFo1irdndng7G4SlJmBI3or
 nM5KqJ1nizjkDuCrYboOoZEdQKULHMv6lWS3hbgXS+rJkf4K75wuIC37V1SRcNmy
 VjNvssP3rVX63A==
 =SpYx
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Ensure we never emit lwarx with EH=1 on 32-bit, because some 32-bit
   CPUs trap on it rather than ignoring it as they should.

 - Fix ftrace when building with clang, which was broken by some
   refactoring.

 - A couple of other minor fixes.

Thanks to Christophe Leroy, Naveen N.  Rao, Nick Desaulniers, Ondrej
Mosnacek, Pali Rohár, Russell Currey, and Segher Boessenkool.

* tag 'powerpc-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/kexec: Fix build failure from uninitialised variable
  powerpc/ppc-opcode: Fix PPC_RAW_TW()
  powerpc64/ftrace: Fix ftrace for clang builds
  powerpc: Make eh value more explicit when using lwarx
  powerpc: Don't hide eh field of lwarx behind a macro
  powerpc: Fix eh field when calling lwarx on PPC32
2022-08-14 08:48:13 -07:00
Naveen N. Rao
cb928ac192 powerpc64/ftrace: Fix ftrace for clang builds
Clang doesn't support -mprofile-kernel ABI, so guard the checks against
CONFIG_DYNAMIC_FTRACE_WITH_REGS, rather than the elf ABI version.

Fixes: 23b44fc248 ("powerpc/ftrace: Make __ftrace_make_{nop/call}() common to PPC32 and PPC64")
Cc: stable@vger.kernel.org # v5.19+
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Reported-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Tested-by: Ondrej Mosnacek <omosnacek@gmail.com>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://github.com/llvm/llvm-project/issues/57031
Link: https://github.com/ClangBuiltLinux/linux/issues/1682
Link: https://lore.kernel.org/r/20220809095907.418764-1-naveen.n.rao@linux.vnet.ibm.com
2022-08-10 15:38:16 +10:00
Linus Torvalds
4e23eeebb2 Bitmap patches for v6.0-rc1
This branch consists of:
 
 Qu Wenruo:
 lib: bitmap: fix the duplicated comments on bitmap_to_arr64()
 https://lore.kernel.org/lkml/0d85e1dbad52ad7fb5787c4432bdb36cbd24f632.1656063005.git.wqu@suse.com/
 
 Alexander Lobakin:
 bitops: let optimize out non-atomic bitops on compile-time constants
 https://lore.kernel.org/lkml/20220624121313.2382500-1-alexandr.lobakin@intel.com/T/
 
 Yury Norov:
 lib: cleanup bitmap-related headers
 https://lore.kernel.org/linux-arm-kernel/YtCVeOGLiQ4gNPSf@yury-laptop/T/#m305522194c4d38edfdaffa71fcaaf2e2ca00a961
 
 Alexander Lobakin:
 x86/olpc: fix 'logical not is only applied to the left hand side'
 https://www.spinics.net/lists/kernel/msg4440064.html
 
 Yury Norov:
 lib/nodemask: inline wrappers around bitmap
 https://lore.kernel.org/all/20220723214537.2054208-1-yury.norov@gmail.com/
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmLpVvwACgkQsUSA/Tof
 vsiAHgwAwS9pl8GJ+fKYnue2CYo9349d2oT6BBUs/Rv8uqYEa4QkpYsR7NS733TG
 pos0hhoRvSOzrUP4qppXUjfJ+NkzLgpnKFOeWfFoNAKlHuaaMRvF3Y0Q/P8g0/Kg
 HPWcCQLHyCH9Wjs3e2TTgRjxTrHuruD2VJ401/PX/lw0DicUhmev5mUFa10uwFkP
 ZJRprjoFn9HJ0Hk16pFZDi36d3YumhACOcWRiJdoBDrEPV3S6lm9EeOy/yHBNp5k
 9bKj+RboeT2t70KaZcKv+M5j1nu0cAhl7kRkjcxcmGyimI0l82Vgq9yFxhGqvWg8
 RnCrJ5EaO08FGCAKG9GEwzdiNa24Gdq5XZSpQA7JZHmhmchpnnlNenJicyv0gOQi
 abChZeWSEsyA+78l2+kk9nezfVKUOnKDEZQxBVTOyWsmZYxHZV94oam340VjQDaY
 4/fETdOy/qqPIxnpxAeFGWxZjcVaYiYPLj7KLPMsB0aAAF7pZrem465vSfgbrE81
 +gCdqrWd
 =4dTW
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-6.0-rc1' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - fix the duplicated comments on bitmap_to_arr64() (Qu Wenruo)

 - optimize out non-atomic bitops on compile-time constants (Alexander
   Lobakin)

 - cleanup bitmap-related headers (Yury Norov)

 - x86/olpc: fix 'logical not is only applied to the left hand side'
   (Alexander Lobakin)

 - lib/nodemask: inline wrappers around bitmap (Yury Norov)

* tag 'bitmap-6.0-rc1' of https://github.com/norov/linux: (26 commits)
  lib/nodemask: inline next_node_in() and node_random()
  powerpc: drop dependency on <asm/machdep.h> in archrandom.h
  x86/olpc: fix 'logical not is only applied to the left hand side'
  lib/cpumask: move some one-line wrappers to header file
  headers/deps: mm: align MANITAINERS and Docs with new gfp.h structure
  headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h>
  headers/deps: mm: Optimize <linux/gfp.h> header dependencies
  lib/cpumask: move trivial wrappers around find_bit to the header
  lib/cpumask: change return types to unsigned where appropriate
  cpumask: change return types to bool where appropriate
  lib/bitmap: change type of bitmap_weight to unsigned long
  lib/bitmap: change return types to bool where appropriate
  arm: align find_bit declarations with generic kernel
  iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE)
  lib/test_bitmap: test the tail after bitmap_to_arr64()
  lib/bitmap: fix off-by-one in bitmap_to_arr64()
  lib: test_bitmap: add compile-time optimization/evaluations assertions
  bitmap: don't assume compiler evaluates small mem*() builtins calls
  net/ice: fix initializing the bitmap in the switch code
  bitops: let optimize out non-atomic bitops on compile-time constants
  ...
2022-08-07 17:52:35 -07:00
Linus Torvalds
eb5699ba31 Updates to various subsystems which I help look after. lib, ocfs2,
fatfs, autofs, squashfs, procfs, etc.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYu9BeQAKCRDdBJ7gKXxA
 jp1DAP4mjCSvAwYzXklrIt+Knv3CEY5oVVdS+pWOAOGiJpldTAD9E5/0NV+VmlD9
 kwS/13j38guulSlXRzDLmitbg81zAAI=
 =Zfum
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2022-08-06-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc updates from Andrew Morton:
 "Updates to various subsystems which I help look after. lib, ocfs2,
  fatfs, autofs, squashfs, procfs, etc. A relatively small amount of
  material this time"

* tag 'mm-nonmm-stable-2022-08-06-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (72 commits)
  scripts/gdb: ensure the absolute path is generated on initial source
  MAINTAINERS: kunit: add David Gow as a maintainer of KUnit
  mailmap: add linux.dev alias for Brendan Higgins
  mailmap: update Kirill's email
  profile: setup_profiling_timer() is moslty not implemented
  ocfs2: fix a typo in a comment
  ocfs2: use the bitmap API to simplify code
  ocfs2: remove some useless functions
  lib/mpi: fix typo 'the the' in comment
  proc: add some (hopefully) insightful comments
  bdi: remove enum wb_congested_state
  kernel/hung_task: fix address space of proc_dohung_task_timeout_secs
  lib/lzo/lzo1x_compress.c: replace ternary operator with min() and min_t()
  squashfs: support reading fragments in readahead call
  squashfs: implement readahead
  squashfs: always build "file direct" version of page actor
  Revert "squashfs: provide backing_dev_info in order to disable read-ahead"
  fs/ocfs2: Fix spelling typo in comment
  ia64: old_rr4 added under CONFIG_HUGETLB_PAGE
  proc: fix test for "vsyscall=xonly" boot option
  ...
2022-08-07 10:03:24 -07:00
Linus Torvalds
cae4199f93 powerpc updates for 6.0
- Add support for syscall stack randomization.
 
  - Add support for atomic operations to the 32 & 64-bit BPF JIT.
 
  - Full support for KASAN on 64-bit Book3E.
 
  - Add a watchdog driver for the new PowerVM hypervisor watchdog.
 
  - Add a number of new selftests for the Power10 PMU support.
 
  - Add a driver for the PowerVM Platform KeyStore.
 
  - Increase the NMI watchdog timeout during live partition migration, to avoid timeouts
    due to increased memory access latency.
 
  - Add support for using the 'linux,pci-domain' device tree property for PCI domain
    assignment.
 
  - Many other small features and fixes.
 
 Thanks to: Alexey Kardashevskiy, Andy Shevchenko, Arnd Bergmann, Athira Rajeev, Bagas
 Sanjaya, Christophe Leroy, Erhard Furtner, Fabiano Rosas, Greg Kroah-Hartman, Greg Kurz,
 Haowen Bai, Hari Bathini, Jason A. Donenfeld, Jason Wang, Jiang Jian, Joel Stanley, Juerg
 Haefliger, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Masahiro Yamada,
 Maxime Bizon, Miaoqian Lin, Murilo Opsfelder Araújo, Nathan Lynch, Naveen N. Rao, Nayna
 Jain, Nicholas Piggin, Ning Qiang, Pali Rohár, Petr Mladek, Rashmica Gupta, Sachin Sant,
 Scott Cheloha, Segher Boessenkool, Stephen Rothwell, Uwe Kleine-König, Wolfram Sang, Xiu
 Jianfeng, Zhouyi Zhou.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmLuAPgTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgBPpD/9kY/T0qlOXABxlZCgtqeAjPX+2xpnY
 BF+TlsN1TS1auFcEZL2BapmVacsvOeGEFDVuZHZvZJc69Hx+gSjnjFCnZjp6n+Yz
 wt6y9w9Pu0t/sjD5vNQ46O15/dXqm6RoVI7um12j/WLMN8Ko5+x3gKAyQONjQd2/
 1kPcxVH6FUosAdnCuvIcqCX4e4IIHl2ZkitHOTXoQUvUy9oAK/mOBnwqZ6zLGUKC
 E5M+Zyt4RFGxhPs48FkX6Nq6crDGU/P0VJpDKkR/t7GHnE67Bm70gZougAPrzrgP
 nx8zoTWgDKpqDeuqK7pFcyKgNS3dKbxsN3sAfKHOWu/YnV4wMyy+7fmwagMauki7
 lXccKN6F/r+8JcMNx80Jp/dAw3ZdLceP38M3Ryf8IL6lTfkNySumUvrKJn6r1Cu1
 wvzhgyEuDawss9KHdEmXcA2i3+XVZvitaipO7JWUC8pblrP1SJMoPfIIe9zh3y3M
 pyZj0TcGJ8XaK+badvI+PW/K/KeRgXEY8HpC3wDHSoIkli3OE4jDwXn6TiZgvm3n
 k0sKL8YSmQZ8hP8QAkR+r8NQKYqLlfyPxdslK5omDPxfub5Uzk9ZV2Ep7svkaiQn
 Wqjq27Dpz8+w0XPjsQ0Tkv+ByTkOhrawOH7x9SpFLHpv9g5otcYmS79NkO/htx8C
 6LyPNx1VYn5IRA==
 =tRkm
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Add support for syscall stack randomization

 - Add support for atomic operations to the 32 & 64-bit BPF JIT

 - Full support for KASAN on 64-bit Book3E

 - Add a watchdog driver for the new PowerVM hypervisor watchdog

 - Add a number of new selftests for the Power10 PMU support

 - Add a driver for the PowerVM Platform KeyStore

 - Increase the NMI watchdog timeout during live partition migration, to
   avoid timeouts due to increased memory access latency

 - Add support for using the 'linux,pci-domain' device tree property for
   PCI domain assignment

 - Many other small features and fixes

Thanks to Alexey Kardashevskiy, Andy Shevchenko, Arnd Bergmann, Athira
Rajeev, Bagas Sanjaya, Christophe Leroy, Erhard Furtner, Fabiano Rosas,
Greg Kroah-Hartman, Greg Kurz, Haowen Bai, Hari Bathini, Jason A.
Donenfeld, Jason Wang, Jiang Jian, Joel Stanley, Juerg Haefliger, Kajol
Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Masahiro Yamada,
Maxime Bizon, Miaoqian Lin, Murilo Opsfelder Araújo, Nathan Lynch,
Naveen N.  Rao, Nayna Jain, Nicholas Piggin, Ning Qiang, Pali Rohár,
Petr Mladek, Rashmica Gupta, Sachin Sant, Scott Cheloha, Segher
Boessenkool, Stephen Rothwell, Uwe Kleine-König, Wolfram Sang, Xiu
Jianfeng, and Zhouyi Zhou.

* tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (191 commits)
  powerpc/64e: Fix kexec build error
  EDAC/ppc_4xx: Include required of_irq header directly
  powerpc/pci: Fix PHB numbering when using opal-phbid
  powerpc/64: Init jump labels before parse_early_param()
  selftests/powerpc: Avoid GCC 12 uninitialised variable warning
  powerpc/cell/axon_msi: Fix refcount leak in setup_msi_msg_address
  powerpc/xive: Fix refcount leak in xive_get_max_prio
  powerpc/spufs: Fix refcount leak in spufs_init_isolated_loader
  powerpc/perf: Include caps feature for power10 DD1 version
  powerpc: add support for syscall stack randomization
  powerpc: Move system_call_exception() to syscall.c
  powerpc/powernv: rename remaining rng powernv_ functions to pnv_
  powerpc/powernv/kvm: Use darn for H_RANDOM on Power9
  powerpc/powernv: Avoid crashing if rng is NULL
  selftests/powerpc: Fix matrix multiply assist test
  powerpc/signal: Update comment for clarity
  powerpc: make facility_unavailable_exception 64s
  powerpc/platforms/83xx/suspend: Remove write-only global variable
  powerpc/platforms/83xx/suspend: Prevent unloading the driver
  powerpc/platforms/83xx/suspend: Reorder to get rid of a forward declaration
  ...
2022-08-06 16:38:17 -07:00
Linus Torvalds
3bd6e5854b asm-generic: updates for 6.0
There are three independent sets of changes:
 
  - Sai Prakash Ranjan adds tracing support to the asm-generic
    version of the MMIO accessors, which is intended to help
    understand problems with device drivers and has been part
    of Qualcomm's vendor kernels for many years.
 
  - A patch from Sebastian Siewior to rework the handling of
    IRQ stacks in softirqs across architectures, which is
    needed for enabling PREEMPT_RT.
 
  - The last patch to remove the CONFIG_VIRT_TO_BUS option and
    some of the code behind that, after the last users of this
    old interface made it in through the netdev, scsi, media and
    staging trees.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmLqPPEACgkQmmx57+YA
 GNlUbQ/+NpIsiA0JUrCGtySt8KrLHdA2dH9lJOR5/iuxfphscPFfWtpcPvcXQWmt
 a8u7wyI8SHW1ku4U0Y5sO0dBSldDnoIqJ5t4X5d7YNU9yVtEtucqQhZf+GkrPlVD
 1HkRu05B7y0k2BMn7BLhSvkpafs3f1lNGXjs8oFBdOF1/zwp/GjcrfCK7KFzqjwU
 dYrX0SOFlKFd4BZC75VfK+XcKg4LtwIOmJraRRl7alz2Q5Oop2hgjgZxXDPf//vn
 SPOhXJN/97i1FUpY2TkfHVH1NxbPfjCV4pUnjmLG0Y4NSy9UQ/ZcXHcywIdeuhfa
 0LySOIsAqBeccpYYYdg2ubiMDZOXkBfANu/sB9o/EhoHfB4svrbPRDhBIQZMFXJr
 MJYu+IYce2rvydA/nydo4q++pxR8v1ES1ZIo8bDux+q1CI/zbpQV+f98kPVRA0M7
 ajc+5GTIqNIsvHzzadq7eYxcj5Bi8Li2JA9sVkAQ+6iq1TVyeYayMc9eYwONlmqw
 MD+PFYc651pKtXZCfkLXPIKSwS0uPqBndAibuVhpZ0hxWaCBBdKvY9mrWcPxt0kA
 tMR8lrosbbrV2K48BFdWTOHvCs2FhHQxPGVPZ/iWuxTA0hHZ9tUlaEkSX+VM57IU
 KCYQLdWzT8J9vrgqSbgYKlb6pSPz6FIjTfut6NZMmshIbavHV/Q=
 =aTR0
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "There are three independent sets of changes:

   - Sai Prakash Ranjan adds tracing support to the asm-generic version
     of the MMIO accessors, which is intended to help understand
     problems with device drivers and has been part of Qualcomm's vendor
     kernels for many years

   - A patch from Sebastian Siewior to rework the handling of IRQ stacks
     in softirqs across architectures, which is needed for enabling
     PREEMPT_RT

   - The last patch to remove the CONFIG_VIRT_TO_BUS option and some of
     the code behind that, after the last users of this old interface
     made it in through the netdev, scsi, media and staging trees"

* tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  uapi: asm-generic: fcntl: Fix typo 'the the' in comment
  arch/*/: remove CONFIG_VIRT_TO_BUS
  soc: qcom: geni: Disable MMIO tracing for GENI SE
  serial: qcom_geni_serial: Disable MMIO tracing for geni serial
  asm-generic/io: Add logging support for MMIO accessors
  KVM: arm64: Add a flag to disable MMIO trace for nVHE KVM
  lib: Add register read/write tracing support
  drm/meson: Fix overflow implicit truncation warnings
  irqchip/tegra: Fix overflow implicit truncation warnings
  coresight: etm4x: Use asm-generic IO memory barriers
  arm64: io: Use asm-generic high level MMIO accessors
  arch/*: Disable softirq stacks on PREEMPT_RT.
2022-08-05 10:07:23 -07:00
Linus Torvalds
c1c76700a0 SPDX changes for 6.0-rc1
Here is the set of SPDX comment updates for 6.0-rc1.
 
 Nothing huge here, just a number of updated SPDX license tags and
 cleanups based on the review of a number of common patterns in GPLv2
 boilerplate text.  Also included in here are a few other minor updates,
 2 USB files, and one Documentation file update to get the SPDX lines
 correct.
 
 All of these have been in the linux-next tree for a very long time.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYupz3g8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynPUgCgslaf2ssCgW5IeuXbhla+ZBRAzisAnjVgOvLN
 4AKdqbiBNlFbCroQwmeQ
 =v1sg
 -----END PGP SIGNATURE-----

Merge tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx

Pull SPDX updates from Greg KH:
 "Here is the set of SPDX comment updates for 6.0-rc1.

  Nothing huge here, just a number of updated SPDX license tags and
  cleanups based on the review of a number of common patterns in GPLv2
  boilerplate text.

  Also included in here are a few other minor updates, two USB files,
  and one Documentation file update to get the SPDX lines correct.

  All of these have been in the linux-next tree for a very long time"

* tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (28 commits)
  Documentation: samsung-s3c24xx: Add blank line after SPDX directive
  x86/crypto: Remove stray comment terminator
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_406.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_398.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_391.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_390.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_385.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_320.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_319.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_318.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_298.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_292.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_179.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 2)
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 1)
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_160.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_152.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_149.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_147.RULE
  treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_133.RULE
  ...
2022-08-04 12:12:54 -07:00
Michael Ellerman
f4b39e88b4 powerpc/pci: Fix PHB numbering when using opal-phbid
The recent change to the PHB numbering logic has a logic error in the
handling of "ibm,opal-phbid".

When an "ibm,opal-phbid" property is present, &prop is written to and
ret is set to zero.

The following call to of_alias_get_id() is skipped because ret == 0.

But then the if (ret >= 0) is true, and the body of that if statement
sets prop = ret which throws away the value that was just read from
"ibm,opal-phbid".

Fix the logic by only doing the ret >= 0 check in the of_alias_get_id()
case.

Fixes: 0fe1e96fef ("powerpc/pci: Prefer PCI domain assignment via DT 'linux,pci-domain' and alias")
Reviewed-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220802105723.1055178-1-mpe@ellerman.id.au
2022-08-02 22:30:06 +10:00
Yury Norov
3e73120315 powerpc: drop dependency on <asm/machdep.h> in archrandom.h
archrandom.h includes <asm/machdep.h> to refer ppc_md. This causes
circular header dependency, if generic nodemask.h  includes random.h:

In file included from include/linux/cred.h:16,
                 from include/linux/seq_file.h:13,
                 from arch/powerpc/include/asm/machdep.h:6,
                 from arch/powerpc/include/asm/archrandom.h:5,
                 from include/linux/random.h:109,
                 from include/linux/nodemask.h:97,
                 from include/linux/list_lru.h:12,
                 from include/linux/fs.h:13,
                 from include/linux/compat.h:17,
                 from arch/powerpc/kernel/asm-offsets.c:12:
include/linux/sched.h:1203:9: error: unknown type name 'nodemask_t'
 1203 |         nodemask_t                      mems_allowed;
      |         ^~~~~~~~~~

Fix it by removing <asm/machdep.h> dependency from archrandom.h

Now as arch_get_random_seed_long() moved to c-file, and not exported,
it's not available for modules. As Michael Ellerman says:

  I think we actually don't need it exported to modules, I think it's
  a private detail of the RNG <-> architecture interface, not something
  that modules should be calling.

CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: Paul Mackerras <paulus@samba.org>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: linuxppc-dev@lists.ozlabs.org
Suggested-by: Michael Ellerman <mpe@ellerman.id.au> (for non-exporting)
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-08-01 08:13:21 -07:00
Zhouyi Zhou
ca829e05d3 powerpc/64: Init jump labels before parse_early_param()
On 64-bit, calling jump_label_init() in setup_feature_keys() is too
late because static keys may be used in subroutines of
parse_early_param() which is again subroutine of early_init_devtree().

For example booting with "threadirqs":

  static_key_enable_cpuslocked(): static key '0xc000000002953260' used before call to jump_label_init()
  WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 static_key_enable_cpuslocked+0xfc/0x120
  ...
  NIP static_key_enable_cpuslocked+0xfc/0x120
  LR  static_key_enable_cpuslocked+0xf8/0x120
  Call Trace:
    static_key_enable_cpuslocked+0xf8/0x120 (unreliable)
    static_key_enable+0x30/0x50
    setup_forced_irqthreads+0x28/0x40
    do_early_param+0xa0/0x108
    parse_args+0x290/0x4e0
    parse_early_options+0x48/0x5c
    parse_early_param+0x58/0x84
    early_init_devtree+0xd4/0x518
    early_setup+0xb4/0x214

So call jump_label_init() just before parse_early_param() in
early_init_devtree().

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
[mpe: Add call trace to change log and minor wording edits.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220726015747.11754-1-zhouzhouyi@gmail.com
2022-08-01 22:21:19 +10:00
Ben Dooks
787dbea11a profile: setup_profiling_timer() is moslty not implemented
The setup_profiling_timer() is mostly un-implemented by many
architectures.  In many places it isn't guarded by CONFIG_PROFILE which is
needed for it to be used.  Make it a weak symbol in kernel/profile.c and
remove the 'return -EINVAL' implementations from the kenrel.

There are a couple of architectures which do return 0 from the
setup_profiling_timer() function but they don't seem to do anything else
with it.  To keep the /proc compatibility for now, leave these for a
future update or removal.

On ARM, this fixes the following sparse warning:
arch/arm/kernel/smp.c:793:5: warning: symbol 'setup_profiling_timer' was not declared. Should it be static?

Link: https://lkml.kernel.org/r/20220721195509.418205-1-ben-linux@fluff.org
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-29 18:12:36 -07:00
Xiu Jianfeng
f4a0318f27 powerpc: add support for syscall stack randomization
Add support for adding a random offset to the stack while handling
syscalls. This patch uses mftb() instead of get_random_int() for better
performance.

In order to avoid unconditional stack canaries on syscall entry (due to
the use of alloca()), also disable stack protector to avoid triggering
needless checks and slowing down the entry path. As there is no general
way to control stack protector coverage with a function attribute, this
must be disabled at the compilation unit level.

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220701082435.126596-3-xiujianfeng@huawei.com
2022-07-28 16:22:15 +10:00
Xiu Jianfeng
1547db7d1f powerpc: Move system_call_exception() to syscall.c
This is a lead-up patch to enable syscall stack randomization, which
uses alloca() and makes the compiler add unconditional stack canaries
on syscall entry. In order to avoid triggering needless checks and
slowing down the entry path, the feature needs to disable stack
protector at the compilation unit level as there is no general way to
control stack protector coverage with a function attribute.

So move system_call_exception() to syscall.c to avoid affecting other
functions in interrupt.c.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220701082435.126596-2-xiujianfeng@huawei.com
2022-07-28 16:22:15 +10:00
Rashmica Gupta
e4787e71ae powerpc/signal: Update comment for clarity
The comment being referred to was deleted in commit af1bbc3dd3 ("powerpc:
Remove UP only lazy floating point and vector optimisations").

Add a bit more detail so it's clear why we need to clear the FP/VEC/VSX
bits here.

Signed-off-by: Rashmica Gupta <rashmica@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220617043135.426897-1-rashmica@linux.ibm.com
2022-07-28 16:22:14 +10:00
Rashmica Gupta
fcdb758ce1 powerpc: make facility_unavailable_exception 64s
The facility unavailable exception is only available on ppc book3s
machines so use CONFIG_PPC_BOOK3S_64 rather than CONFIG_PPC64.
tm_unavailable is only called from facility_unavailable_exception so can
also be under this Kconfig symbol.

Signed-off-by: Rashmica Gupta <rashmica@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220617042805.426231-1-rashmica@linux.ibm.com
2022-07-28 16:22:14 +10:00
Pali Rohár
0fe1e96fef powerpc/pci: Prefer PCI domain assignment via DT 'linux,pci-domain' and alias
Other Linux architectures use DT property 'linux,pci-domain' for
specifying fixed PCI domain of PCI controller specified in Device-Tree.

And lot of Freescale powerpc boards have defined numbered pci alias in
Device-Tree for every PCIe controller which number specify preferred PCI
domain.

So prefer usage of DT property 'linux,pci-domain' (via function
of_get_pci_domain_nr()) and DT pci alias (via function
of_alias_get_id()) on powerpc architecture for assigning PCI domain to
PCI controller.

Fixes: 63a72284b1 ("powerpc/pci: Assign fixed PHB number based on device-tree properties")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220706102148.5060-2-pali@kernel.org
2022-07-28 16:22:13 +10:00
Alexey Kardashevskiy
d80f6de9d6 powerpc/iommu: Fix iommu_table_in_use for a small default DMA window case
The existing iommu_table_in_use() helper checks if the kernel is using
any of TCEs. There are some reserved TCEs:
1) the very first one if DMA window starts from 0 to avoid having a zero
but still valid DMA handle;
2) it_reserved_start..it_reserved_end to exclude MMIO32 window in case
the default window spans across that - this is the default for the first
DMA window on PowerNV.

When 1) is the case and 2) is not the helper does not skip 1) and returns
wrong status.

This only seems occurring when passing through a PCI device to a nested
guest (not something we support really well) so it has not been seen
before.

This fixes the bug by adding a special case for no MMIO32 reservation.

Fixes: 3c33066a21 ("powerpc/kernel/iommu: Add new iommu_table_in_use() helper")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220714081119.3714605-1-aik@ozlabs.ru
2022-07-28 16:22:13 +10:00
Hari Bathini
c7255058b5 powerpc/crash: save cpu register data in crash_smp_send_stop()
During kdump, two set of NMI IPIs are sent to secondary CPUs, if
'crash_kexec_post_notifiers' option is set. The first set of NMI IPIs
to stop the CPUs and the other set to collect register data. Instead,
capture register data for secondary CPUs while stopping them itself.
Also, fallback to smp_send_stop() in case the function gets called
without kdump configured.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220630064942.192283-1-hbathini@linux.ibm.com
2022-07-28 16:22:13 +10:00
Christophe Leroy
4d5c5bad51 powerpc: Remove asm/prom.h from asm/mpc52xx.h and asm/pci.h
asm/pci.h and asm/mpc52xx.h don't need asm/prom.h

Declare struct device_node locally to avoid including of.h

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Add missing include of prom.h to of_rtc.c]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/cf5243343e2364c2b40f22ee5ad9a6e2453d1121.1657264228.git.christophe.leroy@csgroup.eu
2022-07-28 16:22:12 +10:00
Christophe Leroy
62ccae7882 powerpc: Remove remaining parts of oprofile
Commit 9850b6c693 ("arch: powerpc: Remove oprofile") removed
oprofile.

Remove all remaining parts of it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/298432fe1a14c0a415760011d72c3f0999efd5e2.1657204631.git.christophe.leroy@csgroup.eu
2022-07-27 21:36:05 +10:00
Christophe Leroy
2a0fb3c155 powerpc/32: Set an IBAT covering up to _einittext during init
Always set an IBAT covering up to _einittext during init because when
CONFIG_MODULES is not selected there is no reason to have an exception
handler for kernel instruction TLB misses.

It implies DBAT and IBAT are now totaly independent, IBATs are set
by setibat() and DBAT by setbat().

This allows to revert commit 9bb162fa26 ("powerpc/603: Fix
boot failure with DEBUG_PAGEALLOC and KFENCE")

Reported-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ce7f04a39593934d9b1ee68c69144ccd3d4da4a1.1655202804.git.christophe.leroy@csgroup.eu
2022-07-27 21:36:05 +10:00
Nicholas Piggin
f57261e698 powerpc/mce: use early_cpu_to_node() in mce_init()
cpu_to_node() is not yet available (setup_arch() is called before
setup_per_cpu_areas() by start_kernel()).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220711030653.150950-1-npiggin@gmail.com
2022-07-27 21:36:04 +10:00
Nicholas Piggin
28f07fab26 powerpc/vdso: Fix __kernel_sync_dicache sequence with coherent icache
Processors with coherent icache require the sequence sync ; icbi ; isync
to entire store->execute coherency. icbi (to any address) must be
executed to ensure isync flushes the pipeline. See "POWER9 Processor
User's Manual, 4.6.2.2 Instruction Cache Block Invalidate (icbi)" for
details.

__kernel_sync_dicache is missing icbi for the coherent icache path.
Add it.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220520123649.258440-1-npiggin@gmail.com
2022-07-27 21:36:04 +10:00
Pali Rohár
5663568130 powerpc/pci: Add config option for using all 256 PCI buses
By default on PPC32 PCI bus numbers are unique across all PCI domains.
So a system could have only 256 PCI buses independently of available PCI
domains.

This is due to filling DT property pci-OF-bus-map which does not support
a multi-domain setup.

On all powerpc platforms except chrp and powermac there is no DT
property pci-OF-bus-map anymore and therefore it is possible on
non-chrp/powermac platforms to avoid this limitation of maximum number
of 256 PCI buses in a system even on multi-domain setup.

But avoiding this limitation would mean that all PCI and PCIe devices
would be present on completely different BDF addresses as every PCI
domain starts numbering PCI bueses from zero (instead of the last bus
number of previous enumerated PCI domain). Such change could break
existing software which expects fixed PCI bus numbers.

So add a new config option CONFIG_PPC_PCI_BUS_NUM_DOMAIN_DEPENDENT which
enables this change. By default it is disabled. It causes the initial
value of hose->first_busno to be zero.

Signed-off-by: Pali Rohár <pali@kernel.org>
[mpe: Minor change log wording]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220706104308.5390-6-pali@kernel.org
2022-07-27 21:36:04 +10:00
Pali Rohár
7f102d6198 powerpc/pci: Disable filling pci-OF-bus-map for non-chrp/powermac
Creating or filling pci-OF-bus-map property in the device-tree is
deprecated since May 2006 [1] and was used only in old platforms like
PowerMac.

Currently kernel code handles it only for chrp and powermac code. So
completely disable filling pci-OF-bus-map property for non-chrp and
non-powermac platforms.

[1] - https://lore.kernel.org/linuxppc-dev/1148016268.13249.14.camel@localhost.localdomain/

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220706104308.5390-5-pali@kernel.org
2022-07-27 21:36:04 +10:00
Pali Rohár
7045445887 powerpc/pci: Hide pci_create_OF_bus_map() for non-chrp code
Function pci_create_OF_bus_map() is used only in chrp code.
So hide it from all other platforms.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220706104308.5390-4-pali@kernel.org
2022-07-27 21:36:03 +10:00
Pali Rohár
407a767182 powerpc/pci: Make pcibios_make_OF_bus_map() static
Function pcibios_make_OF_bus_map() is used only in pci_32.c. So make it
static.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220706104308.5390-3-pali@kernel.org
2022-07-27 21:36:03 +10:00
Pali Rohár
a2954a7e47 powerpc/pci: Hide pci_device_from_OF_node() for non-powermac code
Function pci_device_from_OF_node() is used only in powermac code. So
hide it from all other platforms.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220706104308.5390-2-pali@kernel.org
2022-07-27 21:36:03 +10:00
Laurent Dufour
f5e74e8360 powerpc/watchdog: introduce a NMI watchdog's factor
Introduce a factor which would apply to the NMI watchdog timeout.

This factor is a percentage added to the watchdog_tresh value. The value is
set under the watchdog_mutex protection and lockup_detector_reconfigure()
is called to recompute wd_panic_timeout_tb.

Once the factor is set, it remains until it is set back to 0, which means
no impact.

Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220713154729.80789-4-ldufour@linux.ibm.com
2022-07-27 21:36:02 +10:00
Michael Ellerman
2b461880c2 powerpc: Fix all occurences of duplicate words
Since commit 87c78b612f ("powerpc: Fix all occurences of "the the"")
fixed "the the", there's now a steady stream of patches fixing other
duplicate words.

Just fix them all at once, to save the overhead of dealing with
individual patches for each case.

This leaves a few cases of "that that", which in some contexts is
correct.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220718095158.326606-1-mpe@ellerman.id.au
2022-07-25 12:05:15 +10:00
Michael Ellerman
e7c45a0845 Merge branch 'fixes' into next
Bring in a build fix for GCC12 from our fixes branch.
2022-07-25 12:04:44 +10:00
Michael Ellerman
be640317a1 powerpc/64s: Disable stack variable initialisation for prom_init
With GCC 12 allmodconfig prom_init fails to build:

  Error: External symbol 'memset' referenced from prom_init.c
  make[2]: *** [arch/powerpc/kernel/Makefile:204: arch/powerpc/kernel/prom_init_check] Error 1

The allmodconfig build enables KASAN, so all calls to memset in
prom_init should be converted to __memset by the #ifdefs in
asm/string.h, because prom_init must use the non-KASAN instrumented
versions.

The build failure happens because there's a call to memset that hasn't
been caught by the pre-processor and converted to __memset. Typically
that's because it's a memset generated by the compiler itself, and that
is the case here.

With GCC 12, allmodconfig enables CONFIG_INIT_STACK_ALL_PATTERN, which
causes the compiler to emit memset calls to initialise on-stack
variables with a pattern.

Because prom_init is non-user-facing boot-time only code, as a
workaround just disable stack variable initialisation to unbreak the
build.

Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220718134418.354114-1-mpe@ellerman.id.au
2022-07-20 15:13:02 +10:00
Michael Ellerman
ac2a230301 Merge branch 'topic/ppc-kvm' into next
Merge KVM related commits we are keeping in a topic branch in case of
any conflicts with generic KVM changes.
2022-07-09 19:32:48 +10:00
Michael Ellerman
7e74dabc3d Merge branch 'fixes' into next
Merge our fixes branch. In particular this brings in commit
9864816180 ("powerpc/book3e: Fix PUD allocation size in
map_kernel_page()") which fixes a build failure in next, because commit
2db2008e63 ("powerpc/64e: Rewrite p4d_populate() as a static inline
function") depends on it.
2022-07-09 19:29:34 +10:00
Michael Ellerman
2a83afe72a powerpc/64: Drop ppc_inst_as_str()
The ppc_inst_as_str() macro tries to make printing variable length,
aka "prefixed", instructions convenient. It mostly succeeds, but it does
hide an on-stack buffer, which triggers stack protector.

More problematically it doesn't compile at all with GCC 12,
with -Wdangling-pointer, due to the fact that it returns the char buffer
declared inside the macro:

  arch/powerpc/kernel/trace/ftrace.c: In function '__ftrace_modify_call':
  ./include/linux/printk.h:475:44: error: using a dangling pointer to '__str' [-Werror=dangling-pointer=]
    475 | #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
    ...
  arch/powerpc/kernel/trace/ftrace.c:567:17: note: in expansion of macro 'pr_err'
    567 |                 pr_err("Not expected bl: opcode is %s\n", ppc_inst_as_str(op));
        |                 ^~~~~~
  ./arch/powerpc/include/asm/inst.h:156:14: note: '__str' declared here
    156 |         char __str[PPC_INST_STR_LEN];   \
        |              ^~~~~

This could be fixed by having the caller declare the buffer, but in some
places there'd need to be two buffers. In all cases where
ppc_inst_as_str() is used the output is not really meant for user
consumption, it's almost always indicative of a kernel bug.

A simpler solution is to just print the value as an unsigned long. For
normal instructions the output is identical. For prefixed instructions
the value is printed as a single 64-bit quantity, whereas previously the
low half was printed first. But that is good enough for debug output,
especially as prefixed instructions will be rare in kernel code in
practice.

Old:
  c000000000111170  60420000      ori     r2,r2,0
  c000000000111174  04100001 e580fb00     .long 0xe580fb0004100001

New:
  c00000000010f90c  60420000      ori     r2,r2,0
  c00000000010f910  e580fb0004100001      .long 0xe580fb0004100001

Reported-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reported-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20220531065936.3674348-1-mpe@ellerman.id.au
2022-06-29 19:37:07 +10:00
Fabiano Rosas
3f8ed993be KVM: PPC: Book3S HV: Add a new config for P8 debug timing
Turn the existing Kconfig KVM_BOOK3S_HV_EXIT_TIMING into
KVM_BOOK3S_HV_P8_TIMING in preparation for the addition of a new
config for P9 timings.

This applies only to P8 code, the generic timing code is still kept
under KVM_BOOK3S_HV_EXIT_TIMING.

Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220525130554.2614394-3-farosas@linux.ibm.com
2022-06-29 19:21:21 +10:00
Christophe Leroy
c7b9ed7c34 powerpc/64e: KASAN Full support for BOOK3E/64
We now have memory organised in a way that allows
implementing KASAN.

Unlike book3s/64, book3e always has translation active so the only
thing needed to use KASAN is to setup an early zero shadow mapping
just after setting a stack pointer and before calling early_setup().

The memory layout is now as follows

   +------------------------+  Kernel virtual map end (0xc000200000000000)
   |                        |
   |    16TB of KASAN map   |
   |                        |
   +------------------------+  Kernel KASAN shadow map start
   |                        |
   |    16TB of IO map      |
   |                        |
   +------------------------+  Kernel IO map start
   |                        |
   |    16TB of vmemmap     |
   |                        |
   +------------------------+  Kernel vmemmap start
   |                        |
   |    16TB of vmap        |
   |                        |
   +------------------------+  Kernel virt start (0xc000100000000000)
   |                        |
   |    64TB of linear mem  |
   |                        |
   +------------------------+  Kernel linear (0xc.....)

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0bef8beda27baf71e3b9e8b13e620fba6e19499b.1656427701.git.christophe.leroy@csgroup.eu
2022-06-29 17:04:15 +10:00
Christophe Leroy
3adfb457b8 powerpc/64e: Remove MMU_FTR_USE_TLBRSRV and MMU_FTR_USE_PAIRED_MAS
Commit fb5a515704 ("powerpc: Remove platforms/wsp and associated
pieces") removed the last CPU having features MMU_FTRS_A2 and
commit cd68098bce ("powerpc: Clean up MMU_FTRS_A2 and
MMU_FTR_TYPE_3E") removed MMU_FTRS_A2 which was the last user of
MMU_FTR_USE_TLBRSRV and MMU_FTR_USE_PAIRED_MAS.

Remove all code that relies on MMU_FTR_USE_TLBRSRV and
MMU_FTR_USE_PAIRED_MAS.

With this change done, TLB miss can happen before the mmu feature
fixups.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/cfd5a0ecdb1598da968832e1bddf7431ec267200.1656427701.git.christophe.leroy@csgroup.eu
2022-06-29 17:04:14 +10:00
Christophe Leroy
78f1c24abd powerpc/irq: Simplify __do_irq()
Remove duplicated code by implementing a proper if/else.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5a3b21311191f1240850db6ab29b19ac7885fe03.1654769775.git.christophe.leroy@csgroup.eu
2022-06-29 16:58:31 +10:00
Christophe Leroy
e90855be9e powerpc/irq: Perform stack_overflow detection after switching to IRQ stack
When KASAN is enabled, as shown by the Oops below, the 2k limit is not
enough to allow stack dump after a stack overflow detection when
CONFIG_DEBUG_STACKOVERFLOW is selected:

	do_IRQ: stack overflow: 1984
	CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1
	Call Trace:
	Oops: Kernel stack overflow, sig: 11 [#1]
	BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
	Modules linked in: sr_mod cdrom radeon(+) ohci_pci(+) hwmon i2c_algo_bit drm_ttm_helper ttm drm_dp_helper snd_aoa_i2sbus snd_aoa_soundbus snd_pcm ehci_pci snd_timer ohci_hcd snd ssb ehci_hcd 8250_pci soundcore drm_kms_helper pcmcia 8250 pcmcia_core syscopyarea usbcore sysfillrect 8250_base sysimgblt serial_mctrl_gpio fb_sys_fops usb_common pkcs8_key_parser fuse drm drm_panel_orientation_quirks configfs
	CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1
	NIP:  c02e5558 LR: c07eb3bc CTR: c07f46a8
	REGS: e7fe9f50 TRAP: 0000   Not tainted  (5.18.0-gentoo-PMacG4)
	MSR:  00001032 <ME,IR,DR,RI>  CR: 44a14824  XER: 20000000

	GPR00: c07eb3bc eaa1c000 c26baea0 eaa1c0a0 00000008 00000000 c07eb3bc eaa1c010
	GPR08: eaa1c0a8 04f3f3f3 f1f1f1f1 c07f4c84 44a14824 0080f7e4 00000005 00000010
	GPR16: 00000025 eaa1c154 eaa1c158 c0dbad64 00000020 fd543810 eaa1c0a0 eaa1c29e
	GPR24: c0dbad44 c0db8740 05ffffff fd543802 eaa1c150 c0c9a3c0 eaa1c0a0 c0c9a3c0
	NIP [c02e5558] kasan_check_range+0xc/0x2b4
	LR [c07eb3bc] format_decode+0x80/0x604
	Call Trace:
	[eaa1c000] [c07eb3bc] format_decode+0x80/0x604 (unreliable)
	[eaa1c070] [c07f4dac] vsnprintf+0x128/0x938
	[eaa1c110] [c07f5788] sprintf+0xa0/0xc0
	[eaa1c180] [c0154c1c] __sprint_symbol.constprop.0+0x170/0x198
	[eaa1c230] [c07ee71c] symbol_string+0xf8/0x260
	[eaa1c430] [c07f46d0] pointer+0x15c/0x710
	[eaa1c4b0] [c07f4fbc] vsnprintf+0x338/0x938
	[eaa1c550] [c00e8fa0] vprintk_store+0x2a8/0x678
	[eaa1c690] [c00e94e4] vprintk_emit+0x174/0x378
	[eaa1c6d0] [c00ea008] _printk+0x9c/0xc0
	[eaa1c750] [c000ca94] show_stack+0x21c/0x260
	[eaa1c7a0] [c07d0bd4] dump_stack_lvl+0x60/0x90
	[eaa1c7c0] [c0009234] __do_IRQ+0x170/0x174
	[eaa1c800] [c0009258] do_IRQ+0x20/0x34
	[eaa1c820] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
...

As the detection is asynchronously performed at IRQs, we could be long
after the limit has been crossed, so increasing the limit would not
solve the problem completely.

In order to be sure that there is enough stack space for the stack
dump, do it after the switch to the IRQ stack. That way it is sure
that the stack is large enough, unless the IRQ stack has been
overfilled in which case the end of life is close.

Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c215d714329f475b431a6193369035aadfc0d182.1654769775.git.christophe.leroy@csgroup.eu
2022-06-29 16:58:31 +10:00
Christophe Leroy
051bd351a2 powerpc/irq: Make __do_irq() static
Since commit 48cf12d889 ("powerpc/irq: Inline call_do_irq() and
call_do_softirq()"), __do_irq() is not used outside irq.c

Reorder functions and make __do_irq() static and
drop the declaration in irq.h.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/adbe1c8315ec2d63259f41468e82e51677bb1eda.1654769775.git.christophe.leroy@csgroup.eu
2022-06-29 16:58:27 +10:00
Christophe Leroy
41f20d6db2 powerpc/irq: Increase stack_overflow detection limit when KASAN is enabled
When KASAN is enabled, as shown by the Oops below, the 2k limit is not
enough to allow stack dump after a stack overflow detection when
CONFIG_DEBUG_STACKOVERFLOW is selected:

	do_IRQ: stack overflow: 1984
	CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1
	Call Trace:
	Oops: Kernel stack overflow, sig: 11 [#1]
	BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
	Modules linked in: sr_mod cdrom radeon(+) ohci_pci(+) hwmon i2c_algo_bit drm_ttm_helper ttm drm_dp_helper snd_aoa_i2sbus snd_aoa_soundbus snd_pcm ehci_pci snd_timer ohci_hcd snd ssb ehci_hcd 8250_pci soundcore drm_kms_helper pcmcia 8250 pcmcia_core syscopyarea usbcore sysfillrect 8250_base sysimgblt serial_mctrl_gpio fb_sys_fops usb_common pkcs8_key_parser fuse drm drm_panel_orientation_quirks configfs
	CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1
	NIP:  c02e5558 LR: c07eb3bc CTR: c07f46a8
	REGS: e7fe9f50 TRAP: 0000   Not tainted  (5.18.0-gentoo-PMacG4)
	MSR:  00001032 <ME,IR,DR,RI>  CR: 44a14824  XER: 20000000

	GPR00: c07eb3bc eaa1c000 c26baea0 eaa1c0a0 00000008 00000000 c07eb3bc eaa1c010
	GPR08: eaa1c0a8 04f3f3f3 f1f1f1f1 c07f4c84 44a14824 0080f7e4 00000005 00000010
	GPR16: 00000025 eaa1c154 eaa1c158 c0dbad64 00000020 fd543810 eaa1c0a0 eaa1c29e
	GPR24: c0dbad44 c0db8740 05ffffff fd543802 eaa1c150 c0c9a3c0 eaa1c0a0 c0c9a3c0
	NIP [c02e5558] kasan_check_range+0xc/0x2b4
	LR [c07eb3bc] format_decode+0x80/0x604
	Call Trace:
	[eaa1c000] [c07eb3bc] format_decode+0x80/0x604 (unreliable)
	[eaa1c070] [c07f4dac] vsnprintf+0x128/0x938
	[eaa1c110] [c07f5788] sprintf+0xa0/0xc0
	[eaa1c180] [c0154c1c] __sprint_symbol.constprop.0+0x170/0x198
	[eaa1c230] [c07ee71c] symbol_string+0xf8/0x260
	[eaa1c430] [c07f46d0] pointer+0x15c/0x710
	[eaa1c4b0] [c07f4fbc] vsnprintf+0x338/0x938
	[eaa1c550] [c00e8fa0] vprintk_store+0x2a8/0x678
	[eaa1c690] [c00e94e4] vprintk_emit+0x174/0x378
	[eaa1c6d0] [c00ea008] _printk+0x9c/0xc0
	[eaa1c750] [c000ca94] show_stack+0x21c/0x260
	[eaa1c7a0] [c07d0bd4] dump_stack_lvl+0x60/0x90
	[eaa1c7c0] [c0009234] __do_IRQ+0x170/0x174
	[eaa1c800] [c0009258] do_IRQ+0x20/0x34
	[eaa1c820] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
...

An investigation shows that on PPC32, calling dump_stack() requires
more than 1k when KASAN is not selected and a bit more than 2k bytes
when KASAN is selected.

On PPC64 the registers are twice the size of PPC32 registers, so the
need should be approximately twice the need on PPC32.

In the meantime we have THREAD_SIZE which is twice larger on PPC64
than PPC32, and twice larger when KASAN is selected.

So we can easily use the value of THREAD_SIZE to set the limit.

On PPC32, THREAD_SIZE is 8k without KASAN and 16k with KASAN.
On PPC64, THREAD_SIZE is 16k without KASAN.

To be on the safe side, leave 2k on PPC32 without KASAN, 4k with
KASAN, and 4k on PPC64 without KASAN. It means the limit should be
one fourth of THREAD_SIZE.

Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e8b4eb82a126c3c6c352173a544fe94609ff660b.1654261687.git.christophe.leroy@csgroup.eu
2022-06-29 16:57:57 +10:00
Christophe Leroy
98552307e3 powerpc/irq64: Remove get_irq_happened()
No need to open code the read of local_paca->irq_happened in
assembly, we have READ_ONCE() for doing the same.

Replace get_irq_happened() by READ_ONCE(local_paca->irq_happened).

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/af511b53e4eb51f8fbc51eda7f5597175e68dce6.1652859593.git.christophe.leroy@csgroup.eu
2022-06-29 16:47:43 +10:00
Christophe Leroy
7d7b28b302 powerpc/irq: Split irq.c
More than half of irq.c is dedicated to PPC64.

Move PPC64 code out of irq.c into irq_64.c

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/9f1a47de80f78d3dd270a7a72f69f55f581c4054.1652859593.git.christophe.leroy@csgroup.eu
2022-06-29 16:47:42 +10:00
Christophe Leroy
e93dee186f powerpc: Don't include asm/ppc_asm.h in other headers
asm/ppc_asm.h is not needed in any of the header it is included.

It is only needed by irq.c. Include it there and remove it from
other headers.

word-at-a-time.h only need ex_table.h, so include it instead.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e2d7b96547037f852c7ed164e4f79e8918c2607a.1651828453.git.christophe.leroy@csgroup.eu
2022-06-29 16:45:13 +10:00
Christophe Leroy
46d60bdb12 powerpc: Include asm/firmware.h in all users of firmware_has_feature()
Trying to remove asm/ppc_asm.h from all places that don't need it
leads to several failures linked to firmware_has_feature().

To fix it, include asm/firmware.h in all files using
firmware_has_feature()

All users found with:

	git grep -L "firmware\.h" ` git grep -l "firmware_has_feature("`

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/11956ec181a034b51a881ac9c059eea72c679a73.1651828453.git.christophe.leroy@csgroup.eu
2022-06-29 16:45:05 +10:00
Liam Howlett
6886da5f49 powerpc/prom_init: Fix kernel config grep
When searching for config options, use the KCONFIG_CONFIG shell variable
so that builds using non-standard config locations work.

Fixes: 26deb04342 ("powerpc: prepare string/mem functions for KASAN")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220624011745.4060795-1-Liam.Howlett@oracle.com
2022-06-24 13:47:26 +10:00
Christophe Leroy
113fe88eed powerpc: Don't include asm/setup.h in asm/machdep.h
asm/machdep.h doesn't need asm/setup.h

Remove it.

Add it directly in files that needs it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3b1dfb19a2c3265fb4abc2bfc7b6eae9261a998b.1654966508.git.christophe.leroy@csgroup.eu
2022-06-20 11:29:49 +10:00
Christophe Leroy
ca5dabcff1 powerpc/prom_init: Fix build failure with GCC_PLUGIN_STRUCTLEAK_BYREF_ALL and KASAN
When CONFIG_KASAN is selected, we expect prom_init to use __memset()
because it is too early to use memset().

But with CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL, the compiler adds calls
to memset() to clear objects on stack, hence the following failure:

	  PROMCHK arch/powerpc/kernel/prom_init_check
	Error: External symbol 'memset' referenced from prom_init.c
	make[2]: *** [arch/powerpc/kernel/Makefile:204 : arch/powerpc/kernel/prom_init_check] Erreur 1

prom_find_machine_type() is called from prom_init() and is called only
once, so lets put compat[] in BSS instead of stack to avoid that.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3802811f7cf94f730be44688539c01bba3a3b5c0.1654875808.git.christophe.leroy@csgroup.eu
2022-06-19 21:57:56 +10:00
Andrew Donnellan
7bc08056a6 powerpc/rtas: Allow ibm,platform-dump RTAS call with null buffer address
Add a special case to block_rtas_call() to allow the ibm,platform-dump RTAS
call through the RTAS filter if the buffer address is 0.

According to PAPR, ibm,platform-dump is called with a null buffer address
to notify the platform firmware that processing of a particular dump is
finished.

Without this, on a pseries machine with CONFIG_PPC_RTAS_FILTER enabled, an
application such as rtas_errd that is attempting to retrieve a dump will
encounter an error at the end of the retrieval process.

Fixes: bd59380c5b ("powerpc/rtas: Restrict RTAS requests from userspace")
Cc: stable@vger.kernel.org
Reported-by: Sathvika Vasireddy <sathvika@linux.ibm.com>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220614134952.156010-1-ajd@linux.ibm.com
2022-06-18 10:19:10 +10:00
Naveen N. Rao
ec6d0dde71 powerpc: Enable execve syscall exit tracepoint
On execve[at], we are zero'ing out most of the thread register state
including gpr[0], which contains the syscall number. Due to this, we
fail to trigger the syscall exit tracepoint properly. Fix this by
retaining gpr[0] in the thread register state.

Before this patch:
  # tail /sys/kernel/debug/tracing/trace
	       cat-123     [000] .....    61.449351: sys_execve(filename:
  7fffa6b23448, argv: 7fffa6b233e0, envp: 7fffa6b233f8)
	       cat-124     [000] .....    62.428481: sys_execve(filename:
  7fffa6b23448, argv: 7fffa6b233e0, envp: 7fffa6b233f8)
	      echo-125     [000] .....    65.813702: sys_execve(filename:
  7fffa6b23378, argv: 7fffa6b233a0, envp: 7fffa6b233b0)
	      echo-125     [000] .....    65.822214: sys_execveat(fd: 0,
  filename: 1009ac48, argv: 7ffff65d0c98, envp: 7ffff65d0ca8, flags: 0)

After this patch:
  # tail /sys/kernel/debug/tracing/trace
	       cat-127     [000] .....   100.416262: sys_execve(filename:
  7fffa41b3448, argv: 7fffa41b33e0, envp: 7fffa41b33f8)
	       cat-127     [000] .....   100.418203: sys_execve -> 0x0
	      echo-128     [000] .....   103.873968: sys_execve(filename:
  7fffa41b3378, argv: 7fffa41b33a0, envp: 7fffa41b33b0)
	      echo-128     [000] .....   103.875102: sys_execve -> 0x0
	      echo-128     [000] .....   103.882097: sys_execveat(fd: 0,
  filename: 1009ac48, argv: 7fffd10d2148, envp: 7fffd10d2158, flags: 0)
	      echo-128     [000] .....   103.883225: sys_execveat -> 0x0

Cc: stable@vger.kernel.org
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Tested-by: Sumit Dubey2 <Sumit.Dubey2@ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220609103328.41306-1-naveen.n.rao@linux.vnet.ibm.com
2022-06-18 10:19:10 +10:00
Michael Ellerman
6cf06c17e9 powerpc/mm: Move CMA reservations after initmem_init()
After commit 11ac3e87ce ("mm: cma: use pageblock_order as the single
alignment") there is an error at boot about the KVM CMA reservation
failing, eg:

  kvm_cma_reserve: reserving 6553 MiB for global area
  cma: Failed to reserve 6553 MiB

That makes it impossible to start KVM guests using the hash MMU with
more than 2G of memory, because the VM is unable to allocate a large
enough region for the hash page table, eg:

  $ qemu-system-ppc64 -enable-kvm -M pseries -m 4G ...
  qemu-system-ppc64: Failed to allocate KVM HPT of order 25: Cannot allocate memory

Aneesh pointed out that this happens because when kvm_cma_reserve() is
called, pageblock_order has not been initialised yet, and is still zero,
causing the checks in cma_init_reserved_mem() against
CMA_MIN_ALIGNMENT_PAGES to fail.

Fix it by moving the call to kvm_cma_reserve() after initmem_init(). The
pageblock_order is initialised in sparse_init() which is called from
initmem_init().

Also move the hugetlb CMA reservation.

Fixes: 11ac3e87ce ("mm: cma: use pageblock_order as the single alignment")
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220616120033.1976732-1-mpe@ellerman.id.au
2022-06-18 10:18:55 +10:00
Sebastian Andrzej Siewior
f2c5092190 arch/*: Disable softirq stacks on PREEMPT_RT.
PREEMPT_RT preempts softirqs and the current implementation avoids
do_softirq_own_stack() and only uses __do_softirq().

Disable the unused softirqs stacks on PREEMPT_RT to save some memory and
ensure that do_softirq_own_stack() is not used bwcause it is not expected.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-15 17:40:59 +02:00
Thomas Gleixner
577b61cee5 treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_398.RULE
Based on the normalized pattern:

    this file is licensed under the terms of the gnu general public
    license version 2 this program as licensed as is without any warranty
    of any kind whether express or implied

extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

has been chosen to replace the boilerplate/reference.

Reviewed-by: Allison Randal <allison@lohutok.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-10 14:51:37 +02:00
Linus Torvalds
95fc76c81b powerpc fixes for 5.19 #2
- On 32-bit fix overread/overwrite of thread_struct via ptrace PEEK/POKE.
 
  - Fix softirqs not switching to the softirq stack since we moved irq_exit().
 
  - Force thread size increase when KASAN is enabled to avoid stack overflows.
 
  - On Book3s 64 mark more code as not to be instrumented by KASAN to avoid crashes.
 
  - Exempt __get_wchan() from KASAN checking, as it's inherently racy.
 
  - Fix a recently introduced crash in the papr_scm driver in some configurations.
 
  - Remove include of <generated/compile.h> which is forbidden.
 
 Thanks to: Ariel Miculas, Chen Jingwen, Christophe Leroy, Erhard Furtner, He Ying, Kees
 Cook, Masahiro Yamada, Nageswara R Sastry, Paul Mackerras, Sachin Sant, Vaibhav Jain,
 Wanming Hu.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmKiBo8THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgKnND/9wmr3nlA9gaXYCFyfnH5m8u/x7gNFs
 yW150WbcF/9FkhGwGED1kozyCrKqNt9UmBdyEqDjdtHE3ydW3dRnNaBG1wqLzlJd
 BpRUfe8VW5hR7aWkX7Vqzx7bnZACb2bxXv1upYDSeWDF0Bk7szdW2y2ucvRhx8Vv
 8hxDsqIo+AsJV9oP6WSkIMFst0GDVBhbnd70zgu/4j9AYgWlfyJoFRw3vwUMWzS9
 sW6niAZd9PsFboJkBj0LYxPQ6nHXUFEQp5OIn/ZPHmzHW4rnuYao8qgfzWSfxvA1
 LcKLLtvobR9t0jK7SEh7efyOoIY3iprPNtP00jXdvtKMI8gv9bxCNyZBKap1TgQN
 TDexS6ShwcoHTMV6HyR4zA6OqFCn/stsuUNIrywPUvsEBgUPjRLiA2Nzk8wbEGG2
 DXIBqYaNHIr34bRxjQ7K6JzovCGF/VOoRiJqwaAEujz4R164fuw7td/yuflLjL0Q
 JtT9DHXdzskzi+JWxHhsK3sLSHY5BInxt5GNNVsbecgpeizGFjwpGsxhWFOZNrBG
 P8zy7FZ7EzZe1a3qYdFROwZY3kkR4Rtp207ZdgAoWjTrt62ACwyx5HheaxGx+0R1
 LL8LbHPxoWXJyQomMR1zVIzNhivcPqlH+yGkVaZJJY971HxjaCxIbbrH7rNpOf+w
 2l5lxjzq5WrCNQ==
 =Odes
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - On 32-bit fix overread/overwrite of thread_struct via ptrace
   PEEK/POKE.

 - Fix softirqs not switching to the softirq stack since we moved
   irq_exit().

 - Force thread size increase when KASAN is enabled to avoid stack
   overflows.

 - On Book3s 64 mark more code as not to be instrumented by KASAN to
   avoid crashes.

 - Exempt __get_wchan() from KASAN checking, as it's inherently racy.

 - Fix a recently introduced crash in the papr_scm driver in some
   configurations.

 - Remove include of <generated/compile.h> which is forbidden.

Thanks to Ariel Miculas, Chen Jingwen, Christophe Leroy, Erhard Furtner,
He Ying, Kees Cook, Masahiro Yamada, Nageswara R Sastry, Paul Mackerras,
Sachin Sant, Vaibhav Jain, and Wanming Hu.

* tag 'powerpc-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/32: Fix overread/overwrite of thread_struct via ptrace
  powerpc/book3e: get rid of #include <generated/compile.h>
  powerpc/kasan: Force thread size increase with KASAN
  powerpc/papr_scm: don't requests stats with '0' sized stats buffer
  powerpc: Don't select HAVE_IRQ_EXIT_ON_IRQ_STACK
  powerpc/kasan: Silence KASAN warnings in __get_wchan()
  powerpc/kasan: Mark more real-mode code as not to be instrumented
2022-06-09 12:17:43 -07:00
Michael Ellerman
8e12784444 powerpc/32: Fix overread/overwrite of thread_struct via ptrace
The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process
to read/write registers of another process.

To get/set a register, the API takes an index into an imaginary address
space called the "USER area", where the registers of the process are
laid out in some fashion.

The kernel then maps that index to a particular register in its own data
structures and gets/sets the value.

The API only allows a single machine-word to be read/written at a time.
So 4 bytes on 32-bit kernels and 8 bytes on 64-bit kernels.

The way floating point registers (FPRs) are addressed is somewhat
complicated, because double precision float values are 64-bit even on
32-bit CPUs. That means on 32-bit kernels each FPR occupies two
word-sized locations in the USER area. On 64-bit kernels each FPR
occupies one word-sized location in the USER area.

Internally the kernel stores the FPRs in an array of u64s, or if VSX is
enabled, an array of pairs of u64s where one half of each pair stores
the FPR. Which half of the pair stores the FPR depends on the kernel's
endianness.

To handle the different layouts of the FPRs depending on VSX/no-VSX and
big/little endian, the TS_FPR() macro was introduced.

Unfortunately the TS_FPR() macro does not take into account the fact
that the addressing of each FPR differs between 32-bit and 64-bit
kernels. It just takes the index into the "USER area" passed from
userspace and indexes into the fp_state.fpr array.

On 32-bit there are 64 indexes that address FPRs, but only 32 entries in
the fp_state.fpr array, meaning the user can read/write 256 bytes past
the end of the array. Because the fp_state sits in the middle of the
thread_struct there are various fields than can be overwritten,
including some pointers. As such it may be exploitable.

It has also been observed to cause systems to hang or otherwise
misbehave when using gdbserver, and is probably the root cause of this
report which could not be easily reproduced:
  https://lore.kernel.org/linuxppc-dev/dc38afe9-6b78-f3f5-666b-986939e40fc6@keymile.com/

Rather than trying to make the TS_FPR() macro even more complicated to
fix the bug, or add more macros, instead add a special-case for 32-bit
kernels. This is more obvious and hopefully avoids a similar bug
happening again in future.

Note that because 32-bit kernels never have VSX enabled the code doesn't
need to consider TS_FPRWIDTH/OFFSET at all. Add a BUILD_BUG_ON() to
ensure that 32-bit && VSX is never enabled.

Fixes: 87fec0514f ("powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR registers in little endian builds")
Cc: stable@vger.kernel.org # v3.13+
Reported-by: Ariel Miculas <ariel.miculas@belden.com>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220609133245.573565-1-mpe@ellerman.id.au
2022-06-09 23:32:56 +10:00
Linus Torvalds
1ec6574a3c This set of changes updates init and user mode helper tasks to be
ordinary user mode tasks.
 
 In commit 40966e316f ("kthread: Ensure struct kthread is present for
 all kthreads") caused init and the user mode helper threads that call
 kernel_execve to have struct kthread allocated for them.  This struct
 kthread going away during execve in turned made a use after free of
 struct kthread possible.
 
 The commit 343f4c49f2 ("kthread: Don't allocate kthread_struct for
 init and umh") is enough to fix the use after free and is simple enough
 to be backportable.
 
 The rest of the changes pass struct kernel_clone_args to clean things
 up and cause the code to make sense.
 
 In making init and the user mode helpers tasks purely user mode tasks
 I ran into two complications.  The function task_tick_numa was
 detecting tasks without an mm by testing for the presence of
 PF_KTHREAD.  The initramfs code in populate_initrd_image was using
 flush_delayed_fput to ensuere the closing of all it's file descriptors
 was complete, and flush_delayed_fput does not work in a userspace thread.
 
 I have looked and looked and more complications and in my code review
 I have not found any, and neither has anyone else with the code sitting
 in linux-next.
 
 Link: https://lkml.kernel.org/r/87mtfu4up3.fsf@email.froward.int.ebiederm.org
 
 Eric W. Biederman (8):
       kthread: Don't allocate kthread_struct for init and umh
       fork: Pass struct kernel_clone_args into copy_thread
       fork: Explicity test for idle tasks in copy_thread
       fork: Generalize PF_IO_WORKER handling
       init: Deal with the init process being a user mode process
       fork: Explicitly set PF_KTHREAD
       fork: Stop allowing kthreads to call execve
       sched: Update task_tick_numa to ignore tasks without an mm
 
  arch/alpha/kernel/process.c      | 13 ++++++------
  arch/arc/kernel/process.c        | 13 ++++++------
  arch/arm/kernel/process.c        | 12 ++++++-----
  arch/arm64/kernel/process.c      | 12 ++++++-----
  arch/csky/kernel/process.c       | 15 ++++++-------
  arch/h8300/kernel/process.c      | 10 ++++-----
  arch/hexagon/kernel/process.c    | 12 ++++++-----
  arch/ia64/kernel/process.c       | 15 +++++++------
  arch/m68k/kernel/process.c       | 12 ++++++-----
  arch/microblaze/kernel/process.c | 12 ++++++-----
  arch/mips/kernel/process.c       | 13 ++++++------
  arch/nios2/kernel/process.c      | 12 ++++++-----
  arch/openrisc/kernel/process.c   | 12 ++++++-----
  arch/parisc/kernel/process.c     | 18 +++++++++-------
  arch/powerpc/kernel/process.c    | 15 +++++++------
  arch/riscv/kernel/process.c      | 12 ++++++-----
  arch/s390/kernel/process.c       | 12 ++++++-----
  arch/sh/kernel/process_32.c      | 12 ++++++-----
  arch/sparc/kernel/process_32.c   | 12 ++++++-----
  arch/sparc/kernel/process_64.c   | 12 ++++++-----
  arch/um/kernel/process.c         | 15 +++++++------
  arch/x86/include/asm/fpu/sched.h |  2 +-
  arch/x86/include/asm/switch_to.h |  8 +++----
  arch/x86/kernel/fpu/core.c       |  4 ++--
  arch/x86/kernel/process.c        | 18 +++++++++-------
  arch/xtensa/kernel/process.c     | 17 ++++++++-------
  fs/exec.c                        |  8 ++++---
  include/linux/sched/task.h       |  8 +++++--
  init/initramfs.c                 |  2 ++
  init/main.c                      |  2 +-
  kernel/fork.c                    | 46 +++++++++++++++++++++++++++++++++-------
  kernel/sched/fair.c              |  2 +-
  kernel/umh.c                     |  6 +++---
  33 files changed, 234 insertions(+), 160 deletions(-)
 
 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEgjlraLDcwBA2B+6cC/v6Eiajj0AFAmKaR/MACgkQC/v6Eiaj
 j0Aayg/7Bx66872d9c6igkJ+MPCTuh+v9QKCGwiYEmiU4Q5sVAFB0HPJO27qC14u
 630X0RFNZTkPzNNEJNIW4kw6Dj8s8YRKf+FgQAVt4SzdRwT7eIPDjk1nGraopPJ3
 O04pjvuTmUyidyViRyFcf2ptx/pnkrwP8jUSc+bGTgfASAKAgAokqKE5ecjewbBc
 Y/EAkQ6QW7KxPjeSmpAHwI+t3BpBev9WEC4PbhRhsBCQFO2+PJiklvqdhVNBnIjv
 qUezll/1xv9UYgniB15Q4Nb722SmnWSU3r8as1eFPugzTHizKhufrrpyP+KMK1A0
 tdtEJNs5t2DZF7ZbGTFSPqJWmyTYLrghZdO+lOmnaSjHxK4Nda1d4NzbefJ0u+FE
 tutewowvHtBX6AFIbx+H3O+DOJM2IgNMf+ReQDU/TyNyVf3wBrTbsr9cLxypIJIp
 zze8npoLMlB7B4yxVo5ES5e63EXfi3iHl0L3/1EhoGwriRz1kWgVLUX/VZOUpscL
 RkJHsW6bT8sqxPWAA5kyWjEN+wNR2PxbXi8OE4arT0uJrEBMUgDCzydzOv5tJB00
 mSQdytxH9LVdsmxBKAOBp5X6WOLGA4yb1cZ6E/mEhlqXMpBDF1DaMfwbWqxSYi4q
 sp5zU3SBAW0qceiZSsWZXInfbjrcQXNV/DkDRDO9OmzEZP4m1j0=
 =x6fy
 -----END PGP SIGNATURE-----

Merge tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace

Pull kthread updates from Eric Biederman:
 "This updates init and user mode helper tasks to be ordinary user mode
  tasks.

  Commit 40966e316f ("kthread: Ensure struct kthread is present for
  all kthreads") caused init and the user mode helper threads that call
  kernel_execve to have struct kthread allocated for them. This struct
  kthread going away during execve in turned made a use after free of
  struct kthread possible.

  Here, commit 343f4c49f2 ("kthread: Don't allocate kthread_struct for
  init and umh") is enough to fix the use after free and is simple
  enough to be backportable.

  The rest of the changes pass struct kernel_clone_args to clean things
  up and cause the code to make sense.

  In making init and the user mode helpers tasks purely user mode tasks
  I ran into two complications. The function task_tick_numa was
  detecting tasks without an mm by testing for the presence of
  PF_KTHREAD. The initramfs code in populate_initrd_image was using
  flush_delayed_fput to ensuere the closing of all it's file descriptors
  was complete, and flush_delayed_fput does not work in a userspace
  thread.

  I have looked and looked and more complications and in my code review
  I have not found any, and neither has anyone else with the code
  sitting in linux-next"

* tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  sched: Update task_tick_numa to ignore tasks without an mm
  fork: Stop allowing kthreads to call execve
  fork: Explicitly set PF_KTHREAD
  init: Deal with the init process being a user mode process
  fork: Generalize PF_IO_WORKER handling
  fork: Explicity test for idle tasks in copy_thread
  fork: Pass struct kernel_clone_args into copy_thread
  kthread: Don't allocate kthread_struct for init and umh
2022-06-03 16:03:05 -07:00
Linus Torvalds
7c9e960c63 Livepatching changes for 5.19
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmKYlREACgkQUqAMR0iA
 lPLHUQ/9EdD1kQsxuvZRTBAr79WmAI9vS41R3owXRUhmpFf0pSjIkD23gRni1Y+a
 f2NVxQY8y9mLUEp/YXeDUX5lqdCDD2iJtin+ZlUGv8+F7rRlMyIEDoudSUKsTRlF
 ufkzKE00tqUkz7J/U1KEbD2u/nWFS5q8DyHc81pC0bHpkiBT8y+wgEtIQy1oGVqV
 OilcHPXpQUV/sw1RsaRGdwJAgJSJcoHk57JelWWlV5fo0ogK1gjIl/kUadAChNzT
 2o8zoWu6fQlqCAI1AJkKLngdcybbXPwWKclcGUze1sfR9+fGGRzqorwAupm5bmkp
 1oZsODqxDdVfxz2/VMtdjjacm3ECJmznqBumdMrdM3WVjh944xkb7xVOp/1xJWfc
 wrmQy4dshXa4OTnvAivBYlgbaUzld5HPQD/v89KBLP3SJkd6p9PVT+qa87hjJ2uH
 sRDjtZxTkcKKoYU8CnASFBkaOHYHOWbRHbEjCkIgo2nHXE4W3fOnaQKBF3jUiZJr
 OimQMWjAqtXh8Gb0B7IIPxuGFjAjKepTEdN+Jwk8/eqXNTFqgGPbu1hOf+zpq/oj
 ekMIy3CUIsempqZfdrJlMOAGIEsx5JGBrdC5KwAEsHymW1XNJVrkpLtTFrtsF1tW
 X6qWzhkNMkOv9/0yoR2GmXiBR10pcXGMsd+EvNewpTIDCcjkYxI=
 =X4cs
 -----END PGP SIGNATURE-----

Merge tag 'livepatching-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching

Pull livepatching cleanup from Petr Mladek:

 - Remove duplicated livepatch code [Christophe]

* tag 'livepatching-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
  livepatch: Remove klp_arch_set_pc() and asm/livepatch.h
2022-06-02 08:55:01 -07:00
Linus Torvalds
1ff7bc3ba7 More power management updates for 5.19-rc1
- Add Tegra234 cpufreq support (Sumit Gupta).
 
  - Clean up and enhance the Mediatek cpufreq driver (Wan Jiabing,
    Rex-BC Chen, and Jia-Wei Chang).
 
  - Fix up the CPPC cpufreq driver after recent changes (Zheng Bin,
    Pierre Gondois).
 
  - Minor update to dt-binding for Qcom's opp-v2-kryo-cpu (Yassine
    Oudjana).
 
  - Use list iterator only inside the list_for_each_entry loop (Xiaomeng
    Tong, and Jakob Koschel).
 
  - New APIs related to finding OPP based on interconnect bandwidth
    (Krzysztof Kozlowski).
 
  - Fix the missing of_node_put() in _bandwidth_supported() (Dan
    Carpenter).
 
  - Cleanups (Krzysztof Kozlowski, and Viresh Kumar).
 
  - Add Out of Band mode description to the intel-speed-select utility
    documentation (Srinivas Pandruvada).
 
  - Add power sequences support to the system reboot and power off
    code and make related platform-specific changes for multiple
    platforms (Dmitry Osipenko, Geert Uytterhoeven).
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmKU8lESHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxVz0P91LNCbkDSt60jzNkXdEjsvUnI/YjJ+QJ
 /+ta7iCwf90obb6s9soBkTyU8Ia7hJ/IWDJW/5xhdG0ySYF17hGNIGKK9xKGsJFK
 tzzWtjFsvT3PeUZQERekqWp8OYskHYmQMj8o4jqqFF7DZD/AswTgkVLALUd7YhVL
 UvLmcKsUA7eXy3ZrhtrGSzVSEbKOGXBLFyjy3IuWjfz6Uk/nGQRNKGf7byRWLM44
 y7zb75/5+p4MPyyJP8M/uiXzEYDKuubRtfx9PdmLgBUSMbtho6eB1x47dZWooaxe
 YKmcFjF80AmnwxHb+Te2rZHPeIYr+5hLBaEq7xaLQf/nAS3y5z1PIfI2wVQ5mXPz
 D599jHHda/6oSAKCVTq2fKfnlR6fetm5j66xOQINpD+G5b5tNSpllXJDamFZxFgP
 DiQAOFzdnRYnK7yTiLWVl1q76SVRxqsGz7/5Ak+NRj2OQK2wRkLzHuZfiV/8r0pk
 ksi6Ew9TerXkstoTQsSToPQxB2VvosSajNU3Oy27pmM0oal1XxP0LIPz9sMor5/g
 tfk5f6Yz/+FFIfXj3cZffZNdhsJgejmcqPdrSdCOV3sBrblnIMQNpHiYg4jGztoj
 IjYKYPVpSaWiSZLQOaK2moTEvm9CfQz1TQCF+/Kz88LX6/7ZaDJFxHG2FDEob0sg
 6KVbrZWweLI=
 =PAh+
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:
 "These update the ARM cpufreq drivers and fix up the CPPC cpufreq
  driver after recent changes, update the OPP code and PM documentation
  and add power sequences support to the system reboot and power off
  code.

  Specifics:

   - Add Tegra234 cpufreq support (Sumit Gupta)

   - Clean up and enhance the Mediatek cpufreq driver (Wan Jiabing,
     Rex-BC Chen, and Jia-Wei Chang)

   - Fix up the CPPC cpufreq driver after recent changes (Zheng Bin,
     Pierre Gondois)

   - Minor update to dt-binding for Qcom's opp-v2-kryo-cpu (Yassine
     Oudjana)

   - Use list iterator only inside the list_for_each_entry loop
     (Xiaomeng Tong, and Jakob Koschel)

   - New APIs related to finding OPP based on interconnect bandwidth
     (Krzysztof Kozlowski)

   - Fix the missing of_node_put() in _bandwidth_supported() (Dan
     Carpenter)

   - Cleanups (Krzysztof Kozlowski, and Viresh Kumar)

   - Add Out of Band mode description to the intel-speed-select utility
     documentation (Srinivas Pandruvada)

   - Add power sequences support to the system reboot and power off code
     and make related platform-specific changes for multiple platforms
     (Dmitry Osipenko, Geert Uytterhoeven)"

* tag 'pm-5.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (60 commits)
  cpufreq: CPPC: Fix unused-function warning
  cpufreq: CPPC: Fix build error without CONFIG_ACPI_CPPC_CPUFREQ_FIE
  Documentation: admin-guide: PM: Add Out of Band mode
  kernel/reboot: Change registration order of legacy power-off handler
  m68k: virt: Switch to new sys-off handler API
  kernel/reboot: Add devm_register_restart_handler()
  kernel/reboot: Add devm_register_power_off_handler()
  soc/tegra: pmc: Use sys-off handler API to power off Nexus 7 properly
  reboot: Remove pm_power_off_prepare()
  regulator: pfuze100: Use devm_register_sys_off_handler()
  ACPI: power: Switch to sys-off handler API
  memory: emif: Use kernel_can_power_off()
  mips: Use do_kernel_power_off()
  ia64: Use do_kernel_power_off()
  x86: Use do_kernel_power_off()
  sh: Use do_kernel_power_off()
  m68k: Switch to new sys-off handler API
  powerpc: Use do_kernel_power_off()
  xen/x86: Use do_kernel_power_off()
  parisc: Use do_kernel_power_off()
  ...
2022-05-30 11:37:26 -07:00
He Ying
a1b29ba2f2 powerpc/kasan: Silence KASAN warnings in __get_wchan()
The following KASAN warning was reported in our kernel.

  BUG: KASAN: stack-out-of-bounds in get_wchan+0x188/0x250
  Read of size 4 at addr d216f958 by task ps/14437

  CPU: 3 PID: 14437 Comm: ps Tainted: G           O      5.10.0 #1
  Call Trace:
  [daa63858] [c0654348] dump_stack+0x9c/0xe4 (unreliable)
  [daa63888] [c035cf0c] print_address_description.constprop.3+0x8c/0x570
  [daa63908] [c035d6bc] kasan_report+0x1ac/0x218
  [daa63948] [c00496e8] get_wchan+0x188/0x250
  [daa63978] [c0461ec8] do_task_stat+0xce8/0xe60
  [daa63b98] [c0455ac8] proc_single_show+0x98/0x170
  [daa63bc8] [c03cab8c] seq_read_iter+0x1ec/0x900
  [daa63c38] [c03cb47c] seq_read+0x1dc/0x290
  [daa63d68] [c037fc94] vfs_read+0x164/0x510
  [daa63ea8] [c03808e4] ksys_read+0x144/0x1d0
  [daa63f38] [c005b1dc] ret_from_syscall+0x0/0x38
  --- interrupt: c00 at 0x8fa8f4
      LR = 0x8fa8cc

  The buggy address belongs to the page:
  page:98ebcdd2 refcount:0 mapcount:0 mapping:00000000 index:0x2 pfn:0x1216f
  flags: 0x0()
  raw: 00000000 00000000 01010122 00000000 00000002 00000000 ffffffff 00000000
  raw: 00000000
  page dumped because: kasan: bad access detected

  Memory state around the buggy address:
   d216f800: 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00
   d216f880: f2 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  >d216f900: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
                                            ^
   d216f980: f2 f2 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00 00
   d216fa00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

After looking into this issue, I find the buggy address belongs
to the task stack region. It seems KASAN has something wrong.
I look into the code of __get_wchan in x86 architecture and
find the same issue has been resolved by the commit
f7d27c35dd ("x86/mm, kasan: Silence KASAN warnings in get_wchan()").
The solution could be applied to powerpc architecture too.

As Andrey Ryabinin said, get_wchan() is racy by design, it may
access volatile stack of running task, thus it may access
redzone in a stack frame and cause KASAN to warn about this.

Use READ_ONCE_NOCHECK() to silence these warnings.

Reported-by: Wanming Hu <huwanming@huaweil.com>
Signed-off-by: He Ying <heying24@huawei.com>
Signed-off-by: Chen Jingwen <chenjingwen6@huawei.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220121014418.155675-1-heying24@huawei.com
2022-05-29 10:30:42 +10:00
Paul Mackerras
743cdb7bd0 powerpc/kasan: Mark more real-mode code as not to be instrumented
This marks more files and functions that can possibly be called in
real mode as not to be instrumented by KASAN.  Most were found by
inspection, except for get_pseries_errorlog() which was reported as
causing a crash in testing.

Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YoX1kZPnmUX4RZEK@cleo
2022-05-29 10:30:42 +10:00
Linus Torvalds
6112bd00e8 powerpc updates for 5.19
- Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT).
 
  - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later).
 
  - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ.
 
  - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later.
 
  - Drop support for system call instruction emulation.
 
  - Many other small features and fixes.
 
 Thanks to: Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas Sanjaya, Bjorn
 Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian King, Daniel Axtens, Dwaipayan
 Ray, Fabiano Rosas, Finn Thain, Frank Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu
 Hua, Haowen Bai, Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing
 Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof Kozlowski, Laurent
 Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes, Miaoqian Lin, Minghao Chi, Nathan
 Chancellor, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár,
 Paul Mackerras, Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib
 Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang wangx, Xiaomeng Tong,
 Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing, Yu Kuai, Zheng Bin, Zou Wei, Zucheng
 Zheng.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmKSEgETHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgJpLEACee7mu2I00Z7VWtW5ckT4RFbAXYZcM
 Hv5DbTnVB2ItoQMRHvG52DNbR73j9HnYrz8kpwfTBVk90udxVP14L/swXDs3xbT4
 riXEYtJ1DRVc/bLiOK637RLPWNrmmZStWZme7k0Y9Ki5Aif8i1Erjjq7EIy47m9j
 j1MTcwp3ND7IsBON2nZ3PkttEHhevKvOwCPb/BWtPMDV0OhyQUFKB2SNegrlCrkT
 wshDgdQcYqbIix98PoGa2ZfUVgFQD3JVLzXa4sLpqouzGD+HvEFStOFa2Gq/ZEvV
 zunaeXDdZUCjlib6KvA8+aumBbIQ1s/urrDbxd+3BuYxZ094vNP1B428NT1AWVtl
 3bEZQIN8GSx0v9aHxZ8HePsAMXgG9d2o0xC9EMQ430+cqroN+6UHP7lkekwkprb7
 U9EpZCG9U8jV6SDcaMigW3tooEjn657we0R8nZG2NgUNssdSHVh/JYxGDALPXIAk
 awL3NQrR0tYF3Y3LJm5AxdQrK1hJH8E+hZFCZvIpUXGsr/uf9Gemy/62pD1rhrr/
 niULpxIneRGkJiXB5qdGy8pRu27ED53k7Ky6+8MWSEFQl1mUsHSryYACWz939D8c
 DydhBwQqDTl6Ozs41a5TkVjIRLOCrZADUd/VZM6A4kEOqPJ5t2Gz22Bn8ya1z6Ks
 5Sx6vrGH7GnDjA==
 =15oQ
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT)

 - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later)

 - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ

 - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later

 - Drop support for system call instruction emulation

 - Many other small features and fixes

Thanks to Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas
Sanjaya, Bjorn Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian
King, Daniel Axtens, Dwaipayan Ray, Fabiano Rosas, Finn Thain, Frank
Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu Hua, Haowen Bai,
Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing
Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof
Kozlowski, Laurent Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes,
Miaoqian Lin, Minghao Chi, Nathan Chancellor, Naveen N. Rao, Nicholas
Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár, Paul Mackerras,
Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib
Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang
wangx, Xiaomeng Tong, Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing,
Yu Kuai, Zheng Bin, Zou Wei, and Zucheng Zheng.

* tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits)
  powerpc/64: Include cache.h directly in paca.h
  powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU is set
  powerpc/xics: Include missing header
  powerpc/powernv/pci: Drop VF MPS fixup
  powerpc/fsl_book3e: Don't set rodata RO too early
  powerpc/microwatt: Add mmu bits to device tree
  powerpc/powernv/flash: Check OPAL flash calls exist before using
  powerpc/powermac: constify device_node in of_irq_parse_oldworld()
  powerpc/powermac: add missing g5_phy_disable_cpu1() declaration
  selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch"
  powerpc: Enable the DAWR on POWER9 DD2.3 and above
  powerpc/64s: Add CPU_FTRS_POWER10 to ALWAYS mask
  powerpc/64s: Add CPU_FTRS_POWER9_DD2_2 to CPU_FTRS_ALWAYS mask
  powerpc: Fix all occurences of "the the"
  selftests/powerpc/pmu/ebb: remove fixed_instruction.S
  powerpc/platforms/83xx: Use of_device_get_match_data()
  powerpc/eeh: Drop redundant spinlock initialization
  powerpc/iommu: Add missing of_node_put in iommu_init_early_dart
  powerpc/pseries/vas: Call misc_deregister if sysfs init fails
  powerpc/papr_scm: Fix leaking nvdimm_events_map elements
  ...
2022-05-28 11:27:17 -07:00
Linus Torvalds
6f664045c8 Not a lot of material this cycle. Many singleton patches against various
subsystems.   Most notably some maintenance work in ocfs2 and initramfs.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYo/6xQAKCRDdBJ7gKXxA
 jkD9AQCPczLBbRWpe1edL+5VHvel9ePoHQmvbHQnufdTh9rB5QEAu0Uilxz4q9cx
 xSZypNhj2n9f8FCYca/ZrZneBsTnAA8=
 =AJEO
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2022-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc updates from Andrew Morton:
 "The non-MM patch queue for this merge window.

  Not a lot of material this cycle. Many singleton patches against
  various subsystems. Most notably some maintenance work in ocfs2
  and initramfs"

* tag 'mm-nonmm-stable-2022-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (65 commits)
  kcov: update pos before writing pc in trace function
  ocfs2: dlmfs: fix error handling of user_dlm_destroy_lock
  ocfs2: dlmfs: don't clear USER_LOCK_ATTACHED when destroying lock
  fs/ntfs: remove redundant variable idx
  fat: remove time truncations in vfat_create/vfat_mkdir
  fat: report creation time in statx
  fat: ignore ctime updates, and keep ctime identical to mtime in memory
  fat: split fat_truncate_time() into separate functions
  MAINTAINERS: add Muchun as a memcg reviewer
  proc/sysctl: make protected_* world readable
  ia64: mca: drop redundant spinlock initialization
  tty: fix deadlock caused by calling printk() under tty_port->lock
  relay: remove redundant assignment to pointer buf
  fs/ntfs3: validate BOOT sectors_per_clusters
  lib/string_helpers: fix not adding strarray to device's resource list
  kernel/crash_core.c: remove redundant check of ck_cmdline
  ELF, uapi: fixup ELF_ST_TYPE definition
  ipc/mqueue: use get_tree_nodev() in mqueue_get_tree()
  ipc: update semtimedop() to use hrtimer
  ipc/sem: remove redundant assignments
  ...
2022-05-27 11:22:03 -07:00
Linus Torvalds
3f306ea2e1 dma-mapping updates for Linux 5.19
- don't over-decrypt memory (Robin Murphy)
  - takes min align mask into account for the swiotlb max mapping size
    (Tianyu Lan)
  - use GFP_ATOMIC in dma-debug (Mikulas Patocka)
  - fix DMA_ATTR_NO_KERNEL_MAPPING on xen/arm (me)
  - don't fail on highmem CMA pages in dma_direct_alloc_pages (me)
  - cleanup swiotlb initialization and share more code with swiotlb-xen
    (me, Stefano Stabellini)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmKObTQLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYObmA//dIcDB/q4iFGD+WJh4MhM+asx0ZsdF2OJz42WEhgT
 Z9duOrgcneEQundCamqJP9rNTs980LHDA8uWQC5rZEc9vxuRVOdS7bSgYRUwWh6B
 r0ZjOsvQCn+ChoZML8uyk4rfmEINq+EvJuec3G5fgecZOhPuJS2i2uzzv5cHwqgP
 ChC0fwyZlkfdECXgvZXbEoCJLfTgGNlziN6Ai8dirSoqgEQUoCsY89/M7OiEBvV2
 R4XUWD7OvQERfB4t6xLuUHyzf9PAuWB+OiblRVNeAmK3lMjxVrc3k4kIowgklnzD
 8hfmphAa9Zou3zdfi6Gd4fiQRHRVOwKVp1rtqUmJ+lPSiwyMzu64z9ld2+2qac0h
 V4sSr/yJkhxnBT4/0MkTChvhnRobisackpUzNRpiM4ck7cNVb7eAvkISsbH+pWI9
 aEexPhbyskjlV+GOyM4QL4ygG0dpXY0HSyoh6uaSVsaXMycnWIsJCPidXxV1HGV0
 q2/RLHuHwYxia8cYCF01/DQvwOKSjwbU0zModxtRezGD5GYh2C0a+SrA1aX+qiTu
 yGJCs2UHtSQstAt78tTVp499YeDeL/oGSQkPAu8zyRkSczzF+CncGTuXyoJbAWyK
 otcgERWljgZ4scxjfu1uacfoVhKQ7nOu7hiJokL0U80FESAennLC3ZlocvB9h/ff
 HNA=
 =n2rk
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - don't over-decrypt memory (Robin Murphy)

 - takes min align mask into account for the swiotlb max mapping size
   (Tianyu Lan)

 - use GFP_ATOMIC in dma-debug (Mikulas Patocka)

 - fix DMA_ATTR_NO_KERNEL_MAPPING on xen/arm (me)

 - don't fail on highmem CMA pages in dma_direct_alloc_pages (me)

 - cleanup swiotlb initialization and share more code with swiotlb-xen
   (me, Stefano Stabellini)

* tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping: (23 commits)
  dma-direct: don't over-decrypt memory
  swiotlb: max mapping size takes min align mask into account
  swiotlb: use the right nslabs-derived sizes in swiotlb_init_late
  swiotlb: use the right nslabs value in swiotlb_init_remap
  swiotlb: don't panic when the swiotlb buffer can't be allocated
  dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
  dma-direct: don't fail on highmem CMA pages in dma_direct_alloc_pages
  swiotlb-xen: fix DMA_ATTR_NO_KERNEL_MAPPING on arm
  x86: remove cruft from <asm/dma-mapping.h>
  swiotlb: remove swiotlb_init_with_tbl and swiotlb_init_late_with_tbl
  swiotlb: merge swiotlb-xen initialization into swiotlb
  swiotlb: provide swiotlb_init variants that remap the buffer
  swiotlb: pass a gfp_mask argument to swiotlb_init_late
  swiotlb: add a SWIOTLB_ANY flag to lift the low memory restriction
  swiotlb: make the swiotlb_init interface more useful
  x86: centralize setting SWIOTLB_FORCE when guest memory encryption is enabled
  x86: remove the IOMMU table infrastructure
  MIPS/octeon: use swiotlb_init instead of open coding it
  arm/xen: don't check for xen_initial_domain() in xen_create_contiguous_region
  swiotlb: rename swiotlb_late_init_with_default_size
  ...
2022-05-25 19:18:36 -07:00
Rafael J. Wysocki
14c03a4a75 Merge back reboot/poweroff notifiers rework for 5.19-rc1. 2022-05-25 14:38:29 +02:00
Christophe Leroy
5d7c854593 livepatch: Remove klp_arch_set_pc() and asm/livepatch.h
All three versions of klp_arch_set_pc() do exactly the same: they
call ftrace_instruction_pointer_set().

Call ftrace_instruction_pointer_set() directly and remove
klp_arch_set_pc().

As klp_arch_set_pc() was the only thing remaining in asm/livepatch.h
on x86 and s390, remove asm/livepatch.h

livepatch.h remains on powerpc but its content is exclusively used
by powerpc specific code.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Petr Mladek <pmladek@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2022-05-24 08:46:37 +02:00
Linus Torvalds
de8ac81747 - Remove all the code around GS switching on 32-bit now that it is not
needed anymore
 
 - Other misc improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmKLp74ACgkQEsHwGGHe
 VUpqrhAAgNdNw/vNTTzeOH5ZSNxyIoTQapmrSNev0cXRW4tV2hxuYSa2wPZPJZXx
 aYhnFxwL7rVy0er7jG/5KaOyzHmrh6PcmqgFdPVo8+yVrfcsPIUqg/4L5peFZh7T
 ETV2pvFIiB4njkL/pR3mU5uAtTjyO89tD/LclKmc4ndv19vI8maj+k/dCDOnNnEz
 m4wJMXYWh4bG47/izU5TcTYU7ttTLEiVQ/mC5kEuj7PQeUR0kXKvvLo4rX+lOI2v
 dQRHgHg/qoNM7uVLd7vV/YdMWwcHchmKG5Y7+a/ogdlwR7a/X9e+lklFSeuxNvyH
 8dOHIyzcb6lKTijpqhisZ3o9150ax3Q5FlSWuE3F/9Rcuc1T5eY82kTW2RTOTdV9
 xsjob4y+hlpsUfuImupxJLHn685xsYAdqyiG/SPkcnJL++tNBlWiGHX9NqXF5cgw
 bq4/94Aouxevl0OBxnFBeoQOJvOnf60OY3LHcYR78yEEJyi4iWsC0/TEmD+9IE+r
 EpC1wz9bHCYbSwZ+yv8u2tNPd/rKxdspPL/6SxT9a+WAVrOZbQAN3VmlOIon6W9O
 bW5ye6suqBbl/Q1FACVU1xxSNjLTJUTFsB1X3QKGm8E+Kr7/zD1ZtT0WQNvyLMfT
 p/I4VRcdIxV3eDiYqeTfJ3sTS7IjKHSaZVBnpkZvRh869mMdqCg=
 =CfX1
 -----END PGP SIGNATURE-----

Merge tag 'x86_core_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core x86 updates from Borislav Petkov:

 - Remove all the code around GS switching on 32-bit now that it is not
   needed anymore

 - Other misc improvements

* tag 'x86_core_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  bug: Use normal relative pointers in 'struct bug_entry'
  x86/nmi: Make register_nmi_handler() more robust
  x86/asm: Merge load_gs_index()
  x86/32: Remove lazy GS macros
  ELF: Remove elf_core_copy_kernel_regs()
  x86/32: Simplify ELF_CORE_COPY_REGS
2022-05-23 18:42:07 -07:00
Reza Arbab
26b78c81e8 powerpc: Enable the DAWR on POWER9 DD2.3 and above
The hardware bug in POWER9 preventing use of the DAWR was fixed in
DD2.3. Set the CPU_FTR_DAWR feature bit on these newer systems to start
using it again, and update the documentation accordingly.

The CPU features for DD2.3 are currently determined by "DD2.2 or later"
logic. In adding DD2.3 as a discrete case for the first time here, I'm
carrying the quirks of DD2.2 forward to keep all behavior outside of
this DAWR change the same. This leaves the assessment and potential
removal of those quirks on DD2.3 for later.

Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220503170152.23412-1-arbab@linux.ibm.com
2022-05-22 15:59:53 +10:00
Michael Ellerman
87c78b612f powerpc: Fix all occurences of "the the"
Rather than waiting for the bots to fix these one-by-one, fix all
occurences of "the the" throughout arch/powerpc.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220518142629.513007-1-mpe@ellerman.id.au
2022-05-22 15:59:43 +10:00
Daniel Axtens
41b7a347bf powerpc: Book3S 64-bit outline-only KASAN support
Implement a limited form of KASAN for Book3S 64-bit machines running under
the Radix MMU, supporting only outline mode.

 - Enable the compiler instrumentation to check addresses and maintain the
   shadow region. (This is the guts of KASAN which we can easily reuse.)

 - Require kasan-vmalloc support to handle modules and anything else in
   vmalloc space.

 - KASAN needs to be able to validate all pointer accesses, but we can't
   instrument all kernel addresses - only linear map and vmalloc. On boot,
   set up a single page of read-only shadow that marks all iomap and
   vmemmap accesses as valid.

 - Document KASAN in powerpc docs.

Background
----------

KASAN support on Book3S is a bit tricky to get right:

 - It would be good to support inline instrumentation so as to be able to
   catch stack issues that cannot be caught with outline mode.

 - Inline instrumentation requires a fixed offset.

 - Book3S runs code with translations off ("real mode") during boot,
   including a lot of generic device-tree parsing code which is used to
   determine MMU features.

    [ppc64 mm note: The kernel installs a linear mapping at effective
    address c000...-c008.... This is a one-to-one mapping with physical
    memory from 0000... onward. Because of how memory accesses work on
    powerpc 64-bit Book3S, a kernel pointer in the linear map accesses the
    same memory both with translations on (accessing as an 'effective
    address'), and with translations off (accessing as a 'real
    address'). This works in both guests and the hypervisor. For more
    details, see s5.7 of Book III of version 3 of the ISA, in particular
    the Storage Control Overview, s5.7.3, and s5.7.5 - noting that this
    KASAN implementation currently only supports Radix.]

 - Some code - most notably a lot of KVM code - also runs with translations
   off after boot.

 - Therefore any offset has to point to memory that is valid with
   translations on or off.

One approach is just to give up on inline instrumentation. This way
boot-time checks can be delayed until after the MMU is set is up, and we
can just not instrument any code that runs with translations off after
booting. Take this approach for now and require outline instrumentation.

Previous attempts allowed inline instrumentation. However, they came with
some unfortunate restrictions: only physically contiguous memory could be
used and it had to be specified at compile time. Maybe we can do better in
the future.

[paulus@ozlabs.org - Rebased onto 5.17.  Note that a kernel with
 CONFIG_KASAN=y will crash during boot on a machine using HPT
 translation because not all the entry points to the generic
 KASAN code are protected with a call to kasan_arch_is_ready().]

Originally-by: Balbir Singh <bsingharora@gmail.com> # ppc64 out-of-line radix version
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
[mpe: Update copyright year and comment formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YoTE69OQwiG7z+Gu@cleo
2022-05-22 15:58:29 +10:00
Daniel Axtens
5352090a99 powerpc/kasan: Don't instrument non-maskable or raw interrupts
Disable address sanitization for raw and non-maskable interrupt
handlers, because they can run in real mode, where we cannot access
the shadow memory.  (Note that kasan_arch_is_ready() doesn't test for
real mode, since it is a static branch for speed, and in any case not
all the entry points to the generic KASAN code are protected by
kasan_arch_is_ready guards.)

The changes to interrupt_nmi_enter/exit_prepare() look larger than
they actually are.  The changes are equivalent to adding
!IS_ENABLED(CONFIG_KASAN) to the conditions for calling nmi_enter() or
nmi_exit() in real mode.  That is, the code is equivalent to using the
following condition for calling nmi_enter/exit:

	if (((!IS_ENABLED(CONFIG_PPC_BOOK3S_64) ||
			!firmware_has_feature(FW_FEATURE_LPAR) ||
			radix_enabled()) &&
		    !IS_ENABLED(CONFIG_KASAN) ||
		(mfmsr() & MSR_DR))

That unwieldy condition has been split into several statements with
comments, for easier reading.

The nmi_ipi_lock functions that call atomic functions (i.e.,
nmi_ipi_lock_start(), nmi_ipi_lock() and nmi_ipi_unlock()), besides
being marked noinstr, now call arch_atomic_* functions instead of
atomic_* functions because with KASAN enabled, the atomic_* functions
are wrappers which explicitly do address sanitization on their
arguments.  Since we are trying to avoid address sanitization, we have
to use the lower-level arch_atomic_* versions.

In hv_nmi_check_nonrecoverable(), the regs_set_unrecoverable() call
has been open-coded so as to avoid having to either trust the inlining
or mark regs_set_unrecoverable() as noinstr.

[paulus@ozlabs.org: combined a few work-in-progress commits of
 Daniel's and wrote the commit message.]

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YoTFGaKM8Pd46PIK@cleo
2022-05-22 15:58:29 +10:00
Naveen N. Rao
84ade0a665 powerpc/ftrace: Remove ftrace init tramp once kernel init is complete
Stop using the ftrace trampoline for init section once kernel init is
complete.

Fixes: 67361cf807 ("powerpc/ftrace: Handle large kernel configs")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220516071422.463738-1-naveen.n.rao@linux.vnet.ibm.com
2022-05-22 15:58:29 +10:00
Christophe Leroy
5fe855169f powerpc/irq: Remove arch_local_irq_restore() for !CONFIG_CC_HAS_ASM_GOTO
All supported versions of GCC & clang support asm goto.

Remove the !CONFIG_CC_HAS_ASM_GOTO version of arch_local_irq_restore()

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/58df50c9e77e2ed945bacdead30412770578886b.1652715336.git.christophe.leroy@csgroup.eu
2022-05-22 15:58:28 +10:00
Christophe Leroy
e0c2ef4321 powerpc/modules: Use PPC_LI macros instead of opencoding
Use PPC_LI_MASK and PPC_LI() instead of opencoding.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3d56d7bc3200403773d54e62659d0e01292a055d.1652074503.git.christophe.leroy@csgroup.eu
2022-05-22 15:58:27 +10:00
Christophe Leroy
8052d043a4 powerpc/ftrace: Don't use copy_from_kernel_nofault() in module_trampoline_target()
module_trampoline_target() is quite a hot path used when
activating/deactivating function tracer.

Avoid the heavy copy_from_kernel_nofault() by doing four calls
to copy_inst_from_kernel_nofault().

Use __copy_inst_from_kernel_nofault() for the 3 last calls. First call
is done to copy_from_kernel_nofault() to check address is within
kernel space. No risk to wrap out the top of kernel space because the
last page is never mapped so if address is in last page the first copy
will fails and the other ones will never be performed.

And also make it notrace just like all functions that call it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c55559103e014b7863161559d340e8e9484eaaa6.1652074503.git.christophe.leroy@csgroup.eu
2022-05-22 15:58:27 +10:00
Christophe Leroy
af8b9f352f powerpc/ftrace: Minimise number of #ifdefs
A lot of #ifdefs can be replaced by IS_ENABLED()

Do so.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Fold in changes suggested by Naveen and Christophe on list]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/18ce6708d6f8c71d87436f9c6019f04df4125128.1652074503.git.christophe.leroy@csgroup.eu
2022-05-22 15:58:26 +10:00
Christophe Leroy
b97d0e3dcf powerpc/ftrace: Simplify expected_nop_sequence()
Avoid ifdefs around expected_nop_sequence().

While at it make it a bool.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/305d22472f1f92127fba09692df6bb5d079a8cd0.1652074503.git.christophe.leroy@csgroup.eu
2022-05-22 15:58:26 +10:00
Christophe Leroy
c8deb28095 powerpc/ftrace: Use size macro instead of opencoding
0x80000000 is SZ_2G. Use it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Fix comparison against unsigned -SZ_2G]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/bb6626e884acffe87b58736291df57db3deaa9b9.1652074503.git.christophe.leroy@csgroup.eu
2022-05-22 15:57:38 +10:00
Dmitry Osipenko
c33fd0b17e powerpc: Use do_kernel_power_off()
Kernel now supports chained power-off handlers. Use do_kernel_power_off()
that invokes chained power-off handlers. It also invokes legacy
pm_power_off() for now, which will be removed once all drivers will
be converted to the new sys-off API.

Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19 19:30:30 +02:00
Christophe Leroy
e89aa642be powerpc/ftrace: Use PPC_RAW_xxx() macros instead of opencoding.
PPC_RAW_xxx() macros are self explanatory and less error prone
than open coding.

Use them in ftrace.c

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/9292094c9a69cef6d29ee83f435a557b59c45065.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:29 +10:00
Christophe Leroy
cf9df92a82 powerpc/ftrace: Use BRANCH_SET_LINK instead of value 1
To make it explicit, use BRANCH_SET_LINK instead of value 1
when calling create_branch().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d57847063ac93660a5af620d4df1847f10edf61a.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:29 +10:00
Christophe Leroy
ccf6607e45 powerpc/ftrace: Remove ftrace_plt_tramps[]
ftrace_plt_tramps table is never filled so it is useless.

Remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/daeeb618a6619e3a7e3f82f1bd83ca7c25af6330.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:29 +10:00
Christophe Leroy
c2cba93d1a powerpc/ftrace: Use CONFIG_FUNCTION_TRACER instead of CONFIG_DYNAMIC_FTRACE
Since commit 0c0c52306f ("powerpc: Only support DYNAMIC_FTRACE not
static"), CONFIG_DYNAMIC_FTRACE is always selected when
CONFIG_FUNCTION_TRACER is selected.

To avoid confusion and have the reader wonder what's happen when
CONFIG_FUNCTION_TRACER is selected and CONFIG_DYNAMIC_FTRACE is not,
use CONFIG_FUNCTION_TRACER in ifdefs instead of CONFIG_DYNAMIC_FTRACE.

As CONFIG_FUNCTION_GRAPH_TRACER depends on CONFIG_FUNCTION_TRACER,
ftrace.o doesn't need to appear for both symbols in Makefile.

Then as ftrace.o is built only when CONFIG_FUNCTION_TRACER is selected
ifdef CONFIG_FUNCTION_TRACER is not needed in ftrace.c, and since it
implies CONFIG_DYNAMIC_FTRACE, CONFIG_DYNAMIC_FTRACE is not needed
in ftrace.c

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/628d357503eb90b4a034f99b7df516caaff4d279.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:29 +10:00
Christophe Leroy
a3d0f5b4b7 powerpc/ftrace: Don't include ftrace.o for CONFIG_FTRACE_SYSCALLS
Since commit 7bea7ac0ca ("powerpc/syscalls: Fix syscall tracing")
ftrace.o is not needed anymore for CONFIG_FTRACE_SYSCALLS.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/275932a5d61543b825ff9a64f61abed6da5d4a2a.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:29 +10:00
Christophe Leroy
23b44fc248 powerpc/ftrace: Make __ftrace_make_{nop/call}() common to PPC32 and PPC64
Since c93d4f6ecf ("powerpc/ftrace: Add module_trampoline_target()
for PPC32"), __ftrace_make_nop() for PPC32 is very similar to the
one for PPC64.

Same for __ftrace_make_call().

Make them common.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/96f53c237316dab4b1b8c682685266faa92da816.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:29 +10:00
Christophe Leroy
5b89492c03 powerpc: Finalise cleanup around ABI use
Now that we have CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2,
get rid of all indirect detection of ABI version.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/709d9d69523c14c8a9fba4486395dca0f2d675b1.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:29 +10:00
Christophe Leroy
7d40aff821 powerpc: Replace PPC64_ELF_ABI_v{1/2} by CONFIG_PPC64_ELF_ABI_V{1/2}
Replace all uses of PPC64_ELF_ABI_v1 and PPC64_ELF_ABI_v2 by
resp CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ba13d59e8c50bc9aa6328f1c7f0c0d0278e0a3a7.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:29 +10:00
Christophe Leroy
bbffdd2fc7 powerpc/ftrace: Use patch_instruction() return directly
Instead of returning -EPERM when patch_instruction() fails,
just return what patch_instruction returns.

That simplifies ftrace_modify_code():

	   0:	94 21 ff c0 	stwu    r1,-64(r1)
	   4:	93 e1 00 3c 	stw     r31,60(r1)
	   8:	7c 7f 1b 79 	mr.     r31,r3
	   c:	40 80 00 30 	bge     3c <ftrace_modify_code+0x3c>
	  10:	93 c1 00 38 	stw     r30,56(r1)
	  14:	7c 9e 23 78 	mr      r30,r4
	  18:	7c a4 2b 78 	mr      r4,r5
	  1c:	80 bf 00 00 	lwz     r5,0(r31)
	  20:	7c 1e 28 40 	cmplw   r30,r5
	  24:	40 82 00 34 	bne     58 <ftrace_modify_code+0x58>
	  28:	83 c1 00 38 	lwz     r30,56(r1)
	  2c:	7f e3 fb 78 	mr      r3,r31
	  30:	83 e1 00 3c 	lwz     r31,60(r1)
	  34:	38 21 00 40 	addi    r1,r1,64
	  38:	48 00 00 00 	b       38 <ftrace_modify_code+0x38>
				38: R_PPC_REL24	patch_instruction

Before:

	   0:	94 21 ff c0 	stwu    r1,-64(r1)
	   4:	93 e1 00 3c 	stw     r31,60(r1)
	   8:	7c 7f 1b 79 	mr.     r31,r3
	   c:	40 80 00 4c 	bge     58 <ftrace_modify_code+0x58>
	  10:	93 c1 00 38 	stw     r30,56(r1)
	  14:	7c 9e 23 78 	mr      r30,r4
	  18:	7c a4 2b 78 	mr      r4,r5
	  1c:	80 bf 00 00 	lwz     r5,0(r31)
	  20:	7c 08 02 a6 	mflr    r0
	  24:	90 01 00 44 	stw     r0,68(r1)
	  28:	7c 1e 28 40 	cmplw   r30,r5
	  2c:	40 82 00 48 	bne     74 <ftrace_modify_code+0x74>
	  30:	7f e3 fb 78 	mr      r3,r31
	  34:	48 00 00 01 	bl      34 <ftrace_modify_code+0x34>
				34: R_PPC_REL24	patch_instruction
	  38:	80 01 00 44 	lwz     r0,68(r1)
	  3c:	20 63 00 00 	subfic  r3,r3,0
	  40:	83 c1 00 38 	lwz     r30,56(r1)
	  44:	7c 63 19 10 	subfe   r3,r3,r3
	  48:	7c 08 03 a6 	mtlr    r0
	  4c:	83 e1 00 3c 	lwz     r31,60(r1)
	  50:	38 21 00 40 	addi    r1,r1,64
	  54:	4e 80 00 20 	blr

It improves ftrace activation/deactivation duration by about 3%.

Modify patch_instruction() return on failure to -EPERM in order to
match with ftrace expectations. Other users of patch_instruction()
do not care about the exact error value returned.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/49a8597230713e2633e7d9d7b56140787c4a7e20.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:28 +10:00
Christophe Leroy
2c920fca8c powerpc/ftrace: Inline ftrace_modify_code()
Inlining ftrace_modify_code(), it increases a bit the
size of ftrace code but brings 5% improvment on ftrace
activation.

Usually in C files we let gcc decide what to do but here
it really help to 'help' gcc to decide to inline, thought
we don't want to force it with an __always_inline that
would be too much for CONFIG_CC_OPTIMIZE_FOR_SIZE.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1597a06d57cfc80e6853838c4066e799bf6c7977.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:28 +10:00
Christophe Leroy
a1facd2578 powerpc/ftrace: Use is_offset_in_branch_range()
Use is_offset_in_branch_range() instead of create_branch()
to check if a target is within branch range.

This patch together with the previous one improves
ftrace activation time by 7%

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/912ae51782f5a53c44e435497c8c3fb5cc632387.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:28 +10:00
Christophe Leroy
ae3a2a2188 powerpc/ftrace: Remove redundant create_branch() calls
Since commit d5937db114 ("powerpc/code-patching: Fix patch_branch()
return on out-of-range failure") patch_branch() fails with -ERANGE
when trying to branch out of range.

No need to perform the test twice. Remove redundant create_branch()
calls.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/aa45fbad0b4b7493080835d8276c0cb4ce146503.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:28 +10:00
Christophe Leroy
d996d5053e powerpc/ftrace: Refactor prepare_ftrace_return()
When we have CONFIG_DYNAMIC_FTRACE_WITH_ARGS,
prepare_ftrace_return() is called by ftrace_graph_func()
otherwise prepare_ftrace_return() is called from assembly.

Refactor prepare_ftrace_return() into a static
__prepare_ftrace_return() that will be called by both
prepare_ftrace_return() and ftrace_graph_func().

It will allow GCC to fold __prepare_ftrace_return() inside
ftrace_graph_func().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0d42deafe353980c66cf19d3132638c05ba9f4a9.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19 23:11:28 +10:00
Nicholas Piggin
804c0a166f powerpc/rtas: enture rtas_call is called with MMU enabled
rtas_call must not be called with the MMU disabled because in case
of rtas error, log_error is called which requires MMU enabled. Add
a test and warning for this.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220308135047.478297-14-npiggin@gmail.com
2022-05-19 23:11:27 +10:00
Nicholas Piggin
014b2e896c powerpc/rtas: Leave MSR[RI] enabled over RTAS call
PAPR specifies that RTAS may be called with MSR[RI] enabled if the
calling context is recoverable, and RTAS will manage RI as necessary.
Call the rtas entry point with RI enabled, and add a check to ensure
the caller has RI enabled.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220308135047.478297-10-npiggin@gmail.com
2022-05-19 23:11:27 +10:00
Nicholas Piggin
5c86bd02b3 powerpc/rtas: PACA can be restored directly from SPRG
On 64-bit, PACA is saved in a SPRG so it does not need to be saved on
stack. We also don't need to mask off the top bits for real mode
addresses because the architecture does this for us.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220308135047.478297-8-npiggin@gmail.com
2022-05-19 23:11:27 +10:00
Nicholas Piggin
c5a65e0a42 powerpc/rtas: Call enter_rtas with MSR[EE] disabled
Disable MSR[EE] in C code rather than asm.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220308135047.478297-5-npiggin@gmail.com
2022-05-19 23:11:27 +10:00
Nicholas Piggin
4e949faae2 powerpc/rtas: Fix whitespace in rtas_entry.S
The code was moved verbatim including whitespace cruft. Fix that.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220308135047.478297-4-npiggin@gmail.com
2022-05-19 23:11:27 +10:00
Nicholas Piggin
07940b4b61 powerpc/rtas: Make enter_rtas a nokprobe symbol on 64-bit
This symbol is marked nokprobe on 32-bit but not 64-bit, add it.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220308135047.478297-3-npiggin@gmail.com
2022-05-19 23:11:27 +10:00
Nicholas Piggin
838ee286ec powerpc/rtas: Move rtas entry assembly into its own file
This makes working on the code a bit easier.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220308135047.478297-2-npiggin@gmail.com
2022-05-19 23:11:27 +10:00
Nicholas Piggin
2896b2dff4 powerpc/signal: Report minimum signal frame size to userspace via AT_MINSIGSTKSZ
Implement the AT_MINSIGSTKSZ AUXV entry, allowing userspace to
dynamically size stack allocations in a manner forward-compatible with
new processor state saved in the signal frame

For now these statically find the maximum signal frame size rather than
doing any runtime testing of features to minimise the size.

glibc 2.34 will take advantage of this, as will applications that use
use _SC_MINSIGSTKSZ and _SC_SIGSTKSZ.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
References: 94b07c1f8c ("arm64: signal: Report signal frame size to userspace via auxv")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220307182734.289289-2-npiggin@gmail.com
2022-05-19 23:11:26 +10:00
Nathan Chancellor
4406b12214 powerpc/vdso: Link with ld.lld when requested
The PowerPC vDSO uses $(CC) to link, which differs from the rest of the
kernel, which uses $(LD) directly. As a result, the default linker of
the compiler is used, which may differ from the linker requested by the
builder. For example:

  $ make ARCH=powerpc LLVM=1 mrproper defconfig arch/powerpc/kernel/vdso/
  ...

  $ llvm-readelf -p .comment arch/powerpc/kernel/vdso/vdso{32,64}.so.dbg

  File: arch/powerpc/kernel/vdso/vdso32.so.dbg
  String dump of section '.comment':
  [     0] clang version 14.0.0 (Fedora 14.0.0-1.fc37)

  File: arch/powerpc/kernel/vdso/vdso64.so.dbg
  String dump of section '.comment':
  [     0] clang version 14.0.0 (Fedora 14.0.0-1.fc37)

LLVM=1 sets LD=ld.lld but ld.lld is not used to link the vDSO; GNU ld is
because "ld" is the default linker for clang on most Linux platforms.

This is a problem for Clang's Link Time Optimization as implemented in
the kernel because use of GNU ld with LTO requires the LLVMgold plugin,
which is not technically supported for ld.bfd per
https://llvm.org/docs/GoldPlugin.html. Furthermore, if LLVMgold.so is
missing from a user's system, the build will fail, even though LTO as it
is implemented in the kernel requires ld.lld to avoid this dependency in
the first place.

Ultimately, the PowerPC vDSO should be converted to compiling and
linking with $(CC) and $(LD) respectively but there were issues last
time this was tried, potentially due to older but supported tool
versions. To avoid regressing GCC + binutils, use the compiler option
'-fuse-ld', which tells the compiler which linker to use when it is
invoked as both the compiler and linker. Use '-fuse-ld=lld' when
LD=ld.lld has been specified (CONFIG_LD_IS_LLD) so that the vDSO is
linked with the same linker as the rest of the kernel.

  $ llvm-readelf -p .comment arch/powerpc/kernel/vdso/vdso{32,64}.so.dbg

  File: arch/powerpc/kernel/vdso/vdso32.so.dbg
  String dump of section '.comment':
  [     0] Linker: LLD 14.0.0
  [    14] clang version 14.0.0 (Fedora 14.0.0-1.fc37)

  File: arch/powerpc/kernel/vdso/vdso64.so.dbg
  String dump of section '.comment':
  [     0] Linker: LLD 14.0.0
  [    14] clang version 14.0.0 (Fedora 14.0.0-1.fc37)

LD can be a full path to ld.lld, which will not be handled properly by
'-fuse-ld=lld' if the full path to ld.lld is outside of the compiler's
search path. '-fuse-ld' can take a path to the linker but it is
deprecated in clang 12.0.0; '--ld-path' is preferred for this scenario.

Use '--ld-path' if it is supported, as it will handle a full path or
just 'ld.lld' properly. See the LLVM commit below for the full details
of '--ld-path'.

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://github.com/ClangBuiltLinux/linux/issues/774
Link: 1bc5c84710
Link: https://lore.kernel.org/r/20220511185001.3269404-3-nathan@kernel.org
2022-05-19 23:11:26 +10:00
Nathan Chancellor
e247172854 powerpc/vdso: Remove unused ENTRY in linker scripts
When linking vdso{32,64}.so.dbg with ld.lld, there is a warning about
not finding _start for the starting address:

  ld.lld: warning: cannot find entry symbol _start; not setting start address
  ld.lld: warning: cannot find entry symbol _start; not setting start address

Looking at GCC + GNU ld, the entry point address is 0x0:

  $ llvm-readelf -h vdso{32,64}.so.dbg &| rg "(File|Entry point address):"
  File: vdso32.so.dbg
    Entry point address:               0x0
  File: vdso64.so.dbg
    Entry point address:               0x0

This matches what ld.lld emits:

  $ powerpc64le-linux-gnu-readelf -p .comment vdso{32,64}.so.dbg

  File: vdso32.so.dbg

  String dump of section '.comment':
    [     0]  Linker: LLD 14.0.0
    [    14]  clang version 14.0.0 (Fedora 14.0.0-1.fc37)

  File: vdso64.so.dbg

  String dump of section '.comment':
    [     0]  Linker: LLD 14.0.0
    [    14]  clang version 14.0.0 (Fedora 14.0.0-1.fc37)

  $ llvm-readelf -h vdso{32,64}.so.dbg &| rg "(File|Entry point address):"
  File: vdso32.so.dbg
    Entry point address:               0x0
  File: vdso64.so.dbg
    Entry point address:               0x0

Remove ENTRY to remove the warning, as it is unnecessary for the vDSO to
function correctly.

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220511185001.3269404-2-nathan@kernel.org
2022-05-19 23:11:26 +10:00
Kevin Hao
d9e5c3e9e7 powerpc: Export mmu_feature_keys[] as non-GPL
When the mmu_feature_keys[] was introduced in the commit c12e6f24d4
("powerpc: Add option to use jump label for mmu_has_feature()"),
it is unlikely that it would be used either directly or indirectly in
the out of tree modules. So we exported it as GPL only.

But with the evolution of the codes, especially the PPC_KUAP support, it
may be indirectly referenced by some primitive macro or inline functions
such as get_user() or __copy_from_user_inatomic(), this will make it
impossible to build many non GPL modules (such as ZFS) on ppc
architecture. Fix this by exposing the mmu_feature_keys[] to the non-GPL
modules too.

Fixes: 7613f5a66b ("powerpc/64s/kuap: Use mmu_has_feature()")
Reported-by: Nathaniel Filardo <nwfilardo@gmail.com>
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220329085709.4132729-1-haokexin@gmail.com
2022-05-19 23:11:26 +10:00
Guilherme G. Piccoli
e2aa34ce80 powerpc/setup: Refactor/untangle panic notifiers
The panic notifiers infrastructure is a bit limited in the scope of
the callbacks - basically every kind of functionality is dropped
in a list that runs in the same point during the kernel panic path.
This is not really on par with the complexities and particularities
of architecture / hypervisors' needs, and a refactor is ongoing.

As part of this refactor, it was observed that powerpc has 2 notifiers,
with mixed goals: one is just a KASLR offset dumper, whereas the other
aims to hard-disable IRQs (necessary on panic path), warn firmware of
the panic event (fadump) and run low-level platform-specific machinery
that might stop kernel execution and never come back.

Clearly, the 2nd notifier has opposed goals: disable IRQs / fadump
should run earlier while low-level platform actions should
run late since it might not even return. Hence, this patch decouples
the notifiers splitting them in three:

- First one is responsible for hard-disable IRQs and fadump,
should run early;

- The kernel KASLR offset dumper is really an informative notifier,
harmless and may run at any moment in the panic path;

- The last notifier should run last, since it aims to perform
low-level actions for specific platforms, and might never return.
It is also only registered for 2 platforms, pseries and ps3.

The patch better documents the notifiers and clears the code too,
also removing a useless header.

Currently no functionality change should be observed, but after
the planned panic refactor we should expect more panic reliability
with this patch.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220427224924.592546-9-gpiccoli@igalia.com
2022-05-19 23:11:26 +10:00
Michael Ellerman
b104e41cda Merge branch 'topic/ppc-kvm' into next
Merge our KVM topic branch.
2022-05-19 23:10:42 +10:00
Alexey Kardashevskiy
cad32d9d42 KVM: PPC: Book3s: Retire H_PUT_TCE/etc real mode handlers
LoPAPR defines guest visible IOMMU with hypercalls to use it -
H_PUT_TCE/etc. Implemented first on POWER7 where hypercalls would trap
in the KVM in the real mode (with MMU off). The problem with the real mode
is some memory is not available and some API usage crashed the host but
enabling MMU was an expensive operation.

The problems with the real mode handlers are:
1. Occasionally these cannot complete the request so the code is
copied+modified to work in the virtual mode, very little is shared;
2. The real mode handlers have to be linked into vmlinux to work;
3. An exception in real mode immediately reboots the machine.

If the small DMA window is used, the real mode handlers bring better
performance. However since POWER8, there has always been a bigger DMA
window which VMs use to map the entire VM memory to avoid calling
H_PUT_TCE. Such 1:1 mapping happens once and uses H_PUT_TCE_INDIRECT
(a bulk version of H_PUT_TCE) which virtual mode handler is even closer
to its real mode version.

On POWER9 hypercalls trap straight to the virtual mode so the real mode
handlers never execute on POWER9 and later CPUs.

So with the current use of the DMA windows and MMU improvements in
POWER9 and later, there is no point in duplicating the code.
The 32bit passed through devices may slow down but we do not have many
of these in practice. For example, with this applied, a 1Gbit ethernet
adapter still demostrates above 800Mbit/s of actual throughput.

This removes the real mode handlers from KVM and related code from
the powernv platform.

This updates the list of implemented hcalls in KVM-HV as the realmode
handlers are removed.

This changes ABI - kvmppc_h_get_tce() moves to the KVM module and
kvmppc_find_table() is static now.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220506053755.3820702-1-aik@ozlabs.ru
2022-05-19 00:44:01 +10:00
Michael Ellerman
a5fc286f69 Merge branch 'fixes' into next
Merge our fixes branch from this cycle. In particular this brings in a
papr_scm.c change which a subsequent patch has a dependency on.
2022-05-19 00:11:51 +10:00
Laurent Dufour
b6b1c3ce06 powerpc/rtas: Keep MSR[RI] set when calling RTAS
RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32-bit big
endian mode (MSR[SF,LE] unset).

The change in MSR is done in enter_rtas() in a relatively complex way,
since the MSR value could be hardcoded.

Furthermore, a panic has been reported when hitting the watchdog interrupt
while running in RTAS, this leads to the following stack trace:

  watchdog: CPU 24 Hard LOCKUP
  watchdog: CPU 24 TB:997512652051031, last heartbeat TB:997504470175378 (15980ms ago)
  ...
  Supported: No, Unreleased kernel
  CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
  NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
  REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
  MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
  CFAR: 000000000000011c IRQMASK: 1
  GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
  GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
  GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
  GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
  GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
  GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
  GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
  GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
  NIP [000000001fb41050] 0x1fb41050
  LR [000000001fb4104c] 0x1fb4104c
  Call Trace:
  Instruction dump:
  XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
  XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
  Oops: Unrecoverable System Reset, sig: 6 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  ...
  Supported: No, Unreleased kernel
  CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
  NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
  REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
  MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
  CFAR: 000000000000011c IRQMASK: 1
  GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
  GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
  GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
  GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
  GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
  GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
  GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
  GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
  NIP [000000001fb41050] 0x1fb41050
  LR [000000001fb4104c] 0x1fb4104c
  Call Trace:
  Instruction dump:
  XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
  XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
  ---[ end trace 3ddec07f638c34a2 ]---

This happens because MSR[RI] is unset when entering RTAS but there is no
valid reason to not set it here.

RTAS is expected to be called with MSR[RI] as specified in PAPR+ section
"7.2.1 Machine State":

  R1–7.2.1–9. If called with MSR[RI] equal to 1, then RTAS must protect
  its own critical regions from recursion by setting the MSR[RI] bit to
  0 when in the critical regions.

Fixing this by reviewing the way MSR is compute before calling RTAS. Now a
hardcoded value meaning real mode, 32 bits big endian mode and Recoverable
Interrupt is loaded. In the case MSR[S] is set, it will remain set while
entering RTAS as only urfid can unset it (thanks Fabiano).

In addition a check is added in do_enter_rtas() to detect calls made with
MSR[RI] unset, as we are forcing it on later.

This patch has been tested on the following machines:
Power KVM Guest
  P8 S822L (host Ubuntu kernel 5.11.0-49-generic)
PowerVM LPAR
  P8 9119-MME (FW860.A1)
  p9 9008-22L (FW950.00)
  P10 9080-HEX (FW1010.00)

Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220504101244.12107-1-ldufour@linux.ibm.com
2022-05-11 23:06:40 +10:00
Linus Torvalds
e3de3a1cda powerpc fixes for 5.18 #4
- Fix the DWARF CFI in our VDSO time functions, allowing gdb to backtrace through them
    correctly.
 
  - Fix a buffer overflow in the papr_scm driver, only triggerable by hypervisor input.
 
  - A fix in the recently added QoS handling for VAS (used for communicating with
    coprocessors).
 
 Thanks to: Alan Modra, Haren Myneni, Kajol Jain, Segher Boessenkool.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmJ3r7kTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgM1LEACU87gXiSJJW99qD5IvoPncR1nRjd+3
 zTAch9Dk70Kt/1Zkn0bEGjkua6YwcOdlA7QXiHBR2HAYli85VWK/w9Vz/TLTbL9/
 SrDvSfzwmqbvU61A2ZppvH487z8pfEBmviv3SCrmB3xZWtttYkatEj4A1EjdBJUI
 +xq0JgrXj8rAayRpkJS5XEUpkw8eXJ85X1WXdx+peIGvcRKB+n46HyCYhsl3/YVv
 pAn6jcnNnPqYzXeE0sQ4ZpybzCkzqVHC4SyCemkp8PWfqyL8LqgIvz2qtCvXAzij
 SJikxmHUN3XvHCF4aOgJG6OTkMEz92cpH0hGsb59c4usPCNlRFMuqhJxuYYmNGT3
 FtDUrrptsQ5ZRdWttQzi0RFjK0klro5VKvMJRv7uDWIlaPwl3Vi8DxvSrzCSdQPE
 Q1XsJS7B/io1JCCzrhrJyRQkIt+Z9e1/3xm618ide1UKQkqVYApZcImJDk7DlDCV
 QL1mtqHgasQjDLRkQq/fil/vyt2byw2uzCy6j0X/WQYrKJlUmbXKlcbdbbeCH2qZ
 8NndD96wlw+dd/q2u/ZdngQ8f/68n6+a/Zxv7eAbb01JY04VFsCXgdgGHpzhBiJ5
 YWqp5qzzuTcvhSYhyfdqFd9KCxxGRIag6jmhs6vbfQbMlZMoo9Jr4vaVm1aIW27t
 MRhuEwi2KzQI2A==
 =sVeX
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix the DWARF CFI in our VDSO time functions, allowing gdb to
   backtrace through them correctly.

 - Fix a buffer overflow in the papr_scm driver, only triggerable by
   hypervisor input.

 - A fix in the recently added QoS handling for VAS (used for
   communicating with coprocessors).

Thanks to Alan Modra, Haren Myneni, Kajol Jain, and Segher Boessenkool.

* tag 'powerpc-5.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/papr_scm: Fix buffer overflow issue with CONFIG_FORTIFY_SOURCE
  powerpc/vdso: Fix incorrect CFI in gettimeofday.S
  powerpc/pseries/vas: Use QoS credits from the userspace
2022-05-08 11:38:23 -07:00
Christophe Leroy
e59596a2d6 powerpc: Use static call for get_irq()
__do_irq() inconditionnaly calls ppc_md.get_irq()

That's definitely a hot path.

At the time being ppc_md.get_irq address is read every time
from ppc_md structure.

Replace that call by a static call, and initialise that
call after ppc_md.init_IRQ() has set ppc_md.get_irq.

Emit a warning and don't set the static call if ppc_md.init_IRQ()
is still NULL, that way the kernel won't blow up if for some
reason ppc_md.get_irq() doesn't get properly set.

With the patch:

	00000000 <__SCT__ppc_get_irq>:
	   0:	48 00 00 20 	b       20 <__static_call_return0>	<== Replaced by 'b <ppc_md.get_irq>' at runtime
...
	00000020 <__static_call_return0>:
	  20:	38 60 00 00 	li      r3,0
	  24:	4e 80 00 20 	blr
...
	00000058 <__do_irq>:
...
	  64:	48 00 00 01 	bl      64 <__do_irq+0xc>
				64: R_PPC_REL24	__SCT__ppc_get_irq
	  68:	2c 03 00 00 	cmpwi   r3,0
...

Before the patch:

	00000038 <__do_irq>:
...
	  3c:	3d 20 00 00 	lis     r9,0
				3e: R_PPC_ADDR16_HA	ppc_md+0x1c
...
	  44:	81 29 00 00 	lwz     r9,0(r9)
				46: R_PPC_ADDR16_LO	ppc_md+0x1c
...
	  4c:	7d 29 03 a6 	mtctr   r9
	  50:	4e 80 04 21 	bctrl
	  54:	2c 03 00 00 	cmpwi   r3,0
...

On PPC64 which doesn't implement static calls yet we get:

00000000000000d0 <__do_irq>:
...
      dc:	00 00 22 3d 	addis   r9,r2,0
			dc: R_PPC64_TOC16_HA	.data+0x8
...
      e4:	00 00 89 e9 	ld      r12,0(r9)
			e4: R_PPC64_TOC16_LO_DS	.data+0x8
...
      f0:	a6 03 89 7d 	mtctr   r12
      f4:	18 00 41 f8 	std     r2,24(r1)
      f8:	21 04 80 4e 	bctrl
      fc:	18 00 41 e8 	ld      r2,24(r1)
...

So on PPC64 that's similar to what we get without static calls.
But at least until ppc_md.get_irq() is set the call is to
__static_call_return0.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/afb92085f930651d8b1063e4d4bf0396c80ebc7d.1647002274.git.christophe.leroy@csgroup.eu
2022-05-08 22:15:40 +10:00
Christophe Leroy
e6f6390ab7 powerpc: Add missing headers
Don't inherit headers "by chances" from asm/prom.h, asm/mpc52xx.h,
asm/pci.h etc...

Include the needed headers, and remove asm/prom.h when it was
needed exclusively for pulling necessary headers.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/be8bdc934d152a7d8ee8d1a840d5596e2f7d85e0.1646767214.git.christophe.leroy@csgroup.eu
2022-05-08 22:15:40 +10:00
Christophe Leroy
86c38fec69 powerpc: Remove asm/prom.h from all files that don't need it
Several files include asm/prom.h for no reason.

Clean it up.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Drop change to prom_parse.c as reported by lkp@intel.com]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7c9b8fda63dcf63e1b28f43e7ebdb95182cbc286.1646767214.git.christophe.leroy@csgroup.eu
2022-05-08 22:15:04 +10:00
Eric W. Biederman
5bd2e97c86 fork: Generalize PF_IO_WORKER handling
Add fn and fn_arg members into struct kernel_clone_args and test for
them in copy_thread (instead of testing for PF_KTHREAD | PF_IO_WORKER).
This allows any task that wants to be a user space task that only runs
in kernel mode to use this functionality.

The code on x86 is an exception and still retains a PF_KTHREAD test
because x86 unlikely everything else handles kthreads slightly
differently than user space tasks that start with a function.

The functions that created tasks that start with a function
have been updated to set ".fn" and ".fn_arg" instead of
".stack" and ".stack_size".  These functions are fork_idle(),
create_io_thread(), kernel_thread(), and user_mode_thread().

Link: https://lkml.kernel.org/r/20220506141512.516114-4-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-05-07 09:01:59 -05:00
Eric W. Biederman
c5febea095 fork: Pass struct kernel_clone_args into copy_thread
With io_uring we have started supporting tasks that are for most
purposes user space tasks that exclusively run code in kernel mode.

The kernel task that exec's init and tasks that exec user mode
helpers are also user mode tasks that just run kernel code
until they call kernel execve.

Pass kernel_clone_args into copy_thread so these oddball
tasks can be supported more cleanly and easily.

v2: Fix spelling of kenrel_clone_args on h8300
Link: https://lkml.kernel.org/r/20220506141512.516114-2-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2022-05-07 09:01:48 -05:00
Christophe Leroy
0aa297e73b powerpc/64: Move pci_device_from_OF_node() out of asm/pci-bridge.h
Move pci_device_from_OF_node() in pci64.c because it needs definition
of struct device_node and is not worth inlining.

ppc32.c already has it in pci32.c.

That way pci-bridge.h doesn't need linux/of.h (Brought by asm/prom.h
via asm/pci.h)

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3c88286b55413730d7784133993a46ef4a3607ce.1646767214.git.christophe.leroy@csgroup.eu
2022-05-06 00:00:20 +10:00
Nicholas Piggin
a553476c44 powerpc/64: remove system call instruction emulation
emulate_step() instruction emulation including sc instruction emulation
initially appeared in xmon. It was then moved into sstep.c where kprobes
could use it too, and later hw_breakpoint and uprobes started to use it.

Until uprobes, the only instruction emulation users were for kernel
mode instructions.

- xmon only steps / breaks on kernel addresses.
- kprobes is kernel only.
- hw_breakpoint only emulates kernel instructions, single steps user.

At one point, there was support for the kernel to execute sc
instructions, although that is long removed and it's not clear whether
there were any in-tree users. So system call emulation is not required
by the above users.

uprobes uses emulate_step and it appears possible to emulate sc
instruction in userspace. Userspace system call emulation is broken and
it's not clear it ever worked well.

The big complication is that userspace takes an interrupt to the kernel
to emulate the instruction. The user->kernel interrupt sets up registers
and interrupt stack frame expecting to return to userspace, then system
call instruction emulation re-directs that stack frame to the kernel,
early in the system call interrupt handler. This means the interrupt
return code takes the kernel->kernel restore path, which does not
restore everything as the system call interrupt handler would expect
coming from userspace. regs->iamr appears to get lost for example,
because the kernel->kernel return does not restore the user iamr.
Accounting such as irqflags tracing and CPU accounting does not get
flipped back to user mode as the system call handler expects, so those
appear to enter the kernel twice without returning to userspace.

These things may be individually fixable with various complication, but
it is a big complexity for unclear real benefit.

Furthermore, it is not possible to single step a system call instruction
since it causes an interrupt. As such, a separate patch disables probing
on system call instructions.

This patch removes system call emulation and disables stepping system
calls.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[minor commit log edit, and also get rid of '#ifdef CONFIG_PPC64']
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a412e3b3791ed83de18704c8d90f492e7a0049c0.1648648712.git.naveen.n.rao@linux.vnet.ibm.com
2022-05-06 00:00:20 +10:00
Naveen N. Rao
54cdacd7d3 powerpc: Reject probes on instructions that can't be single stepped
Per the ISA, a Trace interrupt is not generated for:
- [h|u]rfi[d]
- rfscv
- sc, scv, and Trap instructions that trap
- Power-Saving Mode instructions
- other instructions that cause interrupts (other than Trace interrupts)
- the first instructions of any interrupt handler (applies to Branch and Single Step tracing;
CIABR matches may still occur)
- instructions that are emulated by software

Add a helper to check for instructions belonging to the first four
categories above and to reject kprobes, uprobes and xmon breakpoints on
such instructions. We reject probing on instructions belonging to these
categories across all ISA versions and across both BookS and BookE.

For trap instructions, we can't know in advance if they can cause a
trap, and there is no good reason to allow probing on those. Also,
uprobes already refuses to probe trap instructions and kprobes does not
allow probes on trap instructions used for kernel warnings and bugs. As
such, stop allowing any type of probes/breakpoints on trap instruction
across uprobes, kprobes and xmon.

For some of the fp/altivec instructions that can generate an interrupt
and which we emulate in the kernel (altivec assist, for example), we
check and turn off single stepping in emulate_single_step().

Instructions generating a DSI are restarted and single stepping normally
completes once the instruction is completed.

In uprobes, if a single stepped instruction results in a non-fatal
signal to be delivered to the task, such signals are "delayed" until
after the instruction completes. For fatal signals, single stepping is
cancelled and the instruction restarted in-place so that core dump
captures proper addresses.

In kprobes, we do not allow probes on instructions having an extable
entry and we also do not allow probing interrupt vectors.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f56ee979d50b8711fae350fc97870f3ca34acd75.1648648712.git.naveen.n.rao@linux.vnet.ibm.com
2022-05-06 00:00:20 +10:00
Julia Lawall
1fd02f6605 powerpc: fix typos in comments
Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220430185654.5855-1-Julia.Lawall@inria.fr
2022-05-05 22:12:44 +10:00