mirror_ubuntu-kernels

mirror of https://git.proxmox.com/git/mirror_ubuntu-kernels.git synced 2025-12-24 02:01:53 +00:00

Author	SHA1	Message	Date
Christophe Leroy	d3e32b997a	powerpc: Use 'struct func_desc' instead of 'struct ppc64_opd_entry' 'struct ppc64_opd_entry' is somehow redundant with 'struct func_desc', the later is more correct/complete as it includes the third field which is unused. So use 'struct func_desc' instead of 'struct ppc64_opd_entry' Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/34e76bac6cbe95a63ecd37df69fb7feb93b0ea7c.1644928018.git.christophe.leroy@csgroup.eu	2022-02-16 23:25:10 +11:00
Christophe Leroy	5b23cb8cc6	powerpc: Move and rename func_descr_t There are three architectures with function descriptors, try to have common names for the address they contain in order to refactor some functions into generic functions later. powerpc has 'entry' ia64 has 'ip' parisc has 'addr' Vote for 'addr' and update 'func_descr_t' accordingly. Move it in asm/elf.h to have it at the same place on all three architectures, remove the typedef which hides its real type, and change it to a smoother name 'struct func_desc'. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/529b2ba1d001e8f628ef0d30e8044c9b3d0a4921.1644928018.git.christophe.leroy@csgroup.eu	2022-02-16 23:25:10 +11:00
Vaibhav Jain	bbbca72352	powerpc/papr_scm: Implement initial support for injecting smart errors Presently PAPR doesn't support injecting smart errors on an NVDIMM. This makes testing the NVDIMM health reporting functionality difficult as simulating NVDIMM health related events need a hacked up qemu version. To solve this problem this patch proposes simulating certain set of NVDIMM health related events in papr_scm. Specifically 'fatal' health state and 'dirty' shutdown state. These error can be injected via the user-space 'ndctl-inject-smart(1)' command. With the proposed patch and corresponding ndctl patches following command flow is expected: $ sudo ndctl list -DH -d nmem0 ... "health_state":"ok", "shutdown_state":"clean", ... # inject unsafe shutdown and fatal health error $ sudo ndctl inject-smart nmem0 -Uf ... "health_state":"fatal", "shutdown_state":"dirty", ... # uninject all errors $ sudo ndctl inject-smart nmem0 -N ... "health_state":"ok", "shutdown_state":"clean", ... The patch adds a new member 'health_bitmap_inject_mask' inside struct papr_scm_priv which is then bitwise ANDed to the health bitmap fetched from the hypervisor. The value for 'health_bitmap_inject_mask' is accessible from sysfs at nmemX/papr/health_bitmap_inject. A new PDSM named 'SMART_INJECT' is proposed that accepts newly introduced 'struct nd_papr_pdsm_smart_inject' as payload thats exchanged between libndctl and papr_scm to indicate the requested smart-error states. When the processing the PDSM 'SMART_INJECT', papr_pdsm_smart_inject() constructs a pair or 'inject_mask' and 'clear_mask' bitmaps from the payload and bit-blt it to the 'health_bitmap_inject_mask'. This ensures the after being fetched from the hypervisor, the health_bitmap reflects requested smart-error states. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220124202204.1488346-1-vaibhav@linux.ibm.com	2022-02-16 23:10:47 +11:00
Christophe Leroy	fc75f87337	powerpc/ftrace: Have arch_ftrace_get_regs() return NULL unless FL_SAVE_REGS is set When FL_SAVE_REGS is not set we get here via ftrace_caller() which doesn't save all registers. ftrace_caller() explicitely clears regs.msr, so we can rely on it to know where we come from. We don't expect MSR register to be 0 at all when involving ftrace. Fixes: `40b035efe2` ("powerpc/ftrace: Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS") Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2f9a7e898c93cc7438ef5ccd47cb9c3a9c5b53ef.1644949750.git.christophe.leroy@csgroup.eu	2022-02-16 23:09:47 +11:00
Christophe Leroy	38a1756861	powerpc: Don't allow the use of EMIT_BUG_ENTRY with BUGFLAG_WARNING Warnings in assembly must use EMIT_WARN_ENTRY in order to generate the necessary entry in exception table. Check in EMIT_BUG_ENTRY that flags don't include BUGFLAG_WARNING. This change avoids problems like the one fixed by commit `fd1eaaaaa6` ("powerpc/64s: Use EMIT_WARN_ENTRY for SRR debug warnings"). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ddcb422102a37eb45f57694c7ef0ec6187964dff.1644742951.git.christophe.leroy@csgroup.eu	2022-02-14 13:06:43 +11:00
Michael Ellerman	5a72345e6a	powerpc: Fix STACKTRACE=n build Our skiroot_defconfig doesn't enable FTRACE, and so doesn't get STACKTRACE enabled either. That leads to a build failure since commit `1614b2b11f` ("arch: Make ARCH_STACKWALK independent of STACKTRACE") made stacktrace.c build even when STACKTRACE=n. arch/powerpc/kernel/stacktrace.c: In function ‘handle_backtrace_ipi’: arch/powerpc/kernel/stacktrace.c:171:2: error: implicit declaration of function ‘nmi_cpu_backtrace’ 171 \| nmi_cpu_backtrace(regs); \| ^~~~~~~~~~~~~~~~~ arch/powerpc/kernel/stacktrace.c: In function ‘arch_trigger_cpumask_backtrace’: arch/powerpc/kernel/stacktrace.c:226:2: error: implicit declaration of function ‘nmi_trigger_cpumask_backtrace’ 226 \| nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace_ipi); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This happens because our headers haven't defined arch_trigger_cpumask_backtrace, which causes lib/nmi_backtrace.c not to build nmi_cpu_backtrace(). The code in question doesn't actually depend on STACKTRACE=y, that was just added because arch_trigger_cpumask_backtrace() lived in stacktrace.c for convenience. So drop the dependency on CONFIG_STACKTRACE, that causes lib/nmi_backtrace.c to build nmi_cpu_backtrace() etc. and fixes the build. Fixes: `1614b2b11f` ("arch: Make ARCH_STACKWALK independent of STACKTRACE") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220212111349.2806972-1-mpe@ellerman.id.au	2022-02-12 22:47:44 +11:00
Aneesh Kumar K.V	2354ad252b	powerpc/mm: Update default hugetlb size early commit: `d9c2340052` ("Do not depend on MAX_ORDER when grouping pages by mobility") introduced pageblock_order which will be used to group pages better. The kernel now groups pages based on the value of HPAGE_SHIFT. Hence HPAGE_SHIFT should be set before we call set_pageblock_order. set_pageblock_order happens early in the boot and default hugetlb page size should be initialized before that to compute the right pageblock_order value. Currently, default hugetlbe page size is set via arch_initcalls which happens late in the boot as shown via the below callstack: [c000000007383b10] [c000000001289328] hugetlbpage_init+0x2b8/0x2f8 [c000000007383bc0] [c0000000012749e4] do_one_initcall+0x14c/0x320 [c000000007383c90] [c00000000127505c] kernel_init_freeable+0x410/0x4e8 [c000000007383da0] [c000000000012664] kernel_init+0x30/0x15c [c000000007383e10] [c00000000000cf14] ret_from_kernel_thread+0x5c/0x64 and the pageblock_order initialization is done early during the boot. [c0000000018bfc80] [c0000000012ae120] set_pageblock_order+0x50/0x64 [c0000000018bfca0] [c0000000012b3d94] sparse_init+0x188/0x268 [c0000000018bfd60] [c000000001288bfc] initmem_init+0x28c/0x328 [c0000000018bfe50] [c00000000127b370] setup_arch+0x410/0x480 [c0000000018bfed0] [c00000000127401c] start_kernel+0xb8/0x934 [c0000000018bff90] [c00000000000d984] start_here_common+0x1c/0x98 delaying default hugetlb page size initialization implies the kernel will initialize pageblock_order to (MAX_ORDER - 1) which is not an optimal value for mobility grouping. IIUC we always had this issue. But it was not a problem for hash translation mode because (MAX_ORDER - 1) is the same as HUGETLB_PAGE_ORDER (8) in the case of hash (16MB). With radix, HUGETLB_PAGE_ORDER will be 5 (2M size) and hence pageblock_order should be 5 instead of 8. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220211065215.101767-1-aneesh.kumar@linux.ibm.com	2022-02-12 22:47:44 +11:00
Nathan Lynch	92e6dc257b	powerpc/pseries: make pseries_devicetree_update() static pseries_devicetree_update() has only one call site, in the same file in which it is defined. Make it static. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220207221247.354454-1-nathanl@linux.ibm.com	2022-02-12 22:47:44 +11:00
Christophe Leroy	692b21d780	powerpc/vdso: Move cvdso_call macro into gettimeofday.S Now that gettimeofday.S is unique, move cvdso_call macro into that file which is the only user. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/72720359d4c58e3a3b96dd74952741225faac3de.1642782130.git.christophe.leroy@csgroup.eu	2022-02-12 22:47:44 +11:00
Christophe Leroy	9b97bea900	powerpc/vdso: Remove cvdso_call_time macro cvdso_call_time macro is very similar to cvdso_call macro. Add a call_time argument to cvdso_call which is 0 by default and set to 1 when using cvdso_call to call __c_kernel_time(). Return returned value as is with CR[SO] cleared when it is used for time(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/837a260ad86fc1ce297a562c2117fd69be5f7b5c.1642782130.git.christophe.leroy@csgroup.eu	2022-02-12 22:47:43 +11:00
Christophe Leroy	f061fb03ee	powerpc/vdso: augment VDSO32 functions to support 64 bits build VDSO64 cacheflush.S datapage.S gettimeofday.S and vgettimeofday.c are very similar to their VDSO32 counterpart. VDSO32 counterpart is already more complete than the VDSO64 version as it supports both PPC32 vdso and 32 bits VDSO for PPC64. Use compat macros wherever necessary in PPC32 files so that they can also be used to build VDSO64. vdso64/note.S is already a link to vdso32/note.S so no change is required. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c2cbb8f046b7efc251053521dc39b752795e26b7.1642782130.git.christophe.leroy@csgroup.eu	2022-02-12 22:47:43 +11:00
Christophe Leroy	f222ab83df	powerpc: Add set_memory_{p/np}() and remove set_memory_attr() set_memory_attr() was implemented by commit `4d1755b6a7` ("powerpc/mm: implement set_memory_attr()") because the set_memory_xx() couldn't be used at that time to modify memory "on the fly" as explained it the commit. But set_memory_attr() uses set_pte_at() which leads to warnings when CONFIG_DEBUG_VM is selected, because set_pte_at() is unexpected for updating existing page table entries. The check could be bypassed by using __set_pte_at() instead, as it was the case before commit `c988cfd38e` ("powerpc/32: use set_memory_attr()") but since commit `9f7853d760` ("powerpc/mm: Fix set_memory_() against concurrent accesses") it is now possible to use set_memory_xx() functions to update page table entries "on the fly" because the update is now atomic. For DEBUG_PAGEALLOC we need to clear and set back _PAGE_PRESENT. Add set_memory_np() and set_memory_p() for that. Replace all uses of set_memory_attr() by the relevant set_memory_xx() and remove set_memory_attr(). Fixes: `c988cfd38e` ("powerpc/32: use set_memory_attr()") Cc: stable@vger.kernel.org Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Maxime Bizon <mbizon@freebox.fr> Reviewed-by: Russell Currey <ruscur@russell.cc> Depends-on: `9f7853d760` ("powerpc/mm: Fix set_memory_() against concurrent accesses") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/cda2b44b55c96f9ac69fa92e68c01084ec9495c5.1640344012.git.christophe.leroy@csgroup.eu	2022-02-12 22:47:42 +11:00
Ard Biesheuvel	297565aa22	lib/xor: make xor prototypes more friendly to compiler vectorization Modern compilers are perfectly capable of extracting parallelism from the XOR routines, provided that the prototypes reflect the nature of the input accurately, in particular, the fact that the input vectors are expected not to overlap. This is not documented explicitly, but is implied by the interchangeability of the various C routines, some of which use temporary variables while others don't: this means that these routines only behave identically for non-overlapping inputs. So let's decorate these input vectors with the __restrict modifier, which informs the compiler that there is no overlap. While at it, make the input-only vectors pointer-to-const as well. Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://github.com/ClangBuiltLinux/linux/issues/563 Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-02-11 20:39:39 +11:00
Christophe Leroy	830213786c	powerpc/ftrace: directly call of function graph tracer by ftrace caller Modify function graph tracer to be handled directly by the standard ftrace caller. This is made possible as powerpc now supports CONFIG_DYNAMIC_FTRACE_WITH_ARGS. This change simplifies the call of function graph ftrace. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/04d196585ff81bde06a000bd9c633a33a5b21130.1640017960.git.christophe.leroy@csgroup.eu	2022-02-07 21:03:11 +11:00
Christophe Leroy	40b035efe2	powerpc/ftrace: Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS. It accelerates the call of livepatching. Also note that powerpc being the last one to convert to CONFIG_DYNAMIC_FTRACE_WITH_ARGS, it will now be possible to remove klp_arch_set_pc() on all architectures. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5831f711a778fcd6eb51eb5898f1faae4378b35b.1640017960.git.christophe.leroy@csgroup.eu	2022-02-07 21:03:11 +11:00
Christophe Leroy	d95bf254be	powerpc/ftrace: Prepare PPC32's ftrace_caller() for CONFIG_DYNAMIC_FTRACE_WITH_ARGS In order to implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS, change ftrace_caller() stack layout to match struct pt_regs. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/da9734eba504998fb914aca12131c9f6bf6120a8.1640017960.git.christophe.leroy@csgroup.eu	2022-02-07 21:03:11 +11:00
Christophe Leroy	a4520b2527	powerpc/ftrace: Add support for livepatch to PPC32 PPC64 needs some special logic to properly set up the TOC. See commit `85baa09549` ("powerpc/livepatch: Add live patching support on ppc64le") for details. PPC32 doesn't have TOC so it doesn't need that logic, so adding LIVEPATCH support is straight forward. Add CONFIG_LIVEPATCH_64 and move livepatch stack logic into that item. Livepatch sample modules all work. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/63cb094125b6a6038c65eeac2abaabbabe63addd.1640017960.git.christophe.leroy@csgroup.eu	2022-02-07 21:03:10 +11:00
Christophe Leroy	27e21e8f12	powerpc/32: Remove _ENTRY() macro _ENTRY() is now redundant with _GLOBAL(). Remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/62a35f8dde2bb74c8d0d7a5430cce07a5a3a6fb6.1638273868.git.christophe.leroy@csgroup.eu	2022-02-07 21:03:10 +11:00
Christophe Leroy	1231816373	powerpc/32: Remove remaining .stabs annotations STABS debug format has been superseded long time ago by DWARF. Remove the few remaining .stabs annotations from old 32 bits code. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/68932ec2ba6b868d35006b96e90f0890f3da3c05.1638273868.git.christophe.leroy@csgroup.eu	2022-02-07 21:03:10 +11:00
Christophe Leroy	e6d03ac156	powerpc/machdep: Move sys_ctrler_t definition into pmac_feature.h sys_ctrler_t definitions are tied to pmac. Move it into pmac_feature.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Move to pmac_feature.h to fix some build errors] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7dd5ead4bbca749e2da089ff6fe2b1878d6bf40e.1630667612.git.christophe.leroy@csgroup.eu	2022-02-07 21:02:20 +11:00
Christophe Leroy	d6a6c725a2	powerpc/machdep: Remove CONFIG_PPC_HAS_FEATURE_CALLS Last user was removed by commit `7bbd827750` ("[PATCH] ppc64: very basic desktop g5 sound support"). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/803779fffb4ee0801746b2173d37cea3b273f821.1630667612.git.christophe.leroy@csgroup.eu	2022-02-07 16:43:35 +11:00
Christophe Leroy	4291d085b0	powerpc/32s: Make pte_update() non atomic on 603 core On 603 core, TLB miss handler don't do any change to the page tables so pte_update() doesn't need to be atomic. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/cc89d3c11fc9c742d0df3454a657a3a00be24046.1643538554.git.christophe.leroy@csgroup.eu	2022-02-03 22:57:19 +11:00
Christophe Leroy	535bda36db	powerpc/nohash: Remove pte_same() arch/powerpc/include/asm/nohash/{32/64}/pgtable.h has #define __HAVE_ARCH_PTE_SAME #define pte_same(A,B) ((pte_val(A) ^ pte_val(B)) == 0) include/linux/pgtable.h has #ifndef __HAVE_ARCH_PTE_SAME static inline int pte_same(pte_t pte_a, pte_t pte_b) { return pte_val(pte_a) == pte_val(pte_b); } #endif Remove the powerpc version which is similar to the generic one. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/83c97bd58a3596ef1b0ff28b1e41fd492d005520.1643616989.git.christophe.leroy@csgroup.eu	2022-02-03 22:57:00 +11:00
Bhaskar Chowdhury	a1c4140933	powerpc/epapr: Fix parmeters typo s/parmeters/parameters/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210320213932.22697-1-unixbhaskar@gmail.com	2022-02-03 21:35:56 +11:00
Alexey Kardashevskiy	faf01aef05	KVM: PPC: Merge powerpc's debugfs entry content into generic entry At the moment KVM on PPC creates 4 types of entries under the kvm debugfs: 1) "%pid-%fd" per a KVM instance (for all platforms); 2) "vm%pid" (for PPC Book3s HV KVM); 3) "vm%u_vcpu%u_timing" (for PPC Book3e KVM); 4) "kvm-xive-%p" (for XIVE PPC Book3s KVM, the same for XICS); The problem with this is that multiple VMs per process is not allowed for 2) and 3) which makes it possible for userspace to trigger errors when creating duplicated debugfs entries. This merges all these into 1). This defines kvm_arch_create_kvm_debugfs() similar to kvm_arch_create_vcpu_debugfs(). This defines 2 hooks in kvmppc_ops that allow specific KVM implementations add necessary entries, this adds the _e500 suffix to kvmppc_create_vcpu_debugfs_e500() to make it clear what platform it is for. This makes use of already existing kvm_arch_create_vcpu_debugfs() on PPC. This removes no more used debugfs_dir pointers from PPC kvm_arch structs. This stops removing vcpu entries as once created vcpus stay around for the entire life of a VM and removed when the KVM instance is closed, see commit `d56f5136b0` ("KVM: let kvm_destroy_vm_debugfs clean up vCPU debugfs directories"). Suggested-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220111005404.162219-1-aik@ozlabs.ru	2022-02-02 20:30:26 +11:00
Al Viro	0c9dceb9bb	asm/user.h: killed unused macros Some of them used to be used by libbfd for a.out coredump handling. Seeing that * libbfd has their copies anyway * we don't export them into userland headers * we don't support a.out coredumps anymore let's bury the definitions. They never had in-kernel users anyway... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2022-01-30 21:17:00 -05:00
Arnd Bergmann	3e17314c22	agp: define proper stubs for empty helpers The empty unmap_page_from_agp() macro causes a warning when building with 'make W=1' on a couple of architectures: drivers/char/agp/generic.c: In function 'agp_generic_destroy_page': drivers/char/agp/generic.c:1265:28: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body] 1265 \| unmap_page_from_agp(page); Change the definitions to a 'do { } while (0)' construct to make these more reliable. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Helge Deller <deller@gmx.de>	2022-01-29 22:24:25 +01:00
Nicholas Piggin	22f7ff0dea	KVM: PPC: Book3S HV Nested: Fix nested HFSCR being clobbered with multiple vCPUs The L0 is storing HFSCR requested by the L1 for the L2 in struct kvm_nested_guest when the L1 requests a vCPU enter L2. kvm_nested_guest is not a per-vCPU structure. Hilarity ensues. Fix it by moving the nested hfscr into the vCPU structure together with the other per-vCPU nested fields. Fixes: `8b210a880b` ("KVM: PPC: Book3S HV Nested: Make nested HFSCR state accessible") Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220122105530.3477250-1-npiggin@gmail.com	2022-01-25 16:39:41 +11:00
Christophe Leroy	aec982603a	powerpc/fixmap: Fix VM debug warning on unmap Unmapping a fixmap entry is done by calling __set_fixmap() with FIXMAP_PAGE_CLEAR as flags. Today, powerpc __set_fixmap() calls map_kernel_page(). map_kernel_page() is not happy when called a second time for the same page. WARNING: CPU: 0 PID: 1 at arch/powerpc/mm/pgtable.c:194 set_pte_at+0xc/0x1e8 CPU: 0 PID: 1 Comm: swapper Not tainted 5.16.0-rc3-s3k-dev-01993-g350ff07feb7d-dirty #682 NIP: c0017cd4 LR: c00187f0 CTR: 00000010 REGS: e1011d50 TRAP: 0700 Not tainted (5.16.0-rc3-s3k-dev-01993-g350ff07feb7d-dirty) MSR: 00029032 <EE,ME,IR,DR,RI> CR: 42000208 XER: 00000000 GPR00: c0165fec e1011e10 c14c0000 c0ee2550 ff800000 c0f3d000 00000000 c001686c GPR08: 00001000 b00045a9 00000001 c0f58460 c0f50000 00000000 c0007e10 00000000 GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 GPR24: 00000000 00000000 c0ee2550 00000000 c0f57000 00000ff8 00000000 ff800000 NIP [c0017cd4] set_pte_at+0xc/0x1e8 LR [c00187f0] map_kernel_page+0x9c/0x100 Call Trace: [e1011e10] [c0736c68] vsnprintf+0x358/0x6c8 (unreliable) [e1011e30] [c0165fec] __set_fixmap+0x30/0x44 [e1011e40] [c0c13bdc] early_iounmap+0x11c/0x170 [e1011e70] [c0c06cb0] ioremap_legacy_serial_console+0x88/0xc0 [e1011e90] [c0c03634] do_one_initcall+0x80/0x178 [e1011ef0] [c0c0385c] kernel_init_freeable+0xb4/0x250 [e1011f20] [c0007e34] kernel_init+0x24/0x140 [e1011f30] [c0016268] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7fe3fb78 48019689 80010014 7c630034 83e1000c 5463d97e 7c0803a6 38210010 4e800020 81250000 712a0001 41820008 <0fe00000> 9421ffe0 93e1001c 48000030 Implement unmap_kernel_page() which clears an existing pte. Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b0b752f6f6ecc60653e873f385c6f0dce4e9ab6a.1638789098.git.christophe.leroy@csgroup.eu	2022-01-24 17:29:05 +11:00
Linus Torvalds	dd81e1c7d5	powerpc fixes for 5.17 #2 - A series of bpf fixes, including an oops fix and some codegen fixes. - Fix a regression in syscall_get_arch() for compat processes. - Fix boot failure on some 32-bit systems with KASAN enabled. - A couple of other build/minor fixes. Thanks to: Athira Rajeev, Christophe Leroy, Dmitry V. Levin, Jiri Olsa, Johan Almbladh, Maxime Bizon, Naveen N. Rao, Nicholas Piggin. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmHtOJgTHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgLMeD/9eBHqMy+yhXfTw9mjAf8rVlsdY6Xjq 2LYgHCJpeP0VjxBm8GcyFtAfJc778LyykU18xsi0WVsuwPmVdVwDfZpzxWJrYqeq 6AuK9uRZAbEx6gZVCsWlk3Em7GyUEqHm5PTxKsDfbzh+hBTNQ8CcKKbCVmEZ+loA nWdaWJOZMpE4hQOx0Ph5JRAos/ZUEt2I57qZkRgvIEV+hroFrmdP1rnncdalnLHN pXsEn380Q+rpH+48HIpaTpl5+R8CvgFnnlI6/luZC6HoA8K6jph8FdOkkI9eU1v6 TgdNMiNVIe1wRFzVp+nH7SGqL3l9X/UPS57hL52X+ccpHoioBW9CLPha06g/+UKT kJU3mYZg3jvQGrM9H/2R3P97UH9SwHE5SH0xcXj3ItH1NpOKpMO0h5UuWx+Ze8Wo zO80xbYLbALtEfx8JMCf95rnUVFL0n55xVJN+7+fs2hU3Be/ZpIyHfDujlHlbd1m o3KRhkg1a0RyGsun6Gd2vwLGaBbwYNWvgHM9IDeiAFdJ7HFpp1YrVGpQRkAIUbJS bzYiRczjbSof7c9WqLyKNatgNFO/JPI5vi1iNQ3u9Oh9fzjpv5Zok4vK534mOnqU QnhiqLJbY+yBpoAvSePGQTCCzDAnJq0r53n0d1O/2T1GcpX9MKM4YlL+Wu4m0yLn RcbOCTkmoqWZEw== =KjWT -----END PGP SIGNATURE----- Merge tag 'powerpc-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - A series of bpf fixes, including an oops fix and some codegen fixes. - Fix a regression in syscall_get_arch() for compat processes. - Fix boot failure on some 32-bit systems with KASAN enabled. - A couple of other build/minor fixes. Thanks to Athira Rajeev, Christophe Leroy, Dmitry V. Levin, Jiri Olsa, Johan Almbladh, Maxime Bizon, Naveen N. Rao, and Nicholas Piggin. * tag 'powerpc-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Mask SRR0 before checking against the masked NIP powerpc/perf: Only define power_pmu_wants_prompt_pmi() for CONFIG_PPC64 powerpc/32s: Fix kasan_init_region() for KASAN powerpc/time: Fix build failure due to do_hard_irq_enable() on PPC32 powerpc/audit: Fix syscall_get_arch() powerpc64/bpf: Limit 'ldbrx' to processors compliant with ISA v2.06 tools/bpf: Rename 'struct event' to avoid naming conflict powerpc/bpf: Update ldimm64 instructions during extra pass powerpc32/bpf: Fix codegen for bpf-to-bpf calls bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack()	2022-01-23 17:52:42 +02:00
Linus Torvalds	3689f9f8b0	bitmap patches for 5.17-rc1 -----BEGIN PGP SIGNATURE----- iQHJBAABCgAzFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmHi+xgVHHl1cnkubm9y b3ZAZ21haWwuY29tAAoJELFEgP06H77IxdoMAMf3E+L51Ys/4iAiyJQNVoT3aIBC A8ZVOB9he1OA3o3wBNIRKmICHk+ovnfCWcXTr9fG/Ade2wJz88NAsGPQ1Phywb+s iGlpySllFN72RT9ZqtJhLEzgoHHOL0CzTW07TN9GJy4gQA2h2G9CTP+OmsQdnVqE m9Fn3PSlJ5lhzePlKfnln8rGZFgrriJakfEFPC79n/7an4+2Hvkb5rWigo7KQc4Z 9YNqYUcHWZFUgq80adxEb9LlbMXdD+Z/8fCjOrAatuwVkD4RDt6iKD0mFGjHXGL7 MZ9KRS8AfZXawmetk3jjtsV+/QkeS+Deuu7k0FoO0Th2QV7BGSDhsLXAS5By/MOC nfSyHhnXHzCsBMyVNrJHmNhEZoN29+tRwI84JX9lWcf/OLANcCofnP6f2UIX7tZY CAZAgVELp+0YQXdybrfzTQ8BT3TinjS/aZtCrYijRendI1GwUXcyl69vdOKqAHuk 5jy8k/xHyp+ZWu6v+PyAAAEGowY++qhL0fmszA== =RKW4 -----END PGP SIGNATURE----- Merge tag 'bitmap-5.17-rc1' of git://github.com/norov/linux Pull bitmap updates from Yury Norov: - introduce for_each_set_bitrange() - use find_first__bit() instead of find_next__bit() where possible - unify for_each_bit() macros * tag 'bitmap-5.17-rc1' of git://github.com/norov/linux: vsprintf: rework bitmap_list_string lib: bitmap: add performance test for bitmap_print_to_pagebuf bitmap: unify find_bit operations mm/percpu: micro-optimize pcpu_is_populated() Replace for_each__bit_from() with for_each__bit() where appropriate find: micro-optimize for_each_{set,clear}_bit() include/linux: move for_each_bit() macros from bitops.h to find.h cpumask: replace cpumask_next_* with cpumask_first_* where appropriate tools: sync tools/bitmap with mother linux all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate cpumask: use find_first_and_bit() lib: add find_first_and_bit() arch: remove GENERIC_FIND_FIRST_BIT entirely include: move find.h from asm_generic to linux bitops: move find_bit_*_le functions from le.h to find.h bitops: protect find_first_{,zero}_bit properly	2022-01-23 06:20:44 +02:00
Linus Torvalds	79e06c4c49	RISCV: - Use common KVM implementation of MMU memory caches - SBI v0.2 support for Guest - Initial KVM selftests support - Fix to avoid spurious virtual interrupts after clearing hideleg CSR - Update email address for Anup and Atish ARM: - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update s390: - fix sigp sense/start/stop/inconsistency - cleanups x86: - Clean up some function prototypes more - improved gfn_to_pfn_cache with proper invalidation, used by Xen emulation - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery - completely remove potential TOC/TOU races in nested SVM consistency checks - update some PMCs on emulated instructions - Intel AMX support (joint work between Thomas and Intel) - large MMU cleanups - module parameter to disable PMU virtualization - cleanup register cache - first part of halt handling cleanups - Hyper-V enlightened MSR bitmap support for nested hypervisors Generic: - clean up Makefiles - introduce CONFIG_HAVE_KVM_DIRTY_RING - optimize memslot lookup using a tree - optimize vCPU array usage by converting to xarray -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmHhxvsUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPZkAf+Nz92UL/5nNGcdHtE4m7AToMmitE9 bYkesf9BMQvAe5wjkABLuoHGi6ay4jabo4fiGzbdkiK7lO5YgfsWiMB3/MT5fl4E jRPzaVQabp3YZLM8UYCBmfUVuRj524S967SfSRe0AvYjDEH8y7klPf4+7sCsFT0/ Px9Vf2KGuOlf0eM78yKg4rGaF0jS22eLgXm6FfNMY8/e29ZAo/jyUmqBY+Z2xxZG aWhceDtSheW1jwLHLj3nOlQJvHTn8LVGXBE/R8Gda3ZjrBV2rKaDi4Fh+HD+dz86 2zVXwzQ7uck2CMW73GMoXMTWoKSHMyvlBOs1BdvBm4UsnGcXR+q8IFCeuQ== =s73m -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "RISCV: - Use common KVM implementation of MMU memory caches - SBI v0.2 support for Guest - Initial KVM selftests support - Fix to avoid spurious virtual interrupts after clearing hideleg CSR - Update email address for Anup and Atish ARM: - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update s390: - fix sigp sense/start/stop/inconsistency - cleanups x86: - Clean up some function prototypes more - improved gfn_to_pfn_cache with proper invalidation, used by Xen emulation - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery - completely remove potential TOC/TOU races in nested SVM consistency checks - update some PMCs on emulated instructions - Intel AMX support (joint work between Thomas and Intel) - large MMU cleanups - module parameter to disable PMU virtualization - cleanup register cache - first part of halt handling cleanups - Hyper-V enlightened MSR bitmap support for nested hypervisors Generic: - clean up Makefiles - introduce CONFIG_HAVE_KVM_DIRTY_RING - optimize memslot lookup using a tree - optimize vCPU array usage by converting to xarray" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (268 commits) x86/fpu: Fix inline prefix warnings selftest: kvm: Add amx selftest selftest: kvm: Move struct kvm_x86_state to header selftest: kvm: Reorder vcpu_load_state steps for AMX kvm: x86: Disable interception for IA32_XFD on demand x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() kvm: selftests: Add support for KVM_CAP_XSAVE2 kvm: x86: Add support for getting/setting expanded xstate buffer x86/fpu: Add uabi_size to guest_fpu kvm: x86: Add CPUID support for Intel AMX kvm: x86: Add XCR0 support for Intel AMX kvm: x86: Disable RDMSR interception of IA32_XFD_ERR kvm: x86: Emulate IA32_XFD_ERR for guest kvm: x86: Intercept #NM for saving IA32_XFD_ERR x86/fpu: Prepare xfd_err in struct fpu_guest kvm: x86: Add emulation for IA32_XFD x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation kvm: x86: Enable dynamic xfeatures at KVM_SET_CPUID2 x86/fpu: Provide fpu_enable_guest_xfd_features() for KVM x86/fpu: Add guest support to xfd_enable_feature() ...	2022-01-16 16:15:14 +02:00
Christophe Leroy	d37823c352	powerpc/32s: Fix kasan_init_region() for KASAN It has been reported some configuration where the kernel doesn't boot with KASAN enabled. This is due to wrong BAT allocation for the KASAN area: ---[ Data Block Address Translation ]--- 0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw m 1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw m 2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw m 3: 0xf8000000-0xf9ffffff 0x2a000000 32M Kernel rw m 4: 0xfa000000-0xfdffffff 0x2c000000 64M Kernel rw m A BAT must have both virtual and physical addresses alignment matching the size of the BAT. This is not the case for BAT 4 above. Fix kasan_init_region() by using block_size() function that is in book3s32/mmu.c. To be able to reuse it here, make it non static and change its name to bat_block_size() in order to avoid name conflict with block_size() defined in <linux/blkdev.h> Also reuse find_free_bat() to avoid an error message from setbat() when no BAT is available. And allocate memory outside of linear memory mapping to avoid wasting that precious space. With this change we get correct alignment for BATs and KASAN shadow memory is allocated outside the linear memory space. ---[ Data Block Address Translation ]--- 0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw 1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw 2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw 3: 0xf8000000-0xfbffffff 0x7c000000 64M Kernel rw 4: 0xfc000000-0xfdffffff 0x7a000000 32M Kernel rw Fixes: `7974c47326` ("powerpc/32s: Implement dedicated kasan_init_region()") Cc: stable@vger.kernel.org Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7a50ef902494d1325227d47d33dada01e52e5518.1641818726.git.christophe.leroy@csgroup.eu	2022-01-16 20:51:05 +11:00
Christophe Leroy	87b9d74fb0	powerpc/time: Fix build failure due to do_hard_irq_enable() on PPC32 CC arch/powerpc/kernel/time.o In file included from <command-line>: ./arch/powerpc/include/asm/hw_irq.h: In function 'do_hard_irq_enable': ././include/linux/compiler_types.h:335:45: error: call to '__compiletime_assert_35' declared with attribute error: BUILD_BUG failed 335 \| _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) \| ^ ././include/linux/compiler_types.h:316:25: note: in definition of macro '__compiletime_assert' 316 \| prefix ## suffix(); \ \| ^~~~~~ ././include/linux/compiler_types.h:335:9: note: in expansion of macro '_compiletime_assert' 335 \| _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) \| ^~~~~~~~~~~~~~~~~~~ ./include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert' 39 \| #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) \| ^~~~~~~~~~~~~~~~~~ ./include/linux/build_bug.h:59:21: note: in expansion of macro 'BUILD_BUG_ON_MSG' 59 \| #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed") \| ^~~~~~~~~~~~~~~~ ./arch/powerpc/include/asm/hw_irq.h:483:9: note: in expansion of macro 'BUILD_BUG' 483 \| BUILD_BUG(); \| ^~~~~~~~~ should_hard_irq_enable() returns false on PPC32 so this BUILD_BUG() shouldn't trigger. Force inlining of should_hard_irq_enable() Fixes: `0faf20a1ad` ("powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless perf is in use") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/247e01e0e10f4dbc59b5ff89e81702eb1ee7641e.1641828571.git.christophe.leroy@csgroup.eu	2022-01-16 20:50:20 +11:00
Yury Norov	47d8c15615	include: move find.h from asm_generic to linux find_bit API and bitmap API are closely related, but inclusion paths are different - include/asm-generic and include/linux, correspondingly. In the past it made a lot of troubles due to circular dependencies and/or undefined symbols. Fix this by moving find.h under include/linux. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	2022-01-15 08:47:31 -08:00
Christophe Leroy	252745240b	powerpc/audit: Fix syscall_get_arch() Commit `770cec16cd` ("powerpc/audit: Simplify syscall_get_arch()") and commit `898a1ef06a` ("powerpc/audit: Avoid unneccessary #ifdef in syscall_get_arguments()") replaced test_tsk_thread_flag(task, TIF_32BIT)) by is_32bit_task(). But is_32bit_task() applies on current task while be want the test done on task 'task' So add a new macro is_tsk_32bit_task() to check any task. Fixes: `770cec16cd` ("powerpc/audit: Simplify syscall_get_arch()") Fixes: `898a1ef06a` ("powerpc/audit: Avoid unneccessary #ifdef in syscall_get_arguments()") Cc: stable@vger.kernel.org Reported-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c55cddb8f65713bf5859ed675d75a50cb37d5995.1642159570.git.christophe.leroy@csgroup.eu	2022-01-15 12:21:26 +11:00
Naveen N. Rao	3f5f766d5f	powerpc64/bpf: Limit 'ldbrx' to processors compliant with ISA v2.06 Johan reported the below crash with test_bpf on ppc64 e5500: test_bpf: #296 ALU_END_FROM_LE 64: 0x0123456789abcdef -> 0x67452301 jited:1 Oops: Exception in kernel mode, sig: 4 [#1] BE PAGE_SIZE=4K SMP NR_CPUS=24 QEMU e500 Modules linked in: test_bpf(+) CPU: 0 PID: 76 Comm: insmod Not tainted 5.14.0-03771-g98c2059e008a-dirty #1 NIP: 8000000000061c3c LR: 80000000006dea64 CTR: 8000000000061c18 REGS: c0000000032d3420 TRAP: 0700 Not tainted (5.14.0-03771-g98c2059e008a-dirty) MSR: 0000000080089000 <EE,ME> CR: 88002822 XER: 20000000 IRQMASK: 0 <...> NIP [8000000000061c3c] 0x8000000000061c3c LR [80000000006dea64] .__run_one+0x104/0x17c [test_bpf] Call Trace: .__run_one+0x60/0x17c [test_bpf] (unreliable) .test_bpf_init+0x6a8/0xdc8 [test_bpf] .do_one_initcall+0x6c/0x28c .do_init_module+0x68/0x28c .load_module+0x2460/0x2abc .__do_sys_init_module+0x120/0x18c .system_call_exception+0x110/0x1b8 system_call_common+0xf0/0x210 --- interrupt: c00 at 0x101d0acc <...> ---[ end trace 47b2bf19090bb3d0 ]--- Illegal instruction The illegal instruction turned out to be 'ldbrx' emitted for BPF_FROM_[L\|B]E, which was only introduced in ISA v2.06. Guard use of the same and implement an alternative approach for older processors. Fixes: `156d0e290e` ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF") Reported-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d1e51c6fdf572062cf3009a751c3406bda01b832.1641468127.git.naveen.n.rao@linux.vnet.ibm.com	2022-01-15 12:21:25 +11:00
Linus Torvalds	29ec39fcf1	powerpc updates for 5.17 - Optimise radix KVM guest entry/exit by 2x on Power9/Power10. - Allow firmware to tell us whether to disable the entry and uaccess flushes on Power10 or later CPUs. - Add BPF_PROBE_MEM support for 32 and 64-bit BPF jits. - Several fixes and improvements to our hard lockup watchdog. - Activate HAVE_DYNAMIC_FTRACE_WITH_REGS on 32-bit. - Allow building the 64-bit Book3S kernel without hash MMU support, ie. Radix only. - Add KUAP (SMAP) support for 40x, 44x, 8xx, Book3E (64-bit). - Add new encodings for perf_mem_data_src.mem_hops field, and use them on Power10. - A series of small performance improvements to 64-bit interrupt entry. - Several commits fixing issues when building with the clang integrated assembler. - Many other small features and fixes. Thanks to: Alan Modra, Alexey Kardashevskiy, Ammar Faizi, Anders Roxell, Arnd Bergmann, Athira Rajeev, Cédric Le Goater, Christophe JAILLET, Christophe Leroy, Christoph Hellwig, Daniel Axtens, David Yang, Erhard Furtner, Fabiano Rosas, Greg Kroah-Hartman, Guo Ren, Hari Bathini, Jason Wang, Joel Stanley, Julia Lawall, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mark Brown, Minghao Chi, Nageswara R Sastry, Naresh Kamboju, Nathan Chancellor, Nathan Lynch, Nicholas Piggin, Nick Child, Oliver O'Halloran, Peiwei Hu, Randy Dunlap, Ravi Bangoria, Rob Herring, Russell Currey, Sachin Sant, Sean Christopherson, Segher Boessenkool, Thadeu Lima de Souza Cascardo, Tyrel Datwyler, Xiang wangx, Yang Guang. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmHhVFMTHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgKwzD/9UUEZzWyzMVRJvP9FPZByN2M8czxHJ tuqEuVqnfks8ad8tfm2ebng5t8ZuVASBQU2fpPA1+lpdvprgZN5RFGMRh729vskn 2aHQPmFvFObNbXOgCoXzk+C5xYi3zoRMVM968neSPBneYo+xDicn/zN5CHAgsjhX +baemJQ7/xzwLiZgTHe8fWw3nTk3IbPBpha59SdTvR8Moy6I4O8CDPIYEm3U3/J3 x14ZRETqjksL7YOzEBk0avm1dDZRw/johz29oRYSmCj7dyy5OqrkPwokJiRY90eA 1lVdofDc0zElaSWkVGzKdSWRUIXjKIVdtejvDeEvl6H/mI6q4TVZE8rFmn+3Rvgf 9q0iKtmw5Kn11cqgY/pgEGmxnQtIdAodNfI/t939E7+O5LbcznuYUiy0J/kTD/vl Xduotg2dsCI+5ukf1wrk2wt9LhqZL+ziOeaBhyDM4orV8T3HBYL6zWBptun//IGO lK6TvvCHSYnGqY4bnrAmiOnbbEtnP6nN3zbcXgSvPM0wCRHPIEqd0NRXtfISo32d vBPq1neXWo4wrRJj9X3yOuP+5fEA4I+hB3yrCJOkcEcz+8NhlboQXU7raVsJL+bd kze75H8hwX7kE71oJFFl13LbSNABgiLFARTBXKfvdQA2iLdR0Snvm+OouvwWRPo/ Po7Nm3zqdLc/1A== =BxhQ -----END PGP SIGNATURE----- Merge tag 'powerpc-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Optimise radix KVM guest entry/exit by 2x on Power9/Power10. - Allow firmware to tell us whether to disable the entry and uaccess flushes on Power10 or later CPUs. - Add BPF_PROBE_MEM support for 32 and 64-bit BPF jits. - Several fixes and improvements to our hard lockup watchdog. - Activate HAVE_DYNAMIC_FTRACE_WITH_REGS on 32-bit. - Allow building the 64-bit Book3S kernel without hash MMU support, ie. Radix only. - Add KUAP (SMAP) support for 40x, 44x, 8xx, Book3E (64-bit). - Add new encodings for perf_mem_data_src.mem_hops field, and use them on Power10. - A series of small performance improvements to 64-bit interrupt entry. - Several commits fixing issues when building with the clang integrated assembler. - Many other small features and fixes. Thanks to Alan Modra, Alexey Kardashevskiy, Ammar Faizi, Anders Roxell, Arnd Bergmann, Athira Rajeev, Cédric Le Goater, Christophe JAILLET, Christophe Leroy, Christoph Hellwig, Daniel Axtens, David Yang, Erhard Furtner, Fabiano Rosas, Greg Kroah-Hartman, Guo Ren, Hari Bathini, Jason Wang, Joel Stanley, Julia Lawall, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mark Brown, Minghao Chi, Nageswara R Sastry, Naresh Kamboju, Nathan Chancellor, Nathan Lynch, Nicholas Piggin, Nick Child, Oliver O'Halloran, Peiwei Hu, Randy Dunlap, Ravi Bangoria, Rob Herring, Russell Currey, Sachin Sant, Sean Christopherson, Segher Boessenkool, Thadeu Lima de Souza Cascardo, Tyrel Datwyler, Xiang wangx, and Yang Guang. * tag 'powerpc-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (240 commits) powerpc/xmon: Dump XIVE information for online-only processors. powerpc/opal: use default_groups in kobj_type powerpc/cacheinfo: use default_groups in kobj_type powerpc/sched: Remove unused TASK_SIZE_OF powerpc/xive: Add missing null check after calling kmalloc powerpc/floppy: Remove usage of the deprecated "pci-dma-compat.h" API selftests/powerpc: Add a test of sigreturning to an unaligned address powerpc/64s: Use EMIT_WARN_ENTRY for SRR debug warnings powerpc/64s: Mask NIP before checking against SRR0 powerpc/perf: Fix spelling of "its" powerpc/32: Fix boot failure with GCC latent entropy plugin powerpc/code-patching: Replace patch_instruction() by ppc_inst_write() in selftests powerpc/code-patching: Move code patching selftests in its own file powerpc/code-patching: Move instr_is_branch_{i/b}form() in code-patching.h powerpc/code-patching: Move patch_exception() outside code-patching.c powerpc/code-patching: Use test_trampoline for prefixed patch test powerpc/code-patching: Fix patch_branch() return on out-of-range failure powerpc/code-patching: Reorganise do_patch_instruction() to ease error handling powerpc/code-patching: Fix unmap_patch_area() error handling powerpc/code-patching: Fix error handling in do_patch_instruction() ...	2022-01-14 15:17:26 +01:00
Guo Ren	08035a67f3	powerpc/sched: Remove unused TASK_SIZE_OF This macro isn't used in Linux sched, now. Delete in include/linux/sched.h and arch's include/asm. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211228064730.2882351-5-guoren@kernel.org	2022-01-04 23:12:27 +11:00
Christophe JAILLET	e57c2fd6cd	powerpc/floppy: Remove usage of the deprecated "pci-dma-compat.h" API In [1], Christoph Hellwig has proposed to remove the wrappers in include/linux/pci-dma-compat.h. Some reasons why this API should be removed have been given by Julia Lawall in [2]. A coccinelle script has been used to perform the needed transformation Only relevant parts are given below. @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) [1]: https://lore.kernel.org/kernel-janitors/20200421081257.GA131897@infradead.org/ [2]: https://lore.kernel.org/kernel-janitors/alpine.DEB.2.22.394.2007120902170.2424@hadrien/ Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9e24eedeab44cbb840598bb188561a48811de845.1641119338.git.christophe.jaillet@wanadoo.fr	2022-01-04 16:00:59 +11:00
Christophe Leroy	309a0a6018	powerpc/code-patching: Replace patch_instruction() by ppc_inst_write() in selftests The purpose of selftests is to check that instructions are properly formed. Not to check that they properly run. For that test it uses normal memory, not special test memory. In preparation of a future patch enforcing patch_instruction() to be used only on valid text areas, implement a ppc_inst_write() instruction which is the complement of ppc_inst_read(). This new function writes the formated instruction in valid kernel memory and doesn't bother about icache. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7cf5335cc07ca9b6f8cdaa20ca9887fce4df3bea.1638446239.git.christophe.leroy@csgroup.eu	2021-12-23 22:36:58 +11:00
Christophe Leroy	31acc59956	powerpc/code-patching: Move instr_is_branch_{i/b}form() in code-patching.h To enable moving selftests in their own C file in following patch, move instr_is_branch_iform() and instr_is_branch_bform() to code-patching.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/fca0f3b191211b3681020885a611bf73eef20563.1638446239.git.christophe.leroy@csgroup.eu	2021-12-23 22:36:58 +11:00
Christophe Leroy	29562a9da2	powerpc/code-patching: Move patch_exception() outside code-patching.c patch_exception() is dedicated to book3e/64 is nothing more than a normal use of patch_branch(), so move it into a place dedicated to book3e/64. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0968622b98b1fb51838c35b844c42ad6609de62e.1638446239.git.christophe.leroy@csgroup.eu	2021-12-23 22:36:55 +11:00
Christophe Leroy	af5304a750	powerpc/code-patching: Remove init_mem_is_free A new state has been added by commit `d2635f2012` ("mm: create a new system state and fix core_kernel_text()"). That state tells when initmem is about to be released and is redundant with init_mem_is_free. Remove init_mem_is_free. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ad8c3ccb39c8edaa89fd3eda1cc7218baea1cde5.1638446239.git.christophe.leroy@csgroup.eu	2021-12-23 22:35:24 +11:00
Alexey Kardashevskiy	62479e6e26	powerpc/mm/book3s64/hash: Switch pre 2.06 tlbiel to .long The llvm integrated assembler does not recognise the ISA 2.05 tlbiel version. Work around it by switching to .long when an old arch level detected. Signed-off-by: Daniel Axtens <dja@axtens.net> [aik: did "Eventually do this more smartly"] Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211221055904.555763-7-aik@ozlabs.ru	2021-12-23 22:35:13 +11:00
Alexey Kardashevskiy	d51f86cfd8	powerpc/mm: Switch obsolete dssall to .long The dssall ("Data Stream Stop All") instruction is obsolete altogether with other Data Cache Instructions since ISA 2.03 (year 2006). LLVM IAS does not support it but PPC970 seems to be using it. This switches dssall to .long as there is no much point in fixing LLVM. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211221055904.555763-6-aik@ozlabs.ru	2021-12-23 22:35:13 +11:00
Daniel Axtens	d72c4a36d7	powerpc/64/asm: Do not reassign labels The LLVM integrated assembler really does not like us reassigning things to the same label: <instantiation>:7:9: error: invalid reassignment of non-absolute variable 'fs_label' This happens across a bunch of platforms: https://github.com/ClangBuiltLinux/linux/issues/1043 https://github.com/ClangBuiltLinux/linux/issues/1008 https://github.com/ClangBuiltLinux/linux/issues/920 https://github.com/ClangBuiltLinux/linux/issues/1050 There is no hope of getting this fixed in LLVM (see https://github.com/ClangBuiltLinux/linux/issues/1043#issuecomment-641571200 and https://bugs.llvm.org/show_bug.cgi?id=47798#c1 ) so if we want to build with LLVM_IAS, we need to hack around it ourselves. For us the big problem comes from this: \#define USE_FIXED_SECTION(sname) \ fs_label = start_##sname; \ fs_start = sname##_start; \ use_ftsec sname; \#define USE_TEXT_SECTION() fs_label = start_text; \ fs_start = text_start; \ .text and in particular fs_label. This works around it by not setting those 'variables' and requiring that users of the variables instead track for themselves what section they are in. This isn't amazing, by any stretch, but it gets us further in the compilation. Note that even though users have to keep track of the section, using a wrong one produces an error with both binutils and llvm which prevents from using wrong section at the compile time: llvm error example: AS arch/powerpc/kernel/head_64.o <unknown>:0: error: Cannot represent a difference across sections make[3]: * [/home/aik/p/kernels-llvm/llvm/scripts/Makefile.build:388: arch/powerpc/kernel/head_64.o] Error 1 binutils error example: /home/aik/p/kernels-llvm/llvm/arch/powerpc/kernel/exceptions-64s.S: Assembler messages: /home/aik/p/kernels-llvm/llvm/arch/powerpc/kernel/exceptions-64s.S:1974: Error: can't resolve `system_call_common' {.text section} - `start_r eal_vectors' {.head.text.real_vectors section} make[3]: * [/home/aik/p/kernels-llvm/llvm/scripts/Makefile.build:388: arch/powerpc/kernel/head_64.o] Error 1 Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211221055904.555763-5-aik@ozlabs.ru	2021-12-23 22:35:12 +11:00
Alan Modra	a3ad84da07	powerpc/toc: Future proof kernel toc This patch future-proofs the kernel against linker changes that might put the toc pointer at some location other than .got+0x8000, by replacing __toc_start+0x8000 with .TOC. throughout. If the kernel's idea of the toc pointer doesn't agree with the linker, bad things happen. prom_init.c code relocating its toc is also changed so that a symbolic __prom_init_toc_start toc-pointer relative address is calculated rather than assuming that it is always at toc-pointer - 0x8000. The length calculations loading values from the toc are also avoided. It's a little incestuous to do that with unreloc_toc picking up adjusted values (which is fine in practice, they both adjust by the same amount if all goes well). I've also changed the way .got is aligned in vmlinux.lds and zImage.lds, mostly so that dumping out section info by objdump or readelf plainly shows the alignment is 256. This linker script feature was added 2005-09-27, available in FSF binutils releases from 2.17 onwards. Should be safe to use in the kernel, I think. Finally, put (.got) before the prom_init.o entry which only needs (.toc), so that the GOT header goes in the correct place. I don't believe this makes any difference for the kernel as it would for dynamic objects being loaded by ld.so. That change is just to stop lusers who blindly copy kernel scripts being led astray. Of course, this change needs the prom_init.c changes. Some notes on .toc and .got. .toc is a compiler generated section of addresses. .got is a linker generated section of addresses, generally built when the linker sees R__GOT* relocations. In the case of powerpc64 ld.bfd, there are multiple generated .got sections, one per input object file. So you can somewhat reasonably write in a linker script an input section statement like prom_init.o(.got .toc) to mean "the .got and .toc section for files matching prom_init.o". On other architectures that doesn't make sense, because the linker generally has just one .got section. Even on powerpc64, note well that the GOT entries for prom_init.o may be merged with GOT entries from other objects. That means that if prom_init.o references, say, _end via some GOT relocation, and some other object also references _end via a GOT relocation, the GOT entry for _end may be in the range __prom_init_toc_start to __prom_init_toc_end and if the kernel does something special to GOT/TOC entries in that range then the value of _end as seen by objects other than prom_init.o will be affected. On the other hand the GOT entry for _end may not be in the range __prom_init_toc_start to __prom_init_toc_end. Which way it turns out is deterministic but a detail of linker operation that should not be relied on. A feature of ld.bfd is that input .toc (and .got) sections matching one linker input section statement may be sorted, to put entries used by small-model code first, near the toc base. This is why scripts for powerpc64 normally use (.got .toc) rather than (.got) *(.toc), since the first form allows more freedom to sort. Another feature of ld.bfd is that indirect addressing sequences using the GOT/TOC may be edited by the linker to relative addressing. In many cases relative addressing would be emitted by gcc for -mcmodel=medium if you appropriately decorate variable declarations with non-default visibility. The original patch is here: https://lore.kernel.org/linuxppc-dev/20210310034813.GM6042@bubble.grove.modra.org/ Signed-off-by: Alan Modra <amodra@au1.ibm.com> [aik: removed non-relocatable which is gone in `24d33ac5b8`] [aik: added <=2.24 check] [aik: because of llvm-as, kernel_toc_addr() uses "mr" instead of global register variable] Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211221055904.555763-2-aik@ozlabs.ru	2021-12-23 22:35:01 +11:00
Nick Child	e14ff96d08	powerpc/pseries: Add __init attribute to eligible functions Some functions defined in 'arch/powerpc/platforms/pseries' are deserving of an `__init` macro attribute. These functions are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-13-nick.child@ibm.com	2021-12-23 22:33:15 +11:00
Nick Child	e5913db1ef	powerpc/powernv: Add __init attribute to eligible functions Some functions defined in 'arch/powerpc/platforms/powernv' are deserving of an `__init` macro attribute. These functions are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-12-nick.child@ibm.com	2021-12-23 22:33:15 +11:00
Nick Child	b346f57100	powerpc/powermac: Add __init attribute to eligible functions Some functions defined in 'arch/powerpc/platforms/powermac` are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-11-nick.child@ibm.com	2021-12-23 22:33:14 +11:00
Nick Child	456e8eb324	powerpc/xmon: Add __init attribute to eligible functions `xmon_register_spus` defined in 'arch/powerpc/xmon' is deserving of an `__init` macro attribute. This functions is only called by other initialization functions and therefore should inherit the attribute. Also, change the function declaration in the header file to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-7-nick.child@ibm.com	2021-12-23 22:33:12 +11:00
Nick Child	6c552983d0	powerpc/sysdev: Add __init attribute to eligible functions Some files functions in 'arch/powerpc/sysdev' are deserving of an `__init` macro attribute. These functions are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-6-nick.child@ibm.com	2021-12-23 22:33:12 +11:00
Nick Child	c49f5d88ff	powerpc/perf: Add __init attribute to eligible functions Some functions defined in 'arch/powerpc/perf' are deserving of an `__init` macro attribute. These functions are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-5-nick.child@ibm.com	2021-12-23 22:33:11 +11:00
Nick Child	c13f2b2bb5	powerpc/mm: Add __init attribute to eligible functions Some functions defined in 'arch/powerpc/mm' are deserving of an `__init` macro attribute. These functions are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-4-nick.child@ibm.com	2021-12-23 22:33:11 +11:00
Nick Child	ce0c6be9c6	powerpc/lib: Add __init attribute to eligible functions Some functions defined in 'arch/powerpc/lib' are deserving of an `__init` macro attribute. These functions are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-3-nick.child@ibm.com	2021-12-23 22:33:10 +11:00
Nick Child	d276960d92	powerpc/kernel: Add __init attribute to eligible functions Some functions defined in `arch/powerpc/kernel` (and one in `arch/powerpc/ kexec`) are deserving of an `__init` macro attribute. These functions are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-2-nick.child@ibm.com	2021-12-23 22:33:10 +11:00
Paolo Bonzini	5a213b9220	Merge branch 'topic/ppc-kvm' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux into HEAD Fix conflicts between memslot overhaul and commit `511d25d6b7` ("KVM: PPC: Book3S: Suppress warnings when allocating too big memory slots") from the powerpc tree.	2021-12-19 15:27:21 +01:00
Nicholas Piggin	ecb1057c0f	powerpc/64/interrupt: reduce expensive debug tests Move the assertions requiring restart table searches under CONFIG_PPC_IRQ_SOFT_MASK_DEBUG. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210922145452.352571-6-npiggin@gmail.com	2021-12-16 21:31:45 +11:00
Nicholas Piggin	0faf20a1ad	powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless perf is in use Enabling MSR[EE] in interrupt handlers while interrupts are still soft masked allows PMIs to profile interrupt handlers to some degree, beyond what SIAR latching allows. When perf is not being used, this is almost useless work. It requires an extra mtmsrd in the irq handler, and it also opens the door to masked interrupts hitting and requiring replay, which is more expensive than just taking them directly. This effect can be noticable in high IRQ workloads. Avoid enabling MSR[EE] unless perf is currently in use. This saves about 60 cycles (or 8%) on a simple decrementer interrupt microbenchmark. Replayed interrupts drop from 1.4% of all interrupts taken, to 0.003%. This does prevent the soft-nmi interrupt being taken in these handlers, but that's not too reliable anyway. The SMP watchdog will continue to be the reliable way to catch lockups. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210922145452.352571-5-npiggin@gmail.com	2021-12-16 21:31:45 +11:00
Nicholas Piggin	5a7745b96f	powerpc/64s/perf: add power_pmu_wants_prompt_pmi to say whether perf wants PMIs to be soft-NMI Interrupt code enables MSR[EE] in some irq handlers while keeping local irqs disabled via soft-mask, allowing PMI interrupts to be taken as soft-NMI to improve profiling of irq handlers. When perf is not enabled, there is no point to doing this, it's additional overhead. So provide a function that can say if PMIs should be taken promptly if possible. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210922145452.352571-4-npiggin@gmail.com	2021-12-16 21:31:45 +11:00
Nicholas Piggin	ff0b0d6e1a	powerpc/64s/interrupt: handle MSR EE and RI in interrupt entry wrapper The mtmsrd to enable MSR[RI] can be combined with the mtmsrd to enable MSR[EE] in interrupt entry code, for those interrupts which enable EE. This helps performance of important synchronous interrupts (e.g., page faults). This is similar to what commit `dd152f70bd` ("powerpc/64s: system call avoid setting MSR[RI] until we set MSR[EE]") does for system calls. Do this by enabling EE and RI together at the beginning of the entry wrapper if PACA_IRQ_HARD_DIS is clear, and only enabling RI if it is set. Asynchronous interrupts set PACA_IRQ_HARD_DIS, but synchronous ones leave it unchanged, so by default they always get EE=1 unless they have interrupted a caller that is hard disabled. When the sync interrupt later calls interrupt_cond_local_irq_enable(), it will not require another mtmsrd because MSR[EE] was already enabled here. This avoids one mtmsrd L=1 for synchronous interrupts on 64s, which saves about 20 cycles on POWER9. And for kernel-mode interrupts, both synchronous and asynchronous, this saves an additional 40 cycles due to the mtmsrd being moved ahead of mfspr SPRN_AMR, which prevents a SPR scoreboard stall. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210922145452.352571-3-npiggin@gmail.com	2021-12-16 21:31:45 +11:00
Nicholas Piggin	4423eb5ae3	powerpc/64/interrupt: make normal synchronous interrupts enable MSR[EE] if possible Make synchronous interrupt handler entry wrappers enable MSR[EE] if MSR[EE] was enabled in the interrupted context. IRQs are soft-disabled at this point so there is no change to high level code, but it's a masked interrupt could fire. This is a performance disadvantage for interrupts which do not later call interrupt_cond_local_irq_enable(), because an an additional mtmsrd or wrtee instruction is executed. However the important synchronous interrupts (e.g., page fault) do enable interrupts, so the performance disadvantage is mostly avoided. In the next patch, MSR[RI] enabling can be combined with MSR[EE] enabling, which mitigates the performance drop for the former and gives a performance advanage for the latter interrupts, on 64s machines. 64e is coming along for the ride for now to avoid divergences with 64s in this tricky code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210922145452.352571-2-npiggin@gmail.com	2021-12-16 21:31:45 +11:00
Christophe Leroy	0d76914a4c	powerpc/inst: Optimise copy_inst_from_kernel_nofault() copy_inst_from_kernel_nofault() uses copy_from_kernel_nofault() to copy one or two 32bits words. This means calling an out-of-line function which itself calls back copy_from_kernel_nofault_allowed() then performs a generic copy with loops. Rewrite copy_inst_from_kernel_nofault() to do everything at a single place and use __get_kernel_nofault() directly to perform single accesses without loops. Allthough the generic function uses pagefault_disable(), it is not required on powerpc because do_page_fault() bails earlier when a kernel mode fault happens on a kernel address. As the function has now become very small, inline it. With this change, on an 8xx the time spent in the loop in ftrace_replace_code() is reduced by 23% at function tracer activation and 27% at nop tracer activation. The overall time to activate function tracer (measured with shell command 'time') is 570ms before the patch and 470ms after the patch. Even vmlinux size is reduced (by 152 instruction). Before the patch: 00000018 <copy_inst_from_kernel_nofault>: 18: 94 21 ff e0 stwu r1,-32(r1) 1c: 7c 08 02 a6 mflr r0 20: 38 a0 00 04 li r5,4 24: 93 e1 00 1c stw r31,28(r1) 28: 7c 7f 1b 78 mr r31,r3 2c: 38 61 00 08 addi r3,r1,8 30: 90 01 00 24 stw r0,36(r1) 34: 48 00 00 01 bl 34 <copy_inst_from_kernel_nofault+0x1c> 34: R_PPC_REL24 copy_from_kernel_nofault 38: 2c 03 00 00 cmpwi r3,0 3c: 40 82 00 0c bne 48 <copy_inst_from_kernel_nofault+0x30> 40: 81 21 00 08 lwz r9,8(r1) 44: 91 3f 00 00 stw r9,0(r31) 48: 80 01 00 24 lwz r0,36(r1) 4c: 83 e1 00 1c lwz r31,28(r1) 50: 38 21 00 20 addi r1,r1,32 54: 7c 08 03 a6 mtlr r0 58: 4e 80 00 20 blr After the patch (before inlining): 00000018 <copy_inst_from_kernel_nofault>: 18: 3d 20 b0 00 lis r9,-20480 1c: 7c 04 48 40 cmplw r4,r9 20: 7c 69 1b 78 mr r9,r3 24: 41 80 00 14 blt 38 <copy_inst_from_kernel_nofault+0x20> 28: 81 44 00 00 lwz r10,0(r4) 2c: 38 60 00 00 li r3,0 30: 91 49 00 00 stw r10,0(r9) 34: 4e 80 00 20 blr 38: 38 60 ff de li r3,-34 3c: 4e 80 00 20 blr 40: 38 60 ff f2 li r3,-14 44: 4e 80 00 20 blr Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Add clang workaround, with version check as suggested by Nathan] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0d5b12183d5176dd702d29ad94c39c384e51c78f.1638208156.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:21 +11:00
Christophe Leroy	9b307576f3	powerpc/inst: Move ppc_inst_t definition in asm/reg.h Because of circular inclusion of asm/hw_breakpoint.h, we need to move definition of asm/reg.h outside of inst.h so that asm/hw_breakpoint.h gets it without including asm/inst.h Also remove asm/inst.h from asm/uprobes.h as it's not needed anymore. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4b79f1491118af96b1ac0735e74aeca02ea4c04e.1638208156.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:21 +11:00
Christophe Leroy	07b863aef5	powerpc/inst: Define ppc_inst_t as u32 on PPC32 Unlike PPC64 ABI, PPC32 uses the stack to pass a parameter defined as a struct, even when the struct has a single simple element. To avoid that, define ppc_inst_t as u32 on PPC32. Keep it as 'struct ppc_inst' when __CHECKER__ is defined so that sparse can perform type checking. Also revert commit `511eea5e2c` ("powerpc/kprobes: Fix Oops by passing ppc_inst as a pointer to emulate_step() on ppc32") as now the instruction to be emulated is passed as a register to emulate_step(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c6d0c46f598f76ad0b0a88bc0d84773bd921b17c.1638208156.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:21 +11:00
Christophe Leroy	c545b9f040	powerpc/inst: Define ppc_inst_t In order to stop using 'struct ppc_inst' on PPC32, define a ppc_inst_t typedef. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/fe5baa2c66fea9db05a8b300b3e8d2880a42596c.1638208156.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:21 +11:00
Christophe Leroy	3261d99adb	powerpc/inst: Refactor ___get_user_instr() PPC64 version of ___get_user_instr() can be used for PPC32 as well, by simply disabling the suffix part with IS_ENABLED(CONFIG_PPC64). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1f0ede830ccb33a659119a55cb590820c27004db.1638208156.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:20 +11:00
Christophe Leroy	43afcf8f01	powerpc: Add KUAP support for BOOKE and 40x On booke/40x we don't have segments like book3s/32. On booke/40x we don't have access protection groups like 8xx. Use the PID register to provide user access protection. Kernel address space can be accessed with any PID. User address space has to be accessed with the PID of the user. User PID is always not null. Everytime the kernel is entered, set PID register to 0 and restore PID register when returning to user. Everytime kernel needs to access user data, PID is restored for the access. In TLB miss handlers, check the PID and bail out to data storage exception when PID is 0 and accessed address is in user space. Note that also forbids execution of user text by kernel except when user access is unlocked. But this shouldn't be a problem as the kernel is not supposed to ever run user text. This patch prepares the infrastructure but the real activation of KUAP is done by following patches for each processor type one by one. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5d65576a8e31e9480415785a180c92dd4e72306d.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:19 +11:00
Christophe Leroy	42e03bc524	powerpc/kuap: Prepare for supporting KUAP on BOOK3E/64 Also call kuap_lock() and kuap_save_and_lock() from interrupt functions with CONFIG_PPC64. For book3s/64 we keep them empty as it is done in assembly. Also do the locked assert when switching task unless it is book3s/64. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1cbf94e26e6d6e2e028fd687588a7e6622d454a6.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:19 +11:00
Christophe Leroy	047a6fd401	powerpc/config: Add CONFIG_BOOKE_OR_40x We have many functionnalities common to 40x and BOOKE, it leads to many places with #if defined(CONFIG_BOOKE) \|\| defined(CONFIG_40x). We are going to add a few more with KUAP for booke/40x, so create a new symbol which is defined when either BOOKE or 40x is defined. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9a3dbd60924cb25c9f944d3d8205ac5a0d15e229.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:19 +11:00
Christophe Leroy	937fb7003e	powerpc/kuap: Add kuap_lock() Add kuap_lock() and call it when entering interrupts from user. It is called kuap_lock() as it is similar to kuap_save_and_lock() without the save. However book3s/32 already have a kuap_lock(). Rename it kuap_lock_addr(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4437e2deb9f6f549f7089d45e9c6f96a7e77905a.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:19 +11:00
Christophe Leroy	2341964e27	powerpc/kuap: Remove __kuap_assert_locked() __kuap_assert_locked() is redundant with __kuap_get_and_assert_locked(). Move the verification of CONFIG_PPC_KUAP_DEBUG in kuap_assert_locked() and make it call __kuap_get_and_assert_locked() directly. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1a60198a25d2ba38a37f1b92bc7d096435df4224.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:18 +11:00
Christophe Leroy	c252f3846d	powerpc/kuap: Check KUAP activation in generic functions Today, every platform checks that KUAP is not de-activated before doing the real job. Move the verification out of platform specific functions. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/894f110397fcd248e125fb855d1e863e4e633a0d.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:18 +11:00
Christophe Leroy	ba454f9c8e	powerpc/kuap: Add a generic intermediate layer Make the following functions generic to all platforms. - bad_kuap_fault() - kuap_assert_locked() - kuap_save_and_lock() (PPC32 only) - kuap_kernel_restore() - kuap_get_and_assert_locked() And for all platforms except book3s/64 - allow_user_access() - prevent_user_access() - prevent_user_access_return() - restore_user_access() Prepend __ in front of the name of platform specific ones. For now the generic just calls the platform specific, but next patch will move redundant parts of specific functions into the generic one. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/eaef143a8dae7288cd34565ffa7b49c16aee1ec3.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:18 +11:00
Christophe Leroy	6754862249	powerpc/kuep: Remove 'nosmep' boot time parameter except for book3s/64 Deactivating KUEP at boot time is unrelevant for PPC32 and BOOK3E/64. Remove it. It allows to refactor setup_kuep() via a __weak function that only PPC64s will overide for now. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Fix CONFIG_PPC_BOOKS_64 -> CONFIG_PPC_BOOK3S_64 typo] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4c36df18b41c988c4512f45d96220486adbe4c99.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:18 +11:00
Christophe Leroy	70428da94c	powerpc/32s: Save content of sr0 to avoid 'mfsr' Calling 'mfsr' to get the content of segment registers is heavy, in addition it requires clearing of the 'reserved' bits. In order to avoid this operation, save it in mm context and in thread struct. The saved sr0 is the one used by kernel, this means that on locking entry it can be used as is. For unlocking, the only thing to do is to clear SR_NX. This improves null_syscall selftest by 12 cycles, ie 4%. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b02baf2ed8f09bad910dfaeeb7353b2ae6830525.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:17 +11:00
Christophe Leroy	526d4a4c77	powerpc/32s: Do kuep_lock() and kuep_unlock() in assembly When interrupt and syscall entries where converted to C, KUEP locking and unlocking was also converted. It improved performance by unrolling the loop, and allowed easily implementing boot time deactivation of KUEP. However, null_syscall selftest shows that KUEP is still heavy (361 cycles with KUEP, 212 cycles without). A way to improve more is to group 'mtsr's together, instead of repeating 'addi' + 'mtsr' several times. In order to do that, more registers need to be available. In C, GCC will always be able to provide the requested number of registers, but at the cost of saving some data on the stack, which is counter performant here. So let's do it in assembly, when we have full control of which register can be used. It also has the advantage of locking earlier and unlocking later and it helps GCC generating less tricky code. The only drawback is to make boot time deactivation less straight forward and require 'hand' instruction patching. Group 'mtsr's by 4. With this change, null_syscall selftest reports 336 cycles. Without the change it was 361 cycles, that's a 7% reduction. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/115cb279e9b9948dfd93a065e047081c59e3a2a6.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:17 +11:00
Christophe Leroy	df415cd758	powerpc/32s: Remove capability to disable KUEP at boottime Disabling KUEP at boottime makes things unnecessarily complex. Still allow disabling KUEP at build time, but when it's built-in it is always there. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/96f583f82423a29a4205c60b9721079111b35567.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:17 +11:00
Christophe Leroy	ee2631603f	powerpc/44x: Activate KUEP at all time On 44x, KUEP is implemented by clearing SX bit during TLB miss for user pages. The impact is minimal and not worth neither boot time nor build time selection. Activate it at all time. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2414d662558e7fb27d1ed41c8e47c591d576acac.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:17 +11:00
Christophe Leroy	13dac4e31e	powerpc/8xx: Activate KUEP at all time On the 8xx, there is absolutely no runtime impact with KUEP. Protection against execution of user code in kernel mode is set up at boot time by configuring the groups with contain all user pages as having swapped protection rights, in extenso EX for user and NA for supervisor. Configure KUEP at startup and force selection of CONFIG_PPC_KUEP. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2129e86944323ffe9ed07fffbeafdfd2e363690a.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:17 +11:00
Christophe Leroy	6c1fa60d36	Revert "powerpc: Inline setup_kup()" This reverts commit `1791ebd131`. setup_kup() was inlined to manage conflict between PPC32 marking setup_{kuap/kuep}() __init and PPC64 not marking them __init. But in fact PPC32 has removed the __init mark for all but 8xx in order to properly handle SMP. In order to make setup_kup() grow a bit, revert the commit mentioned above but remove __init for 8xx as well so that we don't have to mark setup_kup() as __ref. Also switch the order so that KUAP is initialised before KUEP because on the 40x, KUEP will depend on the activation of KUAP. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7691088fd0994ee3c8db6298dc8c00259e3f6a7f.1634627931.git.christophe.leroy@csgroup.eu	2021-12-09 22:41:16 +11:00
Nicholas Piggin	387e220a2e	powerpc/64s: Move hash MMU support code under CONFIG_PPC_64S_HASH_MMU Compiling out hash support code when CONFIG_PPC_64S_HASH_MMU=n saves 128kB kernel image size (90kB text) on powernv_defconfig minus KVM, 350kB on pseries_defconfig minus KVM, 40kB on a tiny config. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fixup defined(ARCH_HAS_MEMREMAP_COMPAT_ALIGN), which needs CONFIG. Fix radix_enabled() use in setup_initial_memory_limit(). Add some stubs to reduce number of ifdefs.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-18-npiggin@gmail.com	2021-12-09 22:41:13 +11:00
Nicholas Piggin	c28573744b	powerpc/64s: Make hash MMU support configurable This adds Kconfig selection which allows 64s hash MMU support to be disabled. It can be disabled if radix support is enabled, the minimum supported CPU type is POWER9 (or higher), and KVM is not selected. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-17-npiggin@gmail.com	2021-12-09 22:40:24 +11:00
Nicholas Piggin	debeda0171	powerpc/64s: Always define arch unmapped area calls To avoid any functional changes to radix paths when building with hash MMU support disabled (and CONFIG_PPC_MM_SLICES=n), always define the arch get_unmapped_area calls on 64s platforms. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-16-npiggin@gmail.com	2021-12-09 22:40:24 +11:00
Sean Christopherson	005467e06b	KVM: Drop obsolete kvm_arch_vcpu_block_finish() Drop kvm_arch_vcpu_block_finish() now that all arch implementations are nops. No functional change intended. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-12-08 04:24:50 -05:00
Sean Christopherson	510958e997	KVM: Force PPC to define its own rcuwait object Do not define/reference kvm_vcpu.wait if __KVM_HAVE_ARCH_WQP is true, and instead force the architecture (PPC) to define its own rcuwait object. Allowing common KVM to directly access vcpu->wait without a guard makes it all too easy to introduce potential bugs, e.g. kvm_vcpu_block(), kvm_vcpu_on_spin(), and async_pf_execute() all operate on vcpu->wait, not the result of kvm_arch_vcpu_get_wait(), and so may do the wrong thing for PPC. Due to PPC's shenanigans with respect to callbacks and waits (it switches to the virtual core's wait object at KVM_RUN!?!?), it's not clear whether or not this fixes any bugs. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-12-08 04:24:46 -05:00
Sean Christopherson	eaaaed137e	KVM: PPC: Avoid referencing userspace memory region in memslot updates For PPC HV, get the number of pages directly from the new memslot instead of computing the same from the userspace memory region, and explicitly check for !DELETE instead of inferring the same when toggling mmio_update. The motivation for these changes is to avoid referencing the @mem param so that it can be dropped in a future commit. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <1e97fb5198be25f98ef82e63a8d770c682264cc9.1638817639.git.maciej.szmigiero@oracle.com>	2021-12-08 04:24:22 -05:00
Sean Christopherson	537a17b314	KVM: Let/force architectures to deal with arch specific memslot data Pass the "old" slot to kvm_arch_prepare_memory_region() and force arch code to handle propagating arch specific data from "new" to "old" when necessary. This is a baby step towards dynamically allocating "new" from the get go, and is a (very) minor performance boost on x86 due to not unnecessarily copying arch data. For PPC HV, copy the rmap in the !CREATE and !DELETE paths, i.e. for MOVE and FLAGS_ONLY. This is functionally a nop as the previous behavior would overwrite the pointer for CREATE, and eventually discard/ignore it for DELETE. For x86, copy the arch data only for FLAGS_ONLY changes. Unlike PPC HV, x86 needs to reallocate arch data in the MOVE case as the size of x86's allocations depend on the alignment of the memslot's gfn. Opportunistically tweak kvm_arch_prepare_memory_region()'s param order to match the "commit" prototype. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> [mss: add missing RISCV kvm_arch_prepare_memory_region() change] Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <67dea5f11bbcfd71e3da5986f11e87f5dd4013f9.1638817639.git.maciej.szmigiero@oracle.com>	2021-12-08 04:24:20 -05:00
Sebastian Andrzej Siewior	77993b595a	locking: Allow to include asm/spinlock_types.h from linux/spinlock_types_raw.h The printk header file includes ratelimit_types.h for its __ratelimit() based usage. It is required for the static initializer used in printk_ratelimited(). It uses a raw_spinlock_t and includes the spinlock_types.h. PREEMPT_RT substitutes spinlock_t with a rtmutex based implementation and so its spinlock_t implmentation (provided by spinlock_rt.h) includes rtmutex.h and atomic.h which leads to recursive includes where defines are missing. By including only the raw_spinlock_t defines it avoids the atomic.h related includes at this stage. An example on powerpc: \| CALL scripts/atomic/check-atomics.sh \|In file included from include/linux/bug.h:5, \| from include/linux/page-flags.h:10, \| from kernel/bounds.c:10: \|arch/powerpc/include/asm/page_32.h: In function âclear_pageâ: \|arch/powerpc/include/asm/bug.h:87:4: error: implicit declaration of function â=80=98__WARNâ=80=99 [-Werror=3Dimplicit-function-declaration] \| 87 \| __WARN(); \ \| \| ^~~~~~ \|arch/powerpc/include/asm/page_32.h:48:2: note: in expansion of macro âWARN_ONâ=99 \| 48 \| WARN_ON((unsigned long)addr & (L1_CACHE_BYTES - 1)); \| \| ^~~~~~~ \|arch/powerpc/include/asm/bug.h:58:17: error: invalid application of âsizeofâ=99 to incomplete type âstruct bug_entryâ=99 \| 58 \| "i" (sizeof(struct bug_entry)), \ \| \| ^~~~~~ \|arch/powerpc/include/asm/bug.h:89:3: note: in expansion of macro âBUG_ENTRYâ=99 \| 89 \| BUG_ENTRY(PPC_TLNEI " %4, 0", \ \| \| ^~~~~~~~~ \|arch/powerpc/include/asm/page_32.h:48:2: note: in expansion of macro âWARN_ONâ=99 \| 48 \| WARN_ON((unsigned long)addr & (L1_CACHE_BYTES - 1)); \| \| ^~~~~~~ \|In file included from arch/powerpc/include/asm/ptrace.h:298, \| from arch/powerpc/include/asm/hw_irq.h:12, \| from arch/powerpc/include/asm/irqflags.h:12, \| from include/linux/irqflags.h:16, \| from include/asm-generic/cmpxchg-local.h:6, \| from arch/powerpc/include/asm/cmpxchg.h:526, \| from arch/powerpc/include/asm/atomic.h:11, \| from include/linux/atomic.h:7, \| from include/linux/rwbase_rt.h:6, \| from include/linux/rwlock_types.h:55, \| from include/linux/spinlock_types.h:74, \| from include/linux/ratelimit_types.h:7, \| from include/linux/printk.h:10, \| from include/asm-generic/bug.h:22, \| from arch/powerpc/include/asm/bug.h:109, \| from include/linux/bug.h:5, \| from include/linux/page-flags.h:10, \| from kernel/bounds.c:10: \|include/linux/thread_info.h: In function â=80=98copy_overflowâ=80=99: \|include/linux/thread_info.h:210:2: error: implicit declaration of function â=80=98WARNâ=80=99 [-Werror=3Dimplicit-function-declaration] \| 210 \| WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count); \| \| ^~~~ The WARN / BUG include pulls in printk.h and then ptrace.h expects WARN (from bug.h) which is not yet complete. Even hw_irq.h has WARN_ON() statements. On POWERPC64 there are missing atomic64 defines while building 32bit VDSO: \| VDSO32C arch/powerpc/kernel/vdso32/vgettimeofday.o \|In file included from include/linux/atomic.h:80, \| from include/linux/rwbase_rt.h:6, \| from include/linux/rwlock_types.h:55, \| from include/linux/spinlock_types.h:74, \| from include/linux/ratelimit_types.h:7, \| from include/linux/printk.h:10, \| from include/linux/kernel.h:19, \| from arch/powerpc/include/asm/page.h:11, \| from arch/powerpc/include/asm/vdso/gettimeofday.h:5, \| from include/vdso/datapage.h:137, \| from lib/vdso/gettimeofday.c:5, \| from <command-line>: \|include/linux/atomic-arch-fallback.h: In function âarch_atomic64_incâ=99: \|include/linux/atomic-arch-fallback.h:1447:2: error: implicit declaration of function âarch_atomic64_addâ; did you mean âarch_atomic_addâ? [-Werror=3Dimpl \|icit-function-declaration] \| 1447 \| arch_atomic64_add(1, v); \| \| ^~~~~~~~~~~~~~~~~ \| \| arch_atomic_add The generic fallback is not included, atomics itself are not used. If kernel.h does not include printk.h then it comes later from the bug.h include. Allow asm/spinlock_types.h to be included from linux/spinlock_types_raw.h. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20211129174654.668506-12-bigeasy@linutronix.de	2021-12-07 15:14:12 +01:00
Nicholas Piggin	935b534c24	powerpc/64s: Move and rename do_bad_slb_fault as it is not hash specific slb.c is hash-specific SLB management, but do_bad_slb_fault deals with segment interrupts that occur with radix MMU as well. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-5-npiggin@gmail.com	2021-12-02 22:57:23 +11:00
Nicholas Piggin	a4135cbebd	powerpc/pseries: Stop selecting PPC_HASH_MMU_NATIVE The pseries platform does not use the native hash code but the PAPR virtualised hash interfaces, so remove PPC_HASH_MMU_NATIVE. This requires moving tlbiel code from hash_native.c to hash_utils.c. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-4-npiggin@gmail.com	2021-12-02 22:57:23 +11:00
Nicholas Piggin	79b74a6848	powerpc: Remove unused FW_FEATURE_NATIVE references FW_FEATURE_NATIVE_ALWAYS and FW_FEATURE_NATIVE_POSSIBLE are always zero and never do anything. Remove them. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-2-npiggin@gmail.com	2021-12-02 22:57:22 +11:00
Kees Cook	62ea67e319	powerpc/signal32: Use struct_group() to zero spe regs In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memset(), avoid intentionally writing across neighboring fields. Add a struct_group() for the spe registers so that memset() can correctly reason about the size: In function 'fortify_memset_chk', inlined from 'restore_user_regs.part.0' at arch/powerpc/kernel/signal_32.c:539:3: >> include/linux/fortify-string.h:195:4: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 195 \| __write_overflow_field(); \| ^~~~~~~~~~~~~~~~~~~~~~~~ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211118203604.1288379-1-keescook@chromium.org	2021-12-02 10:39:00 +11:00
Nicholas Piggin	5402e239d0	powerpc/64s: Get LPID bit width from device tree Allow the LPID bit width and partition table size to be set at runtime from the device tree. Move the PID bit width detection into the same place. KVM does not support using the extra bits yet, this is mainly required to get the PTCR register values correct (so KVM will run but it will not allocate > 4096 LPIDs). OPAL firmware provides this property for POWER10 CPUs since skiboot commit 9b85f7d961f2 ("hdata: add mmu-pid-bits and mmu-lpid-bits for POWER10 CPUs"). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211129030915.1888332-1-npiggin@gmail.com	2021-11-30 22:27:07 +11:00
Athira Rajeev	2c9ac51b85	powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC Running perf fuzzer showed below in dmesg logs: "Can't find PMC that caused IRQ" This means a PMU exception happened, but none of the PMC's (Performance Monitor Counter) were found to be overflown. There are some corner cases that clears the PMCs after PMI gets masked. In such cases, the perf interrupt handler will not find the active PMC values that had caused the overflow and thus leads to this message while replaying. Case 1: PMU Interrupt happens during replay of other interrupts and counter values gets cleared by PMU callbacks before replay: During replay of interrupts like timer, __do_irq() and doorbell exception, we conditionally enable interrupts via may_hard_irq_enable(). This could potentially create a window to generate a PMI. Since irq soft mask is set to ALL_DISABLED, the PMI will get masked here. We could get IPIs run before perf interrupt is replayed and the PMU events could be deleted or stopped. This will change the PMU SPR values and resets the counters. Snippet of ftrace log showing PMU callbacks invoked in __do_irq(): <idle>-0 [051] dns. 132025441306354: __do_irq <-call_do_irq <idle>-0 [051] dns. 132025441306430: irq_enter <-__do_irq <idle>-0 [051] dns. 132025441306503: irq_enter_rcu <-__do_irq <idle>-0 [051] dnH. 132025441306599: xive_get_irq <-__do_irq <<>> <idle>-0 [051] dnH. 132025441307770: generic_smp_call_function_single_interrupt <-smp_ipi_demux_relaxed <idle>-0 [051] dnH. 132025441307839: flush_smp_call_function_queue <-smp_ipi_demux_relaxed <idle>-0 [051] dnH. 132025441308057: _raw_spin_lock <-event_function <idle>-0 [051] dnH. 132025441308206: power_pmu_disable <-perf_pmu_disable <idle>-0 [051] dnH. 132025441308337: power_pmu_del <-event_sched_out <idle>-0 [051] dnH. 132025441308407: power_pmu_read <-power_pmu_del <idle>-0 [051] dnH. 132025441308477: read_pmc <-power_pmu_read <idle>-0 [051] dnH. 132025441308590: isa207_disable_pmc <-power_pmu_del <idle>-0 [051] dnH. 132025441308663: write_pmc <-power_pmu_del <idle>-0 [051] dnH. 132025441308787: power_pmu_event_idx <-perf_event_update_userpage <idle>-0 [051] dnH. 132025441308859: rcu_read_unlock_strict <-perf_event_update_userpage <idle>-0 [051] dnH. 132025441308975: power_pmu_enable <-perf_pmu_enable <<>> <idle>-0 [051] dnH. 132025441311108: irq_exit <-__do_irq <idle>-0 [051] dns. 132025441311319: performance_monitor_exception <-replay_soft_interrupts Case 2: PMI's masked during local_* operations, example local_add(). If the local_add() operation happens within a local_irq_save(), replay of PMI will be during local_irq_restore(). Similar to case 1, this could also create a window before replay where PMU events gets deleted or stopped. Fix it by updating the PMU callback function power_pmu_disable() to check for pending perf interrupt. If there is an overflown PMC and pending perf interrupt indicated in paca, clear the PMI bit in paca to drop that sample. Clearing of PMI bit is done in power_pmu_disable() since disable is invoked before any event gets deleted/stopped. With this fix, if there are more than one event running in the PMU, there is a chance that we clear the PMI bit for the event which is not getting deleted/stopped. The other events may still remain active. Hence to make sure we don't drop valid sample in such cases, another check is added in power_pmu_enable. This checks if there is an overflown PMC found among the active events and if so enable back the PMI bit. Two new helper functions are introduced to clear/set the PMI, ie clear_pmi_irq_pending() and set_pmi_irq_pending(). Helper function pmi_irq_pending() is introduced to give a warning if there is pending PMI bit in paca, but no PMC is overflown. Also there are corner cases which result in performance monitor interrupts being triggered during power_pmu_disable(). This happens since PMXE bit is not cleared along with disabling of other MMCR0 bits in the pmu_disable. Such PMI's could leave the PMU running and could trigger PMI again which will set MMCR0 PMAO bit. This could lead to spurious interrupts in some corner cases. Example, a timer after power_pmu_del() which will re-enable interrupts and triggers a PMI again since PMAO bit is still set. But fails to find valid overflow since PMC was cleared in power_pmu_del(). Fix that by disabling PMXE along with disabling of other MMCR0 bits in power_pmu_disable(). We can't just replay PMI any time. Hence this approach is preferred rather than replaying PMI before resetting overflown PMC. Patch also documents core-book3s on a race condition which can trigger these PMC messages during idle path in PowerNV. Fixes: `f442d00480` ("powerpc/64s: Add support to mask perf interrupts and replay them") Reported-by: Nageswara R Sastry <nasastry@in.ibm.com> Suggested-by: Nicholas Piggin <npiggin@gmail.com> Suggested-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Make pmi_irq_pending() return bool, reflow/reword some comments] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1626846509-1350-2-git-send-email-atrajeev@linux.vnet.ibm.com	2021-11-30 17:15:49 +11:00
Christophe Leroy	f05cab0034	powerpc/atomics: Remove atomic_inc()/atomic_dec() and friends Now that atomic_add() and atomic_sub() handle immediate operands, atomic_inc() and atomic_dec() have no added value compared to the generic fallback which calls atomic_add(1) and atomic_sub(1). Also remove atomic_inc_not_zero() which fallsback to atomic_add_unless() which itself fallsback to atomic_fetch_add_unless() which now handles immediate operands. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0bc64a2f18726055093dbb2e479cefc60a409cfd.1632236981.git.christophe.leroy@csgroup.eu	2021-11-30 11:45:57 +11:00
Christophe Leroy	41d65207de	powerpc/atomics: Use immediate operand when possible Today we get the following code generation for atomic operations: c001bb2c: 39 20 00 01 li r9,1 c001bb30: 7d 40 18 28 lwarx r10,0,r3 c001bb34: 7d 09 50 50 subf r8,r9,r10 c001bb38: 7d 00 19 2d stwcx. r8,0,r3 c001c7a8: 39 40 00 01 li r10,1 c001c7ac: 7d 00 18 28 lwarx r8,0,r3 c001c7b0: 7c ea 42 14 add r7,r10,r8 c001c7b4: 7c e0 19 2d stwcx. r7,0,r3 By allowing GCC to choose between immediate or regular operation, we get: c001bb2c: 7d 20 18 28 lwarx r9,0,r3 c001bb30: 39 49 ff ff addi r10,r9,-1 c001bb34: 7d 40 19 2d stwcx. r10,0,r3 -- c001c7a4: 7d 40 18 28 lwarx r10,0,r3 c001c7a8: 39 0a 00 01 addi r8,r10,1 c001c7ac: 7d 00 19 2d stwcx. r8,0,r3 For "and", the dot form has to be used because "andi" doesn't exist. For logical operations we use unsigned 16 bits immediate. For arithmetic operations we use signed 16 bits immediate. On pmac32_defconfig, it reduces the text by approx another 8 kbytes. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2ec558d44db8045752fe9dbd29c9ba84bab6030b.1632236981.git.christophe.leroy@csgroup.eu	2021-11-30 11:45:57 +11:00
Christophe Leroy	fb350784d8	powerpc/bitops: Use immediate operand when possible Today we get the following code generation for bitops like set or clear bit: c0009fe0: 39 40 08 00 li r10,2048 c0009fe4: 7c e0 40 28 lwarx r7,0,r8 c0009fe8: 7c e7 53 78 or r7,r7,r10 c0009fec: 7c e0 41 2d stwcx. r7,0,r8 c000d568: 39 00 18 00 li r8,6144 c000d56c: 7c c0 38 28 lwarx r6,0,r7 c000d570: 7c c6 40 78 andc r6,r6,r8 c000d574: 7c c0 39 2d stwcx. r6,0,r7 Most set bits are constant on lower 16 bits, so it can easily be replaced by the "immediate" version of the operation. Allow GCC to choose between the normal or immediate form. For clear bits, on 32 bits 'rlwinm' can be used instead of 'andc' for when all bits to be cleared are consecutive. On 64 bits we don't have any equivalent single operation for clearing, single bits or a few bits, we'd need two 'rldicl' so it is not worth it, the li/andc sequence is doing the same. With this patch we get: c0009fe0: 7d 00 50 28 lwarx r8,0,r10 c0009fe4: 61 08 08 00 ori r8,r8,2048 c0009fe8: 7d 00 51 2d stwcx. r8,0,r10 c000d558: 7c e0 40 28 lwarx r7,0,r8 c000d55c: 54 e7 05 64 rlwinm r7,r7,0,21,18 c000d560: 7c e0 41 2d stwcx. r7,0,r8 On pmac32_defconfig, it reduces the text by approx 10 kbytes. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e6f815d9181bab09df3b350af51149437863e9f9.1632236981.git.christophe.leroy@csgroup.eu	2021-11-30 11:45:50 +11:00
Nicholas Piggin	aebd1fb45c	powerpc: flexible GPR range save/restore macros Introduce macros that operate on a (start, end) range of GPRs, which reduces lines of code and need to do mental arithmetic while reading the code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211022061322.2671178-1-npiggin@gmail.com	2021-11-29 23:15:20 +11:00

1 2 3 4 5 ...

6677 Commits