Commit Graph

1512 Commits

Author SHA1 Message Date
Jason A. Donenfeld
7ef3d06f1b powerpc/powernv/kvm: Use darn for H_RANDOM on Power9
The existing logic in KVM to support guests calling H_RANDOM only works
on Power8, because it looks for an RNG in the device tree, but on Power9
we just use darn.

In addition the existing code needs to work in real mode, so we have the
special cased powernv_get_random_real_mode() to deal with that.

Instead just have KVM call ppc_md.get_random_seed(), and do the real
mode check inside of there, that way we use whatever RNG is available,
including darn on Power9.

Fixes: e928e9cb36 ("KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation.")
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
[mpe: Rebase on previous commit, update change log appropriately]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220727143219.2684192-2-mpe@ellerman.id.au
2022-07-28 16:22:15 +10:00
Michael Ellerman
90b5d4fe0b powerpc/powernv: Avoid crashing if rng is NULL
On a bare-metal Power8 system that doesn't have an "ibm,power-rng", a
malicious QEMU and guest that ignore the absence of the
KVM_CAP_PPC_HWRNG flag, and calls H_RANDOM anyway, will dereference a
NULL pointer.

In practice all Power8 machines have an "ibm,power-rng", but let's not
rely on that, add a NULL check and early return in
powernv_get_random_real_mode().

Fixes: e928e9cb36 ("KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation.")
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220727143219.2684192-1-mpe@ellerman.id.au
2022-07-28 16:22:15 +10:00
Alexey Kardashevskiy
d73b46c3c1 powerpc/ioda/iommu/debugfs: Generate unique debugfs entries
The iommu_table::it_index is a LIOBN which is not initialized on PowerNV
as it is not used except IOMMU debugfs where it is used for a node name.

This initializes it_index witn a unique number to avoid warnings and
have a node for every iommu_table.

This should not cause any behavioral change without CONFIG_IOMMU_DEBUGFS.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220714080800.3712998-1-aik@ozlabs.ru
2022-07-28 16:22:13 +10:00
Michael Ellerman
2b461880c2 powerpc: Fix all occurences of duplicate words
Since commit 87c78b612f ("powerpc: Fix all occurences of "the the"")
fixed "the the", there's now a steady stream of patches fixing other
duplicate words.

Just fix them all at once, to save the overhead of dealing with
individual patches for each case.

This leaves a few cases of "that that", which in some contexts is
correct.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220718095158.326606-1-mpe@ellerman.id.au
2022-07-25 12:05:15 +10:00
Jason A. Donenfeld
9592eef7c1 random: remove CONFIG_ARCH_RANDOM
When RDRAND was introduced, there was much discussion on whether it
should be trusted and how the kernel should handle that. Initially, two
mechanisms cropped up, CONFIG_ARCH_RANDOM, a compile time switch, and
"nordrand", a boot-time switch.

Later the thinking evolved. With a properly designed RNG, using RDRAND
values alone won't harm anything, even if the outputs are malicious.
Rather, the issue is whether those values are being *trusted* to be good
or not. And so a new set of options were introduced as the real
ones that people use -- CONFIG_RANDOM_TRUST_CPU and "random.trust_cpu".
With these options, RDRAND is used, but it's not always credited. So in
the worst case, it does nothing, and in the best case, maybe it helps.

Along the way, CONFIG_ARCH_RANDOM's meaning got sort of pulled into the
center and became something certain platforms force-select.

The old options don't really help with much, and it's a bit odd to have
special handling for these instructions when the kernel can deal fine
with the existence or untrusted existence or broken existence or
non-existence of that CPU capability.

Simplify the situation by removing CONFIG_ARCH_RANDOM and using the
ordinary asm-generic fallback pattern instead, keeping the two options
that are actually used. For now it leaves "nordrand" for now, as the
removal of that will take a different route.

Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-07-18 15:03:37 +02:00
Michael Ellerman
7e74dabc3d Merge branch 'fixes' into next
Merge our fixes branch. In particular this brings in commit
9864816180 ("powerpc/book3e: Fix PUD allocation size in
map_kernel_page()") which fixes a build failure in next, because commit
2db2008e63 ("powerpc/64e: Rewrite p4d_populate() as a static inline
function") depends on it.
2022-07-09 19:29:34 +10:00
Jason A. Donenfeld
8875028265 powerpc/powernv: delay rng platform device creation until later in boot
The platform device for the rng must be created much later in boot.
Otherwise it tries to connect to a parent that doesn't yet exist,
resulting in this splat:

  [    0.000478] kobject: '(null)' ((____ptrval____)): is not initialized, yet kobject_get() is being called.
  [    0.002925] [c000000002a0fb30] [c00000000073b0bc] kobject_get+0x8c/0x100 (unreliable)
  [    0.003071] [c000000002a0fba0] [c00000000087e464] device_add+0xf4/0xb00
  [    0.003194] [c000000002a0fc80] [c000000000a7f6e4] of_device_add+0x64/0x80
  [    0.003321] [c000000002a0fcb0] [c000000000a800d0] of_platform_device_create_pdata+0xd0/0x1b0
  [    0.003476] [c000000002a0fd00] [c00000000201fa44] pnv_get_random_long_early+0x240/0x2e4
  [    0.003623] [c000000002a0fe20] [c000000002060c38] random_init+0xc0/0x214

This patch fixes the issue by doing the platform device creation inside
of machine_subsys_initcall.

Fixes: f3eac42665 ("powerpc/powernv: wire up rng during setup_arch")
Cc: stable@vger.kernel.org
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
[mpe: Change "of node" to "platform device" in change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220630121654.1939181-1-Jason@zx2c4.com
2022-07-04 21:11:47 +10:00
Juerg Haefliger
1e2e5e8274 powerpc/powernv: Kconfig: Replace single quotes
Replace single quotes with double quotes which seems to be the convention
for strings.

Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220520115229.147368-1-juergh@canonical.com
2022-06-29 19:43:26 +10:00
Jason A. Donenfeld
f3eac42665 powerpc/powernv: wire up rng during setup_arch
The platform's RNG must be available before random_init() in order to be
useful for initial seeding, which in turn means that it needs to be
called from setup_arch(), rather than from an init call.

Complicating things, however, is that POWER8 systems need some per-cpu
state and kmalloc, which isn't available at this stage. So we split
things up into an early phase and a later opportunistic phase. This
commit also removes some noisy log messages that don't add much.

Fixes: a4da0d50b2 ("powerpc: Implement arch_get_random_long/int() for powernv")
Cc: stable@vger.kernel.org # v3.13+
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Add of_node_put(), use pnv naming, minor change log editing]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220621140849.127227-1-Jason@zx2c4.com
2022-06-22 19:47:22 +10:00
Paul Mackerras
743cdb7bd0 powerpc/kasan: Mark more real-mode code as not to be instrumented
This marks more files and functions that can possibly be called in
real mode as not to be instrumented by KASAN.  Most were found by
inspection, except for get_pseries_errorlog() which was reported as
causing a crash in testing.

Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YoX1kZPnmUX4RZEK@cleo
2022-05-29 10:30:42 +10:00
Linus Torvalds
6112bd00e8 powerpc updates for 5.19
- Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT).
 
  - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later).
 
  - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ.
 
  - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later.
 
  - Drop support for system call instruction emulation.
 
  - Many other small features and fixes.
 
 Thanks to: Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas Sanjaya, Bjorn
 Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian King, Daniel Axtens, Dwaipayan
 Ray, Fabiano Rosas, Finn Thain, Frank Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu
 Hua, Haowen Bai, Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing
 Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof Kozlowski, Laurent
 Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes, Miaoqian Lin, Minghao Chi, Nathan
 Chancellor, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár,
 Paul Mackerras, Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib
 Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang wangx, Xiaomeng Tong,
 Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing, Yu Kuai, Zheng Bin, Zou Wei, Zucheng
 Zheng.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmKSEgETHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgJpLEACee7mu2I00Z7VWtW5ckT4RFbAXYZcM
 Hv5DbTnVB2ItoQMRHvG52DNbR73j9HnYrz8kpwfTBVk90udxVP14L/swXDs3xbT4
 riXEYtJ1DRVc/bLiOK637RLPWNrmmZStWZme7k0Y9Ki5Aif8i1Erjjq7EIy47m9j
 j1MTcwp3ND7IsBON2nZ3PkttEHhevKvOwCPb/BWtPMDV0OhyQUFKB2SNegrlCrkT
 wshDgdQcYqbIix98PoGa2ZfUVgFQD3JVLzXa4sLpqouzGD+HvEFStOFa2Gq/ZEvV
 zunaeXDdZUCjlib6KvA8+aumBbIQ1s/urrDbxd+3BuYxZ094vNP1B428NT1AWVtl
 3bEZQIN8GSx0v9aHxZ8HePsAMXgG9d2o0xC9EMQ430+cqroN+6UHP7lkekwkprb7
 U9EpZCG9U8jV6SDcaMigW3tooEjn657we0R8nZG2NgUNssdSHVh/JYxGDALPXIAk
 awL3NQrR0tYF3Y3LJm5AxdQrK1hJH8E+hZFCZvIpUXGsr/uf9Gemy/62pD1rhrr/
 niULpxIneRGkJiXB5qdGy8pRu27ED53k7Ky6+8MWSEFQl1mUsHSryYACWz939D8c
 DydhBwQqDTl6Ozs41a5TkVjIRLOCrZADUd/VZM6A4kEOqPJ5t2Gz22Bn8ya1z6Ks
 5Sx6vrGH7GnDjA==
 =15oQ
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT)

 - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later)

 - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ

 - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later

 - Drop support for system call instruction emulation

 - Many other small features and fixes

Thanks to Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas
Sanjaya, Bjorn Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian
King, Daniel Axtens, Dwaipayan Ray, Fabiano Rosas, Finn Thain, Frank
Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu Hua, Haowen Bai,
Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing
Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof
Kozlowski, Laurent Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes,
Miaoqian Lin, Minghao Chi, Nathan Chancellor, Naveen N. Rao, Nicholas
Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár, Paul Mackerras,
Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib
Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang
wangx, Xiaomeng Tong, Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing,
Yu Kuai, Zheng Bin, Zou Wei, and Zucheng Zheng.

* tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits)
  powerpc/64: Include cache.h directly in paca.h
  powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU is set
  powerpc/xics: Include missing header
  powerpc/powernv/pci: Drop VF MPS fixup
  powerpc/fsl_book3e: Don't set rodata RO too early
  powerpc/microwatt: Add mmu bits to device tree
  powerpc/powernv/flash: Check OPAL flash calls exist before using
  powerpc/powermac: constify device_node in of_irq_parse_oldworld()
  powerpc/powermac: add missing g5_phy_disable_cpu1() declaration
  selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch"
  powerpc: Enable the DAWR on POWER9 DD2.3 and above
  powerpc/64s: Add CPU_FTRS_POWER10 to ALWAYS mask
  powerpc/64s: Add CPU_FTRS_POWER9_DD2_2 to CPU_FTRS_ALWAYS mask
  powerpc: Fix all occurences of "the the"
  selftests/powerpc/pmu/ebb: remove fixed_instruction.S
  powerpc/platforms/83xx: Use of_device_get_match_data()
  powerpc/eeh: Drop redundant spinlock initialization
  powerpc/iommu: Add missing of_node_put in iommu_init_early_dart
  powerpc/pseries/vas: Call misc_deregister if sysfs init fails
  powerpc/papr_scm: Fix leaking nvdimm_events_map elements
  ...
2022-05-28 11:27:17 -07:00
Oliver O'Halloran
a5d28039ec powerpc/powernv/pci: Drop VF MPS fixup
The MPS field in the VF config space is marked as reserved in current
versions of the SR-IOV spec. In other words, this fixup doesn't do
anything.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200902035159.1762596-1-oohall@gmail.com
2022-05-22 15:59:54 +10:00
Vasant Hegde
25e69962ef powerpc/powernv/flash: Check OPAL flash calls exist before using
Currently only FSP based powernv systems supports firmware update
interfaces. Hence check that the token OPAL_FLASH_VALIDATE exists
before initalising the flash driver.

Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210914101630.30613-1-hegdevasant@linux.vnet.ibm.com
2022-05-22 15:59:54 +10:00
Michael Ellerman
87c78b612f powerpc: Fix all occurences of "the the"
Rather than waiting for the bots to fix these one-by-one, fix all
occurences of "the the" throughout arch/powerpc.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220518142629.513007-1-mpe@ellerman.id.au
2022-05-22 15:59:43 +10:00
Lv Ruyi
3ffa9fd471 powerpc/powernv: fix missing of_node_put in uv_init()
of_find_compatible_node() returns node pointer with refcount incremented,
use of_node_put() on it when done.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220407090043.2491854-1-lv.ruyi@zte.com.cn
2022-05-22 15:58:30 +10:00
Daniel Axtens
41b7a347bf powerpc: Book3S 64-bit outline-only KASAN support
Implement a limited form of KASAN for Book3S 64-bit machines running under
the Radix MMU, supporting only outline mode.

 - Enable the compiler instrumentation to check addresses and maintain the
   shadow region. (This is the guts of KASAN which we can easily reuse.)

 - Require kasan-vmalloc support to handle modules and anything else in
   vmalloc space.

 - KASAN needs to be able to validate all pointer accesses, but we can't
   instrument all kernel addresses - only linear map and vmalloc. On boot,
   set up a single page of read-only shadow that marks all iomap and
   vmemmap accesses as valid.

 - Document KASAN in powerpc docs.

Background
----------

KASAN support on Book3S is a bit tricky to get right:

 - It would be good to support inline instrumentation so as to be able to
   catch stack issues that cannot be caught with outline mode.

 - Inline instrumentation requires a fixed offset.

 - Book3S runs code with translations off ("real mode") during boot,
   including a lot of generic device-tree parsing code which is used to
   determine MMU features.

    [ppc64 mm note: The kernel installs a linear mapping at effective
    address c000...-c008.... This is a one-to-one mapping with physical
    memory from 0000... onward. Because of how memory accesses work on
    powerpc 64-bit Book3S, a kernel pointer in the linear map accesses the
    same memory both with translations on (accessing as an 'effective
    address'), and with translations off (accessing as a 'real
    address'). This works in both guests and the hypervisor. For more
    details, see s5.7 of Book III of version 3 of the ISA, in particular
    the Storage Control Overview, s5.7.3, and s5.7.5 - noting that this
    KASAN implementation currently only supports Radix.]

 - Some code - most notably a lot of KVM code - also runs with translations
   off after boot.

 - Therefore any offset has to point to memory that is valid with
   translations on or off.

One approach is just to give up on inline instrumentation. This way
boot-time checks can be delayed until after the MMU is set is up, and we
can just not instrument any code that runs with translations off after
booting. Take this approach for now and require outline instrumentation.

Previous attempts allowed inline instrumentation. However, they came with
some unfortunate restrictions: only physically contiguous memory could be
used and it had to be specified at compile time. Maybe we can do better in
the future.

[paulus@ozlabs.org - Rebased onto 5.17.  Note that a kernel with
 CONFIG_KASAN=y will crash during boot on a machine using HPT
 translation because not all the entry points to the generic
 KASAN code are protected with a call to kasan_arch_is_ready().]

Originally-by: Balbir Singh <bsingharora@gmail.com> # ppc64 out-of-line radix version
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
[mpe: Update copyright year and comment formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YoTE69OQwiG7z+Gu@cleo
2022-05-22 15:58:29 +10:00
Daniel Axtens
5352090a99 powerpc/kasan: Don't instrument non-maskable or raw interrupts
Disable address sanitization for raw and non-maskable interrupt
handlers, because they can run in real mode, where we cannot access
the shadow memory.  (Note that kasan_arch_is_ready() doesn't test for
real mode, since it is a static branch for speed, and in any case not
all the entry points to the generic KASAN code are protected by
kasan_arch_is_ready guards.)

The changes to interrupt_nmi_enter/exit_prepare() look larger than
they actually are.  The changes are equivalent to adding
!IS_ENABLED(CONFIG_KASAN) to the conditions for calling nmi_enter() or
nmi_exit() in real mode.  That is, the code is equivalent to using the
following condition for calling nmi_enter/exit:

	if (((!IS_ENABLED(CONFIG_PPC_BOOK3S_64) ||
			!firmware_has_feature(FW_FEATURE_LPAR) ||
			radix_enabled()) &&
		    !IS_ENABLED(CONFIG_KASAN) ||
		(mfmsr() & MSR_DR))

That unwieldy condition has been split into several statements with
comments, for easier reading.

The nmi_ipi_lock functions that call atomic functions (i.e.,
nmi_ipi_lock_start(), nmi_ipi_lock() and nmi_ipi_unlock()), besides
being marked noinstr, now call arch_atomic_* functions instead of
atomic_* functions because with KASAN enabled, the atomic_* functions
are wrappers which explicitly do address sanitization on their
arguments.  Since we are trying to avoid address sanitization, we have
to use the lower-level arch_atomic_* versions.

In hv_nmi_check_nonrecoverable(), the regs_set_unrecoverable() call
has been open-coded so as to avoid having to either trust the inlining
or mark regs_set_unrecoverable() as noinstr.

[paulus@ozlabs.org: combined a few work-in-progress commits of
 Daniel's and wrote the commit message.]

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YoTFGaKM8Pd46PIK@cleo
2022-05-22 15:58:29 +10:00
Russell Currey
d2a3c13198 powerpc/powernv: Get STF barrier requirements from device-tree
The device-tree property no-need-store-drain-on-priv-state-switch is
equivalent to H_CPU_BEHAV_NO_STF_BARRIER from the
H_CPU_GET_CHARACTERISTICS hcall on pseries.

Since commit 84ed26fd00 ("powerpc/security: Add a security feature for
STF barrier") powernv systems with this device-tree property have been
enabling the STF barrier when they have no need for it.  This patch
fixes this by clearing the STF barrier feature on those systems.

Fixes: 84ed26fd00 ("powerpc/security: Add a security feature for STF barrier")
Reported-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220404101536.104794-2-ruscur@russell.cc
2022-05-22 15:58:28 +10:00
Russell Currey
2efee6adb5 powerpc/powernv: Get L1D flush requirements from device-tree
The device-tree properties no-need-l1d-flush-msr-pr-1-to-0 and
no-need-l1d-flush-kernel-on-user-access are the equivalents of
H_CPU_BEHAV_NO_L1D_FLUSH_ENTRY and H_CPU_BEHAV_NO_L1D_FLUSH_UACCESS
from the H_GET_CPU_CHARACTERISTICS hcall on pseries respectively.

In commit d02fa40d75 ("powerpc/powernv: Remove POWER9 PVR version
check for entry and uaccess flushes") the condition for disabling the
L1D flush on kernel entry and user access was changed from any non-P9
CPU to only checking P7 and P8.  Without the appropriate device-tree
checks for newer processors on powernv, these flushes are unnecessarily
enabled on those systems.  This patch corrects this.

Fixes: d02fa40d75 ("powerpc/powernv: Remove POWER9 PVR version check for entry and uaccess flushes")
Reported-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220404101536.104794-1-ruscur@russell.cc
2022-05-22 15:58:28 +10:00
Haren Myneni
c127d130f6 powerpc/powernv/vas: Assign real address to rx_fifo in vas_rx_win_attr
In init_winctx_regs(), __pa() is called on winctx->rx_fifo and this
function is called to initialize registers for receive and fault
windows. But the real address is passed in winctx->rx_fifo for
receive windows and the virtual address for fault windows which
causes errors with DEBUG_VIRTUAL enabled. Fixes this issue by
assigning only real address to rx_fifo in vas_rx_win_attr struct
for both receive and fault windows.

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/338e958c7ab8f3b266fa794a1f80f99b9671829e.camel@linux.ibm.com
2022-05-22 15:58:27 +10:00
Michael Ellerman
b104e41cda Merge branch 'topic/ppc-kvm' into next
Merge our KVM topic branch.
2022-05-19 23:10:42 +10:00
Alexey Kardashevskiy
cad32d9d42 KVM: PPC: Book3s: Retire H_PUT_TCE/etc real mode handlers
LoPAPR defines guest visible IOMMU with hypercalls to use it -
H_PUT_TCE/etc. Implemented first on POWER7 where hypercalls would trap
in the KVM in the real mode (with MMU off). The problem with the real mode
is some memory is not available and some API usage crashed the host but
enabling MMU was an expensive operation.

The problems with the real mode handlers are:
1. Occasionally these cannot complete the request so the code is
copied+modified to work in the virtual mode, very little is shared;
2. The real mode handlers have to be linked into vmlinux to work;
3. An exception in real mode immediately reboots the machine.

If the small DMA window is used, the real mode handlers bring better
performance. However since POWER8, there has always been a bigger DMA
window which VMs use to map the entire VM memory to avoid calling
H_PUT_TCE. Such 1:1 mapping happens once and uses H_PUT_TCE_INDIRECT
(a bulk version of H_PUT_TCE) which virtual mode handler is even closer
to its real mode version.

On POWER9 hypercalls trap straight to the virtual mode so the real mode
handlers never execute on POWER9 and later CPUs.

So with the current use of the DMA windows and MMU improvements in
POWER9 and later, there is no point in duplicating the code.
The 32bit passed through devices may slow down but we do not have many
of these in practice. For example, with this applied, a 1Gbit ethernet
adapter still demostrates above 800Mbit/s of actual throughput.

This removes the real mode handlers from KVM and related code from
the powernv platform.

This updates the list of implemented hcalls in KVM-HV as the realmode
handlers are removed.

This changes ABI - kvmppc_h_get_tce() moves to the KVM module and
kvmppc_find_table() is static now.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220506053755.3820702-1-aik@ozlabs.ru
2022-05-19 00:44:01 +10:00
Christophe Leroy
e6f6390ab7 powerpc: Add missing headers
Don't inherit headers "by chances" from asm/prom.h, asm/mpc52xx.h,
asm/pci.h etc...

Include the needed headers, and remove asm/prom.h when it was
needed exclusively for pulling necessary headers.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/be8bdc934d152a7d8ee8d1a840d5596e2f7d85e0.1646767214.git.christophe.leroy@csgroup.eu
2022-05-08 22:15:40 +10:00
Christophe Leroy
86c38fec69 powerpc: Remove asm/prom.h from all files that don't need it
Several files include asm/prom.h for no reason.

Clean it up.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Drop change to prom_parse.c as reported by lkp@intel.com]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7c9b8fda63dcf63e1b28f43e7ebdb95182cbc286.1646767214.git.christophe.leroy@csgroup.eu
2022-05-08 22:15:04 +10:00
Julia Lawall
1fd02f6605 powerpc: fix typos in comments
Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220430185654.5855-1-Julia.Lawall@inria.fr
2022-05-05 22:12:44 +10:00
Dwaipayan Ray
4ac751d3f3 powerpc/powernv: Switch from __FUNCTION__ to __func__
__FUNCTION__ exists only for backwards compatibility reasons with old
gcc versions. Replace it with __func__.

Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210711084837.95577-1-dwaipayanray1@gmail.com
2022-05-04 19:37:44 +10:00
Hari Bathini
a3ceb5882e powerpc/fadump: print start of preserved area
Print preserved area start address in fadump_region_show() function.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220406093839.206608-4-hbathini@linux.ibm.com
2022-04-26 22:38:19 +10:00
Hari Bathini
b74196af37 powerpc/fadump: Fix fadump to work with a different endian capture kernel
Dump capture would fail if capture kernel is not of the endianess as the
production kernel, because the in-memory data structure (struct
opal_fadump_mem_struct) shared across production kernel and capture
kernel assumes the same endianess for both the kernels, which doesn't
have to be true always. Fix it by having a well-defined endianess for
struct opal_fadump_mem_struct.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/161902744901.86147.14719228311655123526.stgit@hbathini
2022-04-26 22:35:46 +10:00
Brian Gerst
9554e908fb ELF: Remove elf_core_copy_kernel_regs()
x86-32 was the last architecture that implemented separate user and
kernel registers.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220325153953.162643-3-brgerst@gmail.com
2022-04-14 14:08:26 +02:00
Linus Torvalds
148a650476 pci-v5.18-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmI7iOwUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxkuhAAtJkVwfeyUjZ8sms+qWdZaucJmFF1
 PDeKy8O8upLzRRykdWoAOjKKVcCB9ohxBjPMco2oYNTmSozxeau8jjMA9OTQvTOS
 ZhDDoi49/vHRHuq3WIeAMCuk7tH3H1L3f0UHJxJ3H/oObQ+eMsitPcGFK+QrISDX
 pYokOnXZvf7BT7NpVtogSe2mhniOD1zQSicAMiH6WKNHHZcxewrzV9LP3MFOoBAr
 VMhlhzJbOp9spvCt7M1DycJEQ2RNe+wGLBFDalhPuprwnkNchRV+0AwWfD90zc9u
 h/0J8jkXfqS6QfSd/lOlTvI6kGsV8UKZEt4h4X/hlHFebFM5ktD9X7GmcoYUDFd9
 aHV3I/Jf62uGJ31IrT0V/cSYNlMO+IVFwXLGir4B1cFPOkzyIG/i60iV/C6bnnCa
 TCMH6vxalFycYaHBFqw/K/Dlq+mrAX74nQDfbk8y6rprczM1BN220Z8BkpG13TBu
 MxgCEul2/BJmNcPS1IWb/mCfBy+rdrVn2DZuID3J9KTwKNOUTIuAF0FuxLP4Bk4o
 sti3vKIXOcHnAcJB9tEnpEfstPv2JT13eWDIMmp/qCwqcujOvsg/DSYrx+8ogmBF
 DJ/sbPy3BdIOAeTgepWHAxYcv9SlZTGJGl+oaR1zV0qLBogyQUWZ9Ijx5aAEAw3j
 AJicpdk3BkH3LC8=
 =5Q9H
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:
   - Move the VGA arbiter from drivers/gpu to drivers/pci because it's
     PCI-specific, not GPU-specific (Bjorn Helgaas)
   - Select the default VGA device consistently whether it's enumerated
     before or after VGA arbiter init, which fixes arches that enumerate
     PCI devices late (Huacai Chen)

  Resource management:
   - Support BAR sizes up to 8TB (Dongdong Liu)

  PCIe native device hotplug:
   - Fix "Command Completed" tracking to avoid spurious timouts when
     powering off empty slots (Liguang Zhang)
   - Quirk Qualcomm devices that don't implement Command Completed
     correctly, again to avoid spurious timeouts (Manivannan Sadhasivam)

  Peer-to-peer DMA:
   - Add Intel 3rd Gen Intel Xeon Scalable Processors to whitelist
     (Michael J. Ruhl)

  APM X-Gene PCIe controller driver:
   - Revert generic DT parsing changes that broke some machines in the
     field (Marc Zyngier)

  Freescale i.MX6 PCIe controller driver:
   - Allow controller probe to succeed even when no devices currently
     present to allow hot-add later (Fabio Estevam)
   - Enable power management on i.MX6QP (Richard Zhu)
   - Assert CLKREQ# on i.MX8MM so enumeration doesn't hang when no
     device is connected (Richard Zhu)

  Marvell Aardvark PCIe controller driver:
   - Fix MSI and MSI-X support (Marek Behún, Pali Rohár)
   - Add support for ERR and PME interrupts (Pali Rohár)

  Marvell MVEBU PCIe controller driver:
   - Add DT binding and support for "num-lanes" (Pali Rohár)
   - Add support for INTx interrupts (Pali Rohár)

  Microsoft Hyper-V host bridge driver:
   - Avoid unnecessary hypercalls when unmasking IRQs on ARM64 (Boqun
     Feng)

  Qualcomm PCIe controller driver:
   - Add SM8450 DT binding and driver support (Dmitry Baryshkov)

  Renesas R-Car PCIe controller driver:
   - Help the controller get to the L1 state since the hardware can't do
     it on its own (Marek Vasut)
   - Return PCI_ERROR_RESPONSE (~0) for reads that fail on PCIe (Marek
     Vasut)

  SiFive FU740 PCIe controller driver:
   - Drop redundant '-gpios' from DT GPIO lookup (Ben Dooks)
   - Force 2.5GT/s for initial device probe (Ben Dooks)

  Socionext UniPhier Pro5 controller driver:
   - Add NX1 DT binding and driver support (Kunihiko Hayashi)

  Synopsys DesignWare PCIe controller driver:
   - Restore MSI configuration so MSI works after resume (Jisheng
     Zhang)"

* tag 'pci-v5.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (94 commits)
  x86/PCI: Add #includes to asm/pci_x86.h
  PCI: ibmphp: Remove unused assignments
  PCI: cpqphp: Remove unused assignments
  PCI: fu740: Remove unused assignments
  PCI: kirin: Remove unused assignments
  PCI: Remove unused assignments
  PCI: Declare pci_filp_private only when HAVE_PCI_MMAP
  PCI: Avoid broken MSI on SB600 USB devices
  PCI: fu740: Force 2.5GT/s for initial device probe
  PCI: xgene: Revert "PCI: xgene: Fix IB window setup"
  PCI: xgene: Revert "PCI: xgene: Use inbound resources for setup"
  PCI: imx6: Assert i.MX8MM CLKREQ# even if no device present
  PCI: imx6: Invoke the PHY exit function after PHY power off
  PCI: rcar: Use PCI_SET_ERROR_RESPONSE after read which triggered an exception
  PCI: rcar: Finish transition to L1 state in rcar_pcie_config_access()
  PCI: dwc: Restore MSI Receiver mask during resume
  PCI: fu740: Drop redundant '-gpios' from DT GPIO lookup
  PCI/VGA: Replace full MIT license text with SPDX identifier
  PCI/VGA: Use unsigned format string to print lock counts
  PCI/VGA: Log bridge control messages when adding devices
  ...
2022-03-25 13:02:05 -07:00
Rohan McLure
6b3a3e12f8 powerpc: declare unmodified attribute_group usages const
Inspired by (bd75b4ef49: Constify static attribute_group structs),
accepted by linux-next, reported:
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20220210202805.7750-4-rikard.falkeborn@gmail.com/

Nearly all singletons of type struct attribute_group are never modified,
and so are candidates for being const. Declare them as const.

Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220307231414.86560-1-rmclure@linux.ibm.com
2022-03-08 22:15:32 +11:00
Christophe Leroy
76222808fc powerpc: Move C prototypes out of asm-prototypes.h
We originally added asm-prototypes.h in commit 42f5b4cacd ("powerpc:
Introduce asm-prototypes.h"). It's purpose was for prototypes of C
functions that are only called from asm, in order to fix sparse
warnings about missing prototypes.

A few months later Nick added a different use case in
commit 4efca4ed05 ("kbuild: modversions for EXPORT_SYMBOL() for asm")
for C prototypes for exported asm functions. This is basically the
inverse of our original usage.

Since then we've added various prototypes to asm-prototypes.h for both
reasons, meaning we now need to unstitch it all.

Dispatch prototypes of C functions into relevant headers and keep
only the prototypes for functions defined in assembly.

For the time being, leave prom_init() there because moving it
into asm/prom.h or asm/setup.h conflicts with
drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowrom.o
This will be fixed later by untaggling asm/pci.h and asm/prom.h
or by renaming the function in shadowrom.c

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/62d46904eca74042097acf4cb12c175e3067f3d1.1646413435.git.christophe.leroy@csgroup.eu
2022-03-08 22:06:25 +11:00
Anders Roxell
8667d0d64d powerpc: Fix build errors with newer binutils
Building tinyconfig with gcc (Debian 11.2.0-16) and assembler (Debian
2.37.90.20220207) the following build error shows up:

  {standard input}: Assembler messages:
  {standard input}:1190: Error: unrecognized opcode: `stbcix'
  {standard input}:1433: Error: unrecognized opcode: `lwzcix'
  {standard input}:1453: Error: unrecognized opcode: `stbcix'
  {standard input}:1460: Error: unrecognized opcode: `stwcix'
  {standard input}:1596: Error: unrecognized opcode: `stbcix'
  ...

Rework to add assembler directives [1] around the instruction. Going
through them one by one shows that the changes should be safe.  Like
__get_user_atomic_128_aligned() is only called in p9_hmi_special_emu(),
which according to the name is specific to power9.  And __raw_rm_read*()
are only called in things that are powernv or book3s_hv specific.

[1] https://sourceware.org/binutils/docs/as/PowerPC_002dPseudo.html#PowerPC_002dPseudo

Cc: stable@vger.kernel.org
Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
[mpe: Make commit subject more descriptive]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220224162215.3406642-2-anders.roxell@linaro.org
2022-03-01 23:51:08 +11:00
Pali Rohár
904b10fb18 PCI: Add defines for normal and subtractive PCI bridges
Add these PCI class codes to pci_ids.h:

  PCI_CLASS_BRIDGE_PCI_NORMAL
  PCI_CLASS_BRIDGE_PCI_SUBTRACTIVE

Use these defines in all kernel code for describing PCI class codes for
normal and subtractive PCI bridges.

[bhelgaas: similar change in pci-mvebu.c]
Link: https://lore.kernel.org/r/20220214114109.26809-1-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-02-17 15:29:35 -06:00
Linus Torvalds
29ec39fcf1 powerpc updates for 5.17
- Optimise radix KVM guest entry/exit by 2x on Power9/Power10.
 
  - Allow firmware to tell us whether to disable the entry and uaccess flushes on Power10
    or later CPUs.
 
  - Add BPF_PROBE_MEM support for 32 and 64-bit BPF jits.
 
  - Several fixes and improvements to our hard lockup watchdog.
 
  - Activate HAVE_DYNAMIC_FTRACE_WITH_REGS on 32-bit.
 
  - Allow building the 64-bit Book3S kernel without hash MMU support, ie. Radix only.
 
  - Add KUAP (SMAP) support for 40x, 44x, 8xx, Book3E (64-bit).
 
  - Add new encodings for perf_mem_data_src.mem_hops field, and use them on Power10.
 
  - A series of small performance improvements to 64-bit interrupt entry.
 
  - Several commits fixing issues when building with the clang integrated assembler.
 
  - Many other small features and fixes.
 
 Thanks to: Alan Modra, Alexey Kardashevskiy, Ammar Faizi, Anders Roxell, Arnd Bergmann,
 Athira Rajeev, Cédric Le Goater, Christophe JAILLET, Christophe Leroy, Christoph Hellwig,
 Daniel Axtens, David Yang, Erhard Furtner, Fabiano Rosas, Greg Kroah-Hartman, Guo Ren,
 Hari Bathini, Jason Wang, Joel Stanley, Julia Lawall, Kajol Jain, Kees Cook, Laurent
 Dufour, Madhavan Srinivasan, Mark Brown, Minghao Chi, Nageswara R Sastry, Naresh Kamboju,
 Nathan Chancellor, Nathan Lynch, Nicholas Piggin, Nick Child, Oliver O'Halloran, Peiwei
 Hu, Randy Dunlap, Ravi Bangoria, Rob Herring, Russell Currey, Sachin Sant, Sean
 Christopherson, Segher Boessenkool, Thadeu Lima de Souza Cascardo, Tyrel Datwyler, Xiang
 wangx, Yang Guang.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmHhVFMTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgKwzD/9UUEZzWyzMVRJvP9FPZByN2M8czxHJ
 tuqEuVqnfks8ad8tfm2ebng5t8ZuVASBQU2fpPA1+lpdvprgZN5RFGMRh729vskn
 2aHQPmFvFObNbXOgCoXzk+C5xYi3zoRMVM968neSPBneYo+xDicn/zN5CHAgsjhX
 +baemJQ7/xzwLiZgTHe8fWw3nTk3IbPBpha59SdTvR8Moy6I4O8CDPIYEm3U3/J3
 x14ZRETqjksL7YOzEBk0avm1dDZRw/johz29oRYSmCj7dyy5OqrkPwokJiRY90eA
 1lVdofDc0zElaSWkVGzKdSWRUIXjKIVdtejvDeEvl6H/mI6q4TVZE8rFmn+3Rvgf
 9q0iKtmw5Kn11cqgY/pgEGmxnQtIdAodNfI/t939E7+O5LbcznuYUiy0J/kTD/vl
 Xduotg2dsCI+5ukf1wrk2wt9LhqZL+ziOeaBhyDM4orV8T3HBYL6zWBptun//IGO
 lK6TvvCHSYnGqY4bnrAmiOnbbEtnP6nN3zbcXgSvPM0wCRHPIEqd0NRXtfISo32d
 vBPq1neXWo4wrRJj9X3yOuP+5fEA4I+hB3yrCJOkcEcz+8NhlboQXU7raVsJL+bd
 kze75H8hwX7kE71oJFFl13LbSNABgiLFARTBXKfvdQA2iLdR0Snvm+OouvwWRPo/
 Po7Nm3zqdLc/1A==
 =BxhQ
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Optimise radix KVM guest entry/exit by 2x on Power9/Power10.

 - Allow firmware to tell us whether to disable the entry and uaccess
   flushes on Power10 or later CPUs.

 - Add BPF_PROBE_MEM support for 32 and 64-bit BPF jits.

 - Several fixes and improvements to our hard lockup watchdog.

 - Activate HAVE_DYNAMIC_FTRACE_WITH_REGS on 32-bit.

 - Allow building the 64-bit Book3S kernel without hash MMU support, ie.
   Radix only.

 - Add KUAP (SMAP) support for 40x, 44x, 8xx, Book3E (64-bit).

 - Add new encodings for perf_mem_data_src.mem_hops field, and use them
   on Power10.

 - A series of small performance improvements to 64-bit interrupt entry.

 - Several commits fixing issues when building with the clang integrated
   assembler.

 - Many other small features and fixes.

Thanks to Alan Modra, Alexey Kardashevskiy, Ammar Faizi, Anders Roxell,
Arnd Bergmann, Athira Rajeev, Cédric Le Goater, Christophe JAILLET,
Christophe Leroy, Christoph Hellwig, Daniel Axtens, David Yang, Erhard
Furtner, Fabiano Rosas, Greg Kroah-Hartman, Guo Ren, Hari Bathini, Jason
Wang, Joel Stanley, Julia Lawall, Kajol Jain, Kees Cook, Laurent Dufour,
Madhavan Srinivasan, Mark Brown, Minghao Chi, Nageswara R Sastry, Naresh
Kamboju, Nathan Chancellor, Nathan Lynch, Nicholas Piggin, Nick Child,
Oliver O'Halloran, Peiwei Hu, Randy Dunlap, Ravi Bangoria, Rob Herring,
Russell Currey, Sachin Sant, Sean Christopherson, Segher Boessenkool,
Thadeu Lima de Souza Cascardo, Tyrel Datwyler, Xiang wangx, and Yang
Guang.

* tag 'powerpc-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (240 commits)
  powerpc/xmon: Dump XIVE information for online-only processors.
  powerpc/opal: use default_groups in kobj_type
  powerpc/cacheinfo: use default_groups in kobj_type
  powerpc/sched: Remove unused TASK_SIZE_OF
  powerpc/xive: Add missing null check after calling kmalloc
  powerpc/floppy: Remove usage of the deprecated "pci-dma-compat.h" API
  selftests/powerpc: Add a test of sigreturning to an unaligned address
  powerpc/64s: Use EMIT_WARN_ENTRY for SRR debug warnings
  powerpc/64s: Mask NIP before checking against SRR0
  powerpc/perf: Fix spelling of "its"
  powerpc/32: Fix boot failure with GCC latent entropy plugin
  powerpc/code-patching: Replace patch_instruction() by ppc_inst_write() in selftests
  powerpc/code-patching: Move code patching selftests in its own file
  powerpc/code-patching: Move instr_is_branch_{i/b}form() in code-patching.h
  powerpc/code-patching: Move patch_exception() outside code-patching.c
  powerpc/code-patching: Use test_trampoline for prefixed patch test
  powerpc/code-patching: Fix patch_branch() return on out-of-range failure
  powerpc/code-patching: Reorganise do_patch_instruction() to ease error handling
  powerpc/code-patching: Fix unmap_patch_area() error handling
  powerpc/code-patching: Fix error handling in do_patch_instruction()
  ...
2022-01-14 15:17:26 +01:00
Greg Kroah-Hartman
32a1bda4b1 powerpc/opal: use default_groups in kobj_type
There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field.  Move the powerpc opal dump and elog sysfs code to use
default_groups field which has been the preferred way since aa30f47cf6
("kobject: Add support for default attribute groups to kobj_type") so
that we can soon get rid of the obsolete default_attrs field.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220104161318.1306023-1-gregkh@linuxfoundation.org
2022-01-05 10:59:52 +11:00
Nick Child
e5913db1ef powerpc/powernv: Add __init attribute to eligible functions
Some functions defined in 'arch/powerpc/platforms/powernv' are
deserving of an `__init` macro attribute. These functions are only
called by other initialization functions and therefore should inherit
the attribute.
Also, change function declarations in header files to include `__init`.

Signed-off-by: Nick Child <nick.child@ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211216220035.605465-12-nick.child@ibm.com
2021-12-23 22:33:15 +11:00
Nicholas Piggin
387e220a2e powerpc/64s: Move hash MMU support code under CONFIG_PPC_64S_HASH_MMU
Compiling out hash support code when CONFIG_PPC_64S_HASH_MMU=n saves
128kB kernel image size (90kB text) on powernv_defconfig minus KVM,
350kB on pseries_defconfig minus KVM, 40kB on a tiny config.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fixup defined(ARCH_HAS_MEMREMAP_COMPAT_ALIGN), which needs CONFIG.
      Fix radix_enabled() use in setup_initial_memory_limit(). Add some
      stubs to reduce number of ifdefs.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211201144153.2456614-18-npiggin@gmail.com
2021-12-09 22:41:13 +11:00
Nicholas Piggin
c28573744b powerpc/64s: Make hash MMU support configurable
This adds Kconfig selection which allows 64s hash MMU support to be
disabled. It can be disabled if radix support is enabled, the minimum
supported CPU type is POWER9 (or higher), and KVM is not selected.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211201144153.2456614-17-npiggin@gmail.com
2021-12-09 22:40:24 +11:00
Thomas Gleixner
e58f2259b9 genirq/msi, treewide: Use a named struct for PCI/MSI attributes
The unnamed struct sucks and is in the way of further cleanups. Stick the
PCI related MSI data into a real data structure and cleanup all users.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211206210224.374863119@linutronix.de
2021-12-09 11:52:21 +01:00
Nicholas Piggin
7ebc49031d powerpc: Rename PPC_NATIVE to PPC_HASH_MMU_NATIVE
PPC_NATIVE now only controls the native HPT code, so rename it to be
more descriptive. Restrict it to Book3S only.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211201144153.2456614-3-npiggin@gmail.com
2021-12-02 22:57:22 +11:00
Nicholas Piggin
b350111bf7 powerpc: remove cpu_online_cores_map function
This function builds the cores online map with on-stack cpumasks which
can cause high stack usage with large NR_CPUS.

It is not used in any performance sensitive paths, so instead just check
for first thread sibling.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211105035042.1398309-1-npiggin@gmail.com
2021-11-29 22:48:32 +11:00
Nicholas Piggin
d02fa40d75 powerpc/powernv: Remove POWER9 PVR version check for entry and uaccess flushes
These aren't necessarily POWER9 only, and it's not to say some new
vulnerability may not get discovered on other processors for which
we would like the flexibility of having the workaround enabled by
firmware.

Remove the restriction that the workarounds only apply to POWER9.

However POWER7 and POWER8 are not affected, and they may not have
older firmware that does not advertise this, so clear these workarounds
manually.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
[mpe: Incorporate changes from Nick, reword comment slightly.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210503130243.891868-5-npiggin@gmail.com
2021-11-25 11:25:29 +11:00
Julia Lawall
7d405a939c powerpc/powernv: add missing of_node_put
for_each_compatible_node performs an of_node_get on each iteration, so
a break out of the loop requires an of_node_put.

A simplified version of the semantic patch that fixes this problem is as
follows (http://coccinelle.lip6.fr):

// <smpl>
@@
local idexpression n;
expression e;
@@

 for_each_compatible_node(n,...) {
   ...
(
   of_node_put(n);
|
   e = n
|
+  of_node_put(n);
?  break;
)
   ...
 }
... when != n
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1448051604-25256-4-git-send-email-Julia.Lawall@lip6.fr
2021-11-25 11:25:29 +11:00
Nicholas Piggin
46f9caf1a2 powerpc/64s: Keep AMOR SPR a constant ~0 at runtime
This register controls supervisor SPR modifications, and as such is only
relevant for KVM. KVM always sets AMOR to ~0 on guest entry, and never
restores it coming back out to the host, so it can be kept constant and
avoid the mtSPR in KVM guest entry.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-10-npiggin@gmail.com
2021-11-24 21:08:57 +11:00
Nicholas Piggin
f53884b1bf powerpc/64s: Remove WORT SPR from POWER9/10 (take 2)
This removes a missed remnant of the WORT SPR.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211123095231.1036501-2-npiggin@gmail.com
2021-11-24 21:08:56 +11:00
Linus Torvalds
dd72945c43 cxl for v5.16
- Fix support for platforms that do not enumerate every ACPI0016 (CXL
   Host Bridge) in the CHBS (ACPI Host Bridge Structure).
 
 - Introduce a common pci_find_dvsec_capability() helper, clean up open
   coded implementations in various drivers.
 
 - Add 'cxl_test' for regression testing CXL subsystem ABIs. 'cxl_test'
   is a module built from tools/testing/cxl/ that mocks up a CXL topology
   to augment the nascent support for emulation of CXL devices in QEMU.
 
 - Convert libnvdimm to use the uuid API.
 
 - Complete the definition of CXL namespace labels in libnvdimm.
 
 - Tunnel libnvdimm label operations from nd_ioctl() back to the CXL
   mailbox driver. Enable 'ndctl {read,write}-labels' for CXL.
 
 - Continue to sort and refactor functionality into distinct driver and
   core-infrastructure buckets. For example, mailbox handling is now a
   generic core capability consumed by the PCI and cxl_test drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYYRtyQAKCRDfioYZHlFs
 Z7UsAP9DzUN6IWWnYk1R95YXYVxFriRtRsBjujAqTg49EMghawEAoHaA9lxO3Hho
 l25TLYUOmB/zFTlUbe6YQptMJZ5YLwY=
 =im9j
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl updates from Dan Williams:
 "More preparation and plumbing work in the CXL subsystem.

  From an end user perspective the highlight here is lighting up the CXL
  Persistent Memory related commands (label read / write) with the
  generic ioctl() front-end in LIBNVDIMM.

  Otherwise, the ability to instantiate new persistent and volatile
  memory regions is still on track for v5.17.

  Summary:

   - Fix support for platforms that do not enumerate every ACPI0016 (CXL
     Host Bridge) in the CHBS (ACPI Host Bridge Structure).

   - Introduce a common pci_find_dvsec_capability() helper, clean up
     open coded implementations in various drivers.

   - Add 'cxl_test' for regression testing CXL subsystem ABIs.
     'cxl_test' is a module built from tools/testing/cxl/ that mocks up
     a CXL topology to augment the nascent support for emulation of CXL
     devices in QEMU.

   - Convert libnvdimm to use the uuid API.

   - Complete the definition of CXL namespace labels in libnvdimm.

   - Tunnel libnvdimm label operations from nd_ioctl() back to the CXL
     mailbox driver. Enable 'ndctl {read,write}-labels' for CXL.

   - Continue to sort and refactor functionality into distinct driver
     and core-infrastructure buckets. For example, mailbox handling is
     now a generic core capability consumed by the PCI and cxl_test
     drivers"

* tag 'cxl-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (34 commits)
  ocxl: Use pci core's DVSEC functionality
  cxl/pci: Use pci core's DVSEC functionality
  PCI: Add pci_find_dvsec_capability to find designated VSEC
  cxl/pci: Split cxl_pci_setup_regs()
  cxl/pci: Add @base to cxl_register_map
  cxl/pci: Make more use of cxl_register_map
  cxl/pci: Remove pci request/release regions
  cxl/pci: Fix NULL vs ERR_PTR confusion
  cxl/pci: Remove dev_dbg for unknown register blocks
  cxl/pci: Convert register block identifiers to an enum
  cxl/acpi: Do not fail cxl_acpi_probe() based on a missing CHBS
  cxl/pci: Disambiguate cxl_pci further from cxl_mem
  Documentation/cxl: Add bus internal docs
  cxl/core: Split decoder setup into alloc + add
  tools/testing/cxl: Introduce a mock memory device + driver
  cxl/mbox: Move command definitions to common location
  cxl/bus: Populate the target list at decoder create
  tools/testing/cxl: Introduce a mocked-up CXL port hierarchy
  cxl/pmem: Add support for multiple nvdimm-bridge objects
  cxl/pmem: Translate NVDIMM label commands to CXL label commands
  ...
2021-11-08 11:49:48 -08:00
Linus Torvalds
0c5c62ddf8 pci-v5.16-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmGFXBkUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vx6Tg/7BsGWm8f+uw/mr9lLm47q2mc4XyoO
 7bR9KDp5NM84W/8ZOU7dqqqsnY0ddrSOLBRyhJJYMW3SwJd1y1ajTBsL1Ujqv+eN
 z+JUFmhq4Laqm4k6Spc9CEJE+Ol5P6gGUtxLYo6PM2R0VxnSs/rDxctT5i7YOpCi
 COJ+NVT/mc/by2loz1kLTSR9GgtBBgd+Y8UA33GFbHKssROw02L0OI3wffp81Oba
 EhMGPoD+0FndAniDw+vaOSoO+YaBuTfbM92T/O00mND69Fj1PWgmNWZz7gAVgsXb
 3RrNENUFxgw6CDt7LZWB8OyT04iXe0R2kJs+PA9gigFCGbypwbd/Nbz5M7e9HUTR
 ray+1EpZib6+nIksQBL2mX8nmtyHMcLiM57TOEhq0+ECDO640MiRm8t0FIG/1E8v
 3ZYd9w20o/NxlFNXHxxpZ3D/osGH5ocyF5c5m1rfB4RGRwztZGL172LWCB0Ezz9r
 eHB8sWxylxuhrH+hp2BzQjyddg7rbF+RA4AVfcQSxUpyV01hoRocKqknoDATVeLH
 664nJIINFxKJFwfuL3E6OhrInNe1LnAhCZsHHqbS+NNQFgvPRznbixBeLkI9dMf5
 Yf6vpsWO7ur8lHHbRndZubVu8nxklXTU7B/w+C11sq6k9LLRJSHzanr3Fn9WA80x
 sznCxwUvbTCu1r0=
 =nsMh
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:
   - Conserve IRQs by setting up portdrv IRQs only when there are users
     (Jan Kiszka)
   - Rework and simplify _OSC negotiation for control of PCIe features
     (Joerg Roedel)
   - Remove struct pci_dev.driver pointer since it's redundant with the
     struct device.driver pointer (Uwe Kleine-König)

  Resource management:
   - Coalesce contiguous host bridge apertures from _CRS to accommodate
     BARs that cover more than one aperture (Kai-Heng Feng)

  Sysfs:
   - Check CAP_SYS_ADMIN before parsing user input (Krzysztof
     Wilczyński)
   - Return -EINVAL consistently from "store" functions (Krzysztof
     Wilczyński)
   - Use sysfs_emit() in endpoint "show" functions to avoid buffer
     overruns (Kunihiko Hayashi)

  PCIe native device hotplug:
   - Ignore Link Down/Up caused by resets during error recovery so
     endpoint drivers can remain bound to the device (Lukas Wunner)

  Virtualization:
   - Avoid bus resets on Atheros QCA6174, where they hang the device
     (Ingmar Klein)
   - Work around Pericom PI7C9X2G switch packet drop erratum by using
     store and forward mode instead of cut-through (Nathan Rossi)
   - Avoid trying to enable AtomicOps on VFs; the PF setting applies to
     all VFs (Selvin Xavier)

  MSI:
   - Document that /sys/bus/pci/devices/.../irq contains the legacy INTx
     interrupt or the IRQ of the first MSI (not MSI-X) vector (Barry
     Song)

  VPD:
   - Add pci_read_vpd_any() and pci_write_vpd_any() to access anywhere
     in the possible VPD space; use these to simplify the cxgb3 driver
     (Heiner Kallweit)

  Peer-to-peer DMA:
   - Add (not subtract) the bus offset when calculating DMA address
     (Wang Lu)

  ASPM:
   - Re-enable LTR at Downstream Ports so they don't report Unsupported
     Requests when reset or hot-added devices send LTR messages
     (Mingchuang Qiao)

  Apple PCIe controller driver:
   - Add driver for Apple M1 PCIe controller (Alyssa Rosenzweig, Marc
     Zyngier)

  Cadence PCIe controller driver:
   - Return success when probe succeeds instead of falling into error
     path (Li Chen)

  HiSilicon Kirin PCIe controller driver:
   - Reorganize PHY logic and add support for external PHY drivers
     (Mauro Carvalho Chehab)
   - Support PERST# GPIOs for HiKey970 external PEX 8606 bridge (Mauro
     Carvalho Chehab)
   - Add Kirin 970 support (Mauro Carvalho Chehab)
   - Make driver removable (Mauro Carvalho Chehab)

  Intel VMD host bridge driver:
   - If IOMMU supports interrupt remapping, leave VMD MSI-X remapping
     enabled (Adrian Huang)
   - Number each controller so we can tell them apart in
     /proc/interrupts (Chunguang Xu)
   - Avoid building on UML because VMD depends on x86 bare metal APIs
     (Johannes Berg)

  Marvell Aardvark PCIe controller driver:
   - Define macros for PCI_EXP_DEVCTL_PAYLOAD_* (Pali Rohár)
   - Set Max Payload Size to 512 bytes per Marvell spec (Pali Rohár)
   - Downgrade PIO Response Status messages to debug level (Marek Behún)
   - Preserve CRS SV (Config Request Retry Software Visibility) bit in
     emulated Root Control register (Pali Rohár)
   - Fix issue in configuring reference clock (Pali Rohár)
   - Don't clear status bits for masked interrupts (Pali Rohár)
   - Don't mask unused interrupts (Pali Rohár)
   - Avoid code repetition in advk_pcie_rd_conf() (Marek Behún)
   - Retry config accesses on CRS response (Pali Rohár)
   - Simplify emulated Root Capabilities initialization (Pali Rohár)
   - Fix several link training issues (Pali Rohár)
   - Fix link-up checking via LTSSM (Pali Rohár)
   - Fix reporting of Data Link Layer Link Active (Pali Rohár)
   - Fix emulation of W1C bits (Marek Behún)
   - Fix MSI domain .alloc() method to return zero on success (Marek
     Behún)
   - Read entire 16-bit MSI vector in MSI handler, not just low 8 bits
     (Marek Behún)
   - Clear Root Port I/O Space, Memory Space, and Bus Master Enable bits
     at startup; PCI core will set those as necessary (Pali Rohár)
   - When operating as a Root Port, set class code to "PCI Bridge"
     instead of the default "Mass Storage Controller" (Pali Rohár)
   - Add emulation for PCI_BRIDGE_CTL_BUS_RESET since aardvark doesn't
     implement this per spec (Pali Rohár)
   - Add emulation of option ROM BAR since aardvark doesn't implement
     this per spec (Pali Rohár)

  MediaTek MT7621 PCIe controller driver:
   - Add MediaTek MT7621 PCIe host controller driver and DT binding
     (Sergio Paracuellos)

  Qualcomm PCIe controller driver:
   - Add SC8180x compatible string (Bjorn Andersson)
   - Add endpoint controller driver and DT binding (Manivannan
     Sadhasivam)
   - Restructure to use of_device_get_match_data() (Prasad Malisetty)
   - Add SC7280-specific pcie_1_pipe_clk_src handling (Prasad Malisetty)

  Renesas R-Car PCIe controller driver:
   - Remove unnecessary includes (Geert Uytterhoeven)

  Rockchip DesignWare PCIe controller driver:
   - Add DT binding (Simon Xue)

  Socionext UniPhier Pro5 controller driver:
   - Serialize INTx masking/unmasking (Kunihiko Hayashi)

  Synopsys DesignWare PCIe controller driver:
   - Run dwc .host_init() method before registering MSI interrupt
     handler so we can deal with pending interrupts left by bootloader
     (Bjorn Andersson)
   - Clean up Kconfig dependencies (Andy Shevchenko)
   - Export symbols to allow more modular drivers (Luca Ceresoli)

  TI DRA7xx PCIe controller driver:
   - Allow host and endpoint drivers to be modules (Luca Ceresoli)
   - Enable external clock if present (Luca Ceresoli)

  TI J721E PCIe driver:
   - Disable PHY when probe fails after initializing it (Christophe
     JAILLET)

  MicroSemi Switchtec management driver:
   - Return error to application when command execution fails because an
     out-of-band reset has cleared the device BARs, Memory Space Enable,
     etc (Kelvin Cao)
   - Fix MRPC error status handling issue (Kelvin Cao)
   - Mask out other bits when reading of management VEP instance ID
     (Kelvin Cao)
   - Return EOPNOTSUPP instead of ENOTSUPP from sysfs show functions
     (Kelvin Cao)
   - Add check of event support (Logan Gunthorpe)

  Miscellaneous:
   - Remove unused pci_pool wrappers, which have been replaced by
     dma_pool (Cai Huoqing)
   - Use 'unsigned int' instead of bare 'unsigned' (Krzysztof
     Wilczyński)
   - Use kstrtobool() directly, sans strtobool() wrapper (Krzysztof
     Wilczyński)
   - Fix some sscanf(), sprintf() format mismatches (Krzysztof
     Wilczyński)
   - Update PCI subsystem information in MAINTAINERS (Krzysztof
     Wilczyński)
   - Correct some misspellings (Krzysztof Wilczyński)"

* tag 'pci-v5.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (137 commits)
  PCI: Add ACS quirk for Pericom PI7C9X2G switches
  PCI: apple: Configure RID to SID mapper on device addition
  iommu/dart: Exclude MSI doorbell from PCIe device IOVA range
  PCI: apple: Implement MSI support
  PCI: apple: Add INTx and per-port interrupt support
  PCI: kirin: Allow removing the driver
  PCI: kirin: De-init the dwc driver
  PCI: kirin: Disable clkreq during poweroff sequence
  PCI: kirin: Move the power-off code to a common routine
  PCI: kirin: Add power_off support for Kirin 960 PHY
  PCI: kirin: Allow building it as a module
  PCI: kirin: Add MODULE_* macros
  PCI: kirin: Add Kirin 970 compatible
  PCI: kirin: Support PERST# GPIOs for HiKey970 external PEX 8606 bridge
  PCI: apple: Set up reference clocks when probing
  PCI: apple: Add initial hardware bring-up
  PCI: of: Allow matching of an interrupt-map local to a PCI device
  of/irq: Allow matching of an interrupt-map local to an interrupt controller
  irqdomain: Make of_phandle_args_to_fwspec() generally available
  PCI: Do not enable AtomicOps on VFs
  ...
2021-11-06 14:36:12 -07:00
Linus Torvalds
512b7931ad Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "257 patches.

  Subsystems affected by this patch series: scripts, ocfs2, vfs, and
  mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache,
  gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc,
  pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools,
  memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm,
  vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram,
  cleanups, kfence, and damon)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits)
  mm/damon: remove return value from before_terminate callback
  mm/damon: fix a few spelling mistakes in comments and a pr_debug message
  mm/damon: simplify stop mechanism
  Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
  Docs/admin-guide/mm/damon/start: simplify the content
  Docs/admin-guide/mm/damon/start: fix a wrong link
  Docs/admin-guide/mm/damon/start: fix wrong example commands
  mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
  mm/damon: remove unnecessary variable initialization
  Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
  mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
  selftests/damon: support watermarks
  mm/damon/dbgfs: support watermarks
  mm/damon/schemes: activate schemes based on a watermarks mechanism
  tools/selftests/damon: update for regions prioritization of schemes
  mm/damon/dbgfs: support prioritization weights
  mm/damon/vaddr,paddr: support pageout prioritization
  mm/damon/schemes: prioritize regions within the quotas
  mm/damon/selftests: support schemes quotas
  mm/damon/dbgfs: support quotas of schemes
  ...
2021-11-06 14:08:17 -07:00
David Hildenbrand
50f9481ed9 mm/memory_hotplug: remove CONFIG_MEMORY_HOTPLUG_SPARSE
CONFIG_MEMORY_HOTPLUG depends on CONFIG_SPARSEMEM, so there is no need for
CONFIG_MEMORY_HOTPLUG_SPARSE anymore; adjust all instances to use
CONFIG_MEMORY_HOTPLUG and remove CONFIG_MEMORY_HOTPLUG_SPARSE.

Link: https://lkml.kernel.org/r/20210929143600.49379-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>	[kselftest]
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:42 -07:00
Mike Rapoport
4421cca0a3 memblock: use memblock_free for freeing virtual pointers
Rename memblock_free_ptr() to memblock_free() and use memblock_free()
when freeing a virtual pointer so that memblock_free() will be a
counterpart of memblock_alloc()

The callers are updated with the below semantic patch and manual
addition of (void *) casting to pointers that are represented by
unsigned long variables.

    @@
    identifier vaddr;
    expression size;
    @@
    (
    - memblock_phys_free(__pa(vaddr), size);
    + memblock_free(vaddr, size);
    |
    - memblock_free_ptr(vaddr, size);
    + memblock_free(vaddr, size);
    )

[sfr@canb.auug.org.au: fixup]
  Link: https://lkml.kernel.org/r/20211018192940.3d1d532f@canb.auug.org.au

Link: https://lkml.kernel.org/r/20210930185031.18648-7-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Mike Rapoport
3ecc68349b memblock: rename memblock_free to memblock_phys_free
Since memblock_free() operates on a physical range, make its name
reflect it and rename it to memblock_phys_free(), so it will be a
logical counterpart to memblock_phys_alloc().

The callers are updated with the below semantic patch:

    @@
    expression addr;
    expression size;
    @@
    - memblock_free(addr, size);
    + memblock_phys_free(addr, size);

Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Ben Widawsky
c6d7e1341c ocxl: Use pci core's DVSEC functionality
Reduce maintenance burden of DVSEC query implementation by using the
centralized PCI core implementation.

There are two obvious places to simply drop in the new core
implementation. There remains find_dvsec_from_pos() which would benefit
from using a core implementation. As that change is less trivial it is
reserved for later.

Cc: linuxppc-dev@lists.ozlabs.org
Cc: Andrew Donnellan <ajd@linux.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> (v1)
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Link: https://lore.kernel.org/r/163379789065.692348.7117946955275586530.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-10-29 11:53:52 -07:00
Vasant Hegde
52862ab33c powerpc/powernv/prd: Unregister OPAL_MSG_PRD2 notifier during module unload
Commit 587164cd, introduced new opal message type (OPAL_MSG_PRD2) and
added opal notifier. But I missed to unregister the notifier during
module unload path. This results in below call trace if you try to
unload and load opal_prd module.

Also add new notifier_block for OPAL_MSG_PRD2 message.

Sample calltrace (modprobe -r opal_prd; modprobe opal_prd)
  BUG: Unable to handle kernel data access on read at 0xc0080000192200e0
  Faulting instruction address: 0xc00000000018d1cc
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
  CPU: 66 PID: 7446 Comm: modprobe Kdump: loaded Tainted: G            E     5.14.0prd #759
  NIP:  c00000000018d1cc LR: c00000000018d2a8 CTR: c0000000000cde10
  REGS: c0000003c4c0f0a0 TRAP: 0300   Tainted: G            E      (5.14.0prd)
  MSR:  9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 24224824  XER: 20040000
  CFAR: c00000000018d2a4 DAR: c0080000192200e0 DSISR: 40000000 IRQMASK: 1
  ...
  NIP notifier_chain_register+0x2c/0xc0
  LR  atomic_notifier_chain_register+0x48/0x80
  Call Trace:
    0xc000000002090610 (unreliable)
    atomic_notifier_chain_register+0x58/0x80
    opal_message_notifier_register+0x7c/0x1e0
    opal_prd_probe+0x84/0x150 [opal_prd]
    platform_probe+0x78/0x130
    really_probe+0x110/0x5d0
    __driver_probe_device+0x17c/0x230
    driver_probe_device+0x60/0x130
    __driver_attach+0xfc/0x220
    bus_for_each_dev+0xa8/0x130
    driver_attach+0x34/0x50
    bus_add_driver+0x1b0/0x300
    driver_register+0x98/0x1a0
    __platform_driver_register+0x38/0x50
    opal_prd_driver_init+0x34/0x50 [opal_prd]
    do_one_initcall+0x60/0x2d0
    do_init_module+0x7c/0x320
    load_module+0x3394/0x3650
    __do_sys_finit_module+0xd4/0x160
    system_call_exception+0x140/0x290
    system_call_common+0xf4/0x258

Fixes: 587164cd59 ("powerpc/powernv: Add new opal message type")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211028165716.41300-1-hegdevasant@linux.vnet.ibm.com
2021-10-29 16:47:32 +11:00
Vasant Hegde
ee87843795 powerpc/powernv/dump: Fix typo in comment
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210914143802.54325-1-hegdevasant@linux.vnet.ibm.com
2021-10-09 00:15:59 +11:00
Niklas Schnelle
452f145eca powerpc: Drop superfluous pci_dev_is_added() calls
On powerpc, pci_dev_is_added() is called as part of SR-IOV fixups
that are done under pcibios_add_device() which in turn is only called in
pci_device_add() whih is called when a PCI device is scanned.

pci_dev_assign_added() is called in pci_bus_add_device() which is only
called after scanning the device. Thus pci_dev_is_added() is always
false and can be dropped.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
[mpe: Tweak change log slightly to reflect Oliver's comments]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210910141940.2598035-2-schnelle@linux.ibm.com
2021-10-09 00:15:58 +11:00
Oliver O'Halloran
06dc660e6e PCI: Rename pcibios_add_device() to pcibios_device_add()
The general convention for pcibios_* hooks is that they're named after the
corresponding pci_* function they provide a hook for. The exception is
pcibios_add_device() which provides a hook for pci_device_add().

Rename pcibios_add_device() to pcibios_device_add() so it matches
pci_device_add().

Also, remove the export of the microblaze version. The only caller must be
compiled as a built-in so there's no reason for the export.

Link: https://lore.kernel.org/r/20210913152709.48013-1-oohall@gmail.com
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>	# s390
2021-09-21 15:26:09 -05:00
Michael Ellerman
465e333e77 Merge branch 'topic/ppc-kvm' into next
Merge some KVM patches we are keeping in a topic branch in case there
are any merge conflicts that need resolving.
2021-08-26 21:21:11 +10:00
Christophe Leroy
806c0e6e7e powerpc: Refactor verification of MSR_RI
40x and BOOKE don't have MSR_RI therefore all tests involving
MSR_RI may be problematic on those plateforms.

Create helpers to check or set MSR_RI in regs, and use them
in common code.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c2fb93708196734f4176dda334aaa3055f213b89.1629707037.git.christophe.leroy@csgroup.eu
2021-08-26 21:21:07 +10:00
Nicholas Piggin
0c8fb653d4 powerpc/64s: Remove WORT SPR from POWER9/10
This register is not architected and not implemented in POWER9 or 10,
it just reads back zeroes for compatibility.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Link: https://lore.kernel.org/r/20210811160134.904987-11-npiggin@gmail.com
2021-08-25 16:37:18 +10:00
Aneesh Kumar K.V
dbf77fed8b powerpc: rename powerpc_debugfs_root to arch_debugfs_dir
No functional change in this patch. arch_debugfs_dir is the generic kernel
name declared in linux/debugfs.h for arch-specific debugfs directory.
Architectures like x86/s390 already use the name. Rename powerpc
specific powerpc_debugfs_root to arch_debugfs_dir.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210812132831.233794-2-aneesh.kumar@linux.ibm.com
2021-08-13 22:04:26 +10:00
Marc Zyngier
1bce542500 powerpc: Bulk conversion to generic_handle_domain_irq()
Wherever possible, replace constructs that match either
generic_handle_irq(irq_find_mapping()) or
generic_handle_irq(irq_linear_revmap()) to a single call to
generic_handle_domain_irq().

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210802162630.2219813-13-maz@kernel.org
2021-08-10 23:15:02 +10:00
Cédric Le Goater
c325712b5f powerpc/powernv/pci: Rework pnv_opal_pci_msi_eoi()
pnv_opal_pci_msi_eoi() is called from KVM to EOI passthrough interrupts
when in real mode. Adding MSI domain broke the hack using the
'ioda.irq_chip' field to deduce the owning PHB. Fix that by using the
IRQ chip data in the MSI domain.

The 'ioda.irq_chip' field is now unused and could be removed from the
pnv_phb struct.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-30-clg@kaod.org
2021-08-10 23:15:01 +10:00
Cédric Le Goater
5cd69651ce powerpc/powernv/pci: Set the IRQ chip data for P8/CXL devices
Before MSI domains, the default IRQ chip of PHB3 MSIs was patched by
pnv_set_msi_irq_chip() with the custom EOI handler pnv_ioda2_msi_eoi()
and the owning PHB was deduced from the 'ioda.irq_chip' field. This
path has been deprecated by the MSI domains but it is still in use by
the P8 CAPI 'cxl' driver.

Rewriting this driver to support MSI would be a waste of time.
Nevertheless, we can still remove the IRQ chip patch and set the IRQ
chip data instead. This is cleaner.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-29-clg@kaod.org
2021-08-10 23:15:01 +10:00
Cédric Le Goater
f1a377f86f powerpc/powernv/pci: Adapt is_pnv_opal_msi() to detect passthrough interrupt
The pnv_ioda2_msi_eoi() chip handler is not used anymore for MSIs.
Simply use the check on the PSI-MSI chip.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-27-clg@kaod.org
2021-08-10 23:15:00 +10:00
Cédric Le Goater
6d9ba6121b powerpc/powernv/pci: Drop unused MSI code
MSIs should be fully managed by the PCI and IRQ subsystems now.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-26-clg@kaod.org
2021-08-10 23:15:00 +10:00
Cédric Le Goater
679e30b953 powerpc/pci: Drop XIVE restriction on MSI domains
The PowerNV and pSeries platforms now have support for both the XICS
and XIVE IRQ domains.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-23-clg@kaod.org
2021-08-10 23:15:00 +10:00
Cédric Le Goater
bbb25af8fb powerpc/powernv/pci: Customize the MSI EOI handler to support PHB3
PHB3s need an extra OPAL call to EOI the interrupt. The call takes an
OPAL HW IRQ number but it is translated into a vector number in OPAL.
Here, we directly use the vector number of the in-the-middle "PNV-MSI"
domain instead of grabbing the OPAL HW IRQ number in the XICS parent
domain.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-22-clg@kaod.org
2021-08-10 23:15:00 +10:00
Cédric Le Goater
ba418a0278 KVM: PPC: Book3S HV: Use the new IRQ chip to detect passthrough interrupts
Passthrough PCI MSI interrupts are detected in KVM with a check on a
specific EOI handler (P8) or on XIVE (P9). We can now check the
PCI-MSI IRQ chip which is cleaner.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-14-clg@kaod.org
2021-08-10 23:14:58 +10:00
Cédric Le Goater
0fcfe2247e powerpc/powernv/pci: Add MSI domains
This is very similar to the MSI domains of the pSeries platform. The
MSI allocator is directly handled under the Linux PHB in the
in-the-middle "PNV-MSI" domain.

Only the XIVE (P9/P10) parent domain is supported for now. Support for
XICS will come later.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-13-clg@kaod.org
2021-08-10 23:14:58 +10:00
Cédric Le Goater
2c50d7e99e powerpc/powernv/pci: Introduce __pnv_pci_ioda_msi_setup()
It will be used as a 'compose_msg' handler of the MSI domain introduced
later.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-12-clg@kaod.org
2021-08-10 23:14:58 +10:00
Sebastian Andrzej Siewior
5ae36401ca powerpc: Replace deprecated CPU-hotplug functions.
The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().

Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210803141621.780504-4-bigeasy@linutronix.de
2021-08-10 23:14:56 +10:00
Nicholas Piggin
59dc5bfca0 powerpc/64s: avoid reloading (H)SRR registers if they are still valid
When an interrupt is taken, the SRR registers are set to return to where
it left off. Unless they are modified in the meantime, or the return
address or MSR are modified, there is no need to reload these registers
when returning from interrupt.

Introduce per-CPU flags that track the validity of SRR and HSRR
registers. These are cleared when returning from interrupt, when
using the registers for something else (e.g., OPAL calls), when
adjusting the return address or MSR of a context, and when context
switching (which changes the return address and MSR).

This improves the performance of interrupt returns.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fold in fixup patch from Nick]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-5-npiggin@gmail.com
2021-06-25 00:06:55 +10:00
Haren Myneni
7bc6f71bdf powerpc/vas: Define and use common vas_window struct
Many elements in vas_struct are used on PowerNV and PowerVM
platforms. vas_window is used for both TX and RX windows on
PowerNV and for TX windows on PowerVM. So some elements are
specific to these platforms.

So this patch defines common vas_window and platform
specific window structs (pnv_vas_window on PowerNV). Also adds
the corresponding changes in PowerNV vas code.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1698c35c158dfe52c6d2166667823d3d4a463353.camel@linux.ibm.com
2021-06-20 21:58:56 +10:00
Haren Myneni
3b26797350 powerpc/vas: Move update_csb/dump_crb to common book3s platform
If a coprocessor encounters an error translating an address, the
VAS will cause an interrupt in the host. The kernel processes
the fault by updating CSB. This functionality is same for both
powerNV and pseries. So this patch moves these functions to
common vas-api.c and the actual functionality is not changed.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/bf8d5b0770fa1ef5cba88c96580caa08d999d3b5.camel@linux.ibm.com
2021-06-20 21:58:56 +10:00
Haren Myneni
3856aa542d powerpc/vas: Create take/drop pid and mm reference functions
Take pid and mm references when each window opens and drops during
close. This functionality is needed for powerNV and pseries. So
this patch defines the existing code as functions in common book3s
platform vas-api.c

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/2fa40df962250a737c804e58202924717b39e381.camel@linux.ibm.com
2021-06-20 21:58:56 +10:00
Haren Myneni
1a0d0d5ed5 powerpc/vas: Add platform specific user window operations
PowerNV uses registers to open/close VAS windows, and getting the
paste address. Whereas the hypervisor calls are used on PowerVM.

This patch adds the platform specific user space window operations
and register with the common VAS user space interface.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f85091f4ace67f951ac04d60394d67b21e2f5d3c.camel@linux.ibm.com
2021-06-20 21:58:55 +10:00
Haren Myneni
06c6fad9bf powerpc/powernv/vas: Rename register/unregister functions
powerNV and pseries drivers register / unregister to the corresponding
platform specific VAS separately. Then these VAS functions call the
common API with the specific window operations. So rename powerNV VAS
API register/unregister functions.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/9db00d58dbdcb7cfc07a1df95f3d2a9e3e5d746a.camel@linux.ibm.com
2021-06-20 21:58:55 +10:00
Haren Myneni
413d6ed3ea powerpc/vas: Move VAS API to book3s common platform
The pseries platform will share vas and nx code and interfaces
with the PowerNV platform, so create the
arch/powerpc/platforms/book3s/ directory and move VAS API code
there. Functionality is not changed.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e05c8db17b9eabe3545b902d034238e4c6c08180.camel@linux.ibm.com
2021-06-20 21:58:55 +10:00
Haren Myneni
91cdbb955a powerpc/powernv/vas: Release reference to tgid during window close
The kernel handles the NX fault by updating CSB or sending
signal to process. In multithread applications, children can
open VAS windows and can exit without closing them. But the
parent can continue to send NX requests with these windows. To
prevent pid reuse, reference will be taken on pid and tgid
when the window is opened and release them during window close.

The current code is not releasing the tgid reference which can
cause pid leak and this patch fixes the issue.

Fixes: db1c08a740 ("powerpc/vas: Take reference to PID and mm for user space windows")
Cc: stable@vger.kernel.org # 5.8+
Reported-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/6020fc4d444864fe20f7dcdc5edfe53e67480a1c.camel@linux.ibm.com
2021-06-20 21:58:55 +10:00
Michael Ellerman
3c53642324 Merge branch 'topic/ppc-kvm' into next
Merge some powerpc KVM patches from our topic branch.

In particular this brings in Nick's big series rewriting parts of the
guest entry/exit path in C.

Conflicts:
	arch/powerpc/kernel/security.c
	arch/powerpc/kvm/book3s_hv_rmhandlers.S
2021-06-17 16:51:38 +10:00
Christophe Leroy
ab3aab292c powerpc: Move update_power8_hid0() into its only user
update_power8_hid0() is used only by powernv platform subcore.c

Move it there.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/37f41d74faa0c66f90b373e243e8b1ee37a1f6fa.1623219019.git.christophe.leroy@csgroup.eu
2021-06-17 00:09:11 +10:00
Nicholas Piggin
fae5c9f366 KVM: PPC: Book3S HV: remove ISA v3.0 and v3.1 support from P7/8 path
POWER9 and later processors always go via the P9 guest entry path now.
Remove the remaining support from the P7/8 path.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210528090752.3542186-33-npiggin@gmail.com
2021-06-10 22:12:15 +10:00
Nick Desaulniers
73e6e4e011 powerpc/powernv/pci: fix header guard
While looking at -Wundef warnings, the #if CONFIG_EEH stood out as a
possible candidate to convert to #ifdef CONFIG_EEH.

It seems that based on Kconfig dependencies it's not possible to build
this file without CONFIG_EEH enabled, but based on upstream discussion,
it's not clear yet that CONFIG_EEH should be enabled by default.

For now, simply fix the -Wundef warning.

Suggested-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://github.com/ClangBuiltLinux/linux/issues/570
Link: https://lore.kernel.org/lkml/67f6cd269684c9aa8463ff4812c3b4605e6739c3.camel@perches.com/
Link: https://lore.kernel.org/lkml/CAOSf1CGoN5R0LUrU=Y=UWho1Z_9SLgCX8s3SbFJXwJXc5BYz4A@mail.gmail.com/
Link: https://lore.kernel.org/r/20210518204044.2390064-1-ndesaulniers@google.com
2021-05-23 20:51:35 +10:00
Sandipan Das
b910fcbada powerpc/powernv/memtrace: Fix dcache flushing
Trace memory is cleared and the corresponding dcache lines
are flushed after allocation. However, this should not be
done using the PFN. This adds the missing conversion to
virtual address.

Fixes: 2ac02e5ece ("powerpc/mm: Remove dcache flush from memory remove.")
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210501160254.1179831-1-sandipan@linux.ibm.com
2021-05-04 22:27:56 +10:00
Christoph Hellwig
562d1e207d powerpc/powernv: remove the nvlink support
This code was only used by the vfio-nvlink2 code, which itself had no
proper use.  Drop this huge chunk of code build into every powernv
or generic build.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210326061311.1497642-3-hch@lst.de
2021-05-02 23:35:32 +10:00
Alexey Kardashevskiy
4be518d838 powerpc/iommu: Do not immediately panic when failed IOMMU table allocation
Most platforms allocate IOMMU table structures (specifically it_map)
at the boot time and when this fails - it is a valid reason for panic().

However the powernv platform allocates it_map after a device is returned
to the host OS after being passed through and this happens long after
the host OS booted. It is quite possible to trigger the it_map allocation
panic() and kill the host even though it is not necessary - the host OS
can still use the DMA bypass mode (requires a tiny fraction of it_map's
memory) and even if that fails, the host OS is runnnable as it was without
the device for which allocating it_map causes the panic.

Instead of immediately crashing in a powernv/ioda2 system, this prints
an error and continues. All other platforms still call panic().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Leonardo Bras <leobras.c@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210216033307.69863-3-aik@ozlabs.ru
2021-04-23 01:38:04 +10:00
Yang Li
caea7b833d powerpc/64s: remove unneeded semicolon
Eliminate the following coccicheck warning:
./arch/powerpc/platforms/powernv/setup.c:160:2-3: Unneeded semicolon

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1612236877-104974-1-git-send-email-yang.lee@linux.alibaba.com
2021-04-23 01:38:04 +10:00
Bixuan Cui
ff0b4155ae powerpc/powernv: make symbol 'mpipl_kobj' static
The sparse tool complains as follows:

arch/powerpc/platforms/powernv/opal-core.c:74:16: warning:
 symbol 'mpipl_kobj' was not declared.

This symbol is not used outside of opal-core.c, so marks it static.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210409063855.57347-1-cuibixuan@huawei.com
2021-04-14 23:04:17 +10:00
Jordan Niethe
08a022ad3d powerpc/powernv/memtrace: Allow mmaping trace buffers
Let the memory removed from the linear mapping to be used for the trace
buffers be mmaped. This is a useful way of providing cache-inhibited
memory for the alignment_handler selftest.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
[mpe: make memtrace_mmap() static as noticed by lkp@intel.com]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210225032108.1458352-1-jniethe5@gmail.com
2021-04-08 21:17:44 +10:00
dingsenjie
69931cc387 powerpc/powernv: Remove unneeded variable: "rc"
Remove unneeded variable: "rc".

Signed-off-by: dingsenjie <dingsenjie@yulong.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210326115356.12444-1-dingsenjie@163.com
2021-03-29 13:22:19 +11:00
Linus Torvalds
21a6ab2131 Modules updates for v5.12
Summary of modules changes for the 5.12 merge window:
 
 - Retire EXPORT_UNUSED_SYMBOL() and EXPORT_SYMBOL_GPL_FUTURE(). These export
   types were introduced between 2006 - 2008. All the of the unused symbols have
   been long removed and gpl future symbols were converted to gpl quite a long
   time ago, and I don't believe these export types have been used ever since.
   So, I think it should be safe to retire those export types now. (Christoph Hellwig)
 
 - Refactor and clean up some aged code cruft in the module loader (Christoph Hellwig)
 
 - Build {,module_}kallsyms_on_each_symbol only when livepatching is enabled, as
   it is the only caller (Christoph Hellwig)
 
 - Unexport find_module() and module_mutex and fix the last module
   callers to not rely on these anymore. Make module_mutex internal to
   the module loader. (Christoph Hellwig)
 
 - Harden ELF checks on module load and validate ELF structures before checking
   the module signature (Frank van der Linden)
 
 - Fix undefined symbol warning for clang (Fangrui Song)
 
 - Fix smatch warning (Dan Carpenter)
 
 Signed-off-by: Jessica Yu <jeyu@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEVrp26glSWYuDNrCUwEV+OM47wXIFAmA0/KMQHGpleXVAa2Vy
 bmVsLm9yZwAKCRDARX44zjvBcu0uD/4nmRp18EKAtdUZivsZHat0aEWGlkmrVueY
 5huYw6iwM8b/wIAl3xwLki1Iv0/l0a83WXZhLG4ekl0/Nj8kgllA+jtBrZWpoLMH
 CZusN5dS9YwwyD2vu3ak83ARcehcDEPeA9thvc3uRFGis6Hi4bt1rkzGdrzsgqR4
 tybfN4qaQx4ZAKFxA8bnS58l7QTFwUzTxJfM6WWzl1Q+mLZDr/WP+loJ/f1/oFFg
 ufN31KrqqFpdQY5UKq5P4H8FVq/eXE1Mwl8vo3HsnAj598fznyPUmA3D/j+N4GuR
 sTGBVZ9CSehUj7uZRs+Qgg6Bd+y3o44N29BrdZWA6K3ieTeQQpA+VgPUNrDBjGhP
 J/9Y4ms4PnuNEWWRaa73m9qsVqAsjh9+T2xp9PYn9uWLCM8BvQFtWcY7tw4/nB0/
 INmyiP/tIRpwWkkBl47u1TPR09FzBBGDZjBiSn3lm3VX+zCYtHoma5jWyejG11cf
 ybDrTsci9ANyHNP2zFQsUOQJkph78PIal0i3k4ODqGJvaC0iEIH3Xjv+0dmE14rq
 kGRrG/HN6HhMZPjashudVUktyTZ63+PJpfFlQbcUzdvjQQIkzW0vrCHMWx9vD1xl
 Na7vZLl4Nb03WSJp6saY6j2YSRKL0poGETzGqrsUAHEhpEOPHduaiCVlAr/EmeMk
 p6SrWv8+UQ==
 =T29Q
 -----END PGP SIGNATURE-----

Merge tag 'modules-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull module updates from Jessica Yu:

 - Retire EXPORT_UNUSED_SYMBOL() and EXPORT_SYMBOL_GPL_FUTURE(). These
   export types were introduced between 2006 - 2008. All the of the
   unused symbols have been long removed and gpl future symbols were
   converted to gpl quite a long time ago, and I don't believe these
   export types have been used ever since. So, I think it should be safe
   to retire those export types now (Christoph Hellwig)

 - Refactor and clean up some aged code cruft in the module loader
   (Christoph Hellwig)

 - Build {,module_}kallsyms_on_each_symbol only when livepatching is
   enabled, as it is the only caller (Christoph Hellwig)

 - Unexport find_module() and module_mutex and fix the last module
   callers to not rely on these anymore. Make module_mutex internal to
   the module loader (Christoph Hellwig)

 - Harden ELF checks on module load and validate ELF structures before
   checking the module signature (Frank van der Linden)

 - Fix undefined symbol warning for clang (Fangrui Song)

 - Fix smatch warning (Dan Carpenter)

* tag 'modules-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  module: potential uninitialized return in module_kallsyms_on_each_symbol()
  module: remove EXPORT_UNUSED_SYMBOL*
  module: remove EXPORT_SYMBOL_GPL_FUTURE
  module: move struct symsearch to module.c
  module: pass struct find_symbol_args to find_symbol
  module: merge each_symbol_section into find_symbol
  module: remove each_symbol_in_section
  module: mark module_mutex static
  kallsyms: only build {,module_}kallsyms_on_each_symbol when required
  kallsyms: refactor {,module_}kallsyms_on_each_symbol
  module: use RCU to synchronize find_module
  module: unexport find_module and module_mutex
  drm: remove drm_fb_helper_modinit
  powerpc/powernv: remove get_cxl_module
  module: harden ELF info handling
  module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
2021-02-23 10:15:33 -08:00
Linus Torvalds
b12b472496 powerpc updates for 5.12
A large series adding wrappers for our interrupt handlers, so that irq/nmi/user
 tracking can be isolated in the wrappers rather than spread in each handler.
 
 Conversion of the 32-bit syscall handling into C.
 
 A series from Nick to streamline our TLB flushing when using the Radix MMU.
 
 Switch to using queued spinlocks by default for 64-bit server CPUs.
 
 A rework of our PCI probing so that it happens later in boot, when more generic
 infrastructure is available.
 
 Two small fixes to allow 32-bit little-endian processes to run on 64-bit
 kernels.
 
 Other smaller features, fixes & cleanups.
 
 Thanks to:
   Alexey Kardashevskiy, Ananth N Mavinakayanahalli, Aneesh Kumar K.V, Athira
   Rajeev, Bhaskar Chowdhury, Cédric Le Goater, Chengyang Fan, Christophe Leroy,
   Christopher M. Riedl, Fabiano Rosas, Florian Fainelli, Frederic Barrat, Ganesh
   Goudar, Hari Bathini, Jiapeng Chong, Joseph J Allen, Kajol Jain, Markus
   Elfring, Michal Suchanek, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver
   O'Halloran, Pingfan Liu, Po-Hsu Lin, Qian Cai, Ram Pai, Randy Dunlap, Sandipan
   Das, Stephen Rothwell, Tyrel Datwyler, Will Springer, Yury Norov, Zheng
   Yongjun.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmAzMagTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgAbBD/wMS2g1Q9oAGZPsx2NGd2RoeAauGxUs
 Yj6cZVmR+oa6sJyFYgEG7dT7tcwJITQxLBD3HpsHSnJ/rLrMloE33+cZNA9c4STz
 0mlzm3R7M5pOgcEqZglsgLP0RQeUuHSSF01g0kf1N3r+HYtmbmPjuUIl8CnAjlbT
 iMD2ZN2p8/r3kDDht0iBO534HUpsqhc00duSZgQhsV/PR7ZWVxoPk7PEJeo4vXlJ
 77986F7J5NLUTjMiLv5lTx49FcPbRd7a1jubsBtahJrwXj2GVvuy2i86G7HY+a+B
 eSxN7zJQgaFeLo0YPo7fZLBI0MAsIQt3nnZhKX0TMglbv/K8Aq64xiJqsVQdJ883
 CeEt0HvSJhsSC0C4O595NEINfDhDd+5IeSF9MvsujYXiUKRXtRkm1EPuAzTcZIzW
 NwkCLRo33NMXa+khMKaiqF/g7INayPUXoWESx75NXFsuNfcORvstkeUuEoi5GwJo
 TSlmosFqwRjghQ8eTLZuWBzmh3EpPGdtC4gm6D+lbzhzjah5c/1whyuLqra275kK
 E3Qt0/V0ixKyvlG7MI5yYh3L7+R/hrsflH7xIJJxZp2DW6mwBJzQYmkxDbSS8PzK
 nWien2XgpIQhSFat3QqreEFSfNkzdN2MClVi2Y1hpAgi+2Zm9rPdPNGcQI+DSOsB
 kpJkjOjWNJU/PQ==
 =dB2S
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - A large series adding wrappers for our interrupt handlers, so that
   irq/nmi/user tracking can be isolated in the wrappers rather than
   spread in each handler.

 - Conversion of the 32-bit syscall handling into C.

 - A series from Nick to streamline our TLB flushing when using the
   Radix MMU.

 - Switch to using queued spinlocks by default for 64-bit server CPUs.

 - A rework of our PCI probing so that it happens later in boot, when
   more generic infrastructure is available.

 - Two small fixes to allow 32-bit little-endian processes to run on
   64-bit kernels.

 - Other smaller features, fixes & cleanups.

Thanks to: Alexey Kardashevskiy, Ananth N Mavinakayanahalli, Aneesh
Kumar K.V, Athira Rajeev, Bhaskar Chowdhury, Cédric Le Goater, Chengyang
Fan, Christophe Leroy, Christopher M. Riedl, Fabiano Rosas, Florian
Fainelli, Frederic Barrat, Ganesh Goudar, Hari Bathini, Jiapeng Chong,
Joseph J Allen, Kajol Jain, Markus Elfring, Michal Suchanek, Nathan
Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Pingfan Liu,
Po-Hsu Lin, Qian Cai, Ram Pai, Randy Dunlap, Sandipan Das, Stephen
Rothwell, Tyrel Datwyler, Will Springer, Yury Norov, and Zheng Yongjun.

* tag 'powerpc-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (188 commits)
  powerpc/perf: Adds support for programming of Thresholding in P10
  powerpc/pci: Remove unimplemented prototypes
  powerpc/uaccess: Merge raw_copy_to_user_allowed() into raw_copy_to_user()
  powerpc/uaccess: Merge __put_user_size_allowed() into __put_user_size()
  powerpc/uaccess: get rid of small constant size cases in raw_copy_{to,from}_user()
  powerpc/64: Fix stack trace not displaying final frame
  powerpc/time: Remove get_tbl()
  powerpc/time: Avoid using get_tbl()
  spi: mpc52xx: Avoid using get_tbl()
  powerpc/syscall: Avoid storing 'current' in another pointer
  powerpc/32: Handle bookE debugging in C in syscall entry/exit
  powerpc/syscall: Do not check unsupported scv vector on PPC32
  powerpc/32: Remove the counter in global_dbcr0
  powerpc/32: Remove verification of MSR_PR on syscall in the ASM entry
  powerpc/syscall: implement system call entry/exit logic in C for PPC32
  powerpc/32: Always save non volatile GPRs at syscall entry
  powerpc/syscall: Change condition to check MSR_RI
  powerpc/syscall: Save r3 in regs->orig_r3
  powerpc/syscall: Use is_compat_task()
  powerpc/syscall: Make interrupt.c buildable on PPC32
  ...
2021-02-22 14:34:00 -08:00
Aneesh Kumar K.V
2ac02e5ece powerpc/mm: Remove dcache flush from memory remove.
We added dcache flush on memory add/remove in commit
fb5924fddf ("powerpc/mm: Flush cache on memory hot(un)plug") to
handle crashes on GPU hotplug. Instead of adding dcache flush in
generic memory add/remove routine which is used even for regular
memory, we should handle these devices specific flush in the device
driver code.

memtrace did handle this in the driver and that was removed by commit
7fd6641de2 ("powerpc/powernv/memtrace: Let the arch hotunplug code
flush cache"). This patch reverts that commit.

The dcache flush in memory add was removed by commit
ea458effa8 ("powerpc: Don't flush caches when adding memory") which
I don't think is correct. The reason why we require dcache flush in
memtrace is to make sure we don't have a dirty cache when we remap a
pfn to cache inhibited. We should do that when the memtrace module
removes the memory and make the pfn available for HTM traces to map it
as cache inhibited.

The other device mentioned in commit fb5924fddf ("powerpc/mm: Flush
cache on memory hot(un)plug") is nvlink device with coherent memory.
The support for that was removed in commit
7eb3cf7619 ("powerpc/powernv: remove unused NPU DMA code") and
commit 25b2995a35 ("mm: remove MEMORY_DEVICE_PUBLIC support")

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210203045812.234439-3-aneesh.kumar@linux.ibm.com
2021-02-11 23:35:07 +11:00
Michael Ellerman
dea6f4c696 powerpc/powernv/pci: Use kzalloc() for phb related allocations
As part of commit fbbefb3202 ("powerpc/pci: Move PHB discovery for
PCI_DN using platforms"), I switched some allocations from
memblock_alloc() to kmalloc(), otherwise memblock would warn that it
was being called after slab init.

However I missed that the code relied on the allocations being zeroed,
without which we could end up crashing:

  pci_bus 0000:00: busn_res: [bus 00-ff] end is updated to ff
  BUG: Unable to handle kernel data access on read at 0x6b6b6b6b6b6b6af7
  Faulting instruction address: 0xc0000000000dbc90
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
  ...
  NIP  pnv_ioda_get_pe_state+0xe0/0x1d0
  LR   pnv_ioda_get_pe_state+0xb4/0x1d0
  Call Trace:
    pnv_ioda_get_pe_state+0xb4/0x1d0 (unreliable)
    pnv_pci_config_check_eeh.isra.9+0x78/0x270
    pnv_pci_read_config+0xf8/0x160
    pci_bus_read_config_dword+0xa4/0x120
    pci_bus_generic_read_dev_vendor_id+0x54/0x270
    pci_scan_single_device+0xb8/0x140
    pci_scan_slot+0x80/0x1b0
    pci_scan_child_bus_extend+0x94/0x490
    pcibios_scan_phb+0x1f8/0x3c0
    pcibios_init+0x8c/0x12c
    do_one_initcall+0x94/0x510
    kernel_init_freeable+0x35c/0x3fc
    kernel_init+0x2c/0x168
    ret_from_kernel_thread+0x5c/0x70

Switch them to kzalloc().

Fixes: fbbefb3202 ("powerpc/pci: Move PHB discovery for PCI_DN using platforms")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210211112749.3410771-1-mpe@ellerman.id.au
2021-02-11 23:28:34 +11:00
Chengyang Fan
6c6fdbb2b7 powerpc: remove unneeded semicolons
Remove superfluous semicolons after function definitions.

Signed-off-by: Chengyang Fan <cy.fan@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210125095338.1719405-1-cy.fan@huawei.com
2021-02-09 00:10:50 +11:00
Nicholas Piggin
3a96570ffc powerpc: convert interrupt handlers to use wrappers
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210130130852.2952424-29-npiggin@gmail.com
2021-02-09 00:02:12 +11:00
Nicholas Piggin
209e9d500e powerpc: introduce die_mce
As explained by commit daf00ae71d ("powerpc/traps: restore
recoverability of machine_check interrupts"), die() can't be called from
within nmi_enter to nicely kill a process context that was interrupted.
nmi_exit must be called first.

This adds a function die_mce which takes care of this for machine check
handlers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210130130852.2952424-24-npiggin@gmail.com
2021-02-09 00:02:11 +11:00
Oliver O'Halloran
fbbefb3202 powerpc/pci: Move PHB discovery for PCI_DN using platforms
Make powernv, pseries, powermac and maple use ppc_mc.discover_phbs.
These platforms need to be done together because they all depend on
pci_dn's being created from the DT. The pci_dn contains a pointer to
the relevant pci_controller so they need to be created after the
pci_controller structures are available, but before PCI devices are
scanned. Currently this ordering is provided by initcalls and the
sequence is:

  1. PHBs are discovered (setup_arch) (early boot, pre-initcalls)
  2. pci_dn are created from the unflattended DT (core initcall)
  3. PHBs are scanned pcibios_init() (subsys initcall)

The new ppc_md.discover_phbs() function is also a core_initcall so we
can't guarantee ordering between the creation of pci_controllers and
the creation of pci_dn's which require a pci_controller. We could use
the postcore, or core_sync initcall levels, but it's cleaner to just
move the pci_dn setup into the per-PHB inits which occur inside of
.discover_phb() for these platforms. This brings the boot-time path in
line with the PHB hotplug path that is used for pseries DLPAR
operations too.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[mpe: Squash powermac & maple in to avoid breakage those platforms,
      convert memblock allocs to use kmalloc to avoid warnings]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201103043523.916109-2-oohall@gmail.com
2021-02-09 00:01:05 +11:00
Christoph Hellwig
8b1b4eccb9 powerpc/powernv: remove get_cxl_module
The static inline get_cxl_module function is entirely unused since commit
8bf6b91a51 ("Revert "powerpc/powernv: Add support for the cxl kernel
api on the real phb"), so remove it.

Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2021-02-08 12:20:17 +01:00
Oliver O'Halloran
24b4c6b1a7 powerpc/powernv/pci: Drop pnv_phb->initialized
The pnv_phb->initialized flag is an odd beast. It was added back in 2012 in
commit db1266c852 ("powerpc/powernv: Skip check on PE if necessary") to
allow devices to be enabled even if the device had not yet been assigned to
a PE. Allowing the device to be enabled before the PE is configured may
cause spurious EEH events since none of the IOMMU context has been setup.

I'm not entirely sure why this was ever necessary. My best guess is that it
was an workaround for a bug or some other undesireable behaviour from the
PCI core. Either way, it's unnecessary now since as of commit dc3d8f85bb
("powerpc/powernv/pci: Re-work bus PE configuration") we can guarantee that
the PE will be configured before the PCI core will allow drivers to bind to
the device.

It's also worth pointing out that the ->initialized flag is only set in
pnv_pci_ioda_create_dbgfs(). That function has its entire body wrapped
in #ifdef CONFIG_DEBUG_FS. As a result, for kernels built without debugfs
(i.e. petitboot) the other checks in pnv_pci_enable_device_hook() are
bypassed entirely.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200902013657.1753830-1-oohall@gmail.com
2021-01-31 22:35:51 +11:00
Qian Cai
c9790fb5df powerpc/powernv/pci: fix a RCU-list lock
It is unsafe to traverse tbl->it_group_list without the RCU read lock.

 WARNING: suspicious RCU usage
 5.7.0-rc4-next-20200508 #1 Not tainted
 -----------------------------
 arch/powerpc/platforms/powernv/pci-ioda-tce.c:355 RCU-list traversed in non-reader section!!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 3 locks held by qemu-kvm/4305:
  #0: c000000bc3fe6988 (&container->group_lock){++++}-{3:3}, at: vfio_fops_unl_ioctl+0x108/0x410 [vfio]
  #1: c00800000fcc7400 (&vfio.iommu_drivers_lock){+.+.}-{3:3}, at: vfio_fops_unl_ioctl+0x148/0x410 [vfio]
  #2: c000000bc3fe4d68 (&container->lock){+.+.}-{3:3}, at: tce_iommu_attach_group+0x3c/0x4f0 [vfio_iommu_spapr_tce]

 stack backtrace:
 CPU: 4 PID: 4305 Comm: qemu-kvm Not tainted 5.7.0-rc4-next-20200508 #1
 Call Trace:
 [c0000010f29afa60] [c0000000007154c8] dump_stack+0xfc/0x174 (unreliable)
 [c0000010f29afab0] [c0000000001d8ff0] lockdep_rcu_suspicious+0x140/0x164
 [c0000010f29afb30] [c0000000000dae2c] pnv_pci_unlink_table_and_group+0x11c/0x200
 [c0000010f29afb70] [c0000000000d4a34] pnv_pci_ioda2_unset_window+0xc4/0x190
 [c0000010f29afbf0] [c0000000000d4b4c] pnv_ioda2_take_ownership+0x4c/0xd0
 [c0000010f29afc30] [c00800000fd60ee0] tce_iommu_attach_group+0x2c8/0x4f0 [vfio_iommu_spapr_tce]
 [c0000010f29afcd0] [c00800000fcc11a0] vfio_fops_unl_ioctl+0x238/0x410 [vfio]
 [c0000010f29afd50] [c0000000005430a8] ksys_ioctl+0xd8/0x130
 [c0000010f29afda0] [c000000000543128] sys_ioctl+0x28/0x40
 [c0000010f29afdc0] [c000000000038af4] system_call_exception+0x114/0x1e0
 [c0000010f29afe20] [c00000000000c8f0] system_call_common+0xf0/0x278

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200510051347.1906-1-cai@lca.pw
2021-01-31 22:35:49 +11:00
Cédric Le Goater
9dd31b1137 powerpc/vas: Fix IRQ name allocation
The VAS device allocates a generic interrupt to handle page faults but
the IRQ name doesn't show under /proc. This is because it's on
stack. Allocate the name.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Haren Myneni <haren@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201212142707.2102141-1-clg@kaod.org
2021-01-30 11:39:31 +11:00
Al Viro
f2485a2dc9 elf_prstatus: collect the common part (everything before pr_reg) into a struct
Preparations to doing i386 compat elf_prstatus sanely - rather than duplicating
the beginning of compat_elf_prstatus, take these fields into a separate
structure (compat_elf_prstatus_common), so that it could be reused.  Due to
the incestous relationship between binfmt_elf.c and compat_binfmt_elf.c we
need the same shape change done to native struct elf_prstatus, gathering the
fields prior to pr_reg into a new structure (struct elf_prstatus_common).

Fortunately, offset of pr_reg is always a multiple of 16 with no padding
right before it, so it's possible to turn all the stuff prior to it into
a single member without disturbing the layout.

[build fix from Geert Uytterhoeven folded in]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2021-01-06 08:38:29 -05:00
Linus Torvalds
8a5be36b93 powerpc updates for 5.11
- Switch to the generic C VDSO, as well as some cleanups of our VDSO
    setup/handling code.
 
  - Support for KUAP (Kernel User Access Prevention) on systems using the hashed
    page table MMU, using memory protection keys.
 
  - Better handling of PowerVM SMT8 systems where all threads of a core do not
    share an L2, allowing the scheduler to make better scheduling decisions.
 
  - Further improvements to our machine check handling.
 
  - Show registers when unwinding interrupt frames during stack traces.
 
  - Improvements to our pseries (PowerVM) partition migration code.
 
  - Several series from Christophe refactoring and cleaning up various parts of
    the 32-bit code.
 
  - Other smaller features, fixes & cleanups.
 
 Thanks to:
   Alan Modra, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Ard
   Biesheuvel, Athira Rajeev, Balamuruhan S, Bill Wendling, Cédric Le Goater,
   Christophe Leroy, Christophe Lombard, Colin Ian King, Daniel Axtens, David
   Hildenbrand, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geert
   Uytterhoeven, Giuseppe Sacco, Greg Kurz, Harish, Jan Kratochvil, Jordan
   Niethe, Kaixu Xia, Laurent Dufour, Leonardo Bras, Madhavan Srinivasan, Mahesh
   Salgaonkar, Mathieu Desnoyers, Nathan Lynch, Nicholas Piggin, Oleg Nesterov,
   Oliver O'Halloran, Oscar Salvador, Po-Hsu Lin, Qian Cai, Qinglang Miao, Randy
   Dunlap, Ravi Bangoria, Sachin Sant, Sandipan Das, Sebastian Andrzej Siewior ,
   Segher Boessenkool, Srikar Dronamraju, Tyrel Datwyler, Uwe Kleine-König,
   Vincent Stehlé, Youling Tang, Zhang Xiaoxu.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl/bURITHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgEzBEAC1Vwibcog2P9rkJPb0q3UGWSYSx25V
 h/LwwxtM9Tm14j/LZsSgkOgIsfMaWEBIw/8D4efQ7AX9aFo+R0c2DdQMx1MG5MXz
 gZk58+l3LwId6h9+OrwurpEW+ZmURLAtGMSyFdkeiZ3/XTnkbf1XnewC0QWQe56a
 EGLmjx1MFl45jspoy7UIUXsXoNJIfflEKhrgUzSUh8X2eLmvB9ws6A4BXxbVzyZl
 lZv3+uWimU2pFgdkB9jOCxoG4zFEr2o5ovLHG7zCCVo5JoXmTPQ5cMVBraH206ms
 +5vCmu4qI8uP5UlZW/mZfhrtDiMdHdQqqFOaQwVlOmoUbU6L6E6rxm3iVnov2Bbi
 iUgxoeJDxAb2cM2EWFK6oWVgr7+NkwvXM1IG8xtprhHrCdnC9r+psQr/dswb3LSg
 MJ7u/RCq3uixy2kWP8E0NEHw7ngQZ/ZKPqzfnmIWOC7tYUxgaL02I8Ff9/ZXAI2J
 CnmqFYOjrimHkcwXGOtKkXNvfU0DiL97qpK2AQNWElE8+bUUmpw+ltUrsdSycYmv
 Afc4WIcVrTA+a9laSLgjdZbolbNSa3p+cMIYdPrVx9g+xqygbxIKv+EDGNv1WHfD
 GU1gmohMY+ZkMOvFRMi8LAsEm0DH/etWE0py/8uyxDYKnGyD1Ur6452DStkmgGb2
 azmcaOyLdb+HXA==
 =Ga3K
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Switch to the generic C VDSO, as well as some cleanups of our VDSO
   setup/handling code.

 - Support for KUAP (Kernel User Access Prevention) on systems using the
   hashed page table MMU, using memory protection keys.

 - Better handling of PowerVM SMT8 systems where all threads of a core
   do not share an L2, allowing the scheduler to make better scheduling
   decisions.

 - Further improvements to our machine check handling.

 - Show registers when unwinding interrupt frames during stack traces.

 - Improvements to our pseries (PowerVM) partition migration code.

 - Several series from Christophe refactoring and cleaning up various
   parts of the 32-bit code.

 - Other smaller features, fixes & cleanups.

Thanks to: Alan Modra, Alexey Kardashevskiy, Andrew Donnellan, Aneesh
Kumar K.V, Ard Biesheuvel, Athira Rajeev, Balamuruhan S, Bill Wendling,
Cédric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian King,
Daniel Axtens, David Hildenbrand, Frederic Barrat, Ganesh Goudar,
Gautham R. Shenoy, Geert Uytterhoeven, Giuseppe Sacco, Greg Kurz,
Harish, Jan Kratochvil, Jordan Niethe, Kaixu Xia, Laurent Dufour,
Leonardo Bras, Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu
Desnoyers, Nathan Lynch, Nicholas Piggin, Oleg Nesterov, Oliver
O'Halloran, Oscar Salvador, Po-Hsu Lin, Qian Cai, Qinglang Miao, Randy
Dunlap, Ravi Bangoria, Sachin Sant, Sandipan Das, Sebastian Andrzej
Siewior , Segher Boessenkool, Srikar Dronamraju, Tyrel Datwyler, Uwe
Kleine-König, Vincent Stehlé, Youling Tang, and Zhang Xiaoxu.

* tag 'powerpc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (304 commits)
  powerpc/32s: Fix cleanup_cpu_mmu_context() compile bug
  powerpc: Add config fragment for disabling -Werror
  powerpc/configs: Add ppc64le_allnoconfig target
  powerpc/powernv: Rate limit opal-elog read failure message
  powerpc/pseries/memhotplug: Quieten some DLPAR operations
  powerpc/ps3: use dma_mapping_error()
  powerpc: force inlining of csum_partial() to avoid multiple csum_partial() with GCC10
  powerpc/perf: Fix Threshold Event Counter Multiplier width for P10
  powerpc/mm: Fix hugetlb_free_pmd_range() and hugetlb_free_pud_range()
  KVM: PPC: Book3S HV: Fix mask size for emulated msgsndp
  KVM: PPC: fix comparison to bool warning
  KVM: PPC: Book3S: Assign boolean values to a bool variable
  powerpc: Inline setup_kup()
  powerpc/64s: Mark the kuap/kuep functions non __init
  KVM: PPC: Book3S HV: XIVE: Add a comment regarding VP numbering
  powerpc/xive: Improve error reporting of OPAL calls
  powerpc/xive: Simplify xive_do_source_eoi()
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG
  ...
2020-12-17 13:34:25 -08:00
Andrew Donnellan
c88017cf2a powerpc/powernv: Rate limit opal-elog read failure message
Sometimes we can't read an error log from OPAL, and we print an error
message accordingly. But the OPAL userspace tools seem to like retrying a
lot, in which case we flood the kernel log with a lot of messages.

Change pr_err() to pr_err_ratelimited() to help with this.

Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201211021140.28402-1-ajd@linux.ibm.com
2020-12-15 22:53:27 +11:00
Nicholas Piggin
59d512e437 powerpc/64: irq replay remove decrementer overflow check
This is way to catch some cases of decrementer overflow, when the
decrementer has underflowed an odd number of times, while MSR[EE] was
disabled.

With a typical small decrementer, a timer that fires when MSR[EE] is
disabled will be "lost" if MSR[EE] remains disabled for between 4.3 and
8.6 seconds after the timer expires. In any case, the decrementer
interrupt would be taken at 8.6 seconds and the timer would be found at
that point.

So this check is for catching extreme latency events, and it prevents
those latencies from being a further few seconds long.  It's not obvious
this is a good tradeoff. This is already a watchdog magnitude event and
that situation is not improved a significantly with this check. For
large decrementers, it's useless.

Therefore remove this check, which avoids a mftb when enabling hard
disabled interrupts (e.g., when enabling after coming from hardware
interrupt handlers). Perhaps more importantly, it also removes the
clunky MSR[EE] vs PACA_IRQ_HARD_DIS incoherency in soft-interrupt replay
which simplifies the code.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201107014336.2337337-1-npiggin@gmail.com
2020-12-09 23:48:15 +11:00
Jordan Niethe
250ad7a45b powerpc/powernv/idle: Restore CIABR after idle for Power9
On Power9, CIABR is lost after idle. This means that instruction
breakpoints set by xmon which use CIABR do not work. Fix this by
restoring CIABR after idle.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201207010519.15597-2-jniethe5@gmail.com
2020-12-07 23:26:02 +11:00
Alexey Kardashevskiy
b1198a8823 powerpc/powernv/npu: Do not attempt NPU2 setup on POWER8NVL NPU
We execute certain NPU2 setup code (such as mapping an LPID to a device
in NPU2) unconditionally if an Nvlink bridge is detected. However this
cannot succeed on POWER8NVL machines and errors appear in dmesg. This is
harmless as skiboot returns an error and the only place we check it is
vfio-pci but that code does not get called on P8+ either.

This adds a check if pnv_npu2_xxx helpers are called on a machine with
NPU2 which initializes pnv_phb::npu in pnv_npu2_init();
pnv_phb::npu==NULL on POWER8/NVL (Naples).

While at this, fix NULL derefencing in pnv_npu_peers_take_ownership/
pnv_npu_peers_release_ownership which occurs when GPUs on mentioned P8s
cause EEH which happens if "vfio-pci" disables devices using
the D3 power state; the vfio-pci's disable_idle_d3 module parameter
controls this and must be set on Naples. The EEH handling clears
the entire pnv_ioda_pe struct in pnv_ioda_free_pe() hence
the NULL derefencing. We cannot recover from that but at least we stop
crashing.

Tested on
- POWER9 pvr=004e1201, Ubuntu 19.04 host, Ubuntu 18.04 vm,
  NVIDIA GV100 10de:1db1 driver 418.39
- POWER8 pvr=004c0100, RHEL 7.6 host, Ubuntu 16.10 vm,
  NVIDIA P100 10de:15f9 driver 396.47

Fixes: 1b785611e1 ("powerpc/powernv/npu: Add release_ownership hook")
Cc: stable@vger.kernel.org # 5.0
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201122073828.15446-1-aik@ozlabs.ru
2020-12-04 16:55:31 +11:00
Oliver O'Halloran
6c58b1b41b powernv/pci: Print an error when device enable is blocked
If the platform decides to block enabling the device nothing is printed
currently. This can lead to some confusion since the dmesg output will
usually print an error with no context e.g.

	e1000e: probe of 0022:01:00.0 failed with error -22

This shouldn't be spammy since pci_enable_device() already prints a
messages when it succeeds.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200409061337.9187-1-oohall@gmail.com
2020-12-04 01:01:34 +11:00
Christophe Lombard
19b311ca51 ocxl: Initiate a TLB invalidate command
When a TLB Invalidate is required for the Logical Partition, the following
sequence has to be performed:

1. Load MMIO ATSD AVA register with the necessary value, if required.
2. Write the MMIO ATSD launch register to initiate the TLB Invalidate
command.
3. Poll the MMIO ATSD status register to determine when the TLB Invalidate
   has been completed.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201125155013.39955-3-clombard@linux.vnet.ibm.com
2020-12-04 01:01:30 +11:00
Christophe Lombard
fc1347b5fe ocxl: Assign a register set to a Logical Partition
Platform specific function to assign a register set to a Logical Partition.
The "ibm,mmio-atsd" property, provided by the firmware, contains the 16
base ATSD physical addresses (ATSD0 through ATSD15) of the set of MMIO
registers (XTS MMIO ATSDx LPARID/AVA/launch/status register).

For the time being, the ATSD0 set of registers is used by default.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201125155013.39955-2-clombard@linux.vnet.ibm.com
2020-12-04 01:01:30 +11:00
Nicholas Piggin
f4b239e4c6 powerpc/64s/powernv: Ratelimit harmless HMI error printing
Harmless HMI errors can be triggered by guests in some cases, and don't
contain much useful information anyway. Ratelimit these to avoid
flooding the console/logs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Use dedicated ratelimit state, not printk_ratelimit()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201128070728.825934-6-npiggin@gmail.com
2020-12-04 01:01:23 +11:00
Nicholas Piggin
a1ee281170 powerpc/64s/powernv: Fix memory corruption when saving SLB entries on MCE
This can be hit by an HPT guest running on an HPT host and bring down
the host, so it's quite important to fix.

Fixes: 7290f3b3d3 ("powerpc/64s/powernv: machine check dump SLB contents")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201128070728.825934-2-npiggin@gmail.com
2020-12-02 23:16:40 +11:00
Nicholas Piggin
01b0f0eae0 powerpc/64s: Trim offlined CPUs from mm_cpumasks
When offlining a CPU, powerpc/64s does not flush TLBs, rather it just
leaves the CPU set in mm_cpumasks, so it continues to receive TLBIEs
to manage its TLBs.

However the exit_flush_lazy_tlbs() function expects that after
returning, all CPUs (except self) have flushed TLBs for that mm, in
which case TLBIEL can be used for this flush. This breaks for offline
CPUs because they don't get the IPI to flush their TLB. This can lead
to stale translations.

Fix this by clearing the CPU from mm_cpumasks, then flushing all TLBs
before going offline.

These offlined CPU bits stuck in the cpumask also prevents the cpumask
from being trimmed back to local mode, which means continual broadcast
IPIs or TLBIEs are needed for TLB flushing. This patch prevents that
situation too.

A cast of many were involved in working this out, but in particular
Milton, Aneesh, Paul made key discoveries.

Fixes: 0cef77c779 ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Debugged-by: Milton Miller <miltonm@us.ibm.com>
Debugged-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Debugged-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201126102530.691335-5-npiggin@gmail.com
2020-11-27 00:10:39 +11:00
Michael Ellerman
20fa40b147 Merge branch 'fixes' into next
Merge our fixes branch, in particular to bring in the changes for the
entry/uaccess flush.
2020-11-25 23:17:31 +11:00
Daniel Axtens
da631f7fd6 powerpc/64s: rename pnv|pseries_setup_rfi_flush to _setup_security_mitigations
pseries|pnv_setup_rfi_flush already does the count cache flush setup, and
we just added entry and uaccess flushes. So the name is not very accurate
any more. In both platforms we then also immediately setup the STF flush.

Rename them to _setup_security_mitigations and fold the STF flush in.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2020-11-19 23:47:25 +11:00
Nicholas Piggin
9a32a7e78b powerpc/64s: flush L1D after user accesses
IBM Power9 processors can speculatively operate on data in the L1 cache
before it has been completely validated, via a way-prediction mechanism. It
is not possible for an attacker to determine the contents of impermissible
memory using this method, since these systems implement a combination of
hardware and software security measures to prevent scenarios where
protected data could be leaked.

However these measures don't address the scenario where an attacker induces
the operating system to speculatively execute instructions using data that
the attacker controls. This can be used for example to speculatively bypass
"kernel user access prevention" techniques, as discovered by Anthony
Steinhauser of Google's Safeside Project. This is not an attack by itself,
but there is a possibility it could be used in conjunction with
side-channels or other weaknesses in the privileged code to construct an
attack.

This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This patch flushes the L1 cache after user accesses.

This is part of the fix for CVE-2020-4788.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2020-11-19 23:47:18 +11:00
Nicholas Piggin
f79643787e powerpc/64s: flush L1D on kernel entry
IBM Power9 processors can speculatively operate on data in the L1 cache
before it has been completely validated, via a way-prediction mechanism. It
is not possible for an attacker to determine the contents of impermissible
memory using this method, since these systems implement a combination of
hardware and software security measures to prevent scenarios where
protected data could be leaked.

However these measures don't address the scenario where an attacker induces
the operating system to speculatively execute instructions using data that
the attacker controls. This can be used for example to speculatively bypass
"kernel user access prevention" techniques, as discovered by Anthony
Steinhauser of Google's Safeside Project. This is not an attack by itself,
but there is a possibility it could be used in conjunction with
side-channels or other weaknesses in the privileged code to construct an
attack.

This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This patch flushes the L1 cache on kernel entry.

This is part of the fix for CVE-2020-4788.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2020-11-19 23:47:15 +11:00
David Hildenbrand
0bd4b96d99 powernv/memtrace: don't abuse memory hot(un)plug infrastructure for memory allocations
Let's use alloc_contig_pages() for allocating memory and remove the
linear mapping manually via arch_remove_linear_mapping(). Mark all pages
PG_offline, such that they will definitely not get touched - e.g.,
when hibernating. When freeing memory, try to revert what we did.

The original idea was discussed in:
 https://lkml.kernel.org/r/48340e96-7e6b-736f-9e23-d3111b915b6e@redhat.com

This is similar to CONFIG_DEBUG_PAGEALLOC handling on other
architectures, whereby only single pages are unmapped from the linear
mapping. Let's mimic what memory hot(un)plug would do with the linear
mapping.

We now need MEMORY_HOTPLUG and CONTIG_ALLOC as dependencies. Add a TODO
that we want to use __GFP_ZERO for clearing once alloc_contig_pages()
understands that.

Tested with in QEMU/TCG with 10 GiB of main memory:
  [root@localhost ~]# echo 0x40000000 > /sys/kernel/debug/powerpc/memtrace/enable
  [  105.903043][ T1080] memtrace: Allocated trace memory on node 0 at 0x0000000080000000
  [root@localhost ~]# echo 0x40000000 > /sys/kernel/debug/powerpc/memtrace/enable
  [  145.042493][ T1080] radix-mmu: Mapped 0x0000000080000000-0x00000000c0000000 with 64.0 KiB pages
  [  145.049019][ T1080] memtrace: Freed trace memory back on node 0
  [  145.333960][ T1080] memtrace: Allocated trace memory on node 0 at 0x0000000080000000
  [root@localhost ~]# echo 0x80000000 > /sys/kernel/debug/powerpc/memtrace/enable
  [  213.606916][ T1080] radix-mmu: Mapped 0x0000000080000000-0x00000000c0000000 with 64.0 KiB pages
  [  213.613855][ T1080] memtrace: Freed trace memory back on node 0
  [  214.185094][ T1080] memtrace: Allocated trace memory on node 0 at 0x0000000080000000
  [root@localhost ~]# echo 0x100000000 > /sys/kernel/debug/powerpc/memtrace/enable
  [  234.874872][ T1080] radix-mmu: Mapped 0x0000000080000000-0x0000000100000000 with 64.0 KiB pages
  [  234.886974][ T1080] memtrace: Freed trace memory back on node 0
  [  234.890153][ T1080] memtrace: Failed to allocate trace memory on node 0
  [root@localhost ~]# echo 0x40000000 > /sys/kernel/debug/powerpc/memtrace/enable
  [  259.490196][ T1080] memtrace: Allocated trace memory on node 0 at 0x0000000080000000

I also made sure allocated memory is properly zeroed.

Note 1: We currently won't be allocating from ZONE_MOVABLE - because our
	pages are not movable. However, as we don't run with any memory
	hot(un)plug mechanism around, we could make an exception to
	increase the chance of allocations succeeding.

Note 2: PG_reserved isn't sufficient. E.g., kernel_page_present() used
	along PG_reserved in hibernation code will always return "true"
	on powerpc, resulting in the pages getting touched. It's too
	generic - e.g., indicates boot allocations.

Note 3: For now, we keep using memory_block_size_bytes() as minimum
	granularity.

Suggested-by: Michal Hocko <mhocko@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201111145322.15793-9-david@redhat.com
2020-11-19 16:57:00 +11:00
David Hildenbrand
d6718941a2 powerpc/powernv/memtrace: Fix crashing the kernel when enabling concurrently
It's very easy to crash the kernel right now by simply trying to
enable memtrace concurrently, hammering on the "enable" interface

loop.sh:
  #!/bin/bash

  dmesg --console-off

  while true; do
          echo 0x40000000 > /sys/kernel/debug/powerpc/memtrace/enable
  done

[root@localhost ~]# loop.sh &
[root@localhost ~]# loop.sh &

Resulting quickly in a kernel crash. Let's properly protect using a
mutex.

Fixes: 9d5171a8f2 ("powerpc/powernv: Enable removal of memory for in memory tracing")
Cc: stable@vger.kernel.org# v4.14+
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201111145322.15793-3-david@redhat.com
2020-11-19 16:56:58 +11:00
David Hildenbrand
c74cf7a3d5 powerpc/powernv/memtrace: Don't leak kernel memory to user space
We currently leak kernel memory to user space, because memory
offlining doesn't do any implicit clearing of memory and we are
missing explicit clearing of memory.

Let's keep it simple and clear pages before removing the linear
mapping.

Reproduced in QEMU/TCG with 10 GiB of main memory:
  [root@localhost ~]# dd obs=9G if=/dev/urandom of=/dev/null
  [... wait until "free -m" used counter no longer changes and cancel]
  19665802+0 records in
  1+0 records out
  9663676416 bytes (9.7 GB, 9.0 GiB) copied, 135.548 s, 71.3 MB/s
  [root@localhost ~]# cat /sys/devices/system/memory/block_size_bytes
  40000000
  [root@localhost ~]# echo 0x40000000 > /sys/kernel/debug/powerpc/memtrace/enable
  [  402.978663][ T1086] page:000000001bc4bc74 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x24900
  [  402.980063][ T1086] flags: 0x7ffff000001000(reserved)
  [  402.980415][ T1086] raw: 007ffff000001000 c00c000000924008 c00c000000924008 0000000000000000
  [  402.980627][ T1086] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
  [  402.980845][ T1086] page dumped because: unmovable page
  [  402.989608][ T1086] Offlined Pages 16384
  [  403.324155][ T1086] memtrace: Allocated trace memory on node 0 at 0x0000000200000000

Before this patch:
  [root@localhost ~]# hexdump -C /sys/kernel/debug/powerpc/memtrace/00000000/trace  | head
  00000000  c8 25 72 51 4d 26 36 c5  5c c2 56 15 d5 1a cd 10  |.%rQM&6.\.V.....|
  00000010  19 b9 50 b2 cb e3 60 b8  ec 0a f3 ec 4b 3c 39 f0  |..P...`.....K<9.|$
  00000020  4e 5a 4c cf bd 26 19 ff  37 79 13 67 24 b7 b8 57  |NZL..&..7y.g$..W|$
  00000030  98 3e f5 be 6f 14 6a bd  a4 52 bc 6e e9 e0 c1 5d  |.>..o.j..R.n...]|$
  00000040  76 b3 ae b5 88 d7 da e3  64 23 85 2c 10 88 07 b6  |v.......d#.,....|$
  00000050  9a d8 91 de f7 50 27 69  2e 64 9c 6f d3 19 45 79  |.....P'i.d.o..Ey|$
  00000060  6a 6f 8a 61 71 19 1f c7  f1 df 28 26 ca 0f 84 55  |jo.aq.....(&...U|$
  00000070  01 3f be e4 e2 e1 da ff  7b 8c 8e 32 37 b4 24 53  |.?......{..27.$S|$
  00000080  1b 70 30 45 56 e6 8c c4  0e b5 4c fb 9f dd 88 06  |.p0EV.....L.....|$
  00000090  ef c4 18 79 f1 60 b1 5c  79 59 4d f4 36 d7 4a 5c  |...y.`.\yYM.6.J\|$

After this patch:
  [root@localhost ~]# hexdump -C /sys/kernel/debug/powerpc/memtrace/00000000/trace  | head
  00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
  *
  40000000

Fixes: 9d5171a8f2 ("powerpc/powernv: Enable removal of memory for in memory tracing")
Cc: stable@vger.kernel.org # v4.14+
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201111145322.15793-2-david@redhat.com
2020-11-19 16:56:57 +11:00
Kaixu Xia
027717a45c powerpc/powernv/sriov: fix unsigned int win compared to less than zero
Fix coccicheck warning:

  arch/powerpc/platforms/powernv/pci-sriov.c:443:7-10:
  WARNING: Unsigned expression compared with zero: win < 0

  arch/powerpc/platforms/powernv/pci-sriov.c:462:7-10:
  WARNING: Unsigned expression compared with zero: win < 0

Fixes: 39efc03e3e ("powerpc/powernv/sriov: Move M64 BAR allocation into a helper")
Reported-by: Tosk Robot <tencent_os_robot@tencent.com>
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1605007170-22171-1-git-send-email-kaixuxia@tencent.com
2020-11-19 16:56:54 +11:00
Linus Torvalds
b6f96e75ae powerpc fixes for 5.10 #2
A fix for undetected data corruption on Power9 Nimbus <= DD2.1 in the emulation
 of VSX loads. The affected CPUs were not widely available.
 
 Two fixes for machine check handling in guests under PowerVM.
 
 A fix for our recent changes to SMP setup, when CONFIG_CPUMASK_OFFSTACK=y.
 
 Three fixes for races in the handling of some of our powernv sysfs attributes.
 
 One change to remove TM from the set of Power10 CPU features.
 
 A couple of other minor fixes.
 
 Thanks to:
   Aneesh Kumar K.V, Christophe Leroy, Ganesh Goudar, Jordan Niethe, Mahesh
   Salgaonkar, Michael Neuling, Oliver O'Halloran, Qian Cai, Srikar Dronamraju,
   Vasant Hegde.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl+UASATHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgAbpD/4nN+0cM7M2iCPL1cqd3nmzziJ/tXsq
 1ZxU+2B+cU+pUy4LHgtH1arJb85iVqFR3cC9j705uo6kO9vqsppTj2752srSEioM
 er1UxzRza/lNZaVGaywCD9oApayPkzg74IbenXDDduI+oWvQuvWZbSBskJfdARg2
 7kBFhV7w8sUGa8e/JS1FITndPPO9tMurk+s0FgP4cjsGM/iTW8eUfGcOFsOlc+uZ
 tybZUCY/G4E77etE1KHVjw8IcwSh0P/ibQ6nLnIFpOtPCRs5tTqbuARYN8U55M9H
 0ebt3sv5QTyNvZY0bm5p9ZsC1AKyciUO5SWPNEEwzOdyYVQjlofHj3UvcHKW2D1t
 ymbglsdQeXM5uuexa23ape1e3UuwW1JhsHTQLnCbI3C/snkMA3ZegVsS66GIMXn2
 C0gef0RzQ7HrvwUEl3V/b6W87LL6NpGU6RRWyva7/0pLMZkMtKpGgWg/hVzPRTcC
 6yoUVWNN5p7pZu6VDkoqdJuw7hQPyo7t5Kj71G+/SdH5engcFjnbBxDiEge/4a7+
 RluvswpCn9SyyEvS2BL262LSPq8iYH4+at6n+uLbonZSY0P9Z5zSpPpkNJkyTnwz
 GXj1DBSEOBDZQ7pFeoCFOeYoo1Yk5EQpmA7YuxnZkzOdxFpIUgFU1wdRemzVZw2o
 PTw5VHoRgCmIsQ==
 =LMZv
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - A fix for undetected data corruption on Power9 Nimbus <= DD2.1 in the
   emulation of VSX loads. The affected CPUs were not widely available.

 - Two fixes for machine check handling in guests under PowerVM.

 - A fix for our recent changes to SMP setup, when
   CONFIG_CPUMASK_OFFSTACK=y.

 - Three fixes for races in the handling of some of our powernv sysfs
   attributes.

 - One change to remove TM from the set of Power10 CPU features.

 - A couple of other minor fixes.

Thanks to: Aneesh Kumar K.V, Christophe Leroy, Ganesh Goudar, Jordan
Niethe, Mahesh Salgaonkar, Michael Neuling, Oliver O'Halloran, Qian Cai,
Srikar Dronamraju, Vasant Hegde.

* tag 'powerpc-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pseries: Avoid using addr_to_pfn in real mode
  powerpc/uaccess: Don't use "m<>" constraint with GCC 4.9
  powerpc/eeh: Fix eeh_dev_check_failure() for PE#0
  powerpc/64s: Remove TM from Power10 features
  selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround
  powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation
  powerpc/powernv/dump: Handle multiple writes to ack attribute
  powerpc/powernv/dump: Fix race while processing OPAL dump
  powerpc/smp: Use GFP_ATOMIC while allocating tmp mask
  powerpc/smp: Remove unnecessary variable
  powerpc/mce: Avoid nmi_enter/exit in real mode on pseries hash
  powerpc/opal_elog: Handle multiple writes to ack attribute
2020-10-24 11:09:13 -07:00
Vasant Hegde
358ab796ce powerpc/powernv/dump: Handle multiple writes to ack attribute
Even though we use self removing sysfs helper, we still need
to make sure we do the final kobject delete conditionally.
sysfs_remove_file_self() will handle parallel calls to remove
the sysfs attribute file and returns true only in the caller
that removed the attribute file. The other parallel callers
are returned false. Do the final kobject delete checking
the return value of sysfs_remove_file_self().

Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201017164236.264713-1-hegdevasant@linux.vnet.ibm.com
2020-10-19 22:58:52 +11:00
Vasant Hegde
0a43ae3e2b powerpc/powernv/dump: Fix race while processing OPAL dump
Every dump reported by OPAL is exported to userspace through a sysfs
interface and notified using kobject_uevent(). The userspace daemon
(opal_errd) then reads the dump and acknowledges that the dump is
saved safely to disk. Once acknowledged the kernel removes the
respective sysfs file entry causing respective resources to be
released including kobject.

However it's possible the userspace daemon may already be scanning
dump entries when a new sysfs dump entry is created by the kernel.
User daemon may read this new entry and ack it even before kernel can
notify userspace about it through kobject_uevent() call. If that
happens then we have a potential race between
dump_ack_store->kobject_put() and kobject_uevent which can lead to
use-after-free of a kernfs object resulting in a kernel crash.

This patch fixes this race by protecting the sysfs file
creation/notification by holding a reference count on kobject until we
safely send kobject_uevent().

The function create_dump_obj() returns the dump object which if used
by caller function will end up in use-after-free problem again.
However, the return value of create_dump_obj() function isn't being
used today and there is no need as well. Hence change it to return
void to make this fix complete.

Fixes: c7e64b9ce0 ("powerpc/powernv Platform dump interface")
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201017164210.264619-1-hegdevasant@linux.vnet.ibm.com
2020-10-19 22:52:08 +11:00
Linus Torvalds
96685f8666 powerpc updates for 5.10
- A series from Nick adding ARCH_WANT_IRQS_OFF_ACTIVATE_MM & selecting it for
    powerpc, as well as a related fix for sparc.
 
  - Remove support for PowerPC 601.
 
  - Some fixes for watchpoints & addition of a new ptrace flag for detecting ISA
    v3.1 (Power10) watchpoint features.
 
  - A fix for kernels using 4K pages and the hash MMU on bare metal Power9
    systems with > 16TB of RAM, or RAM on the 2nd node.
 
  - A basic idle driver for shallow stop states on Power10.
 
  - Tweaks to our sched domains code to better inform the scheduler about the
    hardware topology on Power9/10, where two SMT4 cores can be presented by
    firmware as an SMT8 core.
 
  - A series doing further reworks & cleanups of our EEH code.
 
  - Addition of a filter for RTAS (firmware) calls done via sys_rtas(), to
    prevent root from overwriting kernel memory.
 
  - Other smaller features, fixes & cleanups.
 
 Thanks to:
   Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Athira Rajeev, Biwen
   Li, Cameron Berkenpas, Cédric Le Goater, Christophe Leroy, Christoph Hellwig,
   Colin Ian King, Daniel Axtens, David Dai, Finn Thain, Frederic Barrat, Gautham
   R. Shenoy, Greg Kurz, Gustavo Romero, Ira Weiny, Jason Yan, Joel Stanley,
   Jordan Niethe, Kajol Jain, Konrad Rzeszutek Wilk, Laurent Dufour, Leonardo
   Bras, Liu Shixin, Luca Ceresoli, Madhavan Srinivasan, Mahesh Salgaonkar,
   Nathan Lynch, Nicholas Mc Guire, Nicholas Piggin, Nick Desaulniers, Oliver
   O'Halloran, Pedro Miraglia Franco de Carvalho, Pratik Rajesh Sampat, Qian Cai,
   Qinglang Miao, Ravi Bangoria, Russell Currey, Satheesh Rajendran, Scott
   Cheloha, Segher Boessenkool, Srikar Dronamraju, Stan Johnson, Stephen Kitt,
   Stephen Rothwell, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain,
   Vaidyanathan Srinivasan, Vasant Hegde, Wang Wensheng, Wolfram Sang, Yang
   Yingliang, zhengbin.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl+JBQoTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgJJAD/0e3tsFP+9rFlxKSJlDcMW3w7kXDRXE
 tG40F1ubYFLU8wtFVR0De3njTRsz5HyaNU6SI8CwPq48mCa7OFn1D1OeHonHXDX9
 w6v3GE2S1uXXQnjm+czcfdjWQut0IwWBLx007/S23WcPff3Abc2irupKLNu+Gx29
 b/yxJHZSRJVX59jSV94HkdJS75mDHQ3oUOlFGXtuGcUZDufpD1ynRcQOjr0V/8JU
 F4WAblFSe7hiczHGqIvfhFVJ+OikEhnj2aEMAL8U7vxzrAZ7RErKCN9s/0Tf0Ktx
 FzNEFNLHZGqh+qNDpKKmM+RnaeO2Lcoc9qVn7vMHOsXPzx9F5LJwkI/DgPjtgAq/
 mFvGnQB/FapATnQeMluViC/qhEe5bQXLUfPP5i2+QOjK0QqwyFlUMgaVNfsY8jRW
 0Q/sNA72Opzst4WUTveCd4SOInlUuat09e5nLooCRLW7u7/jIiXNRSFNvpOiwkfF
 EcIPJsi6FUQ4SNbqpRSNEO9fK5JZrrUtmr0pg8I7fZhHYGcxEjqPR6IWCs3DTsak
 4/KhjhhTnP/IWJRw6qKAyNhEyEwpWqYZ97SIQbvSb1g/bS47AIdQdJRb0eEoRjhx
 sbbnnYFwPFkG4c1yQSIFanT9wNDQ2hFx/c/mRfbd7J+ordx9JsoqXjqrGuhsU/pH
 GttJLmkJ5FH+pQ==
 =akeX
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - A series from Nick adding ARCH_WANT_IRQS_OFF_ACTIVATE_MM & selecting
   it for powerpc, as well as a related fix for sparc.

 - Remove support for PowerPC 601.

 - Some fixes for watchpoints & addition of a new ptrace flag for
   detecting ISA v3.1 (Power10) watchpoint features.

 - A fix for kernels using 4K pages and the hash MMU on bare metal
   Power9 systems with > 16TB of RAM, or RAM on the 2nd node.

 - A basic idle driver for shallow stop states on Power10.

 - Tweaks to our sched domains code to better inform the scheduler about
   the hardware topology on Power9/10, where two SMT4 cores can be
   presented by firmware as an SMT8 core.

 - A series doing further reworks & cleanups of our EEH code.

 - Addition of a filter for RTAS (firmware) calls done via sys_rtas(),
   to prevent root from overwriting kernel memory.

 - Other smaller features, fixes & cleanups.

Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
Athira Rajeev, Biwen Li, Cameron Berkenpas, Cédric Le Goater, Christophe
Leroy, Christoph Hellwig, Colin Ian King, Daniel Axtens, David Dai, Finn
Thain, Frederic Barrat, Gautham R. Shenoy, Greg Kurz, Gustavo Romero,
Ira Weiny, Jason Yan, Joel Stanley, Jordan Niethe, Kajol Jain, Konrad
Rzeszutek Wilk, Laurent Dufour, Leonardo Bras, Liu Shixin, Luca
Ceresoli, Madhavan Srinivasan, Mahesh Salgaonkar, Nathan Lynch, Nicholas
Mc Guire, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Pedro
Miraglia Franco de Carvalho, Pratik Rajesh Sampat, Qian Cai, Qinglang
Miao, Ravi Bangoria, Russell Currey, Satheesh Rajendran, Scott Cheloha,
Segher Boessenkool, Srikar Dronamraju, Stan Johnson, Stephen Kitt,
Stephen Rothwell, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain,
Vaidyanathan Srinivasan, Vasant Hegde, Wang Wensheng, Wolfram Sang, Yang
Yingliang, zhengbin.

* tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (228 commits)
  Revert "powerpc/pci: unmap legacy INTx interrupts when a PHB is removed"
  selftests/powerpc: Fix eeh-basic.sh exit codes
  cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_reboot_notifier
  powerpc/time: Make get_tb() common to PPC32 and PPC64
  powerpc/time: Make get_tbl() common to PPC32 and PPC64
  powerpc/time: Remove get_tbu()
  powerpc/time: Avoid using get_tbl() and get_tbu() internally
  powerpc/time: Make mftb() common to PPC32 and PPC64
  powerpc/time: Rename mftbl() to mftb()
  powerpc/32s: Remove #ifdef CONFIG_PPC_BOOK3S_32 in head_book3s_32.S
  powerpc/32s: Rename head_32.S to head_book3s_32.S
  powerpc/32s: Setup the early hash table at all time.
  powerpc/time: Remove ifdef in get_dec() and set_dec()
  powerpc: Remove get_tb_or_rtc()
  powerpc: Remove __USE_RTC()
  powerpc: Tidy up a bit after removal of PowerPC 601.
  powerpc: Remove support for PowerPC 601
  powerpc: Remove PowerPC 601
  powerpc: Drop SYNC_601() ISYNC_601() and SYNC()
  powerpc: Remove CONFIG_PPC601_SYNC_FIX
  ...
2020-10-16 12:21:15 -07:00
David Hildenbrand
b611719978 mm/memory_hotplug: prepare passing flags to add_memory() and friends
We soon want to pass flags, e.g., to mark added System RAM resources.
mergeable.  Prepare for that.

This patch is based on a similar patch by Oscar Salvador:

https://lkml.kernel.org/r/20190625075227.15193-3-osalvador@suse.de

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Juergen Gross <jgross@suse.com> # Xen related part
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Libor Pechacek <lpechacek@suse.cz>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Leonardo Bras <leobras.c@gmail.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Julien Grall <julien@xen.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Link: https://lkml.kernel.org/r/20200911103459.10306-5-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16 11:11:18 -07:00
Aneesh Kumar K.V
d4263b12a1 powerpc/opal_elog: Handle multiple writes to ack attribute
Even though we use self removing sysfs helper, we still need
to make sure we do the final kobject delete conditionally.
sysfs_remove_file_self() will handle parallel calls to remove
the sysfs attribute file and returns true only in the caller
that removed the attribute file. The other parallel callers
are returned false. Do the final kobject delete checking
the return value of sysfs_remove_file_self().

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201014064813.109515-1-aneesh.kumar@linux.ibm.com
2020-10-16 14:25:12 +11:00
Oliver O'Halloran
35d64734b6 powerpc/eeh: Clean up PE addressing
When support for EEH on PowerNV was added a lot of pseries specific code
was made "generic" and some of the quirks of pseries EEH came along for the
ride. One of the stranger quirks is eeh_pe containing two types of PE
address: pe->addr and pe->config_addr. There reason for this appears to be
historical baggage rather than any real requirements.

On pseries EEH PEs are manipulated using RTAS calls. Each EEH RTAS call
takes a "PE configuration address" as an input which is used to identify
which EEH PE is being manipulated by the call. When initialising the EEH
state for a device the first thing we need to do is determine the
configuration address for the PE which contains the device so we can enable
EEH on that PE. This process is outlined in PAPR which is the modern
(i.e post-2003) FW specification for pseries. However, EEH support was
first described in the pSeries RISC Platform Architecture (RPA) and
although they are mostly compatible EEH is one of the areas where they are
not.

The major difference is that RPA doesn't actually have the concept of a PE.
On RPA systems the EEH RTAS calls are done on a per-device basis using the
same config_addr that would be passed to the RTAS functions to access PCI
config space (e.g. ibm,read-pci-config). The config_addr is not identical
since the function and config register offsets of the config_addr must be
set to zero. EEH operations being done on a per-device basis doesn't make a
whole lot of sense when you consider how EEH was implemented on legacy PCI
systems.

For legacy PCI(-X) systems EEH was implemented using special PCI-PCI
bridges which contained logic to detect errors and freeze the secondary
bus when one occurred. This means that the EEH enabled state is shared
among all devices behind that EEH bridge. As a result there's no way to
implement the per-device control required for the semantics specified by
RPA. It can be made to work if we assume that a separate EEH bridge exists
for each EEH capable PCI slot and there are no bridges behind those slots.
However, RPA also specifies the ibm,configure-bridge RTAS call for
re-initalising bridges behind EEH capable slots after they are reset due
to an EEH event so that is probably not a valid assumption. This
incoherence was fixed in later PAPR, which succeeded RPA. Unfortunately,
since Linux EEH support seems to have been implemented based on the RPA
spec some of the legacy assumptions were carried over (probably for POWER4
compatibility).

The fix made in PAPR was the introduction of the "PE" concept and
redefining the EEH RTAS calls (set-eeh-option, reset-slot, etc) to operate
on a per-PE basis so all devices behind an EEH bride would share the same
EEH state. The "config_addr" argument to the EEH RTAS calls became the
"PE_config_addr" and the OS was required to use the
ibm,get-config-addr-info RTAS call to find the correct PE address for the
device. When support for the new interfaces was added to Linux it was
implemented using something like:

At probe time:

	pdn->eeh_config_addr = rtas_config_addr(pdn);
	pdn->eeh_pe_config_addr = rtas_get_config_addr_info(pdn);

When performing an RTAS call:

	config_addr = pdn->eeh_config_addr;
	if (pdn->eeh_pe_config_addr)
		config_addr = pdn->eeh_pe_config_addr;

	rtas_call(..., config_addr, ...);

In other words, if the ibm,get-config-addr-info RTAS call is implemented
and returned a valid result we'd use that as the argument to the EEH
RTAS calls. If not, Linux would fall back to using the device's
config_addr. Over time these addresses have moved around going from pci_dn
to eeh_dev and finally into eeh_pe. Today the users look like this:

	config_addr = pe->config_addr;
	if (pe->addr)
		config_addr = pe->addr;

	rtas_call(..., config_addr, ...);

However, considering the EEH core always operates on a per-PE basis and
even on pseries the only per-device operation is the initial call to
ibm,set-eeh-option I'm not sure if any of this actually works on an RPA
system today. It doesn't make much sense to have the fallback address in
a generic structure either since the bulk of the code which reference it
is in pseries anyway.

The EEH core makes a token effort to support looking up a PE using the
config_addr by having two arguments to eeh_pe_get(). However, a survey of
all the callers to eeh_pe_get() shows that all bar one have the config_addr
argument hard-coded to zero.The only caller that doesn't is in
eeh_pe_tree_insert() which has:

	if (!eeh_has_flag(EEH_VALID_PE_ZERO) && !edev->pe_config_addr)
		return -EINVAL;

	pe = eeh_pe_get(hose, edev->pe_config_addr, edev->bdfn);

The third argument (config_addr) is only used if the second (pe->addr)
argument is invalid. The preceding check ensures that the call to
eeh_pe_get() will never happen if edev->pe_config_addr is invalid so there
is no situation where eeh_pe_get() will search for a PE based on the 3rd
argument. The check also means that we'll never insert a PE into the tree
where pe_config_addr is zero since EEH_VALID_PE_ZERO is never set on
pseries. All the users of the fallback address on pseries never actually
use the fallback and all the only caller that supplies something for the
config_addr argument to eeh_pe_get() never use it either. It's all dead
code.

This patch removes the fallback address from eeh_pe since nothing uses it.
Specificly, we do this by:

1) Removing pe->config_addr
2) Removing the EEH_VALID_PE_ZERO flag
3) Removing the fallback address argument to eeh_pe_get().
4) Removing all the checks for pe->addr being zero in the pseries EEH code.

This leaves us with PE's only being identified by what's in their pe->addr
field and the EEH core relying on the platform to ensure that eeh_dev's are
only inserted into the EEH tree if they're actually inside a PE.

No functional changes, I hope.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200918093050.37344-9-oohall@gmail.com
2020-10-06 23:22:25 +11:00
Oliver O'Halloran
395ee2a2a1 powerpc/eeh: Move EEH initialisation to an arch initcall
The initialisation of EEH mostly happens in a core_initcall_sync initcall,
followed by registering a bus notifier later on in an arch_initcall.
Anything involving initcall dependecies is mostly incomprehensible unless
you've spent a while staring at code so here's the full sequence:

ppc_md.setup_arch       <-- pci_controllers are created here

...time passes...

core_initcall           <-- pci_dns are created from DT nodes
core_initcall_sync      <-- platforms call eeh_init()
postcore_initcall       <-- PCI bus type is registered
postcore_initcall_sync
arch_initcall           <-- EEH pci_bus notifier registered
subsys_initcall         <-- PHBs are scanned here

There's no real requirement to do the EEH setup at the core_initcall_sync
level. It just needs to be done after pci_dn's are created and before we
start scanning PHBs. Simplify the flow a bit by moving the platform EEH
inititalisation to an arch_initcall so we can fold the bus notifier
registration into eeh_init().

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200918093050.37344-5-oohall@gmail.com
2020-10-06 23:22:25 +11:00
Oliver O'Halloran
82a1ea21f1 powerpc/powernv: Stop using eeh_ops->init()
Fold pnv_eeh_init() into eeh_powernv_init() rather than having eeh_init()
call it via eeh_ops->init(). It's simpler and it'll let us delete
eeh_ops.init.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200918093050.37344-2-oohall@gmail.com
2020-10-06 23:22:24 +11:00
Oliver O'Halloran
d125aedb40 powerpc/eeh: Rework EEH initialisation
Drop the EEH register / unregister ops thing and have the platform pass the
ops structure into eeh_init() directly. This takes one initcall out of the
EEH setup path and it means we're only doing EEH setup on the platforms
which actually support it. It's also less code and generally easier to
follow.

No functional changes.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200918093050.37344-1-oohall@gmail.com
2020-10-06 23:22:24 +11:00
Mahesh Salgaonkar
aea948bb80 powerpc/powernv/elog: Fix race while processing OPAL error log event.
Every error log reported by OPAL is exported to userspace through a
sysfs interface and notified using kobject_uevent(). The userspace
daemon (opal_errd) then reads the error log and acknowledges the error
log is saved safely to disk. Once acknowledged the kernel removes the
respective sysfs file entry causing respective resources to be
released including kobject.

However it's possible the userspace daemon may already be scanning
elog entries when a new sysfs elog entry is created by the kernel.
User daemon may read this new entry and ack it even before kernel can
notify userspace about it through kobject_uevent() call. If that
happens then we have a potential race between
elog_ack_store->kobject_put() and kobject_uevent which can lead to
use-after-free of a kernfs object resulting in a kernel crash. eg:

  BUG: Unable to handle kernel data access on read at 0x6b6b6b6b6b6b6bfb
  Faulting instruction address: 0xc0000000008ff2a0
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
  CPU: 27 PID: 805 Comm: irq/29-opal-elo Not tainted 5.9.0-rc2-gcc-8.2.0-00214-g6f56a67bcbb5-dirty #363
  ...
  NIP kobject_uevent_env+0xa0/0x910
  LR  elog_event+0x1f4/0x2d0
  Call Trace:
    0x5deadbeef0000122 (unreliable)
    elog_event+0x1f4/0x2d0
    irq_thread_fn+0x4c/0xc0
    irq_thread+0x1c0/0x2b0
    kthread+0x1c4/0x1d0
    ret_from_kernel_thread+0x5c/0x6c

This patch fixes this race by protecting the sysfs file
creation/notification by holding a reference count on kobject until we
safely send kobject_uevent().

The function create_elog_obj() returns the elog object which if used
by caller function will end up in use-after-free problem again.
However, the return value of create_elog_obj() function isn't being
used today and there is no need as well. Hence change it to return
void to make this fix complete.

Fixes: 774fea1a38 ("powerpc/powernv: Read OPAL error log and export it through sysfs")
Cc: stable@vger.kernel.org # v3.15+
Reported-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
[mpe: Rework the logic to use a single return, reword comments, add oops]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201006122051.190176-1-mpe@ellerman.id.au
2020-10-06 23:22:22 +11:00
Qinglang Miao
8ec5cb12cd powerpc/powernv: fix wrong warning message in opalcore_config_init()
The logic of the warn output is incorrect. The two args should be
exchanged.

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200916062129.190864-1-miaoqinglang@huawei.com
2020-09-18 19:59:45 +10:00
Michael Ellerman
39f8756145 powerpc/smp: Move ppc_md.cpu_die() to smp_ops.cpu_offline_self()
We have smp_ops->cpu_die() and ppc_md.cpu_die(). One of them offlines
the current CPU and one offlines another CPU, can you guess which is
which? Also one is in smp_ops and one is in ppc_md?

So rename ppc_md.cpu_die(), to cpu_offline_self(), because that's what
it does. And move it into smp_ops where it belongs.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200819015634.1974478-3-mpe@ellerman.id.au
2020-09-18 19:59:43 +10:00
Nicholas Piggin
ffd2961bb4 powerpc/powernv/idle: add a basic stop 0-3 driver for POWER10
This driver does not restore stop > 3 state, so it limits itself
to states which do not lose full state or TB.

The POWER10 SPRs are sufficiently different from P9 that it seems
easier to split out the P10 code. The POWER10 deep sleep code
(e.g., the BHRB restore) has been taken out, but it can be re-added
when stop > 3 support is added.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Pratik Rajesh Sampat<psampat@linux.ibm.com>
Tested-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Reviewed-by: Pratik Rajesh Sampat<psampat@linux.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200819094700.493399-1-npiggin@gmail.com
2020-09-15 22:13:38 +10:00
Michael Ellerman
960e370813 Merge branch 'fixes' into next
Bring in our fixes branch for this cycle which avoids some small
conflicts with upcoming commits.
2020-09-14 22:57:18 +10:00
Joel Stanley
8f55984f53 powerpc/powernv: Print helpful message when cores guarded
Often the firmware will guard out cores after a crash. This often
undesirable, and is not immediately noticeable.

This adds an informative message when a CPU device tree nodes are
marked bad in the device tree.

Signed-off-by: Joel Stanley <joel@jms.id.au>
[mpe: Use an eye-catcher that's less likely to get us in trouble]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190801051630.5804-1-joel@jms.id.au
2020-09-08 22:57:11 +10:00
Pratik Rajesh Sampat
16d83a540c Revert "powerpc/powernv/idle: Replace CPU feature check with PVR check"
cpuidle stop state implementation has minor optimizations for P10
where hardware preserves more SPR registers compared to P9. The
current P9 driver works for P10, although does few extra
save-restores. P9 driver can provide the required power management
features like SMT thread folding and core level power savings on a P10
platform.

Until the P10 stop driver is available, revert the commit which allows
for only P9 systems to utilize cpuidle and blocks all idle stop states
for P10. CPU idle states are enabled and tested on the P10 platform
with this fix.

This reverts commit 8747bf36f3.

Fixes: 8747bf36f3 ("powerpc/powernv/idle: Replace CPU feature check with PVR check")
Signed-off-by: Pratik Rajesh Sampat <psampat@linux.ibm.com>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200826082918.89306-1-psampat@linux.ibm.com
2020-08-27 17:41:45 +10:00
Oliver O'Halloran
fb248c3121 powerpc/powernv: Fix spurious kerneldoc warnings in opal-prd.c
Comments opening with /** are parsed by kerneldoc and this causes the
following warning to be printed:

	arch/powerpc/platforms/powernv/opal-prd.c:31: warning: cannot understand
	function prototype: 'struct opal_prd_msg_queue_item '

opal_prd_mesg_queue_item is an internal data structure so there's no real
need for it to be documented at all. Fix up the comment to squash the
warning.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200804005410.146094-5-oohall@gmail.com
2020-08-25 01:31:33 +10:00
Oliver O'Halloran
3b70464aa7 powerpc/powernv: Staticify functions without prototypes
There's a few scattered in the powernv platform.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200804005410.146094-4-oohall@gmail.com
2020-08-25 01:31:33 +10:00
Oliver O'Halloran
8471c1dd93 powerpc/powernv: Include asm/powernv.h from the local powernv.h
The asm/powernv.h header provides prototypes for functions which need to be
called by non-powernv platform code. Also include it in the powernv.h
that's local to the platform directory to squash some warnings about
non-static functions missing prototypes.

Also include powernv.h since from opal-memcons.c since it has the
prototypes for the memcons wrangling functions which are used for the opal
and ultravisor msglog.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200804005410.146094-3-oohall@gmail.com
2020-08-25 01:31:33 +10:00
Oliver O'Halloran
f6bac19cf6 powerpc/powernv/smp: Fix spurious DBG() warning
When building with W=1 we get the following warning:

 arch/powerpc/platforms/powernv/smp.c: In function ‘pnv_smp_cpu_kill_self’:
 arch/powerpc/platforms/powernv/smp.c:276:16: error: suggest braces around
 	empty body in an ‘if’ statement [-Werror=empty-body]
   276 |      cpu, srr1);
       |                ^
 cc1: all warnings being treated as errors

The full context is this block:

 if (srr1 && !generic_check_cpu_restart(cpu))
 	DBG("CPU%d Unexpected exit while offline srr1=%lx!\n",
 			cpu, srr1);

When building with DEBUG undefined DBG() expands to nothing and GCC emits
the warning due to the lack of braces around an empty statement.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200804005410.146094-2-oohall@gmail.com
2020-08-25 01:31:33 +10:00
zhengbin
18102e4bcc powerpc/powernv: Remove set but not used variable 'parent'
Fix gcc '-Wunused-but-set-variable' warning:

arch/powerpc/platforms/powernv/pci-ioda.c: In function pnv_ioda_configure_pe:
arch/powerpc/platforms/powernv/pci-ioda.c:867:18: warning: variable parent set but not used [-Wunused-but-set-variable]

It is not used since commit b131a8425c ("powerpc/powernv:
Set PELTV for compound PEs")

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1574144074-142032-6-git-send-email-zhengbin13@huawei.com
2020-08-25 01:31:32 +10:00
Frederic Barrat
374f6178f3 ocxl: Remove custom service to allocate interrupts
We now allocate interrupts through xive directly.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200403153838.29224-5-fbarrat@linux.ibm.com
2020-08-25 01:31:31 +10:00
Frederic Barrat
e17a7c0e0a powerpc/powernv/pci: Fix possible crash when releasing DMA resources
Fix a typo introduced during recent code cleanup, which could lead to
silently not freeing resources or an oops message (on PCI hotplug or
CAPI reset).

Only impacts ioda2, the code path for ioda1 is correct.

Fixes: 01e12629af ("powerpc/powernv/pci: Add explicit tracking of the DMA setup state")
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200819130741.16769-1-fbarrat@linux.ibm.com
2020-08-20 12:30:40 +10:00
Oliver O'Halloran
2075ec9896 powerpc/powernv/sriov: Fix use of uninitialised variable
Initialising the value before using it is generally regarded as a good
idea so do that.

Fixes: 4c51f3e1e8 ("powerpc/powernv/sriov: Make single PE mode a per-BAR setting")
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200803075408.132601-1-oohall@gmail.com
2020-08-03 22:13:13 +10:00
Wei Yongjun
cf1ae052e0 powerpc/powernv/sriov: Remove unused but set variable 'phb'
Gcc report warning as follows:

arch/powerpc/platforms/powernv/pci-sriov.c:602:25: warning:
 variable 'phb' set but not used [-Wunused-but-set-variable]
  602 |  struct pnv_phb        *phb;
      |                         ^~~

This variable is not used, so this commit removing it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200727171112.2781-1-weiyongjun1@huawei.com
2020-07-29 22:30:34 +10:00
Gustavo A. R. Silva
5e66a0cb5f powerpc: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200727224201.GA10133@embeddedor
2020-07-29 21:09:37 +10:00