Commit Graph

1317 Commits

Author SHA1 Message Date
Helge Deller
b4b18f47f4 Revert "parisc: Show error if wrong 32/64-bit compiler is being used"
This reverts commit b160628e9e.

There is no need any longer to have this sanity check, because the
previous commit ("parisc: Make CONFIG_64BIT available for ARCH=parisc64
only") prevents that CONFIG_64BIT is set if ARCH==parisc.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-08-22 11:09:17 +02:00
Linus Torvalds
6614a3c316 - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
Lin, Yang Shi, Anshuman Khandual and Mike Rapoport
 
 - Some kmemleak fixes from Patrick Wang and Waiman Long
 
 - DAMON updates from SeongJae Park
 
 - memcg debug/visibility work from Roman Gushchin
 
 - vmalloc speedup from Uladzislau Rezki
 
 - more folio conversion work from Matthew Wilcox
 
 - enhancements for coherent device memory mapping from Alex Sierra
 
 - addition of shared pages tracking and CoW support for fsdax, from
   Shiyang Ruan
 
 - hugetlb optimizations from Mike Kravetz
 
 - Mel Gorman has contributed some pagealloc changes to improve latency
   and realtime behaviour.
 
 - mprotect soft-dirty checking has been improved by Peter Xu
 
 - Many other singleton patches all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYuravgAKCRDdBJ7gKXxA
 jpqSAQDrXSdII+ht9kSHlaCVYjqRFQz/rRvURQrWQV74f6aeiAD+NHHeDPwZn11/
 SPktqEUrF1pxnGQxqLh1kUFUhsVZQgE=
 =w/UH
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Most of the MM queue. A few things are still pending.

  Liam's maple tree rework didn't make it. This has resulted in a few
  other minor patch series being held over for next time.

  Multi-gen LRU still isn't merged as we were waiting for mapletree to
  stabilize. The current plan is to merge MGLRU into -mm soon and to
  later reintroduce mapletree, with a view to hopefully getting both
  into 6.1-rc1.

  Summary:

   - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
     Lin, Yang Shi, Anshuman Khandual and Mike Rapoport

   - Some kmemleak fixes from Patrick Wang and Waiman Long

   - DAMON updates from SeongJae Park

   - memcg debug/visibility work from Roman Gushchin

   - vmalloc speedup from Uladzislau Rezki

   - more folio conversion work from Matthew Wilcox

   - enhancements for coherent device memory mapping from Alex Sierra

   - addition of shared pages tracking and CoW support for fsdax, from
     Shiyang Ruan

   - hugetlb optimizations from Mike Kravetz

   - Mel Gorman has contributed some pagealloc changes to improve
     latency and realtime behaviour.

   - mprotect soft-dirty checking has been improved by Peter Xu

   - Many other singleton patches all over the place"

 [ XFS merge from hell as per Darrick Wong in

   https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ]

* tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits)
  tools/testing/selftests/vm/hmm-tests.c: fix build
  mm: Kconfig: fix typo
  mm: memory-failure: convert to pr_fmt()
  mm: use is_zone_movable_page() helper
  hugetlbfs: fix inaccurate comment in hugetlbfs_statfs()
  hugetlbfs: cleanup some comments in inode.c
  hugetlbfs: remove unneeded header file
  hugetlbfs: remove unneeded hugetlbfs_ops forward declaration
  hugetlbfs: use helper macro SZ_1{K,M}
  mm: cleanup is_highmem()
  mm/hmm: add a test for cross device private faults
  selftests: add soft-dirty into run_vmtests.sh
  selftests: soft-dirty: add test for mprotect
  mm/mprotect: fix soft-dirty check in can_change_pte_writable()
  mm: memcontrol: fix potential oom_lock recursion deadlock
  mm/gup.c: fix formatting in check_and_migrate_movable_page()
  xfs: fail dax mount if reflink is enabled on a partition
  mm/memcontrol.c: remove the redundant updating of stats_flush_threshold
  userfaultfd: don't fail on unrecognized features
  hugetlb_cgroup: fix wrong hugetlb cgroup numa stat
  ...
2022-08-05 16:32:45 -07:00
Linus Torvalds
3bd6e5854b asm-generic: updates for 6.0
There are three independent sets of changes:
 
  - Sai Prakash Ranjan adds tracing support to the asm-generic
    version of the MMIO accessors, which is intended to help
    understand problems with device drivers and has been part
    of Qualcomm's vendor kernels for many years.
 
  - A patch from Sebastian Siewior to rework the handling of
    IRQ stacks in softirqs across architectures, which is
    needed for enabling PREEMPT_RT.
 
  - The last patch to remove the CONFIG_VIRT_TO_BUS option and
    some of the code behind that, after the last users of this
    old interface made it in through the netdev, scsi, media and
    staging trees.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmLqPPEACgkQmmx57+YA
 GNlUbQ/+NpIsiA0JUrCGtySt8KrLHdA2dH9lJOR5/iuxfphscPFfWtpcPvcXQWmt
 a8u7wyI8SHW1ku4U0Y5sO0dBSldDnoIqJ5t4X5d7YNU9yVtEtucqQhZf+GkrPlVD
 1HkRu05B7y0k2BMn7BLhSvkpafs3f1lNGXjs8oFBdOF1/zwp/GjcrfCK7KFzqjwU
 dYrX0SOFlKFd4BZC75VfK+XcKg4LtwIOmJraRRl7alz2Q5Oop2hgjgZxXDPf//vn
 SPOhXJN/97i1FUpY2TkfHVH1NxbPfjCV4pUnjmLG0Y4NSy9UQ/ZcXHcywIdeuhfa
 0LySOIsAqBeccpYYYdg2ubiMDZOXkBfANu/sB9o/EhoHfB4svrbPRDhBIQZMFXJr
 MJYu+IYce2rvydA/nydo4q++pxR8v1ES1ZIo8bDux+q1CI/zbpQV+f98kPVRA0M7
 ajc+5GTIqNIsvHzzadq7eYxcj5Bi8Li2JA9sVkAQ+6iq1TVyeYayMc9eYwONlmqw
 MD+PFYc651pKtXZCfkLXPIKSwS0uPqBndAibuVhpZ0hxWaCBBdKvY9mrWcPxt0kA
 tMR8lrosbbrV2K48BFdWTOHvCs2FhHQxPGVPZ/iWuxTA0hHZ9tUlaEkSX+VM57IU
 KCYQLdWzT8J9vrgqSbgYKlb6pSPz6FIjTfut6NZMmshIbavHV/Q=
 =aTR0
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "There are three independent sets of changes:

   - Sai Prakash Ranjan adds tracing support to the asm-generic version
     of the MMIO accessors, which is intended to help understand
     problems with device drivers and has been part of Qualcomm's vendor
     kernels for many years

   - A patch from Sebastian Siewior to rework the handling of IRQ stacks
     in softirqs across architectures, which is needed for enabling
     PREEMPT_RT

   - The last patch to remove the CONFIG_VIRT_TO_BUS option and some of
     the code behind that, after the last users of this old interface
     made it in through the netdev, scsi, media and staging trees"

* tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  uapi: asm-generic: fcntl: Fix typo 'the the' in comment
  arch/*/: remove CONFIG_VIRT_TO_BUS
  soc: qcom: geni: Disable MMIO tracing for GENI SE
  serial: qcom_geni_serial: Disable MMIO tracing for geni serial
  asm-generic/io: Add logging support for MMIO accessors
  KVM: arm64: Add a flag to disable MMIO trace for nVHE KVM
  lib: Add register read/write tracing support
  drm/meson: Fix overflow implicit truncation warnings
  irqchip/tegra: Fix overflow implicit truncation warnings
  coresight: etm4x: Use asm-generic IO memory barriers
  arm64: io: Use asm-generic high level MMIO accessors
  arch/*: Disable softirq stacks on PREEMPT_RT.
2022-08-05 10:07:23 -07:00
Linus Torvalds
eff0cb3d91 pci-v5.20-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmLr+2wUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxfZg//eChkC2EUdT6K3zuQDbJJhsGcuOQF
 lnZuUyDn4xw7BkEoZf8V6YdAnp7VvgKhLOq1/q3Geu/LBbCaczoEogOCaR/WcVOs
 C+MsN0RWZQtgfuZKncQoqp25NeLPK9PFToeiIX/xViAYZF7NVjDY7XQiZHQ6JkEA
 /7cUqv/4nS3KCMsKjfmiOxGnqohMWtICiw9qjFvJ40PEDnNB1b53rkiVTxBFePpI
 ePfsRfi/C7klE3xNfoiEgrPp+Jfw+oShsCwXUsId7bEL2oLBc7ClqP05ZYZD3bTK
 QQYyZ12Cq8TysciYpUGBjBnywUHS5DIO5YaV3wxyVAR2Z+6GY2/QVjOa2kKvoK0o
 Hba6TJf8bL58AhSI8Q62pBM0sS7dqJSff+9c2BGpZvII5spP/rQQLlJO56TJjwkw
 Dlf0d3thhZOc9vSKjKw+0v0FdAyc4L11EOwUsw95jZeT5WWgqJYGFnWPZwqBI1KM
 DI1E5wVO5tA2H3NEn+BTTHbLWL+UppqyXPXBHiW52b2q5Bt8fJWMsFvnEEjclxmG
 pYCI7VgF8jqbYKxjobxPFY2x6PH9hfaGMxwzZSdOX6e/Eh+1esgyyaC5APpCO+Pp
 e4OkJaOzCmggrD0jYeLWu+yDm5KRrYo5cdfKHrKgAof0Am41lAa1OhJ2iH4ckNqP
 1qmHereDOe0zNVw=
 =9TAR
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:

   - Consolidate duplicated 'next function' scanning and extend to allow
     'isolated functions' on s390, similar to existing hypervisors
     (Niklas Schnelle)

  Resource management:
   - Implement pci_iobar_pfn() for sparc, which allows us to remove the
     sparc-specific pci_mmap_page_range() and pci_mmap_resource_range().

     This removes the ability to map the entire PCI I/O space using
     /proc/bus/pci, but we believe that's already been broken since
     v2.6.28 (Arnd Bergmann)

   - Move common PCI definitions to asm-generic/pci.h and rework others
     to be be more specific and more encapsulated in arches that need
     them (Stafford Horne)

  Power management:

   - Convert drivers to new *_PM_OPS macros to avoid need for '#ifdef
     CONFIG_PM_SLEEP' or '__maybe_unused' (Bjorn Helgaas)

  Virtualization:

   - Add ACS quirk for Broadcom BCM5750x multifunction NICs that isolate
     the functions but don't advertise an ACS capability (Pavan Chebbi)

  Error handling:

   - Clear PCI Status register during enumeration in case firmware left
     errors logged (Kai-Heng Feng)

   - When we have native control of AER, enable error reporting for all
     devices that support AER. Previously only a few drivers enabled
     this (Stefan Roese)

   - Keep AER error reporting enabled for switches. Previously we
     enabled this during enumeration but immediately disabled it (Stefan
     Roese)

   - Iterate over error counters instead of error strings to avoid
     printing junk in AER sysfs counters (Mohamed Khalfella)

  ASPM:

   - Remove pcie_aspm_pm_state_change() so ASPM config changes, e.g.,
     via sysfs, are not lost across power state changes (Kai-Heng Feng)

  Endpoint framework:

   - Don't stop an EPC when unbinding an EPF from it (Shunsuke Mie)

  Endpoint embedded DMA controller driver:

   - Simplify and clean up support for the DesignWare embedded DMA
     (eDMA) controller (Frank Li, Serge Semin)

  Broadcom STB PCIe controller driver:

   - Avoid config space accesses when link is down because we can't
     recover from the CPU aborts these cause (Jim Quinlan)

   - Look for power regulators described under Root Ports in DT and
     enable them before scanning the secondary bus (Jim Quinlan)

   - Disable/enable regulators in suspend/resume (Jim Quinlan)

  Freescale i.MX6 PCIe controller driver:

   - Simplify and clean up clock and PHY management (Richard Zhu)

   - Disable/enable regulators in suspend/resume (Richard Zhu)

   - Set PCIE_DBI_RO_WR_EN before writing DBI registers (Richard Zhu)

   - Allow speeds faster than Gen2 (Richard Zhu)

   - Make link being down a non-fatal error so controller probe doesn't
     fail if there are no Endpoints connected (Richard Zhu)

  Loongson PCIe controller driver:

   - Add ACPI and MCFG support for Loongson LS7A (Huacai Chen)

   - Avoid config reads to non-existent LS2K/LS7A devices because a
     hardware defect causes machine hangs (Huacai Chen)

   - Work around LS7A integrated devices that report incorrect Interrupt
     Pin values (Jianmin Lv)

  Marvell Aardvark PCIe controller driver:

   - Add support for AER and Slot capability on emulated bridge (Pali
     Rohár)

  MediaTek PCIe controller driver:

   - Add Airoha EN7532 to DT binding (John Crispin)

   - Allow building of driver for ARCH_AIROHA (Felix Fietkau)

  MediaTek PCIe Gen3 controller driver:

   - Print decoded LTSSM state when the link doesn't come up (Jianjun
     Wang)

  NVIDIA Tegra194 PCIe controller driver:

   - Convert DT binding to json-schema (Vidya Sagar)

   - Add DT bindings and driver support for Tegra234 Root Port and
     Endpoint mode (Vidya Sagar)

   - Fix some Root Port interrupt handling issues (Vidya Sagar)

   - Set default Max Payload Size to 256 bytes (Vidya Sagar)

   - Fix Data Link Feature capability programming (Vidya Sagar)

   - Extend Endpoint mode support to devices beyond Controller-5 (Vidya
     Sagar)

  Qualcomm PCIe controller driver:

   - Rework clock, reset, PHY power-on ordering to avoid hangs and
     improve consistency (Robert Marko, Christian Marangi)

   - Move pipe_clk handling to PHY drivers (Dmitry Baryshkov)

   - Add IPQ60xx support (Selvam Sathappan Periakaruppan)

   - Allow ASPM L1 and substates for 2.7.0 (Krishna chaitanya chundru)

   - Add support for more than 32 MSI interrupts (Dmitry Baryshkov)

  Renesas R-Car PCIe controller driver:

   - Convert DT binding to json-schema (Herve Codina)

   - Add Renesas RZ/N1D (R9A06G032) to rcar-gen2 DT binding and driver
     (Herve Codina)

  Samsung Exynos PCIe controller driver:

   - Fix phy-exynos-pcie driver so it follows the 'phy_init() before
     phy_power_on()' PHY programming model (Marek Szyprowski)

  Synopsys DesignWare PCIe controller driver:

   - Simplify and clean up the DWC core extensively (Serge Semin)

   - Fix an issue with programming the ATU for regions that cross a 4GB
     boundary (Serge Semin)

   - Enable the CDM check if 'snps,enable-cdm-check' exists; previously
     we skipped it if 'num-lanes' was absent (Serge Semin)

   - Allocate a 32-bit DMA-able page to be MSI target instead of using a
     driver data structure that may not be addressable with 32-bit
     address (Will McVicker)

   - Add DWC core support for more than 32 MSI interrupts (Dmitry
     Baryshkov)

  Xilinx Versal CPM PCIe controller driver:

   - Add DT binding and driver support for Versal CPM5 Gen5 Root Port
     (Bharat Kumar Gogada)"

* tag 'pci-v5.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (150 commits)
  PCI: imx6: Support more than Gen2 speed link mode
  PCI: imx6: Set PCIE_DBI_RO_WR_EN before writing DBI registers
  PCI: imx6: Reformat suspend callback to keep symmetric with resume
  PCI: imx6: Move the imx6_pcie_ltssm_disable() earlier
  PCI: imx6: Disable clocks in reverse order of enable
  PCI: imx6: Do not hide PHY driver callbacks and refine the error handling
  PCI: imx6: Reduce resume time by only starting link if it was up before suspend
  PCI: imx6: Mark the link down as non-fatal error
  PCI: imx6: Move regulator enable out of imx6_pcie_deassert_core_reset()
  PCI: imx6: Turn off regulator when system is in suspend mode
  PCI: imx6: Call host init function directly in resume
  PCI: imx6: Disable i.MX6QDL clock when disabling ref clocks
  PCI: imx6: Propagate .host_init() errors to caller
  PCI: imx6: Collect clock enables in imx6_pcie_clk_enable()
  PCI: imx6: Factor out ref clock disable to match enable
  PCI: imx6: Move imx6_pcie_clk_disable() earlier
  PCI: imx6: Move imx6_pcie_enable_ref_clk() earlier
  PCI: imx6: Move PHY management functions together
  PCI: imx6: Move imx6_pcie_grp_offset(), imx6_pcie_configure_type() earlier
  PCI: imx6: Convert to NOIRQ_SYSTEM_SLEEP_PM_OPS()
  ...
2022-08-04 19:30:35 -07:00
Fabio M. De Francesco
39ade048a3 highmem: Make __kunmap_{local,atomic}() take const void pointer
__kunmap_ {local,atomic}() currently take pointers to void. However, this
is semantically incorrect, since these functions do not change the memory
their arguments point to.

Therefore, make this semantics explicit by modifying the
__kunmap_{local,atomic}() prototypes to take pointers to const void.

As a side effect, compilers may produce more efficient code.

Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Helge Deller <deller@gmx.de>  # parisc
Suggested-by: David Sterba <dsterba@suse.cz>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-07-25 17:45:40 +02:00
Stafford Horne
abb4970ac3 PCI: Move isa_dma_bridge_buggy out of asm/dma.h
The isa_dma_bridge_buggy symbol is only used for x86_32, and only x86_32
platforms or quirks ever set it.

Add a new linux/isa-dma.h header that #defines isa_dma_bridge_buggy to 0
except on x86_32, where we keep it as a variable, and remove all the arch-
specific definitions.

[bhelgaas: commit log]
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/r/20220722214944.831438-3-shorne@gmail.com
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2022-07-22 17:24:47 -05:00
Stafford Horne
ae85b23c65 PCI: Remove pci_get_legacy_ide_irq() and asm-generic/pci.h
pci_get_legacy_ide_irq() is only used on platforms that support PNP, so
many architectures define it but never use it.  Replace uses of it with
ATA_PRIMARY_IRQ() and ATA_SECONDARY_IRQ(), which provide the same
functionality.

Since pci_get_legacy_ide_irq() is no longer used, remove all the
architecture-specific definitions of it as well as asm-generic/pci.h, which
only provides pci_get_legacy_ide_irq()

[bhelgaas: commit log]
Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20220722214944.831438-2-shorne@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-07-22 17:23:45 -05:00
Mike Rapoport
4501a7a039 parisc: rename PGD_ORDER to PGD_TABLE_ORDER
This is the order of the page table allocation, not the order of a PGD.

Link: https://lkml.kernel.org/r/20220703141203.147893-14-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Xuerui Wang <kernel@xen0n.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-17 17:14:43 -07:00
Anshuman Khandual
252358f1a1 parisc/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
This enables ARCH_HAS_VM_GET_PAGE_PROT on the platform and exports
standard vm_get_page_prot() implementation via DECLARE_VM_GET_PAGE_PROT,
which looks up a private and static protection_map[] array.  Subsequently
all __SXXX and __PXXX macros can be dropped which are no longer needed.

Link: https://lkml.kernel.org/r/20220711070600.2378316-14-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-17 17:14:39 -07:00
Arnd Bergmann
4313a24985 arch/*/: remove CONFIG_VIRT_TO_BUS
All architecture-independent users of virt_to_bus() and bus_to_virt()
have been fixed to use the dma mapping interfaces or have been
removed now.  This means the definitions on most architectures, and the
CONFIG_VIRT_TO_BUS symbol are now obsolete and can be removed.

The only exceptions to this are a few network and scsi drivers for m68k
Amiga and VME machines and ppc32 Macintosh. These drivers work correctly
with the old interfaces and are probably not worth changing.

On alpha and parisc, virt_to_bus() were still used in asm/floppy.h.
alpha can use isa_virt_to_bus() like x86 does, and parisc can just
open-code the virt_to_phys() here, as this is architecture specific
code.

I tried updating the bus-virt-phys-mapping.rst documentation, which
started as an email from Linus to explain some details of the Linux-2.0
driver interfaces. The bits about virt_to_bus() were declared obsolete
backin 2000, and the rest is not all that relevant any more, so in the
end I just decided to remove the file completely.

Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-28 13:20:21 +02:00
Helge Deller
1d0811b03e parisc/stifb: Fix fb_is_primary_device() only available with CONFIG_FB_STI
Fix this build error noticed by the kernel test robot:

drivers/video/console/sticore.c:1132:5: error: redefinition of 'fb_is_primary_device'
 arch/parisc/include/asm/fb.h:18:19: note: previous definition of 'fb_is_primary_device'

Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: kernel test robot <lkp@intel.com>
Cc: stable@vger.kernel.org   # v5.10+
2022-06-07 13:01:17 +02:00
Linus Torvalds
23df9ba64b parisc architecture fixes & updates for kernel v5.19-rc1
A fix to prevent crash at bootup if CONFIG_SCHED_MC is enabled, and
 add auto-detection of primary graphics card for framebuffer driver.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCYptoPQAKCRD3ErUQojoP
 X1MbAQDP86+7VP8vVFDoEx+apgD/r9+53Gb0Ud6ZZ6MbrwIjEAD/Ys1V1X/AjWa7
 ZkbBzRqPhHFRvx9sBUU537HaCem08Ak=
 =kBc2
 -----END PGP SIGNATURE-----

Merge tag 'for-5.19/parisc-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull more parisc architecture updates from Helge Deller:
 "A fix to prevent crash at bootup if CONFIG_SCHED_MC is enabled, and
  add auto-detection of primary graphics card for framebuffer driver"

* tag 'for-5.19/parisc-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc/stifb: Keep track of hardware path of graphics card
  parisc/stifb: Implement fb_is_primary_device()
  parisc: fix a crash with multicore scheduler
2022-06-04 13:50:23 -07:00
Helge Deller
cf936af790 parisc/stifb: Implement fb_is_primary_device()
Implement fb_is_primary_device() function, so that fbcon detects if this
framebuffer belongs to the default graphics card which was used to start
the system.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org   # v5.10+
2022-06-04 15:47:03 +02:00
Linus Torvalds
932c2989b5 TTY / Serial driver changes for 5.19-rc1
Here is the big set of tty and serial driver updates for 5.19-rc1.
 
 Lots of tiny cleanups in here, the major stuff is:
 	- termbit cleanups and unification by Ilpo.  A much needed
 	  change that goes a long way to making things simpler for all
 	  of the different arches
 	- tty documentation cleanups and movements to their own place in
 	  the documentation tree
 	- old tty driver cleanups and fixes from Jiri to bring some
 	  existing drivers into the modern world
 	- RS485 cleanups and unifications to make it easier for
 	  individual drivers to support this mode instead of having to
 	  duplicate logic in each driver
 	- Lots of 8250 driver updates and additions
 	- new device id additions
 	- n_gsm continued fixes and cleanups
 	- other minor serial driver updates and cleanups
 
 All of these have been in linux-next for weeks with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpndTg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykFegCgizjLDyOepr72zMDWWdp0bBTekz8AoMWODfJY
 vB8/kzu329DImJMFB8ET
 =rmv0
 -----END PGP SIGNATURE-----

Merge tag 'tty-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty and serial driver updates from Greg KH:
 "Here is the big set of tty and serial driver updates for 5.19-rc1.

  Lots of tiny cleanups in here, the major stuff is:

   - termbit cleanups and unification by Ilpo. A much needed change that
     goes a long way to making things simpler for all of the different
     arches

   - tty documentation cleanups and movements to their own place in the
     documentation tree

   - old tty driver cleanups and fixes from Jiri to bring some existing
     drivers into the modern world

   - RS485 cleanups and unifications to make it easier for individual
     drivers to support this mode instead of having to duplicate logic
     in each driver

   - Lots of 8250 driver updates and additions

   - new device id additions

   - n_gsm continued fixes and cleanups

   - other minor serial driver updates and cleanups

  All of these have been in linux-next for weeks with no reported issues"

* tag 'tty-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (166 commits)
  tty: Rework receive flow control char logic
  pcmcia: synclink_cs: Don't allow CS5-6
  serial: stm32-usart: Correct CSIZE, bits, and parity
  serial: st-asc: Sanitize CSIZE and correct PARENB for CS7
  serial: sifive: Sanitize CSIZE and c_iflag
  serial: sh-sci: Don't allow CS5-6
  serial: txx9: Don't allow CS5-6
  serial: rda-uart: Don't allow CS5-6
  serial: digicolor-usart: Don't allow CS5-6
  serial: uartlite: Fix BRKINT clearing
  serial: cpm_uart: Fix build error without CONFIG_SERIAL_CPM_CONSOLE
  serial: core: Do stop_rx in suspend path for console if console_suspend is disabled
  tty: serial: qcom-geni-serial: Remove uart frequency table. Instead, find suitable frequency with call to clk_round_rate.
  dt-bindings: serial: renesas,em-uart: Add RZ/V2M clock to access the registers
  serial: 8250_fintek: Check SER_RS485_RTS_* only with RS485
  Revert "serial: 8250_mtk: Make sure to select the right FEATURE_SEL"
  serial: msm_serial: disable interrupts in __msm_console_write()
  serial: meson: acquire port->lock in startup()
  serial: 8250_dw: Use dev_err_probe()
  serial: 8250_dw: Use devm_add_action_or_reset()
  ...
2022-06-03 11:08:40 -07:00
Linus Torvalds
35b51afd23 RISC-V Patches for the 5.19 Merge Window, Part 1
* Support for the Svpbmt extension, which allows memory attributes to be
   encoded in pages.
 * Support for the Allwinner D1's implementation of page-based memory
   attributes.
 * Support for running rv32 binaries on rv64 systems, via the compat
   subsystem.
 * Support for kexec_file().
 * Support for the new generic ticket-based spinlocks, which allows us to
   also move to qrwlock.  These should have already gone in through the
   asm-geneic tree as well.
 * A handful of cleanups and fixes, include some larger ones around
   atomics and XIP.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmKWOx8THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYieAiEADAUdP7ctoaSQwk5skd/fdA3b4KJuKn
 1Zjl+Br32WP0DlbirYBYWRUQZnCCsvABbTiwSJMcG7NBpU5pyQ5XDtB3OA5kJswO
 Fdp8Nd53//+GK1M5zdEM9OdgvT9fbfTZ3qTu8bKsROOQhGwnYL+Csc9KjFRqEmzN
 oQii0jlb3n5PM4FL3GsbV4uMn9zzkP9mnVAPQktcock2EKFEK/Fy3uNYMQiO2KPi
 n8O6bIDaeRdQ6SurzWOuOkt0cro0tEF85ilzT04mynQsOU0el5oGqCxnOhNH3VWg
 ndqPT6Yafw12hZOtbKJeP+nF8IIR6aJLP3jOtRwEVgcfbXYAw4QwbAV8kQZISefN
 ipn8JGY7GX9Y9TYU692OUGkcmAb3/dxb6c0WihBdvJ0M6YyLD5X+YKHNuG2onLgK
 ss43C5Mxsu629rsjdu/PV91B1+pve3rG9siVmF+g4eo0x9rjMq6/JB0Kal/8SLI1
 Je5T55d5ujV1a2XxhZLQOSD5owrK7J1M9owb0bloTnr9nVwFTWDrfEQEU82o3kP+
 Xm+FfXktnz9ai55NjkMbbEur5D++dKJhBavwCTnBcTrJmMtEH0R45GTK9ZehP+WC
 rNVrRXjIsS18wsTfJxnkZeFQA38as6VBKTzvwHvOgzTrrZU1/xk3lpkouYtAO6BG
 gKacHshVilmUuA==
 =Loi6
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.19-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the Svpbmt extension, which allows memory attributes to
   be encoded in pages

 - Support for the Allwinner D1's implementation of page-based memory
   attributes

 - Support for running rv32 binaries on rv64 systems, via the compat
   subsystem

 - Support for kexec_file()

 - Support for the new generic ticket-based spinlocks, which allows us
   to also move to qrwlock. These should have already gone in through
   the asm-geneic tree as well

 - A handful of cleanups and fixes, include some larger ones around
   atomics and XIP

* tag 'riscv-for-linus-5.19-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits)
  RISC-V: Prepare dropping week attribute from arch_kexec_apply_relocations[_add]
  riscv: compat: Using seperated vdso_maps for compat_vdso_info
  RISC-V: Fix the XIP build
  RISC-V: Split out the XIP fixups into their own file
  RISC-V: ignore xipImage
  RISC-V: Avoid empty create_*_mapping definitions
  riscv: Don't output a bogus mmu-type on a no MMU kernel
  riscv: atomic: Add custom conditional atomic operation implementation
  riscv: atomic: Optimize dec_if_positive functions
  riscv: atomic: Cleanup unnecessary definition
  RISC-V: Load purgatory in kexec_file
  RISC-V: Add purgatory
  RISC-V: Support for kexec_file on panic
  RISC-V: Add kexec_file support
  RISC-V: use memcpy for kexec_file mode
  kexec_file: Fix kexec_file.c build error for riscv platform
  riscv: compat: Add COMPAT Kbuild skeletal support
  riscv: compat: ptrace: Add compat_arch_ptrace implement
  riscv: compat: signal: Add rt_frame implementation
  riscv: add memory-type errata for T-Head
  ...
2022-05-31 14:10:54 -07:00
Linus Torvalds
e11a93567d parisc architecture updates for kernel v5.19-rc1
Minor cleanups and code optimizations, e.g.:
 - improvements in assembly statements in the tmpalias code path,
 - added some additionals compile time checks,
 - drop some unneccesary assembler DMA syncs.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCYpUMsgAKCRD3ErUQojoP
 X/fNAQDGddN79CH+bABo6gq8dN/ko0NxxfJapAoPYEEZrKaFtwEA7FUZMfQqBVnm
 z+aSkbH6igueqxRCmbiW1OZ10KLwSgg=
 =SfFN
 -----END PGP SIGNATURE-----

Merge tag 'for-5.19/parisc-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc architecture updates from Helge Deller:
 "Minor cleanups and code optimizations, e.g.:

   - improvements in assembly statements in the tmpalias code path

   - added some additionals compile time checks

   - drop some unneccesary assembler DMA syncs"

* tag 'for-5.19/parisc-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Drop __ARCH_WANT_OLD_READDIR and __ARCH_WANT_SYS_OLDUMOUNT
  parisc: Optimize tmpalias function calls
  parisc: Add dep_safe() macro to deposit a register in 32- and 64-kernels
  parisc: Fix wrong comment for shr macro
  parisc: Prevent ldil() to sign-extend into upper 32 bits
  parisc: Don't hardcode assembler bit definitions in tmpalias code
  parisc: Don't enforce DMA completion order in cache flushes
  parisc: video: fbdev: stifb: Add sti_dump_font() to dump STI font
2022-05-30 11:52:18 -07:00
Helge Deller
72acadfeb3 parisc: Drop __ARCH_WANT_OLD_READDIR and __ARCH_WANT_SYS_OLDUMOUNT
Those old syscalls aren't exported via our syscall table, so just drop
them.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-05-30 17:43:44 +02:00
Linus Torvalds
98931dd95f Yang Shi has improved the behaviour of khugepaged collapsing of readonly
file-backed transparent hugepages.
 
 Johannes Weiner has arranged for zswap memory use to be tracked and
 managed on a per-cgroup basis.
 
 Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime
 enablement of the recent huge page vmemmap optimization feature.
 
 Baolin Wang contributes a series to fix some issues around hugetlb
 pagetable invalidation.
 
 Zhenwei Pi has fixed some interactions between hwpoisoned pages and
 virtualization.
 
 Tong Tiangen has enabled the use of the presently x86-only
 page_table_check debugging feature on arm64 and riscv.
 
 David Vernet has done some fixup work on the memcg selftests.
 
 Peter Xu has taught userfaultfd to handle write protection faults against
 shmem- and hugetlbfs-backed files.
 
 More DAMON development from SeongJae Park - adding online tuning of the
 feature and support for monitoring of fixed virtual address ranges.  Also
 easier discovery of which monitoring operations are available.
 
 Nadav Amit has done some optimization of TLB flushing during mprotect().
 
 Neil Brown continues to labor away at improving our swap-over-NFS support.
 
 David Hildenbrand has some fixes to anon page COWing versus
 get_user_pages().
 
 Peng Liu fixed some errors in the core hugetlb code.
 
 Joao Martins has reduced the amount of memory consumed by device-dax's
 compound devmaps.
 
 Some cleanups of the arch-specific pagemap code from Anshuman Khandual.
 
 Muchun Song has found and fixed some errors in the TLB flushing of
 transparent hugepages.
 
 Roman Gushchin has done more work on the memcg selftests.
 
 And, of course, many smaller fixes and cleanups.  Notably, the customary
 million cleanup serieses from Miaohe Lin.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYo52xQAKCRDdBJ7gKXxA
 jtJFAQD238KoeI9z5SkPMaeBRYSRQmNll85mxs25KapcEgWgGQD9FAb7DJkqsIVk
 PzE+d9hEfirUGdL6cujatwJ6ejYR8Q8=
 =nFe6
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Almost all of MM here. A few things are still getting finished off,
  reviewed, etc.

   - Yang Shi has improved the behaviour of khugepaged collapsing of
     readonly file-backed transparent hugepages.

   - Johannes Weiner has arranged for zswap memory use to be tracked and
     managed on a per-cgroup basis.

   - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for
     runtime enablement of the recent huge page vmemmap optimization
     feature.

   - Baolin Wang contributes a series to fix some issues around hugetlb
     pagetable invalidation.

   - Zhenwei Pi has fixed some interactions between hwpoisoned pages and
     virtualization.

   - Tong Tiangen has enabled the use of the presently x86-only
     page_table_check debugging feature on arm64 and riscv.

   - David Vernet has done some fixup work on the memcg selftests.

   - Peter Xu has taught userfaultfd to handle write protection faults
     against shmem- and hugetlbfs-backed files.

   - More DAMON development from SeongJae Park - adding online tuning of
     the feature and support for monitoring of fixed virtual address
     ranges. Also easier discovery of which monitoring operations are
     available.

   - Nadav Amit has done some optimization of TLB flushing during
     mprotect().

   - Neil Brown continues to labor away at improving our swap-over-NFS
     support.

   - David Hildenbrand has some fixes to anon page COWing versus
     get_user_pages().

   - Peng Liu fixed some errors in the core hugetlb code.

   - Joao Martins has reduced the amount of memory consumed by
     device-dax's compound devmaps.

   - Some cleanups of the arch-specific pagemap code from Anshuman
     Khandual.

   - Muchun Song has found and fixed some errors in the TLB flushing of
     transparent hugepages.

   - Roman Gushchin has done more work on the memcg selftests.

  ... and, of course, many smaller fixes and cleanups. Notably, the
  customary million cleanup serieses from Miaohe Lin"

* tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits)
  mm: kfence: use PAGE_ALIGNED helper
  selftests: vm: add the "settings" file with timeout variable
  selftests: vm: add "test_hmm.sh" to TEST_FILES
  selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests
  selftests: vm: add migration to the .gitignore
  selftests/vm/pkeys: fix typo in comment
  ksm: fix typo in comment
  selftests: vm: add process_mrelease tests
  Revert "mm/vmscan: never demote for memcg reclaim"
  mm/kfence: print disabling or re-enabling message
  include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace"
  include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion"
  mm: fix a potential infinite loop in start_isolate_page_range()
  MAINTAINERS: add Muchun as co-maintainer for HugeTLB
  zram: fix Kconfig dependency warning
  mm/shmem: fix shmem folio swapoff hang
  cgroup: fix an error handling path in alloc_pagecache_max_30M()
  mm: damon: use HPAGE_PMD_SIZE
  tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
  nodemask.h: fix compilation error with GCC12
  ...
2022-05-26 12:32:41 -07:00
Linus Torvalds
7e062cda7d Networking changes for 5.19.
Core
 ----
 
  - Support TCPv6 segmentation offload with super-segments larger than
    64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP).
 
  - Generalize skb freeing deferral to per-cpu lists, instead of
    per-socket lists.
 
  - Add a netdev statistic for packets dropped due to L2 address
    mismatch (rx_otherhost_dropped).
 
  - Continue work annotating skb drop reasons.
 
  - Accept alternative netdev names (ALT_IFNAME) in more netlink
    requests.
 
  - Add VLAN support for AF_PACKET SOCK_RAW GSO.
 
  - Allow receiving skb mark from the socket as a cmsg.
 
  - Enable memcg accounting for veth queues, sysctl tables and IPv6.
 
 BPF
 ---
 
  - Add libbpf support for User Statically-Defined Tracing (USDTs).
 
  - Speed up symbol resolution for kprobes multi-link attachments.
 
  - Support storing typed pointers to referenced and unreferenced
    objects in BPF maps.
 
  - Add support for BPF link iterator.
 
  - Introduce access to remote CPU map elements in BPF per-cpu map.
 
  - Allow middle-of-the-road settings for the
    kernel.unprivileged_bpf_disabled sysctl.
 
  - Implement basic types of dynamic pointers e.g. to allow for
    dynamically sized ringbuf reservations without extra memory copies.
 
 Protocols
 ---------
 
  - Retire port only listening_hash table, add a second bind table
    hashed by port and address. Avoid linear list walk when binding
    to very popular ports (e.g. 443).
 
  - Add bridge FDB bulk flush filtering support allowing user space
    to remove all FDB entries matching a condition.
 
  - Introduce accept_unsolicited_na sysctl for IPv6 to implement
    router-side changes for RFC9131.
 
  - Support for MPTCP path manager in user space.
 
  - Add MPTCP support for fallback to regular TCP for connections
    that have never connected additional subflows or transmitted
    out-of-sequence data (partial support for RFC8684 fallback).
 
  - Avoid races in MPTCP-level window tracking, stabilize and improve
    throughput.
 
  - Support lockless operation of GRE tunnels with seq numbers enabled.
 
  - WiFi support for host based BSS color collision detection.
 
  - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets.
 
  - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2).
 
  - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile).
 
  - Allow matching on the number of VLAN tags via tc-flower.
 
  - Add tracepoint for tcp_set_ca_state().
 
 Driver API
 ----------
 
  - Improve error reporting from classifier and action offload.
 
  - Add support for listing line cards in switches (devlink).
 
  - Add helpers for reporting page pool statistics with ethtool -S.
 
  - Add support for reading clock cycles when using PTP virtual clocks,
    instead of having the driver convert to time before reporting.
    This makes it possible to report time from different vclocks.
 
  - Support configuring low-latency Tx descriptor push via ethtool.
 
  - Separate Clause 22 and Clause 45 MDIO accesses more explicitly.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - Marvell's Octeon NIC PCI Endpoint support (octeon_ep)
    - Sunplus SP7021 SoC (sp7021_emac)
    - Add support for Renesas RZ/V2M (in ravb)
    - Add support for MediaTek mt7986 switches (in mtk_eth_soc)
 
  - Ethernet PHYs:
    - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting)
    - TI DP83TD510 PHY
    - Microchip LAN8742/LAN88xx PHYs
 
  - WiFi:
    - Driver for pureLiFi X, XL, XC devices (plfxlc)
    - Driver for Silicon Labs devices (wfx)
    - Support for WCN6750 (in ath11k)
    - Support Realtek 8852ce devices (in rtw89)
 
  - Mobile:
    - MediaTek T700 modems (Intel 5G 5000 M.2 cards)
 
  - CAN:
   - ctucanfd: add support for CTU CAN FD open-source IP core
     from Czech Technical University in Prague
 
 Drivers
 -------
 
  - Delete a number of old drivers still using virt_to_bus().
 
  - Ethernet NICs:
    - intel: support TSO on tunnels MPLS
    - broadcom: support multi-buffer XDP
    - nfp: support VF rate limiting
    - sfc: use hardware tx timestamps for more than PTP
    - mlx5: multi-port eswitch support
    - hyper-v: add support for XDP_REDIRECT
    - atlantic: XDP support (including multi-buffer)
    - macb: improve real-time perf by deferring Tx processing to NAPI
 
  - High-speed Ethernet switches:
    - mlxsw: implement basic line card information querying
    - prestera: add support for traffic policing on ingress and egress
 
  - Embedded Ethernet switches:
    - lan966x: add support for packet DMA (FDMA)
    - lan966x: add support for PTP programmable pins
    - ti: cpsw_new: enable bc/mc storm prevention
 
  - Qualcomm 802.11ax WiFi (ath11k):
    - Wake-on-WLAN support for QCA6390 and WCN6855
    - device recovery (firmware restart) support
    - support setting Specific Absorption Rate (SAR) for WCN6855
    - read country code from SMBIOS for WCN6855/QCA6390
    - enable keep-alive during WoWLAN suspend
    - implement remain-on-channel support
 
  - MediaTek WiFi (mt76):
    - support Wireless Ethernet Dispatch offloading packet movement
      between the Ethernet switch and WiFi interfaces
    - non-standard VHT MCS10-11 support
    - mt7921 AP mode support
    - mt7921 IPv6 NS offload support
 
  - Ethernet PHYs:
    - micrel: ksz9031/ksz9131: cabletest support
    - lan87xx: SQI support for T1 PHYs
    - lan937x: add interrupt support for link detection
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmKNMPQACgkQMUZtbf5S
 IrsRARAAuDyYs6jFYB3p+xazZdOnbF4iAgVv71+DQGvmsCl6CB9OrsNZMlvE85OL
 Q3gjcRbgjrkN4lhgI8DmiGYbsUJnAvVjFdNjccz1Z/vTLYvuIM0ol54MUp5S+9WY
 StncOJkOGJxxR/Gi5gzVmejPDsysU3Jik+hm/fpIcz8pybXxAsFKU5waY5qfl+/T
 TZepfV0VCfqRDjqcF1qA5+jJZNU8pdodQlZ1+mh8bwu6Jk1ZkWkj6Ov8MWdwQldr
 LnPeK/9hIGzkdJYHZfajxA3t8D0K5CHzSuih2bJ9ry8ZXgVBkXEThew778/R5izW
 uB0YZs9COFlrIP7XHjtRTy/2xHOdYIPlj2nWhVdfuQDX8Crvt4VRN6EZ1rjko1ZJ
 WanfG6WHF8NH5pXBRQbh3kIMKBnYn6OIzuCfCQSqd+niHcxFIM4vRiggeXI5C5TW
 vJgEWfK6X+NfDiFVa3xyCrEmp5ieA/pNecpwd8rVkql+MtFAAw4vfsotLKOJEAru
 J/XL6UE+YuLqIJV9ACZ9x1AFXXAo661jOxBunOo4VXhXVzWS9lYYz5r5ryIkgT/8
 /Fr0zjANJWgfIuNdIBtYfQ4qG+LozGq038VA06RhFUAZ5tF9DzhqJs2Q2AFuWWBC
 ewCePJVqo1j2Ceq2mGonXRt47OEnlePoOxTk9W+cKZb7ZWE+zEo=
 =Wjii
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core
  ----

   - Support TCPv6 segmentation offload with super-segments larger than
     64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP).

   - Generalize skb freeing deferral to per-cpu lists, instead of
     per-socket lists.

   - Add a netdev statistic for packets dropped due to L2 address
     mismatch (rx_otherhost_dropped).

   - Continue work annotating skb drop reasons.

   - Accept alternative netdev names (ALT_IFNAME) in more netlink
     requests.

   - Add VLAN support for AF_PACKET SOCK_RAW GSO.

   - Allow receiving skb mark from the socket as a cmsg.

   - Enable memcg accounting for veth queues, sysctl tables and IPv6.

  BPF
  ---

   - Add libbpf support for User Statically-Defined Tracing (USDTs).

   - Speed up symbol resolution for kprobes multi-link attachments.

   - Support storing typed pointers to referenced and unreferenced
     objects in BPF maps.

   - Add support for BPF link iterator.

   - Introduce access to remote CPU map elements in BPF per-cpu map.

   - Allow middle-of-the-road settings for the
     kernel.unprivileged_bpf_disabled sysctl.

   - Implement basic types of dynamic pointers e.g. to allow for
     dynamically sized ringbuf reservations without extra memory copies.

  Protocols
  ---------

   - Retire port only listening_hash table, add a second bind table
     hashed by port and address. Avoid linear list walk when binding to
     very popular ports (e.g. 443).

   - Add bridge FDB bulk flush filtering support allowing user space to
     remove all FDB entries matching a condition.

   - Introduce accept_unsolicited_na sysctl for IPv6 to implement
     router-side changes for RFC9131.

   - Support for MPTCP path manager in user space.

   - Add MPTCP support for fallback to regular TCP for connections that
     have never connected additional subflows or transmitted
     out-of-sequence data (partial support for RFC8684 fallback).

   - Avoid races in MPTCP-level window tracking, stabilize and improve
     throughput.

   - Support lockless operation of GRE tunnels with seq numbers enabled.

   - WiFi support for host based BSS color collision detection.

   - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets.

   - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2).

   - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile).

   - Allow matching on the number of VLAN tags via tc-flower.

   - Add tracepoint for tcp_set_ca_state().

  Driver API
  ----------

   - Improve error reporting from classifier and action offload.

   - Add support for listing line cards in switches (devlink).

   - Add helpers for reporting page pool statistics with ethtool -S.

   - Add support for reading clock cycles when using PTP virtual clocks,
     instead of having the driver convert to time before reporting. This
     makes it possible to report time from different vclocks.

   - Support configuring low-latency Tx descriptor push via ethtool.

   - Separate Clause 22 and Clause 45 MDIO accesses more explicitly.

  New hardware / drivers
  ----------------------

   - Ethernet:
      - Marvell's Octeon NIC PCI Endpoint support (octeon_ep)
      - Sunplus SP7021 SoC (sp7021_emac)
      - Add support for Renesas RZ/V2M (in ravb)
      - Add support for MediaTek mt7986 switches (in mtk_eth_soc)

   - Ethernet PHYs:
      - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting)
      - TI DP83TD510 PHY
      - Microchip LAN8742/LAN88xx PHYs

   - WiFi:
      - Driver for pureLiFi X, XL, XC devices (plfxlc)
      - Driver for Silicon Labs devices (wfx)
      - Support for WCN6750 (in ath11k)
      - Support Realtek 8852ce devices (in rtw89)

   - Mobile:
      - MediaTek T700 modems (Intel 5G 5000 M.2 cards)

   - CAN:
      - ctucanfd: add support for CTU CAN FD open-source IP core from
        Czech Technical University in Prague

  Drivers
  -------

   - Delete a number of old drivers still using virt_to_bus().

   - Ethernet NICs:
      - intel: support TSO on tunnels MPLS
      - broadcom: support multi-buffer XDP
      - nfp: support VF rate limiting
      - sfc: use hardware tx timestamps for more than PTP
      - mlx5: multi-port eswitch support
      - hyper-v: add support for XDP_REDIRECT
      - atlantic: XDP support (including multi-buffer)
      - macb: improve real-time perf by deferring Tx processing to NAPI

   - High-speed Ethernet switches:
      - mlxsw: implement basic line card information querying
      - prestera: add support for traffic policing on ingress and egress

   - Embedded Ethernet switches:
      - lan966x: add support for packet DMA (FDMA)
      - lan966x: add support for PTP programmable pins
      - ti: cpsw_new: enable bc/mc storm prevention

   - Qualcomm 802.11ax WiFi (ath11k):
      - Wake-on-WLAN support for QCA6390 and WCN6855
      - device recovery (firmware restart) support
      - support setting Specific Absorption Rate (SAR) for WCN6855
      - read country code from SMBIOS for WCN6855/QCA6390
      - enable keep-alive during WoWLAN suspend
      - implement remain-on-channel support

   - MediaTek WiFi (mt76):
      - support Wireless Ethernet Dispatch offloading packet movement
        between the Ethernet switch and WiFi interfaces
      - non-standard VHT MCS10-11 support
      - mt7921 AP mode support
      - mt7921 IPv6 NS offload support

   - Ethernet PHYs:
      - micrel: ksz9031/ksz9131: cabletest support
      - lan87xx: SQI support for T1 PHYs
      - lan937x: add interrupt support for link detection"

* tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits)
  ptp: ocp: Add firmware header checks
  ptp: ocp: fix PPS source selector debugfs reporting
  ptp: ocp: add .init function for sma_op vector
  ptp: ocp: vectorize the sma accessor functions
  ptp: ocp: constify selectors
  ptp: ocp: parameterize input/output sma selectors
  ptp: ocp: revise firmware display
  ptp: ocp: add Celestica timecard PCI ids
  ptp: ocp: Remove #ifdefs around PCI IDs
  ptp: ocp: 32-bit fixups for pci start address
  Revert "net/smc: fix listen processing for SMC-Rv2"
  ath6kl: Use cc-disable-warning to disable -Wdangling-pointer
  selftests/bpf: Dynptr tests
  bpf: Add dynptr data slices
  bpf: Add bpf_dynptr_read and bpf_dynptr_write
  bpf: Dynptr support for ring buffers
  bpf: Add bpf_dynptr_from_mem for local dynptrs
  bpf: Add verifier support for dynptrs
  bpf: Suppress 'passing zero to PTR_ERR' warning
  bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack
  ...
2022-05-25 12:22:58 -07:00
Linus Torvalds
ac2ab99072 Random number generator updates for Linux 5.19-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmKKpM8ACgkQSfxwEqXe
 A6726w/+OJimGd4arvpSmdn+vxepSyDLgKfwM0x5zprRVd16xg8CjJr4eMonTesq
 YvtJRqpetb53MB+sMhutlvQqQzrjtf2MBkgPwF4I2gUrk7vLD45Q+AGdGhi/rUwz
 wHGA7xg1FHLHia2M/9idSqi8QlZmUP4u4l5ZnMyTUHiwvRD6XOrWKfqvUSawNzyh
 hCWlTUxDrjizsW5YpsJX/MkRadSC8loJEk5ByZebow6nRPfurJvqfrcOMgHyNrbY
 pOZ/CGPxcetMqotL2TuuJt5wKmenqYhIWGAp3YM2SWWgU2ueBZekW8AYeMfgUcvh
 LWV93RpSuAnE5wsdjIULvjFnEDJBf8ihfMnMrd9G5QjQu44tuKWfY2MghLSpYzaR
 V6UFbRmhrqhqiStHQXOvk1oqxtpbHlc9zzJLmvPmDJcbvzXQ9Opk5GVXAmdtnHnj
 M/ty3wGWxucY6mHqT8MkCShSSslbgEtc1pEIWHdrUgnaiSVoCVBEO+9LqLbjvOTm
 XA/6YtoiCE5FasK51pir1zVb2GORQn0v8HnuAOsusD/iPAlRQ/G5jZkaXbwRQI6j
 atYL1svqvSKn5POnzqAlMUXfMUr19K5xqJdp7i6qmlO1Vq6Z+tWbCQgD1JV+Wjkb
 CMyvXomFCFu4aYKGRE2SBRnWLRghG3kYHqEQ15yTPMQerxbUDNg=
 =SUr3
 -----END PGP SIGNATURE-----

Merge tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:
 "These updates continue to refine the work began in 5.17 and 5.18 of
  modernizing the RNG's crypto and streamlining and documenting its
  code.

  New for 5.19, the updates aim to improve entropy collection methods
  and make some initial decisions regarding the "premature next" problem
  and our threat model. The cloc utility now reports that random.c is
  931 lines of code and 466 lines of comments, not that basic metrics
  like that mean all that much, but at the very least it tells you that
  this is very much a manageable driver now.

  Here's a summary of the various updates:

   - The random_get_entropy() function now always returns something at
     least minimally useful. This is the primary entropy source in most
     collectors, which in the best case expands to something like RDTSC,
     but prior to this change, in the worst case it would just return 0,
     contributing nothing. For 5.19, additional architectures are wired
     up, and architectures that are entirely missing a cycle counter now
     have a generic fallback path, which uses the highest resolution
     clock available from the timekeeping subsystem.

     Some of those clocks can actually be quite good, despite the CPU
     not having a cycle counter of its own, and going off-core for a
     stamp is generally thought to increase jitter, something positive
     from the perspective of entropy gathering. Done very early on in
     the development cycle, this has been sitting in next getting some
     testing for a while now and has relevant acks from the archs, so it
     should be pretty well tested and fine, but is nonetheless the thing
     I'll be keeping my eye on most closely.

   - Of particular note with the random_get_entropy() improvements is
     MIPS, which, on CPUs that lack the c0 count register, will now
     combine the high-speed but short-cycle c0 random register with the
     lower-speed but long-cycle generic fallback path.

   - With random_get_entropy() now always returning something useful,
     the interrupt handler now collects entropy in a consistent
     construction.

   - Rather than comparing two samples of random_get_entropy() for the
     jitter dance, the algorithm now tests many samples, and uses the
     amount of differing ones to determine whether or not jitter entropy
     is usable and how laborious it must be. The problem with comparing
     only two samples was that if the cycle counter was extremely slow,
     but just so happened to be on the cusp of a change, the slowness
     wouldn't be detected. Taking many samples fixes that to some
     degree.

     This, combined with the other improvements to random_get_entropy(),
     should make future unification of /dev/random and /dev/urandom
     maybe more possible. At the very least, were we to attempt it again
     today (we're not), it wouldn't break any of Guenter's test rigs
     that broke when we tried it with 5.18. So, not today, but perhaps
     down the road, that's something we can revisit.

   - We attempt to reseed the RNG immediately upon waking up from system
     suspend or hibernation, making use of the various timestamps about
     suspend time and such available, as well as the usual inputs such
     as RDRAND when available.

   - Batched randomness now falls back to ordinary randomness before the
     RNG is initialized. This provides more consistent guarantees to the
     types of random numbers being returned by the various accessors.

   - The "pre-init injection" code is now gone for good. I suspect you
     in particular will be happy to read that, as I recall you
     expressing your distaste for it a few months ago. Instead, to avoid
     a "premature first" issue, while still allowing for maximal amount
     of entropy availability during system boot, the first 128 bits of
     estimated entropy are used immediately as it arrives, with the next
     128 bits being buffered. And, as before, after the RNG has been
     fully initialized, it winds up reseeding anyway a few seconds later
     in most cases. This resulted in a pretty big simplification of the
     initialization code and let us remove various ad-hoc mechanisms
     like the ugly crng_pre_init_inject().

   - The RNG no longer pretends to handle the "premature next" security
     model, something that various academics and other RNG designs have
     tried to care about in the past. After an interesting mailing list
     thread, these issues are thought to be a) mainly academic and not
     practical at all, and b) actively harming the real security of the
     RNG by delaying new entropy additions after a potential compromise,
     making a potentially bad situation even worse. As well, in the
     first place, our RNG never even properly handled the premature next
     issue, so removing an incomplete solution to a fake problem was
     particularly nice.

     This allowed for numerous other simplifications in the code, which
     is a lot cleaner as a consequence. If you didn't see it before,
     https://lore.kernel.org/lkml/YmlMGx6+uigkGiZ0@zx2c4.com/ may be a
     thread worth skimming through.

   - While the interrupt handler received a separate code path years ago
     that avoids locks by using per-cpu data structures and a faster
     mixing algorithm, in order to reduce interrupt latency, input and
     disk events that are triggered in hardirq handlers were still
     hitting locks and more expensive algorithms. Those are now
     redirected to use the faster per-cpu data structures.

   - Rather than having the fake-crypto almost-siphash-based random32
     implementation be used right and left, and in many places where
     cryptographically secure randomness is desirable, the batched
     entropy code is now fast enough to replace that.

   - As usual, numerous code quality and documentation cleanups. For
     example, the initialization state machine now uses enum symbolic
     constants instead of just hard coding numbers everywhere.

   - Since the RNG initializes once, and then is always initialized
     thereafter, a pretty heavy amount of code used during that
     initialization is never used again. It is now completely cordoned
     off using static branches and it winds up in the .text.unlikely
     section so that it doesn't reduce cache compactness after the RNG
     is ready.

   - A variety of functions meant for waiting on the RNG to be
     initialized were only used by vsprintf, and in not a particularly
     optimal way. Replacing that usage with a more ordinary setup made
     it possible to remove those functions.

   - A cleanup of how we warn userspace about the use of uninitialized
     /dev/urandom and uninitialized get_random_bytes() usage.
     Interestingly, with the change you merged for 5.18 that attempts to
     use jitter (but does not block if it can't), the majority of users
     should never see those warnings for /dev/urandom at all now, and
     the one for in-kernel usage is mainly a debug thing.

   - The file_operations struct for /dev/[u]random now implements
     .read_iter and .write_iter instead of .read and .write, allowing it
     to also implement .splice_read and .splice_write, which makes
     splice(2) work again after it was broken here (and in many other
     places in the tree) during the set_fs() removal. This was a bit of
     a last minute arrival from Jens that hasn't had as much time to
     bake, so I'll be keeping my eye on this as well, but it seems
     fairly ordinary. Unfortunately, read_iter() is around 3% slower
     than read() in my tests, which I'm not thrilled about. But Jens and
     Al, spurred by this observation, seem to be making progress in
     removing the bottlenecks on the iter paths in the VFS layer in
     general, which should remove the performance gap for all drivers.

   - Assorted other bug fixes, cleanups, and optimizations.

   - A small SipHash cleanup"

* tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (49 commits)
  random: check for signals after page of pool writes
  random: wire up fops->splice_{read,write}_iter()
  random: convert to using fops->write_iter()
  random: convert to using fops->read_iter()
  random: unify batched entropy implementations
  random: move randomize_page() into mm where it belongs
  random: remove mostly unused async readiness notifier
  random: remove get_random_bytes_arch() and add rng_has_arch_random()
  random: move initialization functions out of hot pages
  random: make consistent use of buf and len
  random: use proper return types on get_random_{int,long}_wait()
  random: remove extern from functions in header
  random: use static branch for crng_ready()
  random: credit architectural init the exact amount
  random: handle latent entropy and command line from random_init()
  random: use proper jiffies comparison macro
  random: remove ratelimiting for in-kernel unseeded randomness
  random: move initialization out of reseeding hot path
  random: avoid initializing twice in credit race
  random: use symbolic constants for crng_init states
  ...
2022-05-24 11:58:10 -07:00
Helge Deller
cdd00fe6aa parisc: Add dep_safe() macro to deposit a register in 32- and 64-kernels
Signed-off-by: Helge Deller <deller@gmx.de>
2022-05-23 13:44:24 +02:00
Helge Deller
be6aee1392 parisc: Fix wrong comment for shr macro
The comment that the source and target register can not be the same is
wrong. Instead on PA2.0 usage of extru can clobber upper 32-bits.
This patch fixes the comment.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-05-23 13:44:24 +02:00
John David Anglin
c64c782e58 parisc: Don't hardcode assembler bit definitions in tmpalias code
Remove the hardcoded bit definitions in the tmpalias assembly code.
This makes it easy to change the size of the tmpalias region.

The alignment of the tmpalias region is reduced from 16 MB to 8 MB.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-05-23 13:44:24 +02:00
John David Anglin
1fc7db2401 parisc: Don't enforce DMA completion order in cache flushes
The only place we need to ensure all outstanding cache coherence
operations are complete is in invalidate_kernel_vmap_range. All
parisc drivers synchronize DMA operations internally and do not
call invalidate_kernel_vmap_range. We only need this for non-coherent
I/O operations.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-05-23 13:44:24 +02:00
Ilpo Järvinen
44e0b165b6 termbits.h: Remove posix_types.h include
Nothing in termbits seems to require anything from linux/posix_types.h.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220509093446.6677-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-19 18:25:26 +02:00
Ilpo Järvinen
c9b34088e8 termbits.h: Align lines & format
- Align c_cc defines.
- Remove extra newlines.
- Realign & adjust number of leading zeros.
- Reorder c_cflag defines to ascending order
- Make comment ending shorted (=remove period and one extra space from
  the comments in mips).

Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220509093446.6677-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-19 18:25:26 +02:00
Ilpo Järvinen
0b46ac44f2 termbits.h: create termbits-common.h for identical bits
Some defines are the same across all archs. Move the most obvious
intersection to termbits-common.h.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220509093446.6677-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-19 18:25:25 +02:00
John David Anglin
2de8b4cc20 parisc: Rewrite cache flush code for PA8800/PA8900
Originally, I was convinced that we needed to use tmpalias flushes
everwhere, for both user and kernel flushes. However, when I modified
flush_kernel_dcache_page_addr, to use a tmpalias flush, my c8000
would crash quite early when booting.

The PDC returns alias values of 0 for the icache and dcache. This
indicates that either the alias boundary is greater than 16MB or
equivalent aliasing doesn't work. I modified the tmpalias code to
make it easy to try alternate boundaries. I tried boundaries up to
128MB but still kernel tmpalias flushes didn't work on c8000.

This led me to conclude that tmpalias flushes don't work on PA8800
and PA8900 machines, and that we needed to flush directly using the
virtual address of user and kernel pages. This is likely the major
cause of instability on the c8000 and rp34xx machines.

Flushing user pages requires doing a temporary context switch as we
have to flush pages that don't belong to the current context. Further,
we have to deal with pages that aren't present. If a page isn't
present, the flush instructions fault on every line.

Other code has been rearranged and simplified based on testing. For
example, I introduced a flush_cache_dup_mm routine. flush_cache_mm
and flush_cache_dup_mm differ in that flush_cache_mm calls
purge_cache_pages and flush_cache_dup_mm calls flush_cache_pages.
In some implementations, pdc is more efficient than fdc. Based on
my testing, I don't believe there's any performance benefit on the
c8000.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-05-17 21:52:47 +02:00
Greg Kroah-Hartman
d6da35e0c6 Merge 5.18-rc7 into usb-next
We need the tty fixes in here as well, as we need to revert one of them :(

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-16 15:39:23 +02:00
Baolin Wang
ae07562909 mm: change huge_ptep_clear_flush() to return the original pte
Patch series "Fix CONT-PTE/PMD size hugetlb issue when unmapping or migrating", v4.

presently, migrating a hugetlb page or unmapping a poisoned hugetlb page,
we'll use ptep_clear_flush() and set_pte_at() to nuke the page table entry
and remap it, and this is incorrect for CONT-PTE or CONT-PMD size hugetlb
page, which will cause potential data consistent issue.  This patch set
will change to use hugetlb related APIs to fix this issue.

Note: Mike pointed out the huge_ptep_get() will only return the one
specific value, and it would not take into account the dirty or young bits
of CONT-PTE/PMDs like the huge_ptep_get_and_clear() [1].  This
inconsistent issue is not introduced by this patch set, and this issue
will be addressed in another thread [2].  Meanwhile the uffd for hugetlb
case [3] pointed out by Gerald also needs another patch to address.

[1] https://lore.kernel.org/linux-mm/85bd80b4-b4fd-0d3f-a2e5-149559f2f387@oracle.com/
[2] https://lore.kernel.org/all/cover.1651998586.git.baolin.wang@linux.alibaba.com/
[3] https://lore.kernel.org/linux-mm/20220503120343.6264e126@thinkpad/


This patch (of 3):

It is incorrect to use ptep_clear_flush() to nuke a hugetlb page table
when unmapping or migrating a hugetlb page, and will change to use
huge_ptep_clear_flush() instead in the following patches.

So this is a preparation patch, which changes the huge_ptep_clear_flush()
to return the original pte to help to nuke a hugetlb page table.

[baolin.wang@linux.alibaba.com: fix build in several more architectures]
  Link: https://lkml.kernel.org/r/0009a4cd-2826-e8be-e671-f050d4f18d5d@linux.alibaba.com
[sfr@canb.auug.org.au: fixup]
  Link: https://lkml.kernel.org/r/20220511181531.7f27a5c1@canb.auug.org.au
Link: https://lkml.kernel.org/r/cover.1652270205.git.baolin.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/20f77ddab90baa249bd24504c413189b82acde69.1652270205.git.baolin.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/cover.1652147571.git.baolin.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/dcf065868cce35bceaf138613ad27f17bb7c0c19.1652147571.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Cc: Rich Felker <dalias@libc.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-13 16:48:55 -07:00
Jason A. Donenfeld
8865bbe6ba parisc: define get_cycles macro for arch-override
PA-RISC defines a get_cycles() function, but it does not do the usual
`#define get_cycles get_cycles` dance, making it impossible for generic
code to see if an arch-specific function was defined. While the
get_cycles() ifdef is not currently used, the following timekeeping
patch in this series will depend on the macro existing (or not existing)
when defining random_get_entropy().

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13 23:59:23 +02:00
Jakub Kicinski
9b19e57a3c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Build issue in drivers/net/ethernet/sfc/ptp.c
  54fccfdd7c ("sfc: efx_default_channel_type APIs can be static")
  49e6123c65 ("net: sfc: fix memory leak due to ptp channel")
https://lore.kernel.org/all/20220510130556.52598fe2@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12 16:15:30 -07:00
Helge Deller
234ff4c585 parisc: Change MAX_ADDRESS to become unsigned long long
Dave noticed that for the 32-bit kernel MAX_ADDRESS should be a ULL,
otherwise this define would become 0:
	MAX_ADDRESS   (1UL << MAX_ADDRBITS)
It has no real effect on the kernel.

Signed-off-by: Helge Deller <deller@gmx.de>
Noticed-by: John David Anglin <dave.anglin@bell.net>
2022-05-08 20:01:11 +02:00
Ilpo Järvinen
6808b7f5c8 termbits: Convert octal defines to hex
Many archs have termbits.h as octal numbers. It makes hard for humans
to parse the magnitude of large numbers correctly and to compare with
hex ones of the same define.

Convert octal values to hex.

First step is an automated conversion with:

for i in $(git ls-files | grep 'termbits\.h'); do
	awk --non-decimal-data '/^#define\s+[A-Z][A-Z0-9]*\s+0[0-9]/ {
		l=int(((length($3) - 1) * 3 + 3) / 4);
		repl = sprintf("0x%0" l "x", $3);
		print gensub(/[^[:blank:]]+/, repl, 3);
		next} {print}' $i > $i~;
	mv $i~ $i;
done

On top of that, some manual processing on alignment and number of zeros.
In addition, small tweaks to formatting of a few comments on the same
lines.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Link: https://lore.kernel.org/r/2c8c96f-a12f-aadc-18ac-34c1d371929c@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-05 22:43:58 +02:00
Erin MacNeil
6fd1d51cfa net: SO_RCVMARK socket option for SO_MARK with recvmsg()
Adding a new socket option, SO_RCVMARK, to indicate that SO_MARK
should be included in the ancillary data returned by recvmsg().

Renamed the sock_recv_ts_and_drops() function to sock_recv_cmsgs().

Signed-off-by: Erin MacNeil <lnx.erin@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20220427200259.2564-1-lnx.erin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28 13:08:15 -07:00
Guo Ren
84a0c977ab
asm-generic: compat: Cleanup duplicate definitions
There are 7 64bit architectures that support Linux COMPAT mode to
run 32bit applications. A lot of definitions are duplicate:
 - COMPAT_USER_HZ
 - COMPAT_RLIM_INFINITY
 - COMPAT_OFF_T_MAX
 - __compat_uid_t, __compat_uid_t
 - compat_dev_t
 - compat_ipc_pid_t
 - struct compat_flock
 - struct compat_flock64
 - struct compat_statfs
 - struct compat_ipc64_perm, compat_semid64_ds,
	  compat_msqid64_ds, compat_shmid64_ds

Cleanup duplicate definitions and merge them into asm-generic.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Helge Deller <deller@gmx.de>  # parisc
Link: https://lore.kernel.org/r/20220405071314.3225832-7-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-26 13:35:54 -07:00
Guo Ren
f18ed30db2
fs: stat: compat: Add __ARCH_WANT_COMPAT_STAT
RISC-V doesn't neeed compat_stat, so using __ARCH_WANT_COMPAT_STAT
to exclude unnecessary SYSCALL functions.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Helge Deller <deller@gmx.de>  # parisc
Link: https://lore.kernel.org/r/20220405071314.3225832-6-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-26 13:35:45 -07:00
Christoph Hellwig
3ce0f2373f
compat: consolidate the compat_flock{,64} definition
Provide a single common definition for the compat_flock and
compat_flock64 structures using the same tricks as for the native
variants.  Another extra define is added for the packing required on
x86.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Helge Deller <deller@gmx.de>  # parisc
Link: https://lore.kernel.org/r/20220405071314.3225832-4-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-26 13:35:28 -07:00
Helge Deller
1afde47d08 parisc: Find a new timesync master if current CPU is removed
When CPU hotplugging is enabled, the user may want to remove the
current CPU which is providing the timer ticks. If this happens
we need to find a new timesync master.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-03-29 21:37:13 +02:00
Helge Deller
88b3aac622 parisc: Implement __cpu_die() and __cpu_disable() for CPU hotplugging
Add relevant code to __cpu_die() and __cpu_disable() to finally enable
the CPU hotplugging features. Reset the irq count values in smp_callin()
to zero before bringing up the CPU.

It seems that the firmware may need up to 8 seconds to fully stop a CPU
in which no other PDC calls are allowed to be made. Use a timeout
__cpu_die() to accommodate for this.

Use "chcpu -d 1" to bring CPU1 down, and "chcpu -e 1" to bring it up.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-03-29 21:37:12 +02:00
Helge Deller
db2b0d76cd parisc: Add PDC locking functions for rendezvous code
Add pdc_cpu_rendezvous_lock() and pdc_cpu_rendezvous_unlock()
to lock PDC while CPU is transitioning into rendezvous state.
This is needed, because the transition phase may take up to 8 seconds.

Add pdc_pat_get_PDC_entrypoint() to get PDC entry point for current CPU.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-03-29 21:37:12 +02:00
Helge Deller
62773112ac parisc: Switch from GENERIC_CPU_DEVICES to GENERIC_ARCH_TOPOLOGY
Switch away from the own cpu topology code to common code which is used
by ARM64 and RISCV. That will allow us to enable CPU hotplug later on.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-03-29 21:37:12 +02:00
Helge Deller
7e0b2be94c parisc: Add constants for control registers and clean up mfctl()
Clean up the code for the mfctl() and mtctl() functions and add often
used constants.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-03-29 21:37:12 +02:00
Linus Torvalds
29c8c18363 Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:
 "This is the material which was staged after willystuff in linux-next.

  Subsystems affected by this patch series: mm (debug, selftests,
  pagecache, thp, rmap, migration, kasan, hugetlb, pagemap, madvise),
  and selftests"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (113 commits)
  selftests: kselftest framework: provide "finished" helper
  mm: madvise: MADV_DONTNEED_LOCKED
  mm: fix race between MADV_FREE reclaim and blkdev direct IO read
  mm: generalize ARCH_HAS_FILTER_PGPROT
  mm: unmap_mapping_range_tree() with i_mmap_rwsem shared
  mm: warn on deleting redirtied only if accounted
  mm/huge_memory: remove stale locking logic from __split_huge_pmd()
  mm/huge_memory: remove stale page_trans_huge_mapcount()
  mm/swapfile: remove stale reuse_swap_page()
  mm/khugepaged: remove reuse_swap_page() usage
  mm/huge_memory: streamline COW logic in do_huge_pmd_wp_page()
  mm: streamline COW logic in do_swap_page()
  mm: slightly clarify KSM logic in do_swap_page()
  mm: optimize do_wp_page() for fresh pages in local LRU pagevecs
  mm: optimize do_wp_page() for exclusive pages in the swapcache
  mm/huge_memory: make is_transparent_hugepage() static
  userfaultfd/selftests: enable hugetlb remap and remove event testing
  selftests/vm: add hugetlb madvise MADV_DONTNEED MADV_REMOVE test
  mm: enable MADV_DONTNEED for hugetlb mappings
  kasan: disable LOCKDEP when printing reports
  ...
2022-03-25 10:21:20 -07:00
Linus Torvalds
1f1c153e40 powerpc updates for 5.18
- Enforce kernel RO, and implement STRICT_MODULE_RWX for 603.
 
  - Add support for livepatch to 32-bit.
 
  - Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS.
 
  - Merge vdso64 and vdso32 into a single directory.
 
  - Fix build errors with newer binutils.
 
  - Add support for UADDR64 relocations, which are emitted by some toolchains. This allows
    powerpc to build with the latest lld.
 
  - Fix (another) potential userspace r13 corruption in transactional memory handling.
 
  - Cleanups of function descriptor handling & related fixes to LKDTM.
 
 Thanks to: Abdul Haleem, Alexey Kardashevskiy, Anders Roxell, Aneesh Kumar K.V, Anton
 Blanchard, Arnd Bergmann, Athira Rajeev, Bhaskar Chowdhury, Cédric Le Goater, Chen
 Jingwen, Christophe JAILLET, Christophe Leroy, Corentin Labbe, Daniel Axtens, Daniel
 Henrique Barboza, David Dai, Fabiano Rosas, Ganesh Goudar, Guo Zhengkui, Hangyu Hua, Haren
 Myneni, Hari Bathini, Igor Zhbanov, Jakob Koschel, Jason Wang, Jeremy Kerr, Joachim
 Wiberg, Jordan Niethe, Julia Lawall, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan
 Srinivasan, Mamatha Inamdar, Maxime Bizon, Maxim Kiselev, Maxim Kochetkov, Michal
 Suchanek, Nageswara R Sastry, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nour-eddine
 Taleb, Paul Menzel, Ping Fang, Pratik R. Sampat, Randy Dunlap, Ritesh Harjani, Rohan
 McLure, Russell Currey, Sachin Sant, Segher Boessenkool, Shivaprasad G Bhat, Sourabh Jain,
 Thierry Reding, Tobias Waldekranz, Tyrel Datwyler, Vaibhav Jain, Vladimir Oltean, Wedson
 Almeida Filho, YueHaibing.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmI9TtQTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgLp2D/0dwoliEJubRCfoawYUGhxTRZuo6ZYw
 EQzprOiFA/MtrZyPfbrX/FwxeeetzQWysaw2r5JAuQwx5Jb7Od9dNIrVmueFEktC
 hD4fkO8YT+QuOD3Xhp/rDQTImdw4fkeofIjnWIqEAtz0XGInmiRQKOnojVe/Po7f
 72Yi1u80LxYBAnkN/Hhpmi/BsVmu0Nh3wELu+JZopQXjINj4RyD49ayCBSLbmiNc
 uo7oYzJ0/WsZHNTpX9kAzzCq+XmI3dKZPyf2AOCvoRxJTmUPCRZF9QCwsmQFikiI
 vZOdz4fI5e+C0aYJj8ODmWMrXiS+JUQdEShjGg9t9K6EN8idC8joKWpAuXjTA9KN
 kRjzXX7AvjxaMEGbLe8gjU0PmEjY3eSzMOy15Oc/C0DRRswXRzrXdx2AF+/J6bQb
 MWMM4aCKfcYs5/TENkEnV0xpbOCOy4ikHM1KZbxvVrShvjSlNIL9XTOnl/pNK5BJ
 XSSI2mfnjKkbI1+l0KQ4NBXIRTo6HLpu5jwY3Xh97Tq7kaEfqDbO5p2P2HoOCiLa
 ZjdzmpP99zM6wnqUSj+lyvjob7btyhoq6TKmPtxfKbR6OaSfRJ760BCJ5y15Y9Hc
 rHey4Y/NL7LqsVYFZxi4/T6Ncq1hNeYr2Fiis4gH+/1zjr6Cd4othnvw3Slaxhst
 AaHpN3pyx1QI6g==
 =8r2c
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Livepatch support for 32-bit is probably the standout new feature,
  otherwise mostly just lots of bits and pieces all over the board.

  There's a series of commits cleaning up function descriptor handling,
  which touches a few other arches as well as LKDTM. It has acks from
  Arnd, Kees and Helge.

  Summary:

   - Enforce kernel RO, and implement STRICT_MODULE_RWX for 603.

   - Add support for livepatch to 32-bit.

   - Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS.

   - Merge vdso64 and vdso32 into a single directory.

   - Fix build errors with newer binutils.

   - Add support for UADDR64 relocations, which are emitted by some
     toolchains. This allows powerpc to build with the latest lld.

   - Fix (another) potential userspace r13 corruption in transactional
     memory handling.

   - Cleanups of function descriptor handling & related fixes to LKDTM.

  Thanks to Abdul Haleem, Alexey Kardashevskiy, Anders Roxell, Aneesh
  Kumar K.V, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Bhaskar
  Chowdhury, Cédric Le Goater, Chen Jingwen, Christophe JAILLET,
  Christophe Leroy, Corentin Labbe, Daniel Axtens, Daniel Henrique
  Barboza, David Dai, Fabiano Rosas, Ganesh Goudar, Guo Zhengkui, Hangyu
  Hua, Haren Myneni, Hari Bathini, Igor Zhbanov, Jakob Koschel, Jason
  Wang, Jeremy Kerr, Joachim Wiberg, Jordan Niethe, Julia Lawall, Kajol
  Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mamatha Inamdar,
  Maxime Bizon, Maxim Kiselev, Maxim Kochetkov, Michal Suchanek,
  Nageswara R Sastry, Nathan Lynch, Naveen N. Rao, Nicholas Piggin,
  Nour-eddine Taleb, Paul Menzel, Ping Fang, Pratik R. Sampat, Randy
  Dunlap, Ritesh Harjani, Rohan McLure, Russell Currey, Sachin Sant,
  Segher Boessenkool, Shivaprasad G Bhat, Sourabh Jain, Thierry Reding,
  Tobias Waldekranz, Tyrel Datwyler, Vaibhav Jain, Vladimir Oltean,
  Wedson Almeida Filho, and YueHaibing"

* tag 'powerpc-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits)
  powerpc/pseries: Fix use after free in remove_phb_dynamic()
  powerpc/time: improve decrementer clockevent processing
  powerpc/time: Fix KVM host re-arming a timer beyond decrementer range
  powerpc/tm: Fix more userspace r13 corruption
  powerpc/xive: fix return value of __setup handler
  powerpc/64: Add UADDR64 relocation support
  powerpc: 8xx: fix a return value error in mpc8xx_pic_init
  powerpc/ps3: remove unneeded semicolons
  powerpc/64: Force inlining of prevent_user_access() and set_kuap()
  powerpc/bitops: Force inlining of fls()
  powerpc: declare unmodified attribute_group usages const
  powerpc/spufs: Fix build warning when CONFIG_PROC_FS=n
  powerpc/secvar: fix refcount leak in format_show()
  powerpc/64e: Tie PPC_BOOK3E_64 to PPC_FSL_BOOK3E
  powerpc: Move C prototypes out of asm-prototypes.h
  powerpc/kexec: Declare kexec_paca static
  powerpc/smp: Declare current_set static
  powerpc: Cleanup asm-prototypes.c
  powerpc/ftrace: Use STK_GOT in ftrace_mprofile.S
  powerpc/ftrace: Regroup PPC64 specific operations in ftrace_mprofile.S
  ...
2022-03-25 09:39:36 -07:00
Johannes Weiner
9457056ac4 mm: madvise: MADV_DONTNEED_LOCKED
MADV_DONTNEED historically rejects mlocked ranges, but with MLOCK_ONFAULT
and MCL_ONFAULT allowing to mlock without populating, there are valid use
cases for depopulating locked ranges as well.

Users mlock memory to protect secrets.  There are allocators for secure
buffers that want in-use memory generally mlocked, but cleared and
invalidated memory to give up the physical pages.  This could be done with
explicit munlock -> mlock calls on free -> alloc of course, but that adds
two unnecessary syscalls, heavy mmap_sem write locks, vma splits and
re-merges - only to get rid of the backing pages.

Users also mlockall(MCL_ONFAULT) to suppress sustained paging, but are
okay with on-demand initial population.  It seems valid to selectively
free some memory during the lifetime of such a process, without having to
mess with its overall policy.

Why add a separate flag? Isn't this a pretty niche usecase?

- MADV_DONTNEED has been bailing on locked vmas forever. It's at least
  conceivable that someone, somewhere is relying on mlock to protect
  data from perhaps broader invalidation calls. Changing this behavior
  now could lead to quiet data corruption.

- It also clarifies expectations around MADV_FREE and maybe
  MADV_REMOVE. It avoids the situation where one quietly behaves
  different than the others. MADV_FREE_LOCKED can be added later.

- The combination of mlock() and madvise() in the first place is
  probably niche. But where it happens, I'd say that dropping pages
  from a locked region once they don't contain secrets or won't page
  anymore is much saner than relying on mlock to protect memory from
  speculative or errant invalidation calls. It's just that we can't
  change the default behavior because of the two previous points.

Given that, an explicit new flag seems to make the most sense.

[hannes@cmpxchg.org: fix mips build]

Link: https://lkml.kernel.org/r/20220304171912.305060-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:51 -07:00
Linus Torvalds
169e77764a Networking changes for 5.18.
Core
 ----
 
  - Introduce XDP multi-buffer support, allowing the use of XDP with
    jumbo frame MTUs and combination with Rx coalescing offloads (LRO).
 
  - Speed up netns dismantling (5x) and lower the memory cost a little.
    Remove unnecessary per-netns sockets. Scope some lists to a netns.
    Cut down RCU syncing. Use batch methods. Allow netdev registration
    to complete out of order.
 
  - Support distinguishing timestamp types (ingress vs egress) and
    maintaining them across packet scrubbing points (e.g. redirect).
 
  - Continue the work of annotating packet drop reasons throughout
    the stack.
 
  - Switch netdev error counters from an atomic to dynamically
    allocated per-CPU counters.
 
  - Rework a few preempt_disable(), local_irq_save() and busy waiting
    sections problematic on PREEMPT_RT.
 
  - Extend the ref_tracker to allow catching use-after-free bugs.
 
 BPF
 ---
 
  - Introduce "packing allocator" for BPF JIT images. JITed code is
    marked read only, and used to be allocated at page granularity.
    Custom allocator allows for more efficient memory use, lower
    iTLB pressure and prevents identity mapping huge pages from
    getting split.
 
  - Make use of BTF type annotations (e.g. __user, __percpu) to enforce
    the correct probe read access method, add appropriate helpers.
 
  - Convert the BPF preload to use light skeleton and drop
    the user-mode-driver dependency.
 
  - Allow XDP BPF_PROG_RUN test infra to send real packets, enabling
    its use as a packet generator.
 
  - Allow local storage memory to be allocated with GFP_KERNEL if called
    from a hook allowed to sleep.
 
  - Introduce fprobe (multi kprobe) to speed up mass attachment (arch
    bits to come later).
 
  - Add unstable conntrack lookup helpers for BPF by using the BPF
    kfunc infra.
 
  - Allow cgroup BPF progs to return custom errors to user space.
 
  - Add support for AF_UNIX iterator batching.
 
  - Allow iterator programs to use sleepable helpers.
 
  - Support JIT of add, and, or, xor and xchg atomic ops on arm64.
 
  - Add BTFGen support to bpftool which allows to use CO-RE in kernels
    without BTF info.
 
  - Large number of libbpf API improvements, cleanups and deprecations.
 
 Protocols
 ---------
 
  - Micro-optimize UDPv6 Tx, gaining up to 5% in test on dummy netdev.
 
  - Adjust TSO packet sizes based on min_rtt, allowing very low latency
    links (data centers) to always send full-sized TSO super-frames.
 
  - Make IPv6 flow label changes (AKA hash rethink) more configurable,
    via sysctl and setsockopt. Distinguish between server and client
    behavior.
 
  - VxLAN support to "collect metadata" devices to terminate only
    configured VNIs. This is similar to VLAN filtering in the bridge.
 
  - Support inserting IPv6 IOAM information to a fraction of frames.
 
  - Add protocol attribute to IP addresses to allow identifying where
    given address comes from (kernel-generated, DHCP etc.)
 
  - Support setting socket and IPv6 options via cmsg on ping6 sockets.
 
  - Reject mis-use of ECN bits in IP headers as part of DSCP/TOS.
    Define dscp_t and stop taking ECN bits into account in fib-rules.
 
  - Add support for locked bridge ports (for 802.1X).
 
  - tun: support NAPI for packets received from batched XDP buffs,
    doubling the performance in some scenarios.
 
  - IPv6 extension header handling in Open vSwitch.
 
  - Support IPv6 control message load balancing in bonding, prevent
    neighbor solicitation and advertisement from using the wrong port.
    Support NS/NA monitor selection similar to existing ARP monitor.
 
  - SMC
    - improve performance with TCP_CORK and sendfile()
    - support auto-corking
    - support TCP_NODELAY
 
  - MCTP (Management Component Transport Protocol)
    - add user space tag control interface
    - I2C binding driver (as specified by DMTF DSP0237)
 
  - Multi-BSSID beacon handling in AP mode for WiFi.
 
  - Bluetooth:
    - handle MSFT Monitor Device Event
    - add MGMT Adv Monitor Device Found/Lost events
 
  - Multi-Path TCP:
    - add support for the SO_SNDTIMEO socket option
    - lots of selftest cleanups and improvements
 
  - Increase the max PDU size in CAN ISOTP to 64 kB.
 
 Driver API
 ----------
 
  - Add HW counters for SW netdevs, a mechanism for devices which
    offload packet forwarding to report packet statistics back to
    software interfaces such as tunnels.
 
  - Select the default NIC queue count as a fraction of number of
    physical CPU cores, instead of hard-coding to 8.
 
  - Expose devlink instance locks to drivers. Allow device layer of
    drivers to use that lock directly instead of creating their own
    which always runs into ordering issues in devlink callbacks.
 
  - Add header/data split indication to guide user space enabling
    of TCP zero-copy Rx.
 
  - Allow configuring completion queue event size.
 
  - Refactor page_pool to enable fragmenting after allocation.
 
  - Add allocation and page reuse statistics to page_pool.
 
  - Improve Multiple Spanning Trees support in the bridge to allow
    reuse of topologies across VLANs, saving HW resources in switches.
 
  - DSA (Distributed Switch Architecture):
    - replay and offload of host VLAN entries
    - offload of static and local FDB entries on LAG interfaces
    - FDB isolation and unicast filtering
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - LAN937x T1 PHYs
    - Davicom DM9051 SPI NIC driver
    - Realtek RTL8367S, RTL8367RB-VB switch and MDIO
    - Microchip ksz8563 switches
    - Netronome NFP3800 SmartNICs
    - Fungible SmartNICs
    - MediaTek MT8195 switches
 
  - WiFi:
    - mt76: MediaTek mt7916
    - mt76: MediaTek mt7921u USB adapters
    - brcmfmac: Broadcom BCM43454/6
 
  - Mobile:
    - iosm: Intel M.2 7360 WWAN card
 
 Drivers
 -------
 
  - Convert many drivers to the new phylink API built for split PCS
    designs but also simplifying other cases.
 
  - Intel Ethernet NICs:
    - add TTY for GNSS module for E810T device
    - improve AF_XDP performance
    - GTP-C and GTP-U filter offload
    - QinQ VLAN support
 
  - Mellanox Ethernet NICs (mlx5):
    - support xdp->data_meta
    - multi-buffer XDP
    - offload tc push_eth and pop_eth actions
 
  - Netronome Ethernet NICs (nfp):
    - flow-independent tc action hardware offload (police / meter)
    - AF_XDP
 
  - Other Ethernet NICs:
    - at803x: fiber and SFP support
    - xgmac: mdio: preamble suppression and custom MDC frequencies
    - r8169: enable ASPM L1.2 if system vendor flags it as safe
    - macb/gem: ZynqMP SGMII
    - hns3: add TX push mode
    - dpaa2-eth: software TSO
    - lan743x: multi-queue, mdio, SGMII, PTP
    - axienet: NAPI and GRO support
 
  - Mellanox Ethernet switches (mlxsw):
    - source and dest IP address rewrites
    - RJ45 ports
 
  - Marvell Ethernet switches (prestera):
    - basic routing offload
    - multi-chain TC ACL offload
 
  - NXP embedded Ethernet switches (ocelot & felix):
    - PTP over UDP with the ocelot-8021q DSA tagging protocol
    - basic QoS classification on Felix DSA switch using dcbnl
    - port mirroring for ocelot switches
 
  - Microchip high-speed industrial Ethernet (sparx5):
    - offloading of bridge port flooding flags
    - PTP Hardware Clock
 
  - Other embedded switches:
    - lan966x: PTP Hardward Clock
    - qca8k: mdio read/write operations via crafted Ethernet packets
 
  - Qualcomm 802.11ax WiFi (ath11k):
    - add LDPC FEC type and 802.11ax High Efficiency data in radiotap
    - enable RX PPDU stats in monitor co-exist mode
 
  - Intel WiFi (iwlwifi):
    - UHB TAS enablement via BIOS
    - band disablement via BIOS
    - channel switch offload
    - 32 Rx AMPDU sessions in newer devices
 
  - MediaTek WiFi (mt76):
    - background radar detection
    - thermal management improvements on mt7915
    - SAR support for more mt76 platforms
    - MBSSID and 6 GHz band on mt7915
 
  - RealTek WiFi:
    - rtw89: AP mode
    - rtw89: 160 MHz channels and 6 GHz band
    - rtw89: hardware scan
 
  - Bluetooth:
    - mt7921s: wake on Bluetooth, SCO over I2S, wide-band-speed (WBS)
 
  - Microchip CAN (mcp251xfd):
    - multiple RX-FIFOs and runtime configurable RX/TX rings
    - internal PLL, runtime PM handling simplification
    - improve chip detection and error handling after wakeup
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmI7YBcACgkQMUZtbf5S
 IrveSBAAmSNJlUK6vPsnNzs7IhsZnfI/AUjm2TCLZnlhKttbpI4A/4Pohk33V7RS
 FGX7f8kjEfhUwrIiLDgeCnztNHRECrCmk6aZc/jLEvecmTauJ+f6kjShkDY/wix+
 AkPHmrZnQeLPAEVuljDdV+sL6ik08+zQL7PazIYHsaSKKC0MGQptRwcri8PLRAKE
 KPBAhVhleq2rAZ/ntprSN52F4Af6rpFTrPIWuN8Bqdbc9dy5094LT0mpOOWYvgr3
 /DLvvAPuLemwyIQkjWknVKBRUAQcmNPC+BY3J8K3LRaiNhekGqOFan46BfqP+k2J
 6DWu0Qrp2yWt4BMOeEToZR5rA6v5suUAMIBu8PRZIDkINXQMlIxHfGjZyNm0rVfw
 7edNri966yus9OdzwPa32MIG3oC6PnVAwYCJAjjBMNS8sSIkp7wgHLkgWN4UFe2H
 K/e6z8TLF4UQ+zFM0aGI5WZ+9QqWkTWEDF3R3OhdFpGrznna0gxmkOeV2YvtsgxY
 cbS0vV9Zj73o+bYzgBKJsw/dAjyLdXoHUGvus26VLQ78S/VGunVKtItwoxBAYmZo
 krW964qcC89YofzSi8RSKLHuEWtNWZbVm8YXr75u6jpr5GhMBu0CYefLs+BuZcxy
 dw8c69cGneVbGZmY2J3rBhDkchbuICl8vdUPatGrOJAoaFdYKuw=
 =ELpe
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "The sprinkling of SPI drivers is because we added a new one and Mark
  sent us a SPI driver interface conversion pull request.

  Core
  ----

   - Introduce XDP multi-buffer support, allowing the use of XDP with
     jumbo frame MTUs and combination with Rx coalescing offloads (LRO).

   - Speed up netns dismantling (5x) and lower the memory cost a little.
     Remove unnecessary per-netns sockets. Scope some lists to a netns.
     Cut down RCU syncing. Use batch methods. Allow netdev registration
     to complete out of order.

   - Support distinguishing timestamp types (ingress vs egress) and
     maintaining them across packet scrubbing points (e.g. redirect).

   - Continue the work of annotating packet drop reasons throughout the
     stack.

   - Switch netdev error counters from an atomic to dynamically
     allocated per-CPU counters.

   - Rework a few preempt_disable(), local_irq_save() and busy waiting
     sections problematic on PREEMPT_RT.

   - Extend the ref_tracker to allow catching use-after-free bugs.

  BPF
  ---

   - Introduce "packing allocator" for BPF JIT images. JITed code is
     marked read only, and used to be allocated at page granularity.
     Custom allocator allows for more efficient memory use, lower iTLB
     pressure and prevents identity mapping huge pages from getting
     split.

   - Make use of BTF type annotations (e.g. __user, __percpu) to enforce
     the correct probe read access method, add appropriate helpers.

   - Convert the BPF preload to use light skeleton and drop the
     user-mode-driver dependency.

   - Allow XDP BPF_PROG_RUN test infra to send real packets, enabling
     its use as a packet generator.

   - Allow local storage memory to be allocated with GFP_KERNEL if
     called from a hook allowed to sleep.

   - Introduce fprobe (multi kprobe) to speed up mass attachment (arch
     bits to come later).

   - Add unstable conntrack lookup helpers for BPF by using the BPF
     kfunc infra.

   - Allow cgroup BPF progs to return custom errors to user space.

   - Add support for AF_UNIX iterator batching.

   - Allow iterator programs to use sleepable helpers.

   - Support JIT of add, and, or, xor and xchg atomic ops on arm64.

   - Add BTFGen support to bpftool which allows to use CO-RE in kernels
     without BTF info.

   - Large number of libbpf API improvements, cleanups and deprecations.

  Protocols
  ---------

   - Micro-optimize UDPv6 Tx, gaining up to 5% in test on dummy netdev.

   - Adjust TSO packet sizes based on min_rtt, allowing very low latency
     links (data centers) to always send full-sized TSO super-frames.

   - Make IPv6 flow label changes (AKA hash rethink) more configurable,
     via sysctl and setsockopt. Distinguish between server and client
     behavior.

   - VxLAN support to "collect metadata" devices to terminate only
     configured VNIs. This is similar to VLAN filtering in the bridge.

   - Support inserting IPv6 IOAM information to a fraction of frames.

   - Add protocol attribute to IP addresses to allow identifying where
     given address comes from (kernel-generated, DHCP etc.)

   - Support setting socket and IPv6 options via cmsg on ping6 sockets.

   - Reject mis-use of ECN bits in IP headers as part of DSCP/TOS.
     Define dscp_t and stop taking ECN bits into account in fib-rules.

   - Add support for locked bridge ports (for 802.1X).

   - tun: support NAPI for packets received from batched XDP buffs,
     doubling the performance in some scenarios.

   - IPv6 extension header handling in Open vSwitch.

   - Support IPv6 control message load balancing in bonding, prevent
     neighbor solicitation and advertisement from using the wrong port.
     Support NS/NA monitor selection similar to existing ARP monitor.

   - SMC
      - improve performance with TCP_CORK and sendfile()
      - support auto-corking
      - support TCP_NODELAY

   - MCTP (Management Component Transport Protocol)
      - add user space tag control interface
      - I2C binding driver (as specified by DMTF DSP0237)

   - Multi-BSSID beacon handling in AP mode for WiFi.

   - Bluetooth:
      - handle MSFT Monitor Device Event
      - add MGMT Adv Monitor Device Found/Lost events

   - Multi-Path TCP:
      - add support for the SO_SNDTIMEO socket option
      - lots of selftest cleanups and improvements

   - Increase the max PDU size in CAN ISOTP to 64 kB.

  Driver API
  ----------

   - Add HW counters for SW netdevs, a mechanism for devices which
     offload packet forwarding to report packet statistics back to
     software interfaces such as tunnels.

   - Select the default NIC queue count as a fraction of number of
     physical CPU cores, instead of hard-coding to 8.

   - Expose devlink instance locks to drivers. Allow device layer of
     drivers to use that lock directly instead of creating their own
     which always runs into ordering issues in devlink callbacks.

   - Add header/data split indication to guide user space enabling of
     TCP zero-copy Rx.

   - Allow configuring completion queue event size.

   - Refactor page_pool to enable fragmenting after allocation.

   - Add allocation and page reuse statistics to page_pool.

   - Improve Multiple Spanning Trees support in the bridge to allow
     reuse of topologies across VLANs, saving HW resources in switches.

   - DSA (Distributed Switch Architecture):
      - replay and offload of host VLAN entries
      - offload of static and local FDB entries on LAG interfaces
      - FDB isolation and unicast filtering

  New hardware / drivers
  ----------------------

   - Ethernet:
      - LAN937x T1 PHYs
      - Davicom DM9051 SPI NIC driver
      - Realtek RTL8367S, RTL8367RB-VB switch and MDIO
      - Microchip ksz8563 switches
      - Netronome NFP3800 SmartNICs
      - Fungible SmartNICs
      - MediaTek MT8195 switches

   - WiFi:
      - mt76: MediaTek mt7916
      - mt76: MediaTek mt7921u USB adapters
      - brcmfmac: Broadcom BCM43454/6

   - Mobile:
      - iosm: Intel M.2 7360 WWAN card

  Drivers
  -------

   - Convert many drivers to the new phylink API built for split PCS
     designs but also simplifying other cases.

   - Intel Ethernet NICs:
      - add TTY for GNSS module for E810T device
      - improve AF_XDP performance
      - GTP-C and GTP-U filter offload
      - QinQ VLAN support

   - Mellanox Ethernet NICs (mlx5):
      - support xdp->data_meta
      - multi-buffer XDP
      - offload tc push_eth and pop_eth actions

   - Netronome Ethernet NICs (nfp):
      - flow-independent tc action hardware offload (police / meter)
      - AF_XDP

   - Other Ethernet NICs:
      - at803x: fiber and SFP support
      - xgmac: mdio: preamble suppression and custom MDC frequencies
      - r8169: enable ASPM L1.2 if system vendor flags it as safe
      - macb/gem: ZynqMP SGMII
      - hns3: add TX push mode
      - dpaa2-eth: software TSO
      - lan743x: multi-queue, mdio, SGMII, PTP
      - axienet: NAPI and GRO support

   - Mellanox Ethernet switches (mlxsw):
      - source and dest IP address rewrites
      - RJ45 ports

   - Marvell Ethernet switches (prestera):
      - basic routing offload
      - multi-chain TC ACL offload

   - NXP embedded Ethernet switches (ocelot & felix):
      - PTP over UDP with the ocelot-8021q DSA tagging protocol
      - basic QoS classification on Felix DSA switch using dcbnl
      - port mirroring for ocelot switches

   - Microchip high-speed industrial Ethernet (sparx5):
      - offloading of bridge port flooding flags
      - PTP Hardware Clock

   - Other embedded switches:
      - lan966x: PTP Hardward Clock
      - qca8k: mdio read/write operations via crafted Ethernet packets

   - Qualcomm 802.11ax WiFi (ath11k):
      - add LDPC FEC type and 802.11ax High Efficiency data in radiotap
      - enable RX PPDU stats in monitor co-exist mode

   - Intel WiFi (iwlwifi):
      - UHB TAS enablement via BIOS
      - band disablement via BIOS
      - channel switch offload
      - 32 Rx AMPDU sessions in newer devices

   - MediaTek WiFi (mt76):
      - background radar detection
      - thermal management improvements on mt7915
      - SAR support for more mt76 platforms
      - MBSSID and 6 GHz band on mt7915

   - RealTek WiFi:
      - rtw89: AP mode
      - rtw89: 160 MHz channels and 6 GHz band
      - rtw89: hardware scan

   - Bluetooth:
      - mt7921s: wake on Bluetooth, SCO over I2S, wide-band-speed (WBS)

   - Microchip CAN (mcp251xfd):
      - multiple RX-FIFOs and runtime configurable RX/TX rings
      - internal PLL, runtime PM handling simplification
      - improve chip detection and error handling after wakeup"

* tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2521 commits)
  llc: fix netdevice reference leaks in llc_ui_bind()
  drivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtool
  ice: don't allow to run ice_send_event_to_aux() in atomic ctx
  ice: fix 'scheduling while atomic' on aux critical err interrupt
  net/sched: fix incorrect vlan_push_eth dest field
  net: bridge: mst: Restrict info size queries to bridge ports
  net: marvell: prestera: add missing destroy_workqueue() in prestera_module_init()
  drivers: net: xgene: Fix regression in CRC stripping
  net: geneve: add missing netlink policy and size for IFLA_GENEVE_INNER_PROTO_INHERIT
  net: dsa: fix missing host-filtered multicast addresses
  net/mlx5e: Fix build warning, detected write beyond size of field
  iwlwifi: mvm: Don't fail if PPAG isn't supported
  selftests/bpf: Fix kprobe_multi test.
  Revert "rethook: x86: Add rethook x86 implementation"
  Revert "arm64: rethook: Add arm64 rethook implementation"
  Revert "powerpc: Add rethook support"
  Revert "ARM: rethook: Add rethook arm implementation"
  netdevice: add missing dm_private kdoc
  net: bridge: mst: prevent NULL deref in br_mst_info_size()
  selftests: forwarding: Use same VRF for port and VLAN upper
  ...
2022-03-24 13:13:26 -07:00
Linus Torvalds
194dfe88d6 asm-generic updates for 5.18
There are three sets of updates for 5.18 in the asm-generic tree:
 
  - The set_fs()/get_fs() infrastructure gets removed for good. This
    was already gone from all major architectures, but now we can
    finally remove it everywhere, which loses some particularly
    tricky and error-prone code.
    There is a small merge conflict against a parisc cleanup, the
    solution is to use their new version.
 
  - The nds32 architecture ends its tenure in the Linux kernel. The
    hardware is still used and the code is in reasonable shape, but
    the mainline port is not actively maintained any more, as all
    remaining users are thought to run vendor kernels that would never
    be updated to a future release.
    There are some obvious conflicts against changes to the removed
    files.
 
  - A series from Masahiro Yamada cleans up some of the uapi header
    files to pass the compile-time checks.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmI69BsACgkQmmx57+YA
 GNn/zA//f4d5VTT0ThhRxRWTu9BdThGHoB8TUcY7iOhbsWu0X/913NItRC3UeWNl
 IdmisaXgVtirg1dcC2pWUmrcHdoWOCEGfK4+Zr2NhSWfuZDWvODHK9pGWk4WLnhe
 cQgUNBvIuuAMryGtrOBwHPO4TpfCyy2ioeVP36ZfcsWXdDxTrqfaq/56mk3sxIP6
 sUTk1UEjut9NG4C9xIIvcSU50R3l6LryQE/H9kyTLtaSvfvTOvprcVYCq0GPmSzo
 DtQ1Wwa9zbJ+4EqoMiP5RrgQwWvOTg2iRByLU8ytwlX3e/SEF0uihvMv1FQbL8zG
 G8RhGUOKQSEhaBfc3lIkm8GpOVPh0uHzB6zhn7daVmAWtazRD2Nu59BMjipa+ims
 a8Z58iHH7jRAnKeEkVZqXKb1CEiUxaQx/IeVPzN4QlwMhDtwrI76LY7ZJ1zCqTGY
 ENG0yRLav1XselYBslOYXGtOEWcY5EZPWqLyWbp4P9vz2g0Fe0gZxoIOvPmNQc89
 QnfXpCt7vm/DGkyO255myu08GOLeMkisVqUIzLDB9avlym5mri7T7vk9abBa2YyO
 CRpTL5gl1/qKPWuH1UI5mvhT+sbbBE2SUHSuy84btns39ZKKKynwCtdu+hSQkKLE
 h9pV30Gf1cLTD4JAE0RWlUgOmbBLVp34loTOexQj4MrLM1noOnw=
 =vtCN
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "There are three sets of updates for 5.18 in the asm-generic tree:

   - The set_fs()/get_fs() infrastructure gets removed for good.

     This was already gone from all major architectures, but now we can
     finally remove it everywhere, which loses some particularly tricky
     and error-prone code. There is a small merge conflict against a
     parisc cleanup, the solution is to use their new version.

   - The nds32 architecture ends its tenure in the Linux kernel.

     The hardware is still used and the code is in reasonable shape, but
     the mainline port is not actively maintained any more, as all
     remaining users are thought to run vendor kernels that would never
     be updated to a future release.

   - A series from Masahiro Yamada cleans up some of the uapi header
     files to pass the compile-time checks"

* tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (27 commits)
  nds32: Remove the architecture
  uaccess: remove CONFIG_SET_FS
  ia64: remove CONFIG_SET_FS support
  sh: remove CONFIG_SET_FS support
  sparc64: remove CONFIG_SET_FS support
  lib/test_lockup: fix kernel pointer check for separate address spaces
  uaccess: generalize access_ok()
  uaccess: fix type mismatch warnings from access_ok()
  arm64: simplify access_ok()
  m68k: fix access_ok for coldfire
  MIPS: use simpler access_ok()
  MIPS: Handle address errors for accesses above CPU max virtual user address
  uaccess: add generic __{get,put}_kernel_nofault
  nios2: drop access_ok() check from __put_user()
  x86: use more conventional access_ok() definition
  x86: remove __range_not_ok()
  sparc64: add __{get,put}_kernel_nofault()
  nds32: fix access_ok() checks in get/put_user
  uaccess: fix nios2 and microblaze get_user_8()
  sparc64: fix building assembly files
  ...
2022-03-23 18:03:08 -07:00
Linus Torvalds
9c4b86ebf5 fbdev fixes and updates for kernel v5.18-rc1
Lots of small fixes and code cleanups across most of the fbdev drivers.
 
 This includes conversions to use helper functions, const conversions, spelling
 fixes, help text updates, adding return value checks, small build fixes, and
 much more.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCYjogFQAKCRD3ErUQojoP
 X3Y7AQD/0Qd0zm6klv4EPeyLXOYzs6uXdyHiJGyCBABP3WxKZQD+J4yNXjd1g50A
 iGbsawaUpFMcaXTETr0NcrtRkc5jCgY=
 =HHa3
 -----END PGP SIGNATURE-----

Merge tag 'for-5.18/fbdev-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev

Pull fbdev updates from Helge Deller:
 "Lots of small fixes and code cleanups across most of the fbdev
  drivers.

  This includes conversions to use helper functions, const conversions,
  spelling fixes, help text updates, adding return value checks, small
  build fixes, and much more"

* tag 'for-5.18/fbdev-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: (59 commits)
  video: fbdev: kyro: make read-only array ODValues static const
  video: fbdev: offb: fix warning comparing pointer to 0
  video: fbdev: omapfb: Add missing of_node_put() in dvic_probe_of
  video: fbdev: sm712fb: Fix crash in smtcfb_write()
  video: fbdev: s3c-fb: fix platform_get_irq.cocci warning
  video: fbdev: sm712fb: Fix crash in smtcfb_read()
  video: fbdev: via: check the return value of kstrdup()
  video: fbdev: au1100fb: Spelling s/palette/palette/
  video: fbdev: atari: Atari 2 bpp (STe) palette bugfix
  video: fbdev: atari: Remove unused atafb_setcolreg()
  video: fbdev: atari: Convert to standard round_up() helper
  video: fbdev: atari: Fix TT High video mode
  video: fbdev: udlfb: replace snprintf in show functions with sysfs_emit
  video: fbdev: omapfb: panel-tpo-td043mtea1: Use sysfs_emit() instead of snprintf()
  video: fbdev: omapfb: panel-dsi-cm: Use sysfs_emit() instead of snprintf()
  video: fbdev: omapfb: Use sysfs_emit() instead of snprintf()
  video: fbdev: s3c-fb: Use platform_get_irq() to get the interrupt
  video: fbdev: Fix wrong file path for pvr2fb.c in Kconfig help text
  video: fbdev: pxa3xx-gcu: Remove unnecessary print function dev_err()
  video: fbdev: pxa168fb: Remove unnecessary print function dev_err()
  ...
2022-03-23 14:45:01 -07:00
Linus Torvalds
9030fb0bb9 Folio changes for 5.18
- Rewrite how munlock works to massively reduce the contention
    on i_mmap_rwsem (Hugh Dickins):
    https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/
  - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig):
    https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/
  - Convert GUP to use folios and make pincount available for order-1
    pages. (Matthew Wilcox)
  - Convert a few more truncation functions to use folios (Matthew Wilcox)
  - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox)
  - Convert rmap_walk to use folios (Matthew Wilcox)
  - Convert most of shrink_page_list() to use a folio (Matthew Wilcox)
  - Add support for creating large folios in readahead (Matthew Wilcox)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmI4ucgACgkQDpNsjXcp
 gj69Wgf6AwqwmO5Tmy+fLScDPqWxmXJofbocae1kyoGHf7Ui91OK4U2j6IpvAr+g
 P/vLIK+JAAcTQcrSCjymuEkf4HkGZOR03QQn7maPIEe4eLrZRQDEsmHC1L9gpeJp
 s/GMvDWiGE0Tnxu0EOzfVi/yT+qjIl/S8VvqtCoJv1HdzxitZ7+1RDuqImaMC5MM
 Qi3uHag78vLmCltLXpIOdpgZhdZexCdL2Y/1npf+b6FVkAJRRNUnA0gRbS7YpoVp
 CbxEJcmAl9cpJLuj5i5kIfS9trr+/QcvbUlzRxh4ggC58iqnmF2V09l2MJ7YU3XL
 v1O/Elq4lRhXninZFQEm9zjrri7LDQ==
 =n9Ad
 -----END PGP SIGNATURE-----

Merge tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache

Pull folio updates from Matthew Wilcox:

 - Rewrite how munlock works to massively reduce the contention on
   i_mmap_rwsem (Hugh Dickins):

     https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/

 - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph
   Hellwig):

     https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/

 - Convert GUP to use folios and make pincount available for order-1
   pages. (Matthew Wilcox)

 - Convert a few more truncation functions to use folios (Matthew
   Wilcox)

 - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew
   Wilcox)

 - Convert rmap_walk to use folios (Matthew Wilcox)

 - Convert most of shrink_page_list() to use a folio (Matthew Wilcox)

 - Add support for creating large folios in readahead (Matthew Wilcox)

* tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits)
  mm/damon: minor cleanup for damon_pa_young
  selftests/vm/transhuge-stress: Support file-backed PMD folios
  mm/filemap: Support VM_HUGEPAGE for file mappings
  mm/readahead: Switch to page_cache_ra_order
  mm/readahead: Align file mappings for non-DAX
  mm/readahead: Add large folio readahead
  mm: Support arbitrary THP sizes
  mm: Make large folios depend on THP
  mm: Fix READ_ONLY_THP warning
  mm/filemap: Allow large folios to be added to the page cache
  mm: Turn can_split_huge_page() into can_split_folio()
  mm/vmscan: Convert pageout() to take a folio
  mm/vmscan: Turn page_check_references() into folio_check_references()
  mm/vmscan: Account large folios correctly
  mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios
  mm/vmscan: Free non-shmem folios without splitting them
  mm/rmap: Constify the rmap_walk_control argument
  mm/rmap: Convert rmap_walk() to take a folio
  mm: Turn page_anon_vma() into folio_anon_vma()
  mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read()
  ...
2022-03-22 17:03:12 -07:00
Mike Rapoport
7106c51ee9 arch: Add pmd_pfn() where it is missing
We need to use this function in common code, so define it for
architectures and/or configrations that miss it.  The result of
pmd_pfn() will only be used if TRANSPARENT_HUGEPAGE is enabled,
but a function or macro called pmd_pfn() must be defined, even
on machines with two level page tables.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-21 12:59:02 -04:00
Helge Deller
9b046d0245 parisc: Avoid using hardware single-step in kprobes
This patch changes the kprobe and kretprobe feature to use another
break instruction instead of relying on the hardware single-step
feature.
That way those kprobes now work in qemu as well, because in qemu we
don't emulate yet single-stepping.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-03-18 09:10:14 +01:00
Helge Deller
0a57549769 parisc: Avoid calling SMP cache flush functions on cache-less machines
At least the qemu virtual machine does not provide D- and I-caches,
so skip triggering SMP irqs to flush caches on such machines.

Further optimize the caching code by using static branches and making
some functions static.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-03-16 22:52:12 +01:00
Helge Deller
8278cc1626 parisc: Reduce code size by optimizing get_current() function calls
The get_current() code uses the mfctl() macro to get the pointer to the
current task struct from %cr30. The problem with the mfctl() macro is,
that it is marked volatile which is basically correct, because mfctl()
is used to get e.g. the current internal timer or interrupt flags as
well.

But specifically the task struct pointer (%cr30) doesn't change over
time when the kernel executes code for a task.

So, by dropping the volatile when retrieving %cr30 the compiler is now
able to get this value only once and optimize the generated code a lot.

A bloat-o-meter comparism shows that this patch saves ~5kB kernel code
on a 32-bit kernel and ~6kB kernel code on a 64-bit kernel.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-03-11 19:49:31 +01:00
Helge Deller
360bd6c658 parisc: Use constants to encode the space registers like SR_KERNEL
Use the provided space register constants instead of hardcoded values.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-03-11 19:49:31 +01:00
Helge Deller
5613a93085 parisc: Use SR_USER and SR_KERNEL in get_user() and put_user()
Instead of hardcoding the space registers as strings, use the SR_USER
and SR_KERNEL constants to form the space register in the access
functions.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-03-11 19:49:30 +01:00
Helge Deller
46b4016f7d parisc: Add defines for various space register
Provide defines for space registers (SR_KERNEL, SR_USER, ...) which
should be used instead of hardcoding the values.

Signed-off-by: Helge Deller <deller@gmx.de>
2022-03-11 19:49:30 +01:00
Helge Deller
df24e1783e parisc: Add vDSO support
Add minimal vDSO support, which provides the signal trampoline helpers,
but none of the userspace syscall helpers like time wrappers.

The big benefit of this vDSO implementation is, that we now don't need
an executeable stack any longer. PA-RISC is one of the last
architectures where an executeable stack was needed in oder to implement
the signal trampolines by putting assembly instructions on the stack
which then gets executed. Instead the kernel will provide the relevant
code in the vDSO page and only put the pointers to the signal
information on the stack.

By dropping the need for executable stacks we avoid running into issues
with applications which want non executable stacks for security reasons.
Additionally, alternative stacks on memory areas without exec
permissions are supported too.

This code is based on an initial implementation by Randolph Chung from 2006:
https://lore.kernel.org/linux-parisc/4544A34A.6080700@tausq.org/

I did the porting and lifted the code to current code base. Dave fixed
the unwind code so that gdb and glibc are able to backtrace through the
code. An additional patch to gdb will be pushed upstream by Dave.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Dave Anglin <dave.anglin@bell.net>
Cc: Randolph Chung <randolph@tausq.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-03-11 19:49:30 +01:00
John David Anglin
e00b0a2ab8 parisc: Fix handling off probe non-access faults
Currently, the parisc kernel does not fully support non-access TLB
fault handling for probe instructions. In the fast path, we set the
target register to zero if it is not a shadowed register. The slow
path is not implemented, so we call do_page_fault. The architecture
indicates that non-access faults should not cause a page fault from
disk.

This change adds to code to provide non-access fault support for
probe instructions. It also modifies the handling of faults on
userspace so that if the address lies in a valid VMA and the access
type matches that for the VMA, the probe target register is set to
one. Otherwise, the target register is set to zero.

This was done to make probe instructions more useful for userspace.
Probe instructions are not very useful if they set the target register
to zero whenever a page is not present in memory. Nominally, the
purpose of the probe instruction is determine whether read or write
access to a given address is allowed.

This fixes a problem in function pointer comparison noticed in the
glibc testsuite (stdio-common/tst-vfprintf-user-type). The same
problem is likely in glibc (_dl_lookup_address).

V2 adds flush and lpa instruction support to handle_nadtlb_fault.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-03-11 19:49:30 +01:00
Arnd Bergmann
dd865f090f
Merge branch 'set_fs-4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic into asm-generic
Christoph Hellwig and a few others spent a huge effort on removing
set_fs() from most of the important architectures, but about half the
other architectures were never completed even though most of them don't
actually use set_fs() at all.

I did a patch for microblaze at some point, which turned out to be fairly
generic, and now ported it to most other architectures, using new generic
implementations of access_ok() and __{get,put}_kernel_nocheck().

Three architectures (sparc64, ia64, and sh) needed some extra work,
which I also completed.

* 'set_fs-4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  uaccess: remove CONFIG_SET_FS
  ia64: remove CONFIG_SET_FS support
  sh: remove CONFIG_SET_FS support
  sparc64: remove CONFIG_SET_FS support
  lib/test_lockup: fix kernel pointer check for separate address spaces
  uaccess: generalize access_ok()
  uaccess: fix type mismatch warnings from access_ok()
  arm64: simplify access_ok()
  m68k: fix access_ok for coldfire
  MIPS: use simpler access_ok()
  MIPS: Handle address errors for accesses above CPU max virtual user address
  uaccess: add generic __{get,put}_kernel_nofault
  nios2: drop access_ok() check from __put_user()
  x86: use more conventional access_ok() definition
  x86: remove __range_not_ok()
  sparc64: add __{get,put}_kernel_nofault()
  nds32: fix access_ok() checks in get/put_user
  uaccess: fix nios2 and microblaze get_user_8()
  uaccess: fix integer overflow on access_ok()
2022-02-25 11:16:58 +01:00
Arnd Bergmann
967747bbc0 uaccess: remove CONFIG_SET_FS
There are no remaining callers of set_fs(), so CONFIG_SET_FS
can be removed globally, along with the thread_info field and
any references to it.

This turns access_ok() into a cheaper check against TASK_SIZE_MAX.

As CONFIG_SET_FS is now gone, drop all remaining references to
set_fs()/get_fs(), mm_segment_t, user_addr_max() and uaccess_kernel().

Acked-by: Sam Ravnborg <sam@ravnborg.org> # for sparc32 changes
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Tested-by: Sergey Matyukevich <sergey.matyukevich@synopsys.com> # for arc changes
Acked-by: Stafford Horne <shorne@gmail.com> # [openrisc, asm-generic]
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25 09:36:06 +01:00
Arnd Bergmann
12700c17fc uaccess: generalize access_ok()
There are many different ways that access_ok() is defined across
architectures, but in the end, they all just compare against the
user_addr_max() value or they accept anything.

Provide one definition that works for most architectures, checking
against TASK_SIZE_MAX for user processes or skipping the check inside
of uaccess_kernel() sections.

For architectures without CONFIG_SET_FS(), this should be the fastest
check, as it comes down to a single comparison of a pointer against a
compile-time constant, while the architecture specific versions tend to
do something more complex for historic reasons or get something wrong.

Type checking for __user annotations is handled inconsistently across
architectures, but this is easily simplified as well by using an inline
function that takes a 'const void __user *' argument. A handful of
callers need an extra __user annotation for this.

Some architectures had trick to use 33-bit or 65-bit arithmetic on the
addresses to calculate the overflow, however this simpler version uses
fewer registers, which means it can produce better object code in the
end despite needing a second (statically predicted) branch.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Mark Rutland <mark.rutland@arm.com> [arm64, asm-generic]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Stafford Horne <shorne@gmail.com>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25 09:36:05 +01:00
Arnd Bergmann
34737e2698 uaccess: add generic __{get,put}_kernel_nofault
Nine architectures are still missing __{get,put}_kernel_nofault:
alpha, ia64, microblaze, nds32, nios2, openrisc, sh, sparc32, xtensa.

Add a generic version that lets everything use the normal
copy_{from,to}_kernel_nofault() code based on these, removing the last
use of get_fs()/set_fs() from architecture-independent code.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25 09:36:05 +01:00
Jakub Kicinski
6b5567b1b2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 11:44:20 -08:00
Masahiro Yamada
4a3233c1a6
shmbuf.h: add asm/shmbuf.h to UAPI compile-test coverage
asm/shmbuf.h is currently excluded from the UAPI compile-test because of
the errors like follows:

    HDRTEST usr/include/asm/shmbuf.h
  In file included from ./usr/include/asm/shmbuf.h:6,
                   from <command-line>:
  ./usr/include/asm-generic/shmbuf.h:26:33: error: field ‘shm_perm’ has incomplete type
     26 |         struct ipc64_perm       shm_perm;       /* operation perms */
        |                                 ^~~~~~~~
  ./usr/include/asm-generic/shmbuf.h:27:9: error: unknown type name ‘size_t’
     27 |         size_t                  shm_segsz;      /* size of segment (bytes) */
        |         ^~~~~~
  ./usr/include/asm-generic/shmbuf.h:40:9: error: unknown type name ‘__kernel_pid_t’
     40 |         __kernel_pid_t          shm_cpid;       /* pid of creator */
        |         ^~~~~~~~~~~~~~
  ./usr/include/asm-generic/shmbuf.h:41:9: error: unknown type name ‘__kernel_pid_t’
     41 |         __kernel_pid_t          shm_lpid;       /* pid of last operator */
        |         ^~~~~~~~~~~~~~

The errors can be fixed by replacing size_t with __kernel_size_t and by
including proper headers.

Then, remove the no-header-test entry from user/include/Makefile.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-17 09:09:37 +01:00
Masahiro Yamada
72113d0a7d
signal.h: add linux/signal.h and asm/signal.h to UAPI compile-test coverage
linux/signal.h and asm/signal.h are currently excluded from the UAPI
compile-test because of the errors like follows:

    HDRTEST usr/include/asm/signal.h
  In file included from <command-line>:
  ./usr/include/asm/signal.h:103:9: error: unknown type name ‘size_t’
    103 |         size_t ss_size;
        |         ^~~~~~

The errors can be fixed by replacing size_t with __kernel_size_t.

Then, remove the no-header-test entries from user/include/Makefile.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-17 09:09:36 +01:00
Christophe Leroy
e1478d8eaf asm-generic: Refactor dereference_[kernel]_function_descriptor()
dereference_function_descriptor() and
dereference_kernel_function_descriptor() are identical on the
three architectures implementing them.

Make them common and put them out-of-line in kernel/extable.c
which is one of the users and has similar type of functions.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/449db09b2eba57f4ab05f80102a67d8675bc8bcd.1644928018.git.christophe.leroy@csgroup.eu
2022-02-16 23:25:11 +11:00
Christophe Leroy
0dc690e4ef asm-generic: Define 'func_desc_t' to commonly describe function descriptors
We have three architectures using function descriptors, each with its
own type and name.

Add a common typedef that can be used in generic code.

Also add a stub typedef for architecture without function descriptors,
to avoid a forest of #ifdefs.

It replaces the similar 'func_desc_t' previously defined in
arch/powerpc/kernel/module_64.c

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f1f91b142b3c1082bdc1586ce71c9bac1e75213c.1644928018.git.christophe.leroy@csgroup.eu
2022-02-16 23:25:11 +11:00
Christophe Leroy
a257cacc38 asm-generic: Define CONFIG_HAVE_FUNCTION_DESCRIPTORS
Replace HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR by a config option
named CONFIG_HAVE_FUNCTION_DESCRIPTORS and use it instead of
'dereference_function_descriptor' macro to know whether an
arch has function descriptors.

To limit churn in one of the following patches, use
an #ifdef/#else construct with empty first part
instead of an #ifndef in asm-generic/sections.h

On powerpc, make sure the config option matches the ABI used
by the compiler with a BUILD_BUG_ON() and add missing _CALL_ELF=2
when calling 'sparse' so that sparse sees the same piece of
code as GCC.

And include a helper to check whether an arch has function
descriptors or not : have_function_descriptors()

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4a0f11fb0ea74a3197bc44dd7ba25e53a24fd03d.1644928018.git.christophe.leroy@csgroup.eu
2022-02-16 23:25:11 +11:00
Helge Deller
dbd0b42350 parisc: Fix some apparent put_user() failures
After commit 4b9d2a731c ("parisc: Switch user access functions
to signal errors in r29 instead of r8") bash suddenly started
to report those warnings after login:

-bash: cannot set terminal process group (-1): Bad file descriptor
-bash: no job control in this shell

It turned out, that a function call inside a put_user(), e.g.:
put_user(vt_do_kdgkbmode(console), (int __user *)arg);
clobbered the error register (r29) and thus the put_user() call itself
seem to have failed.

Rearrange the C-code to pre-calculate the intermediate value
and then do the put_user().
Additionally prefer the "+" constraint on pu_err and gu_err registers
to tell the compiler that those operands are both read and written by
the assembly instruction.

Reported-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 4b9d2a731c ("parisc: Switch user access functions to signal errors in r29 instead of r8")
Signed-off-by: Helge Deller <deller@gmx.de>
2022-02-14 12:36:42 +01:00
Helge Deller
b160628e9e parisc: Show error if wrong 32/64-bit compiler is being used
It happens quite often that people use the wrong compiler to build the
kernel:

make ARCH=parisc   -> builds the 32-bit kernel
make ARCH=parisc64 -> builds the 64-bit kernel

This patch adds a sanity check which errors out with an instruction how
use the correct ARCH= option.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.15+
2022-02-13 22:45:53 +01:00
Akhmat Karakotov
26859240e4 txhash: Add socket option to control TX hash rethink behavior
Add the SO_TXREHASH socket option to control hash rethink behavior per socket.
When default mode is set, sockets disable rehash at initialization and use
sysctl option when entering listen state. setsockopt() overrides default
behavior.

Signed-off-by: Akhmat Karakotov <hmukos@yandex-team.ru>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31 15:05:25 +00:00
Arnd Bergmann
3e17314c22 agp: define proper stubs for empty helpers
The empty unmap_page_from_agp() macro causes a warning when
building with 'make W=1' on a couple of architectures:

drivers/char/agp/generic.c: In function 'agp_generic_destroy_page':
drivers/char/agp/generic.c:1265:28: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
 1265 |   unmap_page_from_agp(page);

Change the definitions to a 'do { } while (0)' construct to
make these more reliable.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-29 22:24:25 +01:00
Linus Torvalds
3689f9f8b0 bitmap patches for 5.17-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQHJBAABCgAzFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmHi+xgVHHl1cnkubm9y
 b3ZAZ21haWwuY29tAAoJELFEgP06H77IxdoMAMf3E+L51Ys/4iAiyJQNVoT3aIBC
 A8ZVOB9he1OA3o3wBNIRKmICHk+ovnfCWcXTr9fG/Ade2wJz88NAsGPQ1Phywb+s
 iGlpySllFN72RT9ZqtJhLEzgoHHOL0CzTW07TN9GJy4gQA2h2G9CTP+OmsQdnVqE
 m9Fn3PSlJ5lhzePlKfnln8rGZFgrriJakfEFPC79n/7an4+2Hvkb5rWigo7KQc4Z
 9YNqYUcHWZFUgq80adxEb9LlbMXdD+Z/8fCjOrAatuwVkD4RDt6iKD0mFGjHXGL7
 MZ9KRS8AfZXawmetk3jjtsV+/QkeS+Deuu7k0FoO0Th2QV7BGSDhsLXAS5By/MOC
 nfSyHhnXHzCsBMyVNrJHmNhEZoN29+tRwI84JX9lWcf/OLANcCofnP6f2UIX7tZY
 CAZAgVELp+0YQXdybrfzTQ8BT3TinjS/aZtCrYijRendI1GwUXcyl69vdOKqAHuk
 5jy8k/xHyp+ZWu6v+PyAAAEGowY++qhL0fmszA==
 =RKW4
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-5.17-rc1' of git://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - introduce for_each_set_bitrange()

 - use find_first_*_bit() instead of find_next_*_bit() where possible

 - unify for_each_bit() macros

* tag 'bitmap-5.17-rc1' of git://github.com/norov/linux:
  vsprintf: rework bitmap_list_string
  lib: bitmap: add performance test for bitmap_print_to_pagebuf
  bitmap: unify find_bit operations
  mm/percpu: micro-optimize pcpu_is_populated()
  Replace for_each_*_bit_from() with for_each_*_bit() where appropriate
  find: micro-optimize for_each_{set,clear}_bit()
  include/linux: move for_each_bit() macros from bitops.h to find.h
  cpumask: replace cpumask_next_* with cpumask_first_* where appropriate
  tools: sync tools/bitmap with mother linux
  all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate
  cpumask: use find_first_and_bit()
  lib: add find_first_and_bit()
  arch: remove GENERIC_FIND_FIRST_BIT entirely
  include: move find.h from asm_generic to linux
  bitops: move find_bit_*_le functions from le.h to find.h
  bitops: protect find_first_{,zero}_bit properly
2022-01-23 06:20:44 +02:00
Helge Deller
30f3089994 parisc: Fix missing prototype for 'toc_intr' warning in toc.c
Fix a missing prototype warning noticed by the kernel test robot.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-20 20:39:19 +01:00
Yury Norov
47d8c15615 include: move find.h from asm_generic to linux
find_bit API and bitmap API are closely related, but inclusion paths
are different - include/asm-generic and include/linux, correspondingly.
In the past it made a lot of troubles due to circular dependencies
and/or undefined symbols. Fix this by moving find.h under include/linux.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2022-01-15 08:47:31 -08:00
Bart Van Assche
6968e707d3 parisc: io: Improve the outb(), outw() and outl() macros
This patch fixes the following build error for source file
drivers/scsi/pcmcia/sym53c500_cs.c:

In file included from ./include/linux/bug.h:5,
                 from ./include/linux/cpumask.h:14,
                 from ./include/linux/mm_types_task.h:14,
                 from ./include/linux/mm_types.h:5,
                 from ./include/linux/buildid.h:5,
                 from ./include/linux/module.h:14,
                 from drivers/scsi/pcmcia/sym53c500_cs.c:42:
drivers/scsi/pcmcia/sym53c500_cs.c: In function ‘SYM53C500_intr’:
./arch/parisc/include/asm/bug.h:28:2: error: expected expression before ‘do’
   28 |  do {        \
      |  ^~
./arch/parisc/include/asm/io.h:276:20: note: in expansion of macro ‘BUG’
  276 | #define outb(x, y) BUG()
      |                    ^~~
drivers/scsi/pcmcia/sym53c500_cs.c:124:19: note: in expansion of macro ‘outb’
  124 | #define REG0(x)  (outb(C4_IMG, (x) + CONFIG4))
      |                   ^~~~
drivers/scsi/pcmcia/sym53c500_cs.c:362:2: note: in expansion of macro ‘REG0’
  362 |  REG0(port_base);
      |  ^~~~

Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07 01:29:22 +01:00
Helge Deller
712a270d2d parisc: Fix pdc_toc_pim_11 and pdc_toc_pim_20 definitions
The definitions for pdc_toc_pim_11 and pdc_toc_pim_20 are wrong since they
include an entry for a hversion field which doesn't exist in the specification.

Fix this and clean up some whitespaces so that the whole file will be in
sync with it's copy in the SeaBIOS-hppa sources.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.16
2022-01-07 01:29:22 +01:00
John David Anglin
d0585d742f parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes.  Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.

In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.

COW breaks occur randomly.  So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.

The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.

I have attempted to merge common code and streamline the fast path.  In the
futex code, we only compute the spinlock address once.

I eliminated some debug code in the original CAS routine that just made the
flow more complicated.

I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.

I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.

Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.

Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.

In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.

In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.

In v4, I fixed a typo in depi_safe line.

In v5, I moved the code to disable/enable page faults inside the spinlocks.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07 01:29:21 +01:00
Helge Deller
4b9d2a731c parisc: Switch user access functions to signal errors in r29 instead of r8
Use register r29 instead of register r8 to signal faults when accessing
user memory. In case of faults, the fixup routine will store -EFAULT in
this register.

This change saves up to 752 bytes on a 32bit kernel, partly because the
compiler doesn't need to save and restore the old r8 value on the stack.

bloat-o-meter results for usage with r29 register:
add/remove: 0/0 grow/shrink: 23/86 up/down: 228/-980 (-752)

bloat-o-meter results for usage with r28 register:
add/remove: 0/0 grow/shrink: 28/83 up/down: 296/-956 (-660)

Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07 01:29:21 +01:00
John David Anglin
db19c6f1a2 parisc: Fix lpa and lpa_user defines
While working on the rewrite to the light-weight syscall and futex code, I
experimented with using a hash index based on the user physical address of
atomic variable. This exposed two problems with the lpa and lpa_user defines.

Because of the copy instruction, the pa argument needs to be an early clobber
argument. This prevents gcc from allocating the va and pa arguments to the same
register.

Secondly, the lpa instruction can cause a page fault so we need to catch
exceptions.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Fixes: 116d753308 ("parisc: Use lpa instruction to load physical addresses in driver code")
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.2+
2022-01-07 01:29:20 +01:00
John David Anglin
45458aa49a parisc: Define depi_safe macro
The depi instruction is similar to the extru instruction on 64-bit machines.
It leaves the most-significant 32 bits of the target register in an undefined
state.  On 64-bit machines, the macro uses depdi to perform safe deposits in
the least-significant 32 bits.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07 01:29:20 +01:00
John David Anglin
d3a5a68cff parisc: Fix mask used to select futex spinlock
The address bits used to select the futex spinlock need to match those used in
the LWS code in syscall.S. The mask 0x3f8 only selects 7 bits.  It should
select 8 bits.

This change fixes the glibc nptl/tst-cond24 and nptl/tst-cond25 tests.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Fixes: 53a42b6324 ("parisc: Switch to more fine grained lws locks")
Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Helge Deller <deller@gmx.de>
2021-12-21 21:15:59 +01:00
Linus Torvalds
79941493ff Fixes for 5.16 folios:
- Fix compilation warnings on csky and sparc
  - Rename multipage folios to large folios
  - Rename AS_THP_SUPPORT and FS_THP_SUPPORT
  - Add functions to zero portions of a folio
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmGem6wACgkQDpNsjXcp
 gj7uvwgAjNqDWOVgwYU98daN6nKQQf5Vv35f0bzeKcKcHIOEWZ2+MUeXkI55h8TD
 ss5L3O86sPtQmpKUQJJChZC4AhpIPRyjPA0JW6vYqXQd912M331WpGgFFyX5eI+3
 OxfKLRULmopeWP1RjWmkWqlhYQHL5OLgAMC4VaBSfDHd1UMRf+F9JNm9qR7GCp9Q
 Vb0qcmBMaQYt/K5sWRQyPUACVTF+27RLKAs+Om37NGekv1UqgOPMzi9nAyi9RjCi
 rRY6oGupNgC+Y41jzlpaNoL71RPS92H769FBh/Fe4qu55VSPjfcN77qAnVhX5Ykn
 4RhzZcEUoqlx9xG9xynk0mmbx2Bf4g==
 =kvqM
 -----END PGP SIGNATURE-----

Merge tag 'folio-5.16b' of git://git.infradead.org/users/willy/pagecache

Pull folio fixes from Matthew Wilcox:
 "In the course of preparing the folio changes for iomap for next merge
  window, we discovered some problems that would be nice to address now:

   - Renaming multi-page folios to large folios.

     mapping_multi_page_folio_support() is just a little too long, so we
     settled on mapping_large_folio_support(). That meant renaming, eg
     folio_test_multi() to folio_test_large().

     Rename AS_THP_SUPPORT to match

   - I hadn't included folio wrappers for zero_user_segments(), etc.
     Also, multi-page^W^W large folio support is now independent of
     CONFIG_TRANSPARENT_HUGEPAGE, so machines with HIGHMEM always need
     to fall back to the out-of-line zero_user_segments().

     Remove FS_THP_SUPPORT to match

   - The build bots finally got round to telling me that I missed a
     couple of architectures when adding flush_dcache_folio(). Christoph
     suggested that we just add linux/cacheflush.h and not rely on
     asm-generic/cacheflush.h"

* tag 'folio-5.16b' of git://git.infradead.org/users/willy/pagecache:
  mm: Add functions to zero portions of a folio
  fs: Rename AS_THP_SUPPORT and mapping_thp_support
  fs: Remove FS_THP_SUPPORT
  mm: Remove folio_test_single
  mm: Rename folio_test_multi to folio_test_large
  Add linux/cacheflush.h
2021-11-25 10:13:56 -08:00
Helge Deller
169d1a4a2a parisc: Provide an extru_safe() macro to extract unsigned bits
The extru instruction leaves the most significant 32 bits of the
target register in an undefined state on PA 2.0 systems.
Provide a macro to safely use extru on 32- and 64-bit machines.

Suggested-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-22 07:37:31 +01:00
Helge Deller
79df39d535 Revert "parisc: Reduce sigreturn trampoline to 3 instructions"
This reverts commit e4f2006f12.

This patch shows problems with signal handling. Revert it for now.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v5.15
2021-11-18 09:02:32 +01:00
Helge Deller
4017b230c9 parisc: Wrap assembler related defines inside __ASSEMBLY__
Building allmodconfig shows errors in the gpu/drm/msm snapdragon drivers,
because a COND() define is used there which conflicts with the COND() for
PA-RISC assembly.  Although the snapdragon driver isn't relevant for parisc, it
is nevertheless compiled when CONFIG_COMPILE_TEST is defined.

Move the COND() define and other PA-RISC mnemonics inside the #ifdef
__ASSEMBLY__ part to avoid this conflict.

Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: kernel test robot <lkp@intel.com>
2021-11-18 09:02:00 +01:00
Helge Deller
4d7804d201 parisc: Include stringify.h to avoid build error in crypto/api.c
Include stringify.h to avoid this build error:
 arch/parisc/include/asm/jump_label.h: error: expected ':' before '__stringify'
 arch/parisc/include/asm/jump_label.h: error: label 'l_yes' defined but not used [-Werror=unused-label]

Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: kernel test robot <lkp@intel.com>
2021-11-18 09:01:36 +01:00
Matthew Wilcox (Oracle)
522a0032af Add linux/cacheflush.h
Many architectures do not include asm-generic/cacheflush.h, so turn
the includes on their head and add linux/cacheflush.h which includes
asm/cacheflush.h.

Move the flush_dcache_folio() declaration from asm-generic/cacheflush.h
to linux/cacheflush.h and change linux/highmem.h to include
linux/cacheflush.h instead of asm/cacheflush.h so that all necessary
places will see flush_dcache_folio().

More functions should have their default implementations moved in the
future, but those are for follow-on patches.  This fixes csky, sparc and
sparc64 which were missed in the commit which added flush_dcache_folio().

Fixes: 08b0b0059b ("mm: Add flush_dcache_folio()")
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-11-17 10:36:15 -05:00
John David Anglin
38860b2c8b parisc: Flush kernel data mapping in set_pte_at() when installing pte for user page
For years, there have been random segmentation faults in userspace on
SMP PA-RISC machines.  It occurred to me that this might be a problem in
set_pte_at().  MIPS and some other architectures do cache flushes when
installing PTEs with the present bit set.

Here I have adapted the code in update_mmu_cache() to flush the kernel
mapping when the kernel flush is deferred, or when the kernel mapping
may alias with the user mapping.  This simplifies calls to
update_mmu_cache().

I also changed the barrier in set_pte() from a compiler barrier to a
full memory barrier.  I know this change is not sufficient to fix the
problem.  It might not be needed.

I have had a few days of operation with 5.14.16 to 5.15.1 and haven't
seen any random segmentation faults on rp3440 or c8000 so far.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@kernel.org # 5.12+
2021-11-13 22:10:56 +01:00
Ard Biesheuvel
2a2e8202c7 parisc: move CPU field back into thread_info
In commit 2214c0e772 ("parisc: Move thread_info into task struct")
PA-RISC gained support for THREAD_INFO_IN_TASK while changes were
already underway to keep the CPU field in thread_info rather than move
it into task_struct when THREAD_INFO_IN_TASK is enabled. The result is a
broken build for all PA-RISC configs that enable SMP.

So let's partially revert that commit, and get rid of the ugly hack to
get at the offset of task_struct::cpu without having to include
linux/sched.h, and put the CPU field back where it was before.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Fixes: bcf9033e54 ("sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-04 11:21:47 +01:00
Dave Anglin
7e992711dd parisc: Don't disable interrupts in cmpxchg and futex operations
I no longer think interrupts can be disabled in the futex and cmpxchg
operations because of COW breaks.  This not ideal but I suspect it's the
best we can do.

For the cmpxchg operations in syscall.S, we rely on the code to not
schedule off the gateway page.  For the futex, I added code to disable
preemption.

So far, I haven't seen the warnings with the attached change but the
change is only lightly tested.

Signed-off-by: Dave Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-04 11:21:20 +01:00
Linus Torvalds
fc02cb2b37 Core:
- Remove socket skb caches
 
  - Add a SO_RESERVE_MEM socket op to forward allocate buffer space
    and avoid memory accounting overhead on each message sent
 
  - Introduce managed neighbor entries - added by control plane and
    resolved by the kernel for use in acceleration paths (BPF / XDP
    right now, HW offload users will benefit as well)
 
  - Make neighbor eviction on link down controllable by userspace
    to work around WiFi networks with bad roaming implementations
 
  - vrf: Rework interaction with netfilter/conntrack
 
  - fq_codel: implement L4S style ce_threshold_ect1 marking
 
  - sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
 
 BPF:
 
  - Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
    as implemented in LLVM14
 
  - Introduce bpf_get_branch_snapshot() to capture Last Branch Records
 
  - Implement variadic trace_printk helper
 
  - Add a new Bloomfilter map type
 
  - Track <8-byte scalar spill and refill
 
  - Access hw timestamp through BPF's __sk_buff
 
  - Disallow unprivileged BPF by default
 
  - Document BPF licensing
 
 Netfilter:
 
  - Introduce egress hook for looking at raw outgoing packets
 
  - Allow matching on and modifying inner headers / payload data
 
  - Add NFT_META_IFTYPE to match on the interface type either from
    ingress or egress
 
 Protocols:
 
  - Multi-Path TCP:
    - increase default max additional subflows to 2
    - rework forward memory allocation
    - add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
 
  - MCTP flow support allowing lower layer drivers to configure msg
    muxing as needed
 
  - Automatic Multicast Tunneling (AMT) driver based on RFC7450
 
  - HSR support the redbox supervision frames (IEC-62439-3:2018)
 
  - Support for the ip6ip6 encapsulation of IOAM
 
  - Netlink interface for CAN-FD's Transmitter Delay Compensation
 
  - Support SMC-Rv2 eliminating the current same-subnet restriction,
    by exploiting the UDP encapsulation feature of RoCE adapters
 
  - TLS: add SM4 GCM/CCM crypto support
 
  - Bluetooth: initial support for link quality and audio/codec
    offload
 
 Driver APIs:
 
  - Add a batched interface for RX buffer allocation in AF_XDP
    buffer pool
 
  - ethtool: Add ability to control transceiver modules' power mode
 
  - phy: Introduce supported interfaces bitmap to express MAC
    capabilities and simplify PHY code
 
  - Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
 
 New drivers:
 
  - WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
 
  - Ethernet driver for ASIX AX88796C SPI device (x88796c)
 
 Drivers:
 
  - Broadcom PHYs
    - support 72165, 7712 16nm PHYs
    - support IDDQ-SR for additional power savings
 
  - PHY support for QCA8081, QCA9561 PHYs
 
  - NXP DPAA2: support for IRQ coalescing
 
  - NXP Ethernet (enetc): support for software TCP segmentation
 
  - Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
    Gigabit-capable IP found on RZ/G2L SoC
 
  - Intel 100G Ethernet
    - support for eswitch offload of TC/OvS flow API, including
      offload of GRE, VxLAN, Geneve tunneling
    - support application device queues - ability to assign Rx and Tx
      queues to application threads
    - PTP and PPS (pulse-per-second) extensions
 
  - Broadcom Ethernet (bnxt)
    - devlink health reporting and device reload extensions
 
  - Mellanox Ethernet (mlx5)
    - offload macvlan interfaces
    - support HW offload of TC rules involving OVS internal ports
    - support HW-GRO and header/data split
    - support application device queues
 
  - Marvell OcteonTx2:
    - add XDP support for PF
    - add PTP support for VF
 
  - Qualcomm Ethernet switch (qca8k): support for QCA8328
 
  - Realtek Ethernet DSA switch (rtl8366rb)
    - support bridge offload
    - support STP, fast aging, disabling address learning
    - support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
 
  - Mellanox Ethernet/IB switch (mlxsw)
    - multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
    - offload root TBF qdisc as port shaper
    - support multiple routing interface MAC address prefixes
    - support for IP-in-IP with IPv6 underlay
 
  - MediaTek WiFi (mt76)
    - mt7921 - ASPM, 6GHz, SDIO and testmode support
    - mt7915 - LED and TWT support
 
  - Qualcomm WiFi (ath11k)
    - include channel rx and tx time in survey dump statistics
    - support for 80P80 and 160 MHz bandwidths
    - support channel 2 in 6 GHz band
    - spectral scan support for QCN9074
    - support for rx decapsulation offload (data frames in 802.3
      format)
 
  - Qualcomm phone SoC WiFi (wcn36xx)
    - enable Idle Mode Power Save (IMPS) to reduce power consumption
      during idle
 
  - Bluetooth driver support for MediaTek MT7922 and MT7921
 
  - Enable support for AOSP Bluetooth extension in Qualcomm WCN399x
    and Realtek 8822C/8852A
 
  - Microsoft vNIC driver (mana)
    - support hibernation and kexec
 
  - Google vNIC driver (gve)
    - support for jumbo frames
    - implement Rx page reuse
 
 Refactor:
 
  - Make all writes to netdev->dev_addr go thru helpers, so that we
    can add this address to the address rbtree and handle the updates
 
  - Various TCP cleanups and optimizations including improvements
    to CPU cache use
 
  - Simplify the gnet_stats, Qdisc stats' handling and remove
    qdisc->running sequence counter
 
  - Driver changes and API updates to address devlink locking
    deficiencies
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGAzX4ACgkQMUZtbf5S
 IrvW3g//Q0ZLrOuHK9pZ8sCXMMhDj8qL6ajm0otMddHWA/+1UglwVBKFhsajfxOf
 wJ/5LZis+XKLpLqKTU5chKVfn39HuDGe/D3l+egi01Gv5BW0+XzEhagfyR5tJX5z
 wsGG5CXO/we/laVSzRiFtwwVEKHKN20YC+tIQwYOYP5Wy3q4G7qDsFhT7GqgsGCS
 n74QUEAIB5Tz0ODWFqLtbsySzIurXrskibwt5T9bvAAlPw/lCU68mmG+NVJ7VddO
 lBbNkLMOo8yW9Ci20H09SrYd4jZTmMARo9tsFO1tAvAMk7qpn0Wd8pnOYTjFFoMD
 +qjiFSVMh7E0JGb8Y7NCvwaB99suAK5rfGP68Xwe62DfP7vYWEx4pZGxBP19F4ld
 6Kn1ME33BX9rUF9tBecf0bdKfJUwB2Q2Xou/b9laG04bwiqsc9iG5FQq1C46lnLZ
 QdzNiS1My4dJMczkWt66HF3Kx30ibwHfvKMIHjf4PqkzEatkv6Y6SBZ57KXL+Lde
 0BQSFhbf0tm2Gf55etzrczLElI3uqHSFWUNZZ2Bt6WmzO1e6tpV9nAtRWF4C/dFg
 QDpLJtOOOY65uq+qz09zoPfv2lem868SrCAuFrVn99bEpYjx/CGNFDeEI02l6jyr
 84eUxd364UcbIk3fc+eTGdXHLQNVk30G0AHVBBxaWNIidwfqXeE=
 =srde
 -----END PGP SIGNATURE-----

Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - Remove socket skb caches

   - Add a SO_RESERVE_MEM socket op to forward allocate buffer space and
     avoid memory accounting overhead on each message sent

   - Introduce managed neighbor entries - added by control plane and
     resolved by the kernel for use in acceleration paths (BPF / XDP
     right now, HW offload users will benefit as well)

   - Make neighbor eviction on link down controllable by userspace to
     work around WiFi networks with bad roaming implementations

   - vrf: Rework interaction with netfilter/conntrack

   - fq_codel: implement L4S style ce_threshold_ect1 marking

   - sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()

  BPF:

   - Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
     as implemented in LLVM14

   - Introduce bpf_get_branch_snapshot() to capture Last Branch Records

   - Implement variadic trace_printk helper

   - Add a new Bloomfilter map type

   - Track <8-byte scalar spill and refill

   - Access hw timestamp through BPF's __sk_buff

   - Disallow unprivileged BPF by default

   - Document BPF licensing

  Netfilter:

   - Introduce egress hook for looking at raw outgoing packets

   - Allow matching on and modifying inner headers / payload data

   - Add NFT_META_IFTYPE to match on the interface type either from
     ingress or egress

  Protocols:

   - Multi-Path TCP:
      - increase default max additional subflows to 2
      - rework forward memory allocation
      - add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS

   - MCTP flow support allowing lower layer drivers to configure msg
     muxing as needed

   - Automatic Multicast Tunneling (AMT) driver based on RFC7450

   - HSR support the redbox supervision frames (IEC-62439-3:2018)

   - Support for the ip6ip6 encapsulation of IOAM

   - Netlink interface for CAN-FD's Transmitter Delay Compensation

   - Support SMC-Rv2 eliminating the current same-subnet restriction, by
     exploiting the UDP encapsulation feature of RoCE adapters

   - TLS: add SM4 GCM/CCM crypto support

   - Bluetooth: initial support for link quality and audio/codec offload

  Driver APIs:

   - Add a batched interface for RX buffer allocation in AF_XDP buffer
     pool

   - ethtool: Add ability to control transceiver modules' power mode

   - phy: Introduce supported interfaces bitmap to express MAC
     capabilities and simplify PHY code

   - Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks

  New drivers:

   - WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)

   - Ethernet driver for ASIX AX88796C SPI device (x88796c)

  Drivers:

   - Broadcom PHYs
      - support 72165, 7712 16nm PHYs
      - support IDDQ-SR for additional power savings

   - PHY support for QCA8081, QCA9561 PHYs

   - NXP DPAA2: support for IRQ coalescing

   - NXP Ethernet (enetc): support for software TCP segmentation

   - Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
     Gigabit-capable IP found on RZ/G2L SoC

   - Intel 100G Ethernet
      - support for eswitch offload of TC/OvS flow API, including
        offload of GRE, VxLAN, Geneve tunneling
      - support application device queues - ability to assign Rx and Tx
        queues to application threads
      - PTP and PPS (pulse-per-second) extensions

   - Broadcom Ethernet (bnxt)
      - devlink health reporting and device reload extensions

   - Mellanox Ethernet (mlx5)
      - offload macvlan interfaces
      - support HW offload of TC rules involving OVS internal ports
      - support HW-GRO and header/data split
      - support application device queues

   - Marvell OcteonTx2:
      - add XDP support for PF
      - add PTP support for VF

   - Qualcomm Ethernet switch (qca8k): support for QCA8328

   - Realtek Ethernet DSA switch (rtl8366rb)
      - support bridge offload
      - support STP, fast aging, disabling address learning
      - support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch

   - Mellanox Ethernet/IB switch (mlxsw)
      - multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
      - offload root TBF qdisc as port shaper
      - support multiple routing interface MAC address prefixes
      - support for IP-in-IP with IPv6 underlay

   - MediaTek WiFi (mt76)
      - mt7921 - ASPM, 6GHz, SDIO and testmode support
      - mt7915 - LED and TWT support

   - Qualcomm WiFi (ath11k)
      - include channel rx and tx time in survey dump statistics
      - support for 80P80 and 160 MHz bandwidths
      - support channel 2 in 6 GHz band
      - spectral scan support for QCN9074
      - support for rx decapsulation offload (data frames in 802.3
        format)

   - Qualcomm phone SoC WiFi (wcn36xx)
      - enable Idle Mode Power Save (IMPS) to reduce power consumption
        during idle

   - Bluetooth driver support for MediaTek MT7922 and MT7921

   - Enable support for AOSP Bluetooth extension in Qualcomm WCN399x and
     Realtek 8822C/8852A

   - Microsoft vNIC driver (mana)
      - support hibernation and kexec

   - Google vNIC driver (gve)
      - support for jumbo frames
      - implement Rx page reuse

  Refactor:

   - Make all writes to netdev->dev_addr go thru helpers, so that we can
     add this address to the address rbtree and handle the updates

   - Various TCP cleanups and optimizations including improvements to
     CPU cache use

   - Simplify the gnet_stats, Qdisc stats' handling and remove
     qdisc->running sequence counter

   - Driver changes and API updates to address devlink locking
     deficiencies"

* tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2122 commits)
  Revert "net: avoid double accounting for pure zerocopy skbs"
  selftests: net: add arp_ndisc_evict_nocarrier
  net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
  net: arp: introduce arp_evict_nocarrier sysctl parameter
  libbpf: Deprecate AF_XDP support
  kbuild: Unify options for BTF generation for vmlinux and modules
  selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
  bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
  bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
  net: vmxnet3: remove multiple false checks in vmxnet3_ethtool.c
  net: avoid double accounting for pure zerocopy skbs
  tcp: rename sk_wmem_free_skb
  netdevsim: fix uninit value in nsim_drv_configure_vfs()
  selftests/bpf: Fix also no-alu32 strobemeta selftest
  bpf: Add missing map_delete_elem method to bloom filter map
  selftests/bpf: Add bloom map success test for userspace calls
  bpf: Add alignment padding for "map_extra" + consolidate holes
  bpf: Bloom filter map naming fixups
  selftests/bpf: Add test cases for struct_ops prog
  bpf: Add dummy BPF STRUCT_OPS for test purpose
  ...
2021-11-02 06:20:58 -07:00
Linus Torvalds
552ebfe022 parisc architecture updates for kernel v5.16-rc1
Lots of new features and fixes:
 * Added TOC (table of content) support, which is a debugging feature which is
   either initiated by pressing the TOC button or via command in the BMC. If
   pressed the Linux built-in KDB/KGDB will be called (Sven Schnelle)
 * Fix CONFIG_PREEMPT (Sven)
 * Fix unwinder on 64-bit kernels (Sven)
 * Various kgdb fixes (Sven)
 * Added KFENCE support (me)
 * Switch to ARCH_STACKWALK implementation (me)
 * Fix ptrace check on syscall return (me)
 * Fix kernel crash with fixmaps on PA1.x machines (me)
 * Move thread_info into task struct, aka CONFIG_THREAD_INFO_IN_TASK (me)
 * Updated defconfigs
 * Smaller cleanups, including Makefile cleanups (Masahiro Yamada),
   use kthread_run() macro (Cai Huoqing), use swap() macro (Yihao Han).
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCYYALowAKCRD3ErUQojoP
 X3o1APwK7wJBdFKAV2hMEouFNZLz2ZTkSQrCMhPTxRupkwJ71QD+JeXvyPLZBLIu
 hlvi9mw9DKUKgCV+/Z65s8zjSHYC4wg=
 =A4Ci
 -----END PGP SIGNATURE-----

Merge tag 'for-5.16/parisc-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc updates from Helge Deller:
 "Lots of new features and fixes:

   - Added TOC (table of content) support, which is a debugging feature
     which is either initiated by pressing the TOC button or via command
     in the BMC. If pressed the Linux built-in KDB/KGDB will be called
     (Sven Schnelle)

   - Fix CONFIG_PREEMPT (Sven)

   - Fix unwinder on 64-bit kernels (Sven)

   - Various kgdb fixes (Sven)

   - Added KFENCE support (me)

   - Switch to ARCH_STACKWALK implementation (me)

   - Fix ptrace check on syscall return (me)

   - Fix kernel crash with fixmaps on PA1.x machines (me)

   - Move thread_info into task struct, aka CONFIG_THREAD_INFO_IN_TASK
     (me)

   - Updated defconfigs

   - Smaller cleanups, including Makefile cleanups (Masahiro Yamada),
     use kthread_run() macro (Cai Huoqing), use swap() macro (Yihao
     Han)"

* tag 'for-5.16/parisc-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (36 commits)
  parisc: Fix set_fixmap() on PA1.x CPUs
  parisc: Use swap() to swap values in setup_bootmem()
  parisc: Update defconfigs
  parisc: decompressor: clean up Makefile
  parisc: decompressor: remove repeated depenency of misc.o
  parisc: Remove unused constants from asm-offsets.c
  parisc/ftrace: use static key to enable/disable function graph tracer
  parisc/ftrace: set function trace function
  parisc: Make use of the helper macro kthread_run()
  parisc: mark xchg functions notrace
  parisc: enhance warning regarding usage of O_NONBLOCK
  parisc: Drop ifdef __KERNEL__ from non-uapi kernel headers
  parisc: Use PRIV_USER and PRIV_KERNEL in ptrace.h
  parisc: Use PRIV_USER in syscall.S
  parisc/kgdb: add kgdb_roundup() to make kgdb work with idle polling
  parisc: Move thread_info into task struct
  parisc: add support for TOC (transfer of control)
  parisc/firmware: add functions to retrieve TOC data
  parisc: add PIM TOC data structures
  parisc: move virt_map macro to assembly.h
  ...
2021-11-01 16:51:13 -07:00
Linus Torvalds
9a7e0a90a4 Scheduler updates:
- Revert the printk format based wchan() symbol resolution as it can leak
    the raw value in case that the symbol is not resolvable.
 
  - Make wchan() more robust and work with all kind of unwinders by
    enforcing that the task stays blocked while unwinding is in progress.
 
  - Prevent sched_fork() from accessing an invalid sched_task_group
 
  - Improve asymmetric packing logic
 
  - Extend scheduler statistics to RT and DL scheduling classes and add
    statistics for bandwith burst to the SCHED_FAIR class.
 
  - Properly account SCHED_IDLE entities
 
  - Prevent a potential deadlock when initial priority is assigned to a
    newly created kthread. A recent change to plug a race between cpuset and
    __sched_setscheduler() introduced a new lock dependency which is now
    triggered. Break the lock dependency chain by moving the priority
    assignment to the thread function.
 
  - Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.
 
  - Improve idle balancing in general and especially for NOHZ enabled
    systems.
 
  - Provide proper interfaces for live patching so it does not have to
    fiddle with scheduler internals.
 
  - Add cluster aware scheduling support.
 
  - A small set of tweaks for RT (irqwork, wait_task_inactive(), various
    scheduler options and delaying mmdrop)
 
  - The usual small tweaks and improvements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/OUkTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoR/5D/9ikdGNpKg9osNqJ3GjAmxsK6kVkB29
 iFe2k8pIpWDToWQf/wQRGih4Yj3Cl49QSnZcPIibh2/12EB1qrrW6iSPJkInz8Ec
 /1LS5/Vewn2OyoxyXZjdvGC5gTXEodSbIazASvX7nvdMeI4gsAsL5etzrMJirT/t
 aymqvr7zovvywrwMTQJrGjUMo9l4ewE8tafMNNhRu1BHU1U4ojM9yvThyRAAcmp7
 3Xy49A+Yq3IgrvYI4u8FMK5Zh08KaxSFjiLhePGm/bF+wSfYmWop2TP1jY05W2Uo
 ti8hfbJMUoFRYuMxAiEldkItnc0wV4M9PtWZZ/x+B71bs65Y4Zjt9cW+rxJv2+m1
 vzV31EsQwGnOti072dzWN4c/cZqngVXAjaNtErvDwJUr+Tw1ayv9KUvuodMQqZY6
 mu68bFUO2kV9EMe1CBOv51Uy1RGHyLj3rlNqrkw+Xp5ISE9Ad2vhUEiRp5bQx5Ci
 V/XFhGZkGUluh0vccrdFlNYZwhj8cZEzkOPCnPSeZ+bq8SyZE6xuHH/lTP1CJCOy
 s800rW1huM+kgV+zRN8adDkGXibAk9N3RtVGnQXmuEy8gB9LZmQg+JeM2wsc9B+6
 i0gdqZnsjNAfoK+BBAG4holxptSL8/eOJsFH8ZNIoxQ+iqooyPx9tFX7yXnRTBQj
 d2qWG7UvoseT+g==
 =fgtS
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Thomas Gleixner:

 - Revert the printk format based wchan() symbol resolution as it can
   leak the raw value in case that the symbol is not resolvable.

 - Make wchan() more robust and work with all kind of unwinders by
   enforcing that the task stays blocked while unwinding is in progress.

 - Prevent sched_fork() from accessing an invalid sched_task_group

 - Improve asymmetric packing logic

 - Extend scheduler statistics to RT and DL scheduling classes and add
   statistics for bandwith burst to the SCHED_FAIR class.

 - Properly account SCHED_IDLE entities

 - Prevent a potential deadlock when initial priority is assigned to a
   newly created kthread. A recent change to plug a race between cpuset
   and __sched_setscheduler() introduced a new lock dependency which is
   now triggered. Break the lock dependency chain by moving the priority
   assignment to the thread function.

 - Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.

 - Improve idle balancing in general and especially for NOHZ enabled
   systems.

 - Provide proper interfaces for live patching so it does not have to
   fiddle with scheduler internals.

 - Add cluster aware scheduling support.

 - A small set of tweaks for RT (irqwork, wait_task_inactive(), various
   scheduler options and delaying mmdrop)

 - The usual small tweaks and improvements all over the place

* tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (69 commits)
  sched/fair: Cleanup newidle_balance
  sched/fair: Remove sysctl_sched_migration_cost condition
  sched/fair: Wait before decaying max_newidle_lb_cost
  sched/fair: Skip update_blocked_averages if we are defering load balance
  sched/fair: Account update_blocked_averages in newidle_balance cost
  x86: Fix __get_wchan() for !STACKTRACE
  sched,x86: Fix L2 cache mask
  sched/core: Remove rq_relock()
  sched: Improve wake_up_all_idle_cpus() take #2
  irq_work: Also rcuwait for !IRQ_WORK_HARD_IRQ on PREEMPT_RT
  irq_work: Handle some irq_work in a per-CPU thread on PREEMPT_RT
  irq_work: Allow irq_work_sync() to sleep if irq_work() no IRQ support.
  sched/rt: Annotate the RT balancing logic irqwork as IRQ_WORK_HARD_IRQ
  sched: Add cluster scheduler level for x86
  sched: Add cluster scheduler level in core and related Kconfig for ARM64
  topology: Represent clusters of CPUs within a die
  sched: Disable -Wunused-but-set-variable
  sched: Add wrapper for get_wchan() to keep task blocked
  x86: Fix get_wchan() to support the ORC unwinder
  proc: Use task_is_running() for wchan in /proc/$pid/stat
  ...
2021-11-01 13:48:52 -07:00
Linus Torvalds
595b28fb0c Locking updates:
- Move futex code into kernel/futex/ and split up the kitchen sink into
    seperate files to make integration of sys_futex_waitv() simpler.
 
  - Add a new sys_futex_waitv() syscall which allows to wait on multiple
    futexes. The main use case is emulating Windows' WaitForMultipleObjects
    which allows Wine to improve the performance of Windows Games. Also
    native Linux games can benefit from this interface as this is a common
    wait pattern for this kind of applications.
 
  - Add context to ww_mutex_trylock() to provide a path for i915 to rework
    their eviction code step by step without making lockdep upset until the
    final steps of rework are completed. It's also useful for regulator and
    TTM to avoid dropping locks in the non contended path.
 
  - Lockdep and might_sleep() cleanups and improvements
 
  - A few improvements for the RT substitutions.
 
  - The usual small improvements and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/FTITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVNZD/9vIm3Bu1Coz8tbNXz58AiCYq9Y/vp5
 mzFgSzz+VJTkW5Vh8jo5Uel4rCKZyt+rL276EoaRPzYl8KFtWDbpK3qd3PrXKqTX
 At49JO4ttAMJUHIBQ6vblEkykmfEd9YPU1uSWk5roJ+s7Jmr5VWnu0FEWHP00As5
 tWOca/TM0ei9kof26V2fl5aecTGII4i4Zsvy+LPsXtI+TnmP0gSBcGAS/5UnZTtJ
 vQRWTR3ojoYvh5iTmNqbaURYoQLe2j8yscn1DSW1CABWVmP12eDWs+N7jRP4b5S9
 73xOv5P7vpva41wxrK2ir5iNkpsLE97VL2JOHTW8nm7orblfiuxHLTCkTjEdd2pO
 h8blI2IBizEB3JYn2BMkOAaZQOSjN8hd6Ye/b2B4AMEGWeXEoEv6eVy/orYKCluQ
 XDqGn47Vce/SYmo5vfTB8VMt6nANx8PKvOP3IvjHInYEQBgiT6QrlUw3RRkXBp5s
 clQkjYYwjAMVIXowcCrdhoKjMROzi6STShVwHwGL8MaZXqr8Vl6BUO9ckU0pY+4C
 F000Hzwxi8lGEQ9k+P+BnYOEzH5osCty8lloKiQ/7ciX6T+CZHGJPGK/iY4YL8P5
 C3CJWMsHCqST7DodNFJmdfZt99UfIMmEhshMDduU9AAH0tHCn8vOu0U6WvCtpyBp
 BvHj68zteAtlYg==
 =RZ4x
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Thomas Gleixner:

 - Move futex code into kernel/futex/ and split up the kitchen sink into
   seperate files to make integration of sys_futex_waitv() simpler.

 - Add a new sys_futex_waitv() syscall which allows to wait on multiple
   futexes.

   The main use case is emulating Windows' WaitForMultipleObjects which
   allows Wine to improve the performance of Windows Games. Also native
   Linux games can benefit from this interface as this is a common wait
   pattern for this kind of applications.

 - Add context to ww_mutex_trylock() to provide a path for i915 to
   rework their eviction code step by step without making lockdep upset
   until the final steps of rework are completed. It's also useful for
   regulator and TTM to avoid dropping locks in the non contended path.

 - Lockdep and might_sleep() cleanups and improvements

 - A few improvements for the RT substitutions.

 - The usual small improvements and cleanups.

* tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  locking: Remove spin_lock_flags() etc
  locking/rwsem: Fix comments about reader optimistic lock stealing conditions
  locking: Remove rcu_read_{,un}lock() for preempt_{dis,en}able()
  locking/rwsem: Disable preemption for spinning region
  docs: futex: Fix kernel-doc references
  futex: Fix PREEMPT_RT build
  futex2: Documentation: Document sys_futex_waitv() uAPI
  selftests: futex: Test sys_futex_waitv() wouldblock
  selftests: futex: Test sys_futex_waitv() timeout
  selftests: futex: Add sys_futex_waitv() test
  futex,arm: Wire up sys_futex_waitv()
  futex,x86: Wire up sys_futex_waitv()
  futex: Implement sys_futex_waitv()
  futex: Simplify double_lock_hb()
  futex: Split out wait/wake
  futex: Split out requeue
  futex: Rename mark_wake_futex()
  futex: Rename: match_futex()
  futex: Rename: hb_waiter_{inc,dec,pending}()
  futex: Split out PI futex
  ...
2021-11-01 13:15:36 -07:00
Helge Deller
3759778e6b parisc: enhance warning regarding usage of O_NONBLOCK
Instead of showing only the very first application which needs
recompile, show all of them, but print them only once.

Includes typo fix noticed by Colin Ian King.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
2021-11-01 07:36:00 +01:00
Helge Deller
0760a9157b parisc: Drop ifdef __KERNEL__ from non-uapi kernel headers
Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-01 07:36:00 +01:00
Helge Deller
8d90dbfd4c parisc: Use PRIV_USER and PRIV_KERNEL in ptrace.h
Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-01 07:36:00 +01:00
Helge Deller
2214c0e772 parisc: Move thread_info into task struct
This implements the CONFIG_THREAD_INFO_IN_TASK option.

With this change:
- before thread_info was part of the stack and located at the beginning of the stack
- now the thread_info struct is moved and located inside the task_struct structure
- the stack is allocated and handled like the major other platforms
- drop the cpu field of thread_info and use instead the one in task_struct

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
2021-11-01 07:35:59 +01:00
Sven Schnelle
bc294838cc parisc: add support for TOC (transfer of control)
Almost all PA-RISC machines have either a button that
is labeled with 'TOC' or a BMC function to trigger a TOC.
TOC is a non-maskable interrupt that is sent to the processor.
This can be used for diagnostic purposes like obtaining a
stack trace/register dump or to enter KDB/KGDB.

As an example, on my c8000, TOC can be used with:

CONFIG_KGDB=y
CONFIG_KGDB_KDB=y

and the 'kgdboc=ttyS0,115200' appended to the command line.

Press ^[( on serial console, which will enter the BMC command line,
and enter 'TOC s':

root@(none):/# (
cli>TOC s
Sending TOC/INIT.
<Cpu3> 2800035d03e00000  0000000040c21ac8  CC_ERR_CHECK_TOC
<Cpu0> 2800035d00e00000  0000000040c21ad0  CC_ERR_CHECK_TOC
<Cpu2> 2800035d02e00000  0000000040c21ac8  CC_ERR_CHECK_TOC
<Cpu1> 2800035d01e00000  0000000040c21ad0  CC_ERR_CHECK_TOC
<Cpu3> 37000f7303e00000  2000000000000000  CC_ERR_CPU_CHECK_SUMMARY
<Cpu0> 37000f7300e00000  2000000000000000  CC_ERR_CPU_CHECK_SUMMARY
<Cpu2> 37000f7302e00000  2000000000000000  CC_ERR_CPU_CHECK_SUMMARY
<Cpu1> 37000f7301e00000  2000000000000000  CC_ERR_CPU_CHECK_SUMMARY
<Cpu3> 4300100803e00000  c0000000001d26cc  CC_MC_BR_TO_OS_TOC
<Cpu0> 4300100800e00000  c0000000001d26cc  CC_MC_BR_TO_OS_TOC
<Cpu2> 4300100802e00000  c0000000001d26cc  CC_MC_BR_TO_OS_TOC
<Cpu1> 4300100801e00000  c0000000001d26cc  CC_MC_BR_TO_OS_TOC

Entering kdb (current=0x00000000411cef80, pid 0) on processor 0 due to NonMaskable Interrupt @ 0x40c21ad0
[0]kdb>

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-01 07:35:59 +01:00
Sven Schnelle
ecac70366d parisc/firmware: add functions to retrieve TOC data
Add functions to retrieve TOC data from firmware both
for 1.1 and 2.0 PDC.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-01 07:35:59 +01:00
Sven Schnelle
d9e2033669 parisc: add PIM TOC data structures
These data structures describe the TOC data we get from firmware
when issuing a PDC_PIM_TOC request.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-01 07:35:59 +01:00
Sven Schnelle
b5f73da500 parisc: move virt_map macro to assembly.h
This macro will also be used by the TOC code, so move it
into asm/assembly.h to avoid duplication.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-01 07:35:59 +01:00
Helge Deller
b7d8c16a58 parisc: Allocate task struct with stack frame alignment
We will put the stack directly behind the task struct, so
make sure that we allocate it with an alignment of 64 bytes.

Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30 23:11:01 +02:00
Helge Deller
9f6cfef1d0 parisc: Define FRAME_ALIGN and PRIV_USER/PRIV_KERNEL in assembly.h
Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30 23:11:01 +02:00
Helge Deller
ec5c115050 parisc: Add KFENCE support
Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30 23:11:00 +02:00
Helge Deller
a348eab327 parisc: make parisc_acctyp() available outside of faults.c
When adding kfence support, we need to tell kfence_handle_page_fault()
if the interrupted assembler statement is a read or write operation.

Signed-off-by: Helge Deller <deller@gmx.de>
2021-10-30 23:11:00 +02:00
Arnd Bergmann
f98a3dccfc locking: Remove spin_lock_flags() etc
parisc, ia64 and powerpc32 are the only remaining architectures that
provide custom arch_{spin,read,write}_lock_flags() functions, which are
meant to re-enable interrupts while waiting for a spinlock.

However, none of these can actually run into this codepath, because
it is only called on architectures without CONFIG_GENERIC_LOCKBREAK,
or when CONFIG_DEBUG_LOCK_ALLOC is set without CONFIG_LOCKDEP, and none
of those combinations are possible on the three architectures.

Going back in the git history, it appears that arch/mn10300 may have
been able to run into this code path, but there is a good chance that
it never worked. On the architectures that still exist, it was
already impossible to hit back in 2008 after the introduction of
CONFIG_GENERIC_LOCKBREAK, and possibly earlier.

As this is all dead code, just remove it and the helper functions built
around it. For arch/ia64, the inline asm could be cleaned up, but
it seems safer to leave it untouched.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Helge Deller <deller@gmx.de>  # parisc
Link: https://lore.kernel.org/r/20211022120058.1031690-1-arnd@kernel.org
2021-10-30 16:37:28 +02:00
Matthew Wilcox (Oracle)
08b0b0059b mm: Add flush_dcache_folio()
This is a default implementation which calls flush_dcache_page() on
each page in the folio.  If architectures can do better, they should
implement their own version of it.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18 07:49:36 -04:00
Kees Cook
42a20f86dc sched: Add wrapper for get_wchan() to keep task blocked
Having a stable wchan means the process must be blocked and for it to
stay that way while performing stack unwinding.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm]
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Link: https://lkml.kernel.org/r/20211008111626.332092234@infradead.org
2021-10-15 11:25:14 +02:00
Stephen Rothwell
10d48705d5 fix up for "net: add new socket option SO_RESERVE_MEM"
Some architectures do not include uapi/asm/socket.h

Fixes: 2bb2f5fb21 ("net: add new socket option SO_RESERVE_MEM")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-01 15:00:21 +01:00
Helge Deller
90cc7bed1e parisc: Use absolute_pointer() to define PAGE0
Use absolute_pointer() wrapper for PAGE0 to avoid this compiler warning:

  arch/parisc/kernel/setup.c: In function 'start_parisc':
  error: '__builtin_memcmp_eq' specified bound 8 exceeds source size 0

Signed-off-by: Helge Deller <deller@gmx.de>
Co-Developed-by: Guenter Roeck <linux@roeck-us.net>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-16 08:35:42 +02:00
Helge Deller
6710287280 parisc: Implement __get/put_kernel_nofault()
Remove CONFIG_SET_FS from parisc, so we need to add
__get_kernel_nofault() and __put_kernel_nofault(), define
HAVE_GET_KERNEL_NOFAULT and remove set_fs(), get_fs(), load_sr2(),
thread_info->addr_limit, KERNEL_DS and USER_DS.

The nice side-effect of this patch is that we now can directly access
userspace via sr3 without the need to use a temporary sr2 which is
either copied from sr3 or set to zero (for kernel space).

Signed-off-by: Helge Deller <deller@gmx.de>
Suggested-by: Arnd Bergmann <arnd@kernel.org>
2021-09-09 22:53:09 +02:00
Helge Deller
e4f2006f12 parisc: Reduce sigreturn trampoline to 3 instructions
We can move the INSN_LDI_R20 instruction into the branch delay slot.

Signed-off-by: Helge Deller <deller@gmx.de>
2021-09-09 12:44:31 +02:00
Helge Deller
1260dea6d2 parisc: Drop strnlen_user() in favour of generic version
As suggested by Arnd Bergmann, drop the parisc version of
strnlen_user() and switch to the generic version.

Suggested-by: Arnd Bergmann <arnd@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-09-09 12:44:30 +02:00
Arnd Bergmann
a7a08b275a arch: remove compat_alloc_user_space
All users of compat_alloc_user_space() and copy_in_user() have been
removed from the kernel, only a few functions in sparc remain that can be
changed to calling arch_copy_in_user() instead.

Link: https://lkml.kernel.org/r/20210727144859.4150043-7-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 15:32:35 -07:00
Linus Torvalds
11d5576880 parisc architecture fixes for kernel 5.15-rc1:
Fix an unaligned-access crash in the bootloader and drop asm/swab.h.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCYTHYXgAKCRD3ErUQojoP
 X1afAQCytCoygzHd6EyP6imoXsllx9hjzNHsii84zevhfQkBmQD9HARGlx23iYa+
 oXBuOL+pPKcauLVMFTIVKTvLtlrjbw4=
 =KH1L
 -----END PGP SIGNATURE-----

Merge tag 'for-5.15/parisc-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc architecture fixes from Helge Deller:
 "Fix an unaligned-access crash in the bootloader and drop asm/swab.h"

* tag 'for-5.15/parisc-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix unaligned-access crash in bootloader
  parisc: Drop __arch_swab16(), arch_swab24(), _arch_swab32() and __arch_swab64() functions
2021-09-03 11:15:49 -07:00
Linus Torvalds
14726903c8 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "173 patches.

  Subsystems affected by this series: ia64, ocfs2, block, and mm (debug,
  pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap,
  bootmem, sparsemem, vmalloc, kasan, pagealloc, memory-failure,
  hugetlb, userfaultfd, vmscan, compaction, mempolicy, memblock,
  oom-kill, migration, ksm, percpu, vmstat, and madvise)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (173 commits)
  mm/madvise: add MADV_WILLNEED to process_madvise()
  mm/vmstat: remove unneeded return value
  mm/vmstat: simplify the array size calculation
  mm/vmstat: correct some wrong comments
  mm/percpu,c: remove obsolete comments of pcpu_chunk_populated()
  selftests: vm: add COW time test for KSM pages
  selftests: vm: add KSM merging time test
  mm: KSM: fix data type
  selftests: vm: add KSM merging across nodes test
  selftests: vm: add KSM zero page merging test
  selftests: vm: add KSM unmerge test
  selftests: vm: add KSM merge test
  mm/migrate: correct kernel-doc notation
  mm: wire up syscall process_mrelease
  mm: introduce process_mrelease system call
  memblock: make memblock_find_in_range method private
  mm/mempolicy.c: use in_task() in mempolicy_slab_node()
  mm/mempolicy: unify the create() func for bind/interleave/prefer-many policies
  mm/mempolicy: advertise new MPOL_PREFERRED_MANY
  mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY
  ...
2021-09-03 10:08:28 -07:00
Christoph Hellwig
f358afc52c mm: remove flush_kernel_dcache_page
flush_kernel_dcache_page is a rather confusing interface that implements a
subset of flush_dcache_page by not being able to properly handle page
cache mapped pages.

The only callers left are in the exec code as all other previous callers
were incorrect as they could have dealt with page cache pages.  Replace
the calls to flush_kernel_dcache_page with calls to flush_dcache_page,
which for all architectures does either exactly the same thing, can
contains one or more of the following:

 1) an optimization to defer the cache flush for page cache pages not
    mapped into userspace
 2) additional flushing for mapped page cache pages if cache aliases
    are possible

Link: https://lkml.kernel.org/r/20210712060928.4161649-7-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Alex Shi <alexs@kernel.org>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-03 09:58:13 -07:00
Helge Deller
e5a2cac908 parisc: Drop __arch_swab16(), arch_swab24(), _arch_swab32() and __arch_swab64() functions
No need to keep those as inline assembly functions, the compiler now
generates the same or even better optimized code.

Signed-off-by: Helge Deller <deller@gmx.de>
2021-09-02 23:54:03 +02:00
Linus Torvalds
a2d616b935 parisc architecture updates for kernel 5.15:
- Fix a kernel crash when a signal is delivered to bad userspace stack
 - Fix fall-through warnings in math-emu code
 - Increase size of gcc stack frame check
 - Switch coding from 'pci_' to 'dma_' API
 - Make struct parisc_driver::remove() return void
 - Some parisc related Makefile changes
 - Minor cleanups, e.g. change to octal permissions, fix macro collisions,
   fix PMD_ORDER collision, replace spaces with tabs
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCYTELwQAKCRD3ErUQojoP
 Xy/uAQChkDVD15kBvj0PUt4hDpGq7ryfAsEfMnxlV2k4Ue6SKAEA3Smfd242lpPF
 f89NNo6Y/ZhO+aWKfOLerXLfM6sB2QQ=
 =cxvN
 -----END PGP SIGNATURE-----

Merge tag 'for-5.15/parisc' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc architecture updates from Helge Deller:

 - Fix a kernel crash when a signal is delivered to bad userspace stack

 - Fix fall-through warnings in math-emu code

 - Increase size of gcc stack frame check

 - Switch coding from 'pci_' to 'dma_' API

 - Make struct parisc_driver::remove() return void

 - Some parisc related Makefile changes

 - Minor cleanups, e.g. change to octal permissions, fix macro
   collisions, fix PMD_ORDER collision, replace spaces with tabs

* tag 'for-5.15/parisc' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: math-emu: Fix fall-through warnings
  parisc: fix crash with signals and alloca
  parisc: Fix compile failure when building 64-bit kernel natively
  parisc: ccio-dma.c: Added tab instead of spaces
  parisc/parport_gsc: switch from 'pci_' to 'dma_' API
  parisc: move core-y in arch/parisc/Makefile to arch/parisc/Kbuild
  parisc: switch from 'pci_' to 'dma_' API
  parisc: Make struct parisc_driver::remove() return void
  parisc: remove unused arch/parisc/boot/install.sh and its phony target
  parisc: Rename PMD_ORDER to PMD_TABLE_ORDER
  parisc: math-emu: Avoid "fmt" macro collision
  parisc: Increase size of gcc stack frame check
  parisc: Replace symbolic permissions with octal permissions
2021-09-02 13:16:00 -07:00
Linus Torvalds
9e9fb7655e Core:
- Enable memcg accounting for various networking objects.
 
 BPF:
 
  - Introduce bpf timers.
 
  - Add perf link and opaque bpf_cookie which the program can read
    out again, to be used in libbpf-based USDT library.
 
  - Add bpf_task_pt_regs() helper to access user space pt_regs
    in kprobes, to help user space stack unwinding.
 
  - Add support for UNIX sockets for BPF sockmap.
 
  - Extend BPF iterator support for UNIX domain sockets.
 
  - Allow BPF TCP congestion control progs and bpf iterators to call
    bpf_setsockopt(), e.g. to switch to another congestion control
    algorithm.
 
 Protocols:
 
  - Support IOAM Pre-allocated Trace with IPv6.
 
  - Support Management Component Transport Protocol.
 
  - bridge: multicast: add vlan support.
 
  - netfilter: add hooks for the SRv6 lightweight tunnel driver.
 
  - tcp:
     - enable mid-stream window clamping (by user space or BPF)
     - allow data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD
     - more accurate DSACK processing for RACK-TLP
 
  - mptcp:
     - add full mesh path manager option
     - add partial support for MP_FAIL
     - improve use of backup subflows
     - optimize option processing
 
  - af_unix: add OOB notification support.
 
  - ipv6: add IFLA_INET6_RA_MTU to expose MTU value advertised by
          the router.
 
  - mac80211: Target Wake Time support in AP mode.
 
  - can: j1939: extend UAPI to notify about RX status.
 
 Driver APIs:
 
  - Add page frag support in page pool API.
 
  - Many improvements to the DSA (distributed switch) APIs.
 
  - ethtool: extend IRQ coalesce uAPI with timer reset modes.
 
  - devlink: control which auxiliary devices are created.
 
  - Support CAN PHYs via the generic PHY subsystem.
 
  - Proper cross-chip support for tag_8021q.
 
  - Allow TX forwarding for the software bridge data path to be
    offloaded to capable devices.
 
 Drivers:
 
  - veth: more flexible channels number configuration.
 
  - openvswitch: introduce per-cpu upcall dispatch.
 
  - Add internet mix (IMIX) mode to pktgen.
 
  - Transparently handle XDP operations in the bonding driver.
 
  - Add LiteETH network driver.
 
  - Renesas (ravb):
    - support Gigabit Ethernet IP
 
  - NXP Ethernet switch (sja1105)
    - fast aging support
    - support for "H" switch topologies
    - traffic termination for ports under VLAN-aware bridge
 
  - Intel 1G Ethernet
     - support getcrosststamp() with PCIe PTM (Precision Time
       Measurement) for better time sync
     - support Credit-Based Shaper (CBS) offload, enabling HW traffic
       prioritization and bandwidth reservation
 
  - Broadcom Ethernet (bnxt)
     - support pulse-per-second output
     - support larger Rx rings
 
  - Mellanox Ethernet (mlx5)
     - support ethtool RSS contexts and MQPRIO channel mode
     - support LAG offload with bridging
     - support devlink rate limit API
     - support packet sampling on tunnels
 
  - Huawei Ethernet (hns3):
     - basic devlink support
     - add extended IRQ coalescing support
     - report extended link state
 
  - Netronome Ethernet (nfp):
     - add conntrack offload support
 
  - Broadcom WiFi (brcmfmac):
     - add WPA3 Personal with FT to supported cipher suites
     - support 43752 SDIO device
 
  - Intel WiFi (iwlwifi):
     - support scanning hidden 6GHz networks
     - support for a new hardware family (Bz)
 
  - Xen pv driver:
     - harden netfront against malicious backends
 
  - Qualcomm mobile
     - ipa: refactor power management and enable automatic suspend
     - mhi: move MBIM to WWAN subsystem interfaces
 
 Refactor:
 
  - Ambient BPF run context and cgroup storage cleanup.
 
  - Compat rework for ndo_ioctl.
 
 Old code removal:
 
  - prism54 remove the obsoleted driver, deprecated by the p54 driver.
 
  - wan: remove sbni/granch driver.
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmEukBYACgkQMUZtbf5S
 IrsyHA//TO8dw18NYts4n9LmlJT2naJ7yBUUSSXK/M+DtW0MQ9nnHhqzPm5uJdRl
 IgQTNJrW3dYzRwgqaWZqEwO1t5/FI+f87ND1Nsekg7x9tF66a6ov5WxU26TwwSba
 U+si/inQ/4chuQ+LxMQobqCDxaLE46I2dIoRl+YfndJ24DRzYSwAEYIPPbSdfyU+
 +/l+3s4GaxO4k/hLciPAiOniyxLoUNiGUTNh+2yqRBXelSRJRKVnl+V22ANFrxRW
 nTEiplfVKhlPU1e4iLuRtaxDDiePHhw9I3j/lMHhfeFU2P/gKJIvz4QpGV0CAZg2
 1VvDU32WEx1GQLXJbKm0KwoNRUq1QSjOyyFti+BO7ugGaYAR4gKhShOqlSYLzUtB
 tbtzQhSNLWOGqgmSJOztZb5kFDm2EdRSll5/lP2uyFlPkIsIp0QbscJVzNTnS74b
 Xz15ZOw41Z4TfWPEMWgfrx6Zkm7pPWkly+7WfUkPcHa1gftNz6tzXXxSXcXIBPdi
 yQ5JCzzxrM5573YHuk5YedwZpn6PiAt4A/muFGk9C6aXP60TQAOS/ppaUzZdnk4D
 NfOk9mj06WEULjYjPcKEuT3GGWE6kmjb8Pu0QZWKOchv7vr6oZly1EkVZqYlXELP
 AfhcrFeuufie8mqm0jdb4LnYaAnqyLzlb1J4Zxh9F+/IX7G3yoc=
 =JDGD
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - Enable memcg accounting for various networking objects.

  BPF:

   - Introduce bpf timers.

   - Add perf link and opaque bpf_cookie which the program can read out
     again, to be used in libbpf-based USDT library.

   - Add bpf_task_pt_regs() helper to access user space pt_regs in
     kprobes, to help user space stack unwinding.

   - Add support for UNIX sockets for BPF sockmap.

   - Extend BPF iterator support for UNIX domain sockets.

   - Allow BPF TCP congestion control progs and bpf iterators to call
     bpf_setsockopt(), e.g. to switch to another congestion control
     algorithm.

  Protocols:

   - Support IOAM Pre-allocated Trace with IPv6.

   - Support Management Component Transport Protocol.

   - bridge: multicast: add vlan support.

   - netfilter: add hooks for the SRv6 lightweight tunnel driver.

   - tcp:
       - enable mid-stream window clamping (by user space or BPF)
       - allow data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD
       - more accurate DSACK processing for RACK-TLP

   - mptcp:
       - add full mesh path manager option
       - add partial support for MP_FAIL
       - improve use of backup subflows
       - optimize option processing

   - af_unix: add OOB notification support.

   - ipv6: add IFLA_INET6_RA_MTU to expose MTU value advertised by the
     router.

   - mac80211: Target Wake Time support in AP mode.

   - can: j1939: extend UAPI to notify about RX status.

  Driver APIs:

   - Add page frag support in page pool API.

   - Many improvements to the DSA (distributed switch) APIs.

   - ethtool: extend IRQ coalesce uAPI with timer reset modes.

   - devlink: control which auxiliary devices are created.

   - Support CAN PHYs via the generic PHY subsystem.

   - Proper cross-chip support for tag_8021q.

   - Allow TX forwarding for the software bridge data path to be
     offloaded to capable devices.

  Drivers:

   - veth: more flexible channels number configuration.

   - openvswitch: introduce per-cpu upcall dispatch.

   - Add internet mix (IMIX) mode to pktgen.

   - Transparently handle XDP operations in the bonding driver.

   - Add LiteETH network driver.

   - Renesas (ravb):
       - support Gigabit Ethernet IP

   - NXP Ethernet switch (sja1105):
       - fast aging support
       - support for "H" switch topologies
       - traffic termination for ports under VLAN-aware bridge

   - Intel 1G Ethernet
       - support getcrosststamp() with PCIe PTM (Precision Time
         Measurement) for better time sync
       - support Credit-Based Shaper (CBS) offload, enabling HW traffic
         prioritization and bandwidth reservation

   - Broadcom Ethernet (bnxt)
       - support pulse-per-second output
       - support larger Rx rings

   - Mellanox Ethernet (mlx5)
       - support ethtool RSS contexts and MQPRIO channel mode
       - support LAG offload with bridging
       - support devlink rate limit API
       - support packet sampling on tunnels

   - Huawei Ethernet (hns3):
       - basic devlink support
       - add extended IRQ coalescing support
       - report extended link state

   - Netronome Ethernet (nfp):
       - add conntrack offload support

   - Broadcom WiFi (brcmfmac):
       - add WPA3 Personal with FT to supported cipher suites
       - support 43752 SDIO device

   - Intel WiFi (iwlwifi):
       - support scanning hidden 6GHz networks
       - support for a new hardware family (Bz)

   - Xen pv driver:
       - harden netfront against malicious backends

   - Qualcomm mobile
       - ipa: refactor power management and enable automatic suspend
       - mhi: move MBIM to WWAN subsystem interfaces

  Refactor:

   - Ambient BPF run context and cgroup storage cleanup.

   - Compat rework for ndo_ioctl.

  Old code removal:

   - prism54 remove the obsoleted driver, deprecated by the p54 driver.

   - wan: remove sbni/granch driver"

* tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1715 commits)
  net: Add depends on OF_NET for LiteX's LiteETH
  ipv6: seg6: remove duplicated include
  net: hns3: remove unnecessary spaces
  net: hns3: add some required spaces
  net: hns3: clean up a type mismatch warning
  net: hns3: refine function hns3_set_default_feature()
  ipv6: remove duplicated 'net/lwtunnel.h' include
  net: w5100: check return value after calling platform_get_resource()
  net/mlxbf_gige: Make use of devm_platform_ioremap_resourcexxx()
  net: mdio: mscc-miim: Make use of the helper function devm_platform_ioremap_resource()
  net: mdio-ipq4019: Make use of devm_platform_ioremap_resource()
  fou: remove sparse errors
  ipv4: fix endianness issue in inet_rtm_getroute_build_skb()
  octeontx2-af: Set proper errorcode for IPv4 checksum errors
  octeontx2-af: Fix static code analyzer reported issues
  octeontx2-af: Fix mailbox errors in nix_rss_flowkey_cfg
  octeontx2-af: Fix loop in free and unmap counter
  af_unix: fix potential NULL deref in unix_dgram_connect()
  dpaa2-eth: Replace strlcpy with strscpy
  octeontx2-af: Use NDC TX for transmit packet data
  ...
2021-08-31 16:43:06 -07:00
Uwe Kleine-König
87875c1084 parisc: Make struct parisc_driver::remove() return void
The caller of this function (parisc_driver_remove() in
arch/parisc/kernel/drivers.c) ignores the return value, so better don't
return any value at all to not wake wrong expectations in driver authors.

The only function that could return a non-zero value before was
ipmi_parisc_remove() which returns the return value of
ipmi_si_remove_by_dev(). Make this function return void, too, as for all
other callers the value is ignored, too.

Also fold in a small checkpatch fix for:

WARNING: Unnecessary space before function pointer arguments
+	void (*remove) (struct parisc_device *dev);

Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> (for drivers/input)
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by:  Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Acked-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-08-30 10:18:25 +02:00
Matthew Wilcox (Oracle)
7bf82eb387 parisc: Rename PMD_ORDER to PMD_TABLE_ORDER
This is the order of the page table allocation, not the order of a PMD.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-08-30 10:18:25 +02:00
Helge Deller
f6a3308d6f Revert "parisc: Add assembly implementations for memset, strlen, strcpy, strncpy and strcat"
This reverts commit 83af58f806.

It turns out that at least the assembly implementation for strncpy() was
buggy.  Revert the whole commit and return back to the default coding.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v5.4+
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-29 10:13:32 -07:00
Pavel Tikhomirov
04190bf894 sock: allow reading and changing sk_userlocks with setsockopt
SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags disable automatic socket
buffers adjustment done by kernel (see tcp_fixup_rcvbuf() and
tcp_sndbuf_expand()). If we've just created a new socket this adjustment
is enabled on it, but if one changes the socket buffer size by
setsockopt(SO_{SND,RCV}BUF*) it becomes disabled.

CRIU needs to call setsockopt(SO_{SND,RCV}BUF*) on each socket on
restore as it first needs to increase buffer sizes for packet queues
restore and second it needs to restore back original buffer sizes. So
after CRIU restore all sockets become non-auto-adjustable, which can
decrease network performance of restored applications significantly.

CRIU need to be able to restore sockets with enabled/disabled adjustment
to the same state it was before dump, so let's add special setsockopt
for it.

Let's also export SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags to uAPI so
that using these interface one can reenable automatic socket buffer
adjustment on their sockets.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 12:52:03 +01:00
Arnd Bergmann
1a33b18b3b compat: make linux/compat.h available everywhere
Parts of linux/compat.h are under an #ifdef, but we end up
using more of those over time, moving things around bit by
bit.

To get it over with once and for all, make all of this file
uncondititonal now so it can be accessed everywhere. There
are only a few types left that are in asm/compat.h but not
yet in the asm-generic version, so add those in the process.

This requires providing a few more types in asm-generic/compat.h
that were not already there. The only tricky one is
compat_sigset_t, which needs a little help on 32-bit architectures
and for x86.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23 14:20:24 +01:00
Aneesh Kumar K.V
9cf6fa2458 mm: rename pud_page_vaddr to pud_pgtable and make it return pmd_t *
No functional change in this patch.

[aneesh.kumar@linux.ibm.com: fix]
  Link: https://lkml.kernel.org/r/87wnqtnb60.fsf@linux.ibm.com
[sfr@canb.auug.org.au: another fix]
  Link: https://lkml.kernel.org/r/20210619134410.89559-1-aneesh.kumar@linux.ibm.com

Link: https://lkml.kernel.org/r/20210615110859.320299-1-aneesh.kumar@linux.ibm.com
Link: https://lore.kernel.org/linuxppc-dev/CAHk-=wi+J+iodze9FtjM3Zi4j4OeS+qqbKxME9QN4roxPEXH9Q@mail.gmail.com/
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-08 11:48:22 -07:00
Linus Torvalds
4cad671979 asm-generic/unaligned: Unify asm/unaligned.h around struct helper
The get_unaligned()/put_unaligned() helpers are traditionally architecture
 specific, with the two main variants being the "access-ok.h" version
 that assumes unaligned pointer accesses always work on a particular
 architecture, and the "le-struct.h" version that casts the data to a
 byte aligned type before dereferencing, for architectures that cannot
 always do unaligned accesses in hardware.
 
 Based on the discussion linked below, it appears that the access-ok
 version is not realiable on any architecture, but the struct version
 probably has no downsides. This series changes the code to use the
 same implementation on all architectures, addressing the few exceptions
 separately.
 
 Link: https://lore.kernel.org/lkml/75d07691-1e4f-741f-9852-38c0b4f520bc@synopsys.com/
 Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363
 Link: https://lore.kernel.org/lkml/20210507220813.365382-14-arnd@kernel.org/
 Link: git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git unaligned-rework-v2
 Link: https://lore.kernel.org/lkml/CAHk-=whGObOKruA_bU3aPGZfoDqZM1_9wBkwREp0H0FgR-90uQ@mail.gmail.com/
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmDfFx4ACgkQmmx57+YA
 GNkqzRAAjdlIr8M+xI2CyT0/A9tswYfLMeWejmYopq3zlxI6RnvPiJJDIdY2I8US
 1npIiDo55w061CnXL9rV65ocL3XmGu1mabOvgM6ATsec+8t4WaXBV9tysxTJ9ea0
 ltLTa2P5DXWALvWiVMTME7hFaf1cW+8Uqt3LmXxDp2l5zasXajCHAH6YokON2PfM
 CsaRhwSxIu8Sbnu/IQGBI9JW5UXsBfKSyUwtM0OwP7jFOuIeZ4WBVA+j6UxONnFC
 wouKmAM/ThoOsaV9aP4EZLIfBx8d4/hfYQjZ958kYXurerruYkJeEqdIRbV0QqTy
 2O6ZrJ6uqPlzfWz9h458me2dt98YEtALHV/3DCWUcBfHmUQtxElyJYEhG0YjVF3H
 5RYtjw8Q2LS/QR5ask1Xn0JfT89rRnLi2migAtsA4Ce70JP4Us6wGobkj4SHlgDt
 P7+eVq2Mkhqw/kmV8N4p+ZS5lpkK0JniDN+ONDhkZqHL/zXG/HQzx9wLV69jlvo2
 ASevKxITdi+bKHWs5ANungkBOnBUQZacq46mVyi4HPDwMAFyWvVYTbFumy9koagQ
 o9NEgX3RsZcxxi7bU1xuFPFMLMlUQT3Nb30+84B4fKe9FmvHC1hizTiCnp7q4bZr
 z6a6AMHke7YLqKZOqzTJGRR3lPoZZDCb775SAd70LQp6XPZXOHs=
 =IY5U
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-unaligned-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm/unaligned.h unification from Arnd Bergmann:
 "Unify asm/unaligned.h around struct helper

  The get_unaligned()/put_unaligned() helpers are traditionally
  architecture specific, with the two main variants being the
  "access-ok.h" version that assumes unaligned pointer accesses always
  work on a particular architecture, and the "le-struct.h" version that
  casts the data to a byte aligned type before dereferencing, for
  architectures that cannot always do unaligned accesses in hardware.

  Based on the discussion linked below, it appears that the access-ok
  version is not realiable on any architecture, but the struct version
  probably has no downsides. This series changes the code to use the
  same implementation on all architectures, addressing the few
  exceptions separately"

Link: https://lore.kernel.org/lkml/75d07691-1e4f-741f-9852-38c0b4f520bc@synopsys.com/
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363
Link: https://lore.kernel.org/lkml/20210507220813.365382-14-arnd@kernel.org/
Link: git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git unaligned-rework-v2
Link: https://lore.kernel.org/lkml/CAHk-=whGObOKruA_bU3aPGZfoDqZM1_9wBkwREp0H0FgR-90uQ@mail.gmail.com/

* tag 'asm-generic-unaligned-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: simplify asm/unaligned.h
  asm-generic: uaccess: 1-byte access is always aligned
  netpoll: avoid put_unaligned() on single character
  mwifiex: re-fix for unaligned accesses
  apparmor: use get_unaligned() only for multi-byte words
  partitions: msdos: fix one-byte get_unaligned()
  asm-generic: unaligned always use struct helpers
  asm-generic: unaligned: remove byteshift helpers
  powerpc: use linux/unaligned/le_struct.h on LE power7
  m68k: select CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
  sh: remove unaligned access for sh4a
  openrisc: always use unaligned-struct header
  asm-generic: use asm-generic/unaligned.h for most architectures
2021-07-02 12:43:40 -07:00
Linus Torvalds
71bd934101 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "190 patches.

  Subsystems affected by this patch series: mm (hugetlb, userfaultfd,
  vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock,
  migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap,
  zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc,
  core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs,
  signals, exec, kcov, selftests, compress/decompress, and ipc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits)
  ipc/util.c: use binary search for max_idx
  ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock
  ipc: use kmalloc for msg_queue and shmid_kernel
  ipc sem: use kvmalloc for sem_undo allocation
  lib/decompressors: remove set but not used variabled 'level'
  selftests/vm/pkeys: exercise x86 XSAVE init state
  selftests/vm/pkeys: refill shadow register after implicit kernel write
  selftests/vm/pkeys: handle negative sys_pkey_alloc() return code
  selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random
  kcov: add __no_sanitize_coverage to fix noinstr for all architectures
  exec: remove checks in __register_bimfmt()
  x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned
  hfsplus: report create_date to kstat.btime
  hfsplus: remove unnecessary oom message
  nilfs2: remove redundant continue statement in a while-loop
  kprobes: remove duplicated strong free_insn_page in x86 and s390
  init: print out unknown kernel parameters
  checkpatch: do not complain about positive return values starting with EPOLL
  checkpatch: improve the indented label test
  checkpatch: scripts/spdxcheck.py now requires python3
  ...
2021-07-02 12:08:10 -07:00
Anshuman Khandual
1c2f7d14d8 mm/thp: define default pmd_pgtable()
Currently most platforms define pmd_pgtable() as pmd_page() duplicating
the same code all over.  Instead just define a default value i.e
pmd_page() for pmd_pgtable() and let platforms override when required via
<asm/pgtable.h>.  All the existing platform that override pmd_pgtable()
have been moved into their respective <asm/pgtable.h> header in order to
precede before the new generic definition.  This makes it much cleaner
with reduced code.

Link: https://lkml.kernel.org/r/1623646133-20306-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01 11:06:03 -07:00
Anshuman Khandual
fac7757e1f mm: define default value for FIRST_USER_ADDRESS
Currently most platforms define FIRST_USER_ADDRESS as 0UL duplication the
same code all over.  Instead just define a generic default value (i.e 0UL)
for FIRST_USER_ADDRESS and let the platforms override when required.  This
makes it much cleaner with reduced code.

The default FIRST_USER_ADDRESS here would be skipped in <linux/pgtable.h>
when the given platform overrides its value via <asm/pgtable.h>.

Link: https://lkml.kernel.org/r/1620615725-24623-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Acked-by: Guo Ren <guoren@kernel.org>			[csky]
Acked-by: Stafford Horne <shorne@gmail.com>		[openrisc]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>	[RISC-V]
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01 11:06:02 -07:00
David Hildenbrand
4ca9b3859d mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables
I. Background: Sparse Memory Mappings

When we manage sparse memory mappings dynamically in user space - also
sometimes involving MAP_NORESERVE - we want to dynamically populate/
discard memory inside such a sparse memory region.  Example users are
hypervisors (especially implementing memory ballooning or similar
technologies like virtio-mem) and memory allocators.  In addition, we want
to fail in a nice way (instead of generating SIGBUS) if populating does
not succeed because we are out of backend memory (which can happen easily
with file-based mappings, especially tmpfs and hugetlbfs).

While MADV_DONTNEED, MADV_REMOVE and FALLOC_FL_PUNCH_HOLE allow for
reliably discarding memory for most mapping types, there is no generic
approach to populate page tables and preallocate memory.

Although mmap() supports MAP_POPULATE, it is not applicable to the concept
of sparse memory mappings, where we want to populate/discard dynamically
and avoid expensive/problematic remappings.  In addition, we never
actually report errors during the final populate phase - it is best-effort
only.

fallocate() can be used to preallocate file-based memory and fail in a
safe way.  However, it cannot really be used for any private mappings on
anonymous files via memfd due to COW semantics.  In addition, fallocate()
does not actually populate page tables, so we still always get pagefaults
on first access - which is sometimes undesired (i.e., real-time workloads)
and requires real prefaulting of page tables, not just a preallocation of
backend storage.  There might be interesting use cases for sparse memory
regions along with mlockall(MCL_ONFAULT) which fallocate() cannot satisfy
as it does not prefault page tables.

II. On preallcoation/prefaulting from user space

Because we don't have a proper interface, what applications (like QEMU and
databases) end up doing is touching (i.e., reading+writing one byte to not
overwrite existing data) all individual pages.

However, that approach
1) Can result in wear on storage backing, because we end up reading/writing
   each page; this is especially a problem for dax/pmem.
2) Can result in mmap_sem contention when prefaulting via multiple
   threads.
3) Requires expensive signal handling, especially to catch SIGBUS in case
   of hugetlbfs/shmem/file-backed memory. For example, this is
   problematic in hypervisors like QEMU where SIGBUS handlers might already
   be used by other subsystems concurrently to e.g, handle hardware errors.
   "Simply" doing preallocation concurrently from other thread is not that
   easy.

III. On MADV_WILLNEED

Extending MADV_WILLNEED is not an option because
1. It would change the semantics: "Expect access in the near future." and
   "might be a good idea to read some pages" vs. "Definitely populate/
   preallocate all memory and definitely fail on errors.".
2. Existing users (like virtio-balloon in QEMU when deflating the balloon)
   don't want populate/prealloc semantics. They treat this rather as a hint
   to give a little performance boost without too much overhead - and don't
   expect that a lot of memory might get consumed or a lot of time
   might be spent.

IV. MADV_POPULATE_READ and MADV_POPULATE_WRITE

Let's introduce MADV_POPULATE_READ and MADV_POPULATE_WRITE, inspired by
MAP_POPULATE, with the following semantics:
1. MADV_POPULATE_READ can be used to prefault page tables just like
   manually reading each individual page. This will not break any COW
   mappings. The shared zero page might get mapped and no backend storage
   might get preallocated -- allocation might be deferred to
   write-fault time. Especially shared file mappings require an explicit
   fallocate() upfront to actually preallocate backend memory (blocks in
   the file system) in case the file might have holes.
2. If MADV_POPULATE_READ succeeds, all page tables have been populated
   (prefaulted) readable once.
3. MADV_POPULATE_WRITE can be used to preallocate backend memory and
   prefault page tables just like manually writing (or
   reading+writing) each individual page. This will break any COW
   mappings -- e.g., the shared zeropage is never populated.
4. If MADV_POPULATE_WRITE succeeds, all page tables have been populated
   (prefaulted) writable once.
5. MADV_POPULATE_READ and MADV_POPULATE_WRITE cannot be applied to special
   mappings marked with VM_PFNMAP and VM_IO. Also, proper access
   permissions (e.g., PROT_READ, PROT_WRITE) are required. If any such
   mapping is encountered, madvise() fails with -EINVAL.
6. If MADV_POPULATE_READ or MADV_POPULATE_WRITE fails, some page tables
   might have been populated.
7. MADV_POPULATE_READ and MADV_POPULATE_WRITE will return -EHWPOISON
   when encountering a HW poisoned page in the range.
8. Similar to MAP_POPULATE, MADV_POPULATE_READ and MADV_POPULATE_WRITE
   cannot protect from the OOM (Out Of Memory) handler killing the
   process.

While the use case for MADV_POPULATE_WRITE is fairly obvious (i.e.,
preallocate memory and prefault page tables for VMs), one issue is that
whenever we prefault pages writable, the pages have to be marked dirty,
because the CPU could dirty them any time.  while not a real problem for
hugetlbfs or dax/pmem, it can be a problem for shared file mappings: each
page will be marked dirty and has to be written back later when evicting.

MADV_POPULATE_READ allows for optimizing this scenario: Pre-read a whole
mapping from backend storage without marking it dirty, such that eviction
won't have to write it back.  As discussed above, shared file mappings
might require an explciit fallocate() upfront to achieve
preallcoation+prepopulation.

Although sparse memory mappings are the primary use case, this will also
be useful for other preallocate/prefault use cases where MAP_POPULATE is
not desired or the semantics of MAP_POPULATE are not sufficient: as one
example, QEMU users can trigger preallocation/prefaulting of guest RAM
after the mapping was created -- and don't want errors to be silently
suppressed.

Looking at the history, MADV_POPULATE was already proposed in 2013 [1],
however, the main motivation back than was performance improvements --
which should also still be the case.

V. Single-threaded performance comparison

I did a short experiment, prefaulting page tables on completely *empty
mappings/files* and repeated the experiment 10 times.  The results
correspond to the shortest execution time.  In general, the performance
benefit for huge pages is negligible with small mappings.

V.1: Private mappings

POPULATE_READ and POPULATE_WRITE is fastest.  Note that
Reading/POPULATE_READ will populate the shared zeropage where applicable
-- which result in short population times.

The fastest way to allocate backend storage (here: swap or huge pages) and
prefault page tables is POPULATE_WRITE.

V.2: Shared mappings

fallocate() is fastest, however, doesn't prefault page tables.
POPULATE_WRITE is faster than simple writes and read/writes.
POPULATE_READ is faster than simple reads.

Without a fd, the fastest way to allocate backend storage and prefault
page tables is POPULATE_WRITE.  With an fd, the fastest way is usually
FALLOCATE+POPULATE_READ or FALLOCATE+POPULATE_WRITE respectively; one
exception are actual files: FALLOCATE+Read is slightly faster than
FALLOCATE+POPULATE_READ.

The fastest way to allocate backend storage prefault page tables is
FALLOCATE+POPULATE_WRITE -- except when dealing with actual files; then,
FALLOCATE+POPULATE_READ is fastest and won't directly mark all pages as
dirty.

v.3: Detailed results

==================================================
2 MiB MAP_PRIVATE:
**************************************************
Anon 4 KiB     : Read                     :     0.119 ms
Anon 4 KiB     : Write                    :     0.222 ms
Anon 4 KiB     : Read/Write               :     0.380 ms
Anon 4 KiB     : POPULATE_READ            :     0.060 ms
Anon 4 KiB     : POPULATE_WRITE           :     0.158 ms
Memfd 4 KiB    : Read                     :     0.034 ms
Memfd 4 KiB    : Write                    :     0.310 ms
Memfd 4 KiB    : Read/Write               :     0.362 ms
Memfd 4 KiB    : POPULATE_READ            :     0.039 ms
Memfd 4 KiB    : POPULATE_WRITE           :     0.229 ms
Memfd 2 MiB    : Read                     :     0.030 ms
Memfd 2 MiB    : Write                    :     0.030 ms
Memfd 2 MiB    : Read/Write               :     0.030 ms
Memfd 2 MiB    : POPULATE_READ            :     0.030 ms
Memfd 2 MiB    : POPULATE_WRITE           :     0.030 ms
tmpfs          : Read                     :     0.033 ms
tmpfs          : Write                    :     0.313 ms
tmpfs          : Read/Write               :     0.406 ms
tmpfs          : POPULATE_READ            :     0.039 ms
tmpfs          : POPULATE_WRITE           :     0.285 ms
file           : Read                     :     0.033 ms
file           : Write                    :     0.351 ms
file           : Read/Write               :     0.408 ms
file           : POPULATE_READ            :     0.039 ms
file           : POPULATE_WRITE           :     0.290 ms
hugetlbfs      : Read                     :     0.030 ms
hugetlbfs      : Write                    :     0.030 ms
hugetlbfs      : Read/Write               :     0.030 ms
hugetlbfs      : POPULATE_READ            :     0.030 ms
hugetlbfs      : POPULATE_WRITE           :     0.030 ms
**************************************************
4096 MiB MAP_PRIVATE:
**************************************************
Anon 4 KiB     : Read                     :   237.940 ms
Anon 4 KiB     : Write                    :   708.409 ms
Anon 4 KiB     : Read/Write               :  1054.041 ms
Anon 4 KiB     : POPULATE_READ            :   124.310 ms
Anon 4 KiB     : POPULATE_WRITE           :   572.582 ms
Memfd 4 KiB    : Read                     :   136.928 ms
Memfd 4 KiB    : Write                    :   963.898 ms
Memfd 4 KiB    : Read/Write               :  1106.561 ms
Memfd 4 KiB    : POPULATE_READ            :    78.450 ms
Memfd 4 KiB    : POPULATE_WRITE           :   805.881 ms
Memfd 2 MiB    : Read                     :   357.116 ms
Memfd 2 MiB    : Write                    :   357.210 ms
Memfd 2 MiB    : Read/Write               :   357.606 ms
Memfd 2 MiB    : POPULATE_READ            :   356.094 ms
Memfd 2 MiB    : POPULATE_WRITE           :   356.937 ms
tmpfs          : Read                     :   137.536 ms
tmpfs          : Write                    :   954.362 ms
tmpfs          : Read/Write               :  1105.954 ms
tmpfs          : POPULATE_READ            :    80.289 ms
tmpfs          : POPULATE_WRITE           :   822.826 ms
file           : Read                     :   137.874 ms
file           : Write                    :   987.025 ms
file           : Read/Write               :  1107.439 ms
file           : POPULATE_READ            :    80.413 ms
file           : POPULATE_WRITE           :   857.622 ms
hugetlbfs      : Read                     :   355.607 ms
hugetlbfs      : Write                    :   355.729 ms
hugetlbfs      : Read/Write               :   356.127 ms
hugetlbfs      : POPULATE_READ            :   354.585 ms
hugetlbfs      : POPULATE_WRITE           :   355.138 ms
**************************************************
2 MiB MAP_SHARED:
**************************************************
Anon 4 KiB     : Read                     :     0.394 ms
Anon 4 KiB     : Write                    :     0.348 ms
Anon 4 KiB     : Read/Write               :     0.400 ms
Anon 4 KiB     : POPULATE_READ            :     0.326 ms
Anon 4 KiB     : POPULATE_WRITE           :     0.273 ms
Anon 2 MiB     : Read                     :     0.030 ms
Anon 2 MiB     : Write                    :     0.030 ms
Anon 2 MiB     : Read/Write               :     0.030 ms
Anon 2 MiB     : POPULATE_READ            :     0.030 ms
Anon 2 MiB     : POPULATE_WRITE           :     0.030 ms
Memfd 4 KiB    : Read                     :     0.412 ms
Memfd 4 KiB    : Write                    :     0.372 ms
Memfd 4 KiB    : Read/Write               :     0.419 ms
Memfd 4 KiB    : POPULATE_READ            :     0.343 ms
Memfd 4 KiB    : POPULATE_WRITE           :     0.288 ms
Memfd 4 KiB    : FALLOCATE                :     0.137 ms
Memfd 4 KiB    : FALLOCATE+Read           :     0.446 ms
Memfd 4 KiB    : FALLOCATE+Write          :     0.330 ms
Memfd 4 KiB    : FALLOCATE+Read/Write     :     0.454 ms
Memfd 4 KiB    : FALLOCATE+POPULATE_READ  :     0.379 ms
Memfd 4 KiB    : FALLOCATE+POPULATE_WRITE :     0.268 ms
Memfd 2 MiB    : Read                     :     0.030 ms
Memfd 2 MiB    : Write                    :     0.030 ms
Memfd 2 MiB    : Read/Write               :     0.030 ms
Memfd 2 MiB    : POPULATE_READ            :     0.030 ms
Memfd 2 MiB    : POPULATE_WRITE           :     0.030 ms
Memfd 2 MiB    : FALLOCATE                :     0.030 ms
Memfd 2 MiB    : FALLOCATE+Read           :     0.031 ms
Memfd 2 MiB    : FALLOCATE+Write          :     0.031 ms
Memfd 2 MiB    : FALLOCATE+Read/Write     :     0.031 ms
Memfd 2 MiB    : FALLOCATE+POPULATE_READ  :     0.030 ms
Memfd 2 MiB    : FALLOCATE+POPULATE_WRITE :     0.030 ms
tmpfs          : Read                     :     0.416 ms
tmpfs          : Write                    :     0.369 ms
tmpfs          : Read/Write               :     0.425 ms
tmpfs          : POPULATE_READ            :     0.346 ms
tmpfs          : POPULATE_WRITE           :     0.295 ms
tmpfs          : FALLOCATE                :     0.139 ms
tmpfs          : FALLOCATE+Read           :     0.447 ms
tmpfs          : FALLOCATE+Write          :     0.333 ms
tmpfs          : FALLOCATE+Read/Write     :     0.454 ms
tmpfs          : FALLOCATE+POPULATE_READ  :     0.380 ms
tmpfs          : FALLOCATE+POPULATE_WRITE :     0.272 ms
file           : Read                     :     0.191 ms
file           : Write                    :     0.511 ms
file           : Read/Write               :     0.524 ms
file           : POPULATE_READ            :     0.196 ms
file           : POPULATE_WRITE           :     0.434 ms
file           : FALLOCATE                :     0.004 ms
file           : FALLOCATE+Read           :     0.197 ms
file           : FALLOCATE+Write          :     0.554 ms
file           : FALLOCATE+Read/Write     :     0.480 ms
file           : FALLOCATE+POPULATE_READ  :     0.201 ms
file           : FALLOCATE+POPULATE_WRITE :     0.381 ms
hugetlbfs      : Read                     :     0.030 ms
hugetlbfs      : Write                    :     0.030 ms
hugetlbfs      : Read/Write               :     0.030 ms
hugetlbfs      : POPULATE_READ            :     0.030 ms
hugetlbfs      : POPULATE_WRITE           :     0.030 ms
hugetlbfs      : FALLOCATE                :     0.030 ms
hugetlbfs      : FALLOCATE+Read           :     0.031 ms
hugetlbfs      : FALLOCATE+Write          :     0.031 ms
hugetlbfs      : FALLOCATE+Read/Write     :     0.030 ms
hugetlbfs      : FALLOCATE+POPULATE_READ  :     0.030 ms
hugetlbfs      : FALLOCATE+POPULATE_WRITE :     0.030 ms
**************************************************
4096 MiB MAP_SHARED:
**************************************************
Anon 4 KiB     : Read                     :  1053.090 ms
Anon 4 KiB     : Write                    :   913.642 ms
Anon 4 KiB     : Read/Write               :  1060.350 ms
Anon 4 KiB     : POPULATE_READ            :   893.691 ms
Anon 4 KiB     : POPULATE_WRITE           :   782.885 ms
Anon 2 MiB     : Read                     :   358.553 ms
Anon 2 MiB     : Write                    :   358.419 ms
Anon 2 MiB     : Read/Write               :   357.992 ms
Anon 2 MiB     : POPULATE_READ            :   357.533 ms
Anon 2 MiB     : POPULATE_WRITE           :   357.808 ms
Memfd 4 KiB    : Read                     :  1078.144 ms
Memfd 4 KiB    : Write                    :   942.036 ms
Memfd 4 KiB    : Read/Write               :  1100.391 ms
Memfd 4 KiB    : POPULATE_READ            :   925.829 ms
Memfd 4 KiB    : POPULATE_WRITE           :   804.394 ms
Memfd 4 KiB    : FALLOCATE                :   304.632 ms
Memfd 4 KiB    : FALLOCATE+Read           :  1163.359 ms
Memfd 4 KiB    : FALLOCATE+Write          :   933.186 ms
Memfd 4 KiB    : FALLOCATE+Read/Write     :  1187.304 ms
Memfd 4 KiB    : FALLOCATE+POPULATE_READ  :  1013.660 ms
Memfd 4 KiB    : FALLOCATE+POPULATE_WRITE :   794.560 ms
Memfd 2 MiB    : Read                     :   358.131 ms
Memfd 2 MiB    : Write                    :   358.099 ms
Memfd 2 MiB    : Read/Write               :   358.250 ms
Memfd 2 MiB    : POPULATE_READ            :   357.563 ms
Memfd 2 MiB    : POPULATE_WRITE           :   357.334 ms
Memfd 2 MiB    : FALLOCATE                :   356.735 ms
Memfd 2 MiB    : FALLOCATE+Read           :   358.152 ms
Memfd 2 MiB    : FALLOCATE+Write          :   358.331 ms
Memfd 2 MiB    : FALLOCATE+Read/Write     :   358.018 ms
Memfd 2 MiB    : FALLOCATE+POPULATE_READ  :   357.286 ms
Memfd 2 MiB    : FALLOCATE+POPULATE_WRITE :   357.523 ms
tmpfs          : Read                     :  1087.265 ms
tmpfs          : Write                    :   950.840 ms
tmpfs          : Read/Write               :  1107.567 ms
tmpfs          : POPULATE_READ            :   922.605 ms
tmpfs          : POPULATE_WRITE           :   810.094 ms
tmpfs          : FALLOCATE                :   306.320 ms
tmpfs          : FALLOCATE+Read           :  1169.796 ms
tmpfs          : FALLOCATE+Write          :   933.730 ms
tmpfs          : FALLOCATE+Read/Write     :  1191.610 ms
tmpfs          : FALLOCATE+POPULATE_READ  :  1020.474 ms
tmpfs          : FALLOCATE+POPULATE_WRITE :   798.945 ms
file           : Read                     :   654.101 ms
file           : Write                    :  1259.142 ms
file           : Read/Write               :  1289.509 ms
file           : POPULATE_READ            :   661.642 ms
file           : POPULATE_WRITE           :  1106.816 ms
file           : FALLOCATE                :     1.864 ms
file           : FALLOCATE+Read           :   656.328 ms
file           : FALLOCATE+Write          :  1153.300 ms
file           : FALLOCATE+Read/Write     :  1180.613 ms
file           : FALLOCATE+POPULATE_READ  :   668.347 ms
file           : FALLOCATE+POPULATE_WRITE :   996.143 ms
hugetlbfs      : Read                     :   357.245 ms
hugetlbfs      : Write                    :   357.413 ms
hugetlbfs      : Read/Write               :   357.120 ms
hugetlbfs      : POPULATE_READ            :   356.321 ms
hugetlbfs      : POPULATE_WRITE           :   356.693 ms
hugetlbfs      : FALLOCATE                :   355.927 ms
hugetlbfs      : FALLOCATE+Read           :   357.074 ms
hugetlbfs      : FALLOCATE+Write          :   357.120 ms
hugetlbfs      : FALLOCATE+Read/Write     :   356.983 ms
hugetlbfs      : FALLOCATE+POPULATE_READ  :   356.413 ms
hugetlbfs      : FALLOCATE+POPULATE_WRITE :   356.266 ms
**************************************************

[1] https://lkml.org/lkml/2013/6/27/698

[akpm@linux-foundation.org: coding style fixes]

Link: https://lkml.kernel.org/r/20210419135443.12822-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 20:47:30 -07:00
Linus Torvalds
dbe69e4337 Networking changes for 5.14.
Core:
 
  - BPF:
    - add syscall program type and libbpf support for generating
      instructions and bindings for in-kernel BPF loaders (BPF loaders
      for BPF), this is a stepping stone for signed BPF programs
    - infrastructure to migrate TCP child sockets from one listener
      to another in the same reuseport group/map to improve flexibility
      of service hand-off/restart
    - add broadcast support to XDP redirect
 
  - allow bypass of the lockless qdisc to improving performance
    (for pktgen: +23% with one thread, +44% with 2 threads)
 
  - add a simpler version of "DO_ONCE()" which does not require
    jump labels, intended for slow-path usage
 
  - virtio/vsock: introduce SOCK_SEQPACKET support
 
  - add getsocketopt to retrieve netns cookie
 
  - ip: treat lowest address of a IPv4 subnet as ordinary unicast address
        allowing reclaiming of precious IPv4 addresses
 
  - ipv6: use prandom_u32() for ID generation
 
  - ip: add support for more flexible field selection for hashing
        across multi-path routes (w/ offload to mlxsw)
 
  - icmp: add support for extended RFC 8335 PROBE (ping)
 
  - seg6: add support for SRv6 End.DT46 behavior
 
  - mptcp:
     - DSS checksum support (RFC 8684) to detect middlebox meddling
     - support Connection-time 'C' flag
     - time stamping support
 
  - sctp: packetization Layer Path MTU Discovery (RFC 8899)
 
  - xfrm: speed up state addition with seq set
 
  - WiFi:
     - hidden AP discovery on 6 GHz and other HE 6 GHz improvements
     - aggregation handling improvements for some drivers
     - minstrel improvements for no-ack frames
     - deferred rate control for TXQs to improve reaction times
     - switch from round robin to virtual time-based airtime scheduler
 
  - add trace points:
     - tcp checksum errors
     - openvswitch - action execution, upcalls
     - socket errors via sk_error_report
 
 Device APIs:
 
  - devlink: add rate API for hierarchical control of max egress rate
             of virtual devices (VFs, SFs etc.)
 
  - don't require RCU read lock to be held around BPF hooks
    in NAPI context
 
  - page_pool: generic buffer recycling
 
 New hardware/drivers:
 
  - mobile:
     - iosm: PCIe Driver for Intel M.2 Modem
     - support for Qualcomm MSM8998 (ipa)
 
  - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices
 
  - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches
 
  - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)
 
  - NXP SJA1110 Automotive Ethernet 10-port switch
 
  - Qualcomm QCA8327 switch support (qca8k)
 
  - Mikrotik 10/25G NIC (atl1c)
 
 Driver changes:
 
  - ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP
    (our first foray into MAC/PHY description via ACPI)
 
  - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx
 
  - Mellanox/Nvidia NIC (mlx5)
    - NIC VF offload of L2 bridging
    - support IRQ distribution to Sub-functions
 
  - Marvell (prestera):
     - add flower and match all
     - devlink trap
     - link aggregation
 
  - Netronome (nfp): connection tracking offload
 
  - Intel 1GE (igc): add AF_XDP support
 
  - Marvell DPU (octeontx2): ingress ratelimit offload
 
  - Google vNIC (gve): new ring/descriptor format support
 
  - Qualcomm mobile (rmnet & ipa): inline checksum offload support
 
  - MediaTek WiFi (mt76)
     - mt7915 MSI support
     - mt7915 Tx status reporting
     - mt7915 thermal sensors support
     - mt7921 decapsulation offload
     - mt7921 enable runtime pm and deep sleep
 
  - Realtek WiFi (rtw88)
     - beacon filter support
     - Tx antenna path diversity support
     - firmware crash information via devcoredump
 
  - Qualcomm 60GHz WiFi (wcn36xx)
     - Wake-on-WLAN support with magic packets and GTK rekeying
 
  - Micrel PHY (ksz886x/ksz8081): add cable test support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmDb+fUACgkQMUZtbf5S
 Irs2Jg//aqN0Q8CgIvYCVhPxQw1tY7pTAbgyqgBZ01vwjyvtIOgJiWzSfFEU84mX
 M8fcpFX5eTKrOyJ9S6UFfQ/JG114n3hjAxFFT4Hxk2gC1Tg0vHuFQTDHcUl28bUE
 mTm61e1YpdorILnv2k5JVQ/wu0vs5QKDrjcYcrcPnh+j93wvnPOgAfDBV95nZzjS
 OTt4q2fR8GzLcSYWWsclMbDNkzyTG50RW/0Yd6aGjr5QGvXfrMeXfUJNz533PMf/
 w5lNyjRKv+x9mdTZJzU0+msNUrZgUdRz7W8Ey8lD3hJZRE+D6/uU7FtsE8Mi3+uc
 HWxeZUyzA3YF1MfVl/eesbxyPT7S/OkLzk4O5B35FbqP0YltaP+bOjq1/nM3ce1/
 io9Dx9pIl/2JANUgRCAtLi8Z2dkvRoqTaBxZ/nPudCCljFwDwl6joTMJ7Ow22i5Y
 5aIkcXFmZq4LbJDiHvbTlqT7yiuaEvu2UK/23bSIg/K3nF4eAmkY9Y1EgiMf60OF
 78Ttw0wk2tUegwaS5MZnCniKBKDyl9gM2F6rbZ/IxQRR2LTXFc1B6gC+ynUxgXfh
 Ub8O++6qGYGYZ0XvQH4pzco79p3qQWBTK5beIp2eu6BOAjBVIXq4AibUfoQLACsu
 hX7jMPYd0kc3WFgUnKgQP8EnjFSwbf4XiaE7fIXvWBY8hzCw2h4=
 =LvtX
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - BPF:
      - add syscall program type and libbpf support for generating
        instructions and bindings for in-kernel BPF loaders (BPF loaders
        for BPF), this is a stepping stone for signed BPF programs
      - infrastructure to migrate TCP child sockets from one listener to
        another in the same reuseport group/map to improve flexibility
        of service hand-off/restart
      - add broadcast support to XDP redirect

   - allow bypass of the lockless qdisc to improving performance (for
     pktgen: +23% with one thread, +44% with 2 threads)

   - add a simpler version of "DO_ONCE()" which does not require jump
     labels, intended for slow-path usage

   - virtio/vsock: introduce SOCK_SEQPACKET support

   - add getsocketopt to retrieve netns cookie

   - ip: treat lowest address of a IPv4 subnet as ordinary unicast
     address allowing reclaiming of precious IPv4 addresses

   - ipv6: use prandom_u32() for ID generation

   - ip: add support for more flexible field selection for hashing
     across multi-path routes (w/ offload to mlxsw)

   - icmp: add support for extended RFC 8335 PROBE (ping)

   - seg6: add support for SRv6 End.DT46 behavior

   - mptcp:
      - DSS checksum support (RFC 8684) to detect middlebox meddling
      - support Connection-time 'C' flag
      - time stamping support

   - sctp: packetization Layer Path MTU Discovery (RFC 8899)

   - xfrm: speed up state addition with seq set

   - WiFi:
      - hidden AP discovery on 6 GHz and other HE 6 GHz improvements
      - aggregation handling improvements for some drivers
      - minstrel improvements for no-ack frames
      - deferred rate control for TXQs to improve reaction times
      - switch from round robin to virtual time-based airtime scheduler

   - add trace points:
      - tcp checksum errors
      - openvswitch - action execution, upcalls
      - socket errors via sk_error_report

  Device APIs:

   - devlink: add rate API for hierarchical control of max egress rate
     of virtual devices (VFs, SFs etc.)

   - don't require RCU read lock to be held around BPF hooks in NAPI
     context

   - page_pool: generic buffer recycling

  New hardware/drivers:

   - mobile:
      - iosm: PCIe Driver for Intel M.2 Modem
      - support for Qualcomm MSM8998 (ipa)

   - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices

   - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches

   - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)

   - NXP SJA1110 Automotive Ethernet 10-port switch

   - Qualcomm QCA8327 switch support (qca8k)

   - Mikrotik 10/25G NIC (atl1c)

  Driver changes:

   - ACPI support for some MDIO, MAC and PHY devices from Marvell and
     NXP (our first foray into MAC/PHY description via ACPI)

   - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx

   - Mellanox/Nvidia NIC (mlx5)
      - NIC VF offload of L2 bridging
      - support IRQ distribution to Sub-functions

   - Marvell (prestera):
      - add flower and match all
      - devlink trap
      - link aggregation

   - Netronome (nfp): connection tracking offload

   - Intel 1GE (igc): add AF_XDP support

   - Marvell DPU (octeontx2): ingress ratelimit offload

   - Google vNIC (gve): new ring/descriptor format support

   - Qualcomm mobile (rmnet & ipa): inline checksum offload support

   - MediaTek WiFi (mt76)
      - mt7915 MSI support
      - mt7915 Tx status reporting
      - mt7915 thermal sensors support
      - mt7921 decapsulation offload
      - mt7921 enable runtime pm and deep sleep

   - Realtek WiFi (rtw88)
      - beacon filter support
      - Tx antenna path diversity support
      - firmware crash information via devcoredump

   - Qualcomm WiFi (wcn36xx)
      - Wake-on-WLAN support with magic packets and GTK rekeying

   - Micrel PHY (ksz886x/ksz8081): add cable test support"

* tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits)
  tcp: change ICSK_CA_PRIV_SIZE definition
  tcp_yeah: check struct yeah size at compile time
  gve: DQO: Fix off by one in gve_rx_dqo()
  stmmac: intel: set PCI_D3hot in suspend
  stmmac: intel: Enable PHY WOL option in EHL
  net: stmmac: option to enable PHY WOL with PMT enabled
  net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}
  net: use netdev_info in ndo_dflt_fdb_{add,del}
  ptp: Set lookup cookie when creating a PTP PPS source.
  net: sock: add trace for socket errors
  net: sock: introduce sk_error_report
  net: dsa: replay the local bridge FDB entries pointing to the bridge dev too
  net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev
  net: dsa: include fdb entries pointing to bridge in the host fdb list
  net: dsa: include bridge addresses which are local in the host fdb list
  net: dsa: sync static FDB entries on foreign interfaces to hardware
  net: dsa: install the host MDB and FDB entries in the master's RX filter
  net: dsa: reference count the FDB addresses at the cross-chip notifier level
  net: dsa: introduce a separate cross-chip notifier type for host FDBs
  net: dsa: reference count the MDB entries at the cross-chip notifier level
  ...
2021-06-30 15:51:09 -07:00
Martynas Pumputis
e8b9eab992 net: retrieve netns cookie via getsocketopt
It's getting more common to run nested container environments for
testing cloud software. One of such examples is Kind [1] which runs a
Kubernetes cluster in Docker containers on a single host. Each container
acts as a Kubernetes node, and thus can run any Pod (aka container)
inside the former. This approach simplifies testing a lot, as it
eliminates complicated VM setups.

Unfortunately, such a setup breaks some functionality when cgroupv2 BPF
programs are used for load-balancing. The load-balancer BPF program
needs to detect whether a request originates from the host netns or a
container netns in order to allow some access, e.g. to a service via a
loopback IP address. Typically, the programs detect this by comparing
netns cookies with the one of the init ns via a call to
bpf_get_netns_cookie(NULL). However, in nested environments the latter
cannot be used given the Kubernetes node's netns is outside the init ns.
To fix this, we need to pass the Kubernetes node netns cookie to the
program in a different way: by extending getsockopt() with a
SO_NETNS_COOKIE option, the orchestrator which runs in the Kubernetes
node netns can retrieve the cookie and pass it to the program instead.

Thus, this is following up on Eric's commit 3d368ab87c ("net:
initialize net->net_cookie at netns setup") to allow retrieval via
SO_NETNS_COOKIE.  This is also in line in how we retrieve socket cookie
via SO_COOKIE.

  [1] https://kind.sigs.k8s.io/

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Cc: Eric Dumazet <edumazet@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 11:13:05 -07:00
Mark Rutland
329c161b8b locking/atomic: parisc: move to ARCH_ATOMIC
We'd like all architectures to convert to ARCH_ATOMIC, as once all
architectures are converted it will be possible to make significant
cleanups to the atomics headers, and this will make it much easier to
generically enable atomic functionality (e.g. debug logic in the
instrumented wrappers).

As a step towards that, this patch migrates parisc to ARCH_ATOMIC. The
arch code provides arch_{atomic,atomic64,xchg,cmpxchg}*(), and common
code wraps these with optional instrumentation to provide the regular
functions.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210525140232.53872-27-mark.rutland@arm.com
2021-05-26 13:20:51 +02:00
Mark Rutland
6988631bdf locking/atomic: cmpxchg: make generic a prefix
The asm-generic implementations of cmpxchg_local() and cmpxchg64_local()
use a `_generic` suffix to distinguish themselves from arch code or
wrappers used elsewhere.

Subsequent patches will add ARCH_ATOMIC support to these
implementations, and will distinguish more functions with a `generic`
portion. To align with how ARCH_ATOMIC uses an `arch_` prefix, it would
be helpful to use a `generic_` prefix rather than a `_generic` suffix.

In preparation for this, this patch renames the existing functions to
make `generic` a prefix rather than a suffix. There should be no
functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210525140232.53872-12-mark.rutland@arm.com
2021-05-26 13:20:50 +02:00
Arnd Bergmann
637be9183e asm-generic: use asm-generic/unaligned.h for most architectures
There are several architectures that just duplicate the contents
of asm-generic/unaligned.h, so change those over to use the
file directly, to make future modifications easier.

The exceptions are:

- arm32 sets HAVE_EFFICIENT_UNALIGNED_ACCESS, but wants the
  unaligned-struct version

- ppc64le disables HAVE_EFFICIENT_UNALIGNED_ACCESS but includes
  the access-ok version

- most m68k also uses the access-ok version without setting
  HAVE_EFFICIENT_UNALIGNED_ACCESS.

- sh4a has a custom inline asm version

- openrisc is the only one using the memmove version that
  generally leads to worse code.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-05-10 17:43:15 +02:00
Linus Torvalds
a48b0872e6 Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:
 "This is everything else from -mm for this merge window.

  90 patches.

  Subsystems affected by this patch series: mm (cleanups and slub),
  alpha, procfs, sysctl, misc, core-kernel, bitmap, lib, compat,
  checkpatch, epoll, isofs, nilfs2, hpfs, exit, fork, kexec, gcov,
  panic, delayacct, gdb, resource, selftests, async, initramfs, ipc,
  drivers/char, and spelling"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (90 commits)
  mm: fix typos in comments
  mm: fix typos in comments
  treewide: remove editor modelines and cruft
  ipc/sem.c: spelling fix
  fs: fat: fix spelling typo of values
  kernel/sys.c: fix typo
  kernel/up.c: fix typo
  kernel/user_namespace.c: fix typos
  kernel/umh.c: fix some spelling mistakes
  include/linux/pgtable.h: few spelling fixes
  mm/slab.c: fix spelling mistake "disired" -> "desired"
  scripts/spelling.txt: add "overflw"
  scripts/spelling.txt: Add "diabled" typo
  scripts/spelling.txt: add "overlfow"
  arm: print alloc free paths for address in registers
  mm/vmalloc: remove vwrite()
  mm: remove xlate_dev_kmem_ptr()
  drivers/char: remove /dev/kmem for good
  mm: fix some typos and code style problems
  ipc/sem.c: mundane typo fixes
  ...
2021-05-07 00:34:51 -07:00
Masahiro Yamada
fa60ce2cb4 treewide: remove editor modelines and cruft
The section "19) Editor modelines and other cruft" in
Documentation/process/coding-style.rst clearly says, "Do not include any
of these in source files."

I recently receive a patch to explicitly add a new one.

Let's do treewide cleanups, otherwise some people follow the existing code
and attempt to upstream their favoriate editor setups.

It is even nicer if scripts/checkpatch.pl can check it.

If we like to impose coding style in an editor-independent manner, I think
editorconfig (patch [1]) is a saner solution.

[1] https://lore.kernel.org/lkml/20200703073143.423557-1-danny@kdrag0n.dev/

Link: https://lkml.kernel.org/r/20210324054457.1477489-1-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>	[auxdisplay]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-07 00:26:34 -07:00
David Hildenbrand
f2e762bab9 mm: remove xlate_dev_kmem_ptr()
Since /dev/kmem has been removed, let's remove the xlate_dev_kmem_ptr()
leftovers.

Link: https://lkml.kernel.org/r/20210324102351.6932-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: Pierre Morel <pmorel@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-07 00:26:34 -07:00
Masahiro Yamada
df86ddbb91 parisc: syscalls: switch to generic syscalltbl.sh
Many architectures duplicate similar shell scripts.

This commit converts parisc to use scripts/syscalltbl.sh. This also
unifies syscall_table_64.h and syscall_table_c32.h.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-04-27 09:43:56 +02:00
Gao Xiang
4d752e5af6 parisc: avoid a warning on u8 cast for cmpxchg on u8 pointers
commit b344d6a83d ("parisc: add support for cmpxchg on u8 pointers")
can generate a sparse warning ("cast truncates bits from constant
value"), which has been reported several times [1] [2] [3].

The original code worked as expected, but anyway, let silence such
sparse warning as what others did [4].

[1] https://lore.kernel.org/r/202104061220.nRMBwCXw-lkp@intel.com
[2] https://lore.kernel.org/r/202012291914.T5Agcn99-lkp@intel.com
[3] https://lore.kernel.org/r/202008210829.KVwn7Xeh%25lkp@intel.com
[4] https://lore.kernel.org/r/20210315131512.133720-2-jacopo+renesas@jmondi.org
Cc: Liam Beguin <liambeguin@gmail.com>
Cc: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.8+
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-04-06 14:32:12 +02:00
Wan Jiabing
4033e76a7a parisc: Remove duplicate struct task_struct declaration
struct task_struct is declared twice. One has been declared
at 154th line. Remove the duplicate.

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-04-06 11:46:26 +02:00
Linus Torvalds
29c395c77a Rework of the X86 irq stack handling:
The irq stack switching was moved out of the ASM entry code in course of
   the entry code consolidation. It ended up being suboptimal in various
   ways.
 
   - Make the stack switching inline so the stackpointer manipulation is not
     longer at an easy to find place.
 
   - Get rid of the unnecessary indirect call.
 
   - Avoid the double stack switching in interrupt return and reuse the
     interrupt stack for softirq handling.
 
   - A objtool fix for CONFIG_FRAME_POINTER=y builds where it got confused
     about the stack pointer manipulation.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmA21OcTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoaX0D/9S0ud6oqbsIvI8LwhvYub63a2cjKP9
 liHAJ7xwMYYVwzf0skwsPb/QE6+onCzdq0upJkgG/gEYm2KbiaMWZ4GgHdj0O7ER
 qXKJONDd36AGxSEdaVzLY5kPuD/mkomGk5QdaZaTmjruthkNzg4y/N2wXUBIMZR0
 FdpSpp5fGspSZCn/DXDx6FjClwpLI53VclvDs6DcZ2DIBA0K+F/cSLb1UQoDLE1U
 hxGeuNa+GhKeeZ5C+q5giho1+ukbwtjMW9WnKHAVNiStjm0uzdqq7ERGi/REvkcB
 LY62u5uOSW1zIBMmzUjDDQEqvypB0iFxFCpN8g9sieZjA0zkaUioRTQyR+YIQ8Cp
 l8LLir0dVQivR1bHghHDKQJUpdw/4zvDj4mMH10XHqbcOtIxJDOJHC5D00ridsAz
 OK0RlbAJBl9FTdLNfdVReBCoehYAO8oefeyMAG12nZeSh5XVUWl238rvzmzIYNhG
 cEtkSx2wIUNEA+uSuI+xvfmwpxL7voTGvqmiRDCAFxyO7Bl/GBu9OEBFA1eOvHB+
 +wTmPDMswRetQNh4QCRXzk1JzP1Wk5CobUL9iinCWFoTJmnsPPSOWlosN6ewaNXt
 kYFpRLy5xt9EP7dlfgBSjiRlthDhTdMrFjD5bsy1vdm1w7HKUo82lHa4O8Hq3PHS
 tinKICUqRsbjig==
 =Sqr1
 -----END PGP SIGNATURE-----

Merge tag 'x86-entry-2021-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 irq entry updates from Thomas Gleixner:
 "The irq stack switching was moved out of the ASM entry code in course
  of the entry code consolidation. It ended up being suboptimal in
  various ways.

  This reworks the X86 irq stack handling:

   - Make the stack switching inline so the stackpointer manipulation is
     not longer at an easy to find place.

   - Get rid of the unnecessary indirect call.

   - Avoid the double stack switching in interrupt return and reuse the
     interrupt stack for softirq handling.

   - A objtool fix for CONFIG_FRAME_POINTER=y builds where it got
     confused about the stack pointer manipulation"

* tag 'x86-entry-2021-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Fix stack-swizzle for FRAME_POINTER=y
  um: Enforce the usage of asm-generic/softirq_stack.h
  x86/softirq/64: Inline do_softirq_own_stack()
  softirq: Move do_softirq_own_stack() to generic asm header
  softirq: Move __ARCH_HAS_DO_SOFTIRQ to Kconfig
  x86: Select CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
  x86/softirq: Remove indirection in do_softirq_own_stack()
  x86/entry: Use run_sysvec_on_irqstack_cond() for XEN upcall
  x86/entry: Convert device interrupts to inline stack switching
  x86/entry: Convert system vectors to irq stack macro
  x86/irq: Provide macro for inlining irq stack switching
  x86/apic: Split out spurious handling code
  x86/irq/64: Adjust the per CPU irq stack pointer by 8
  x86/irq: Sanitize irq stack tracking
  x86/entry: Fix instrumentation annotation
2021-02-24 16:32:23 -08:00
Helge Deller
b7795074a0 parisc: Optimize per-pagetable spinlocks
On parisc a spinlock is stored in the next page behind the pgd which
protects against parallel accesses to the pgd. That's why one additional
page (PGD_ALLOC_ORDER) is allocated for the pgd.

Matthew Wilcox suggested that we instead should use a pointer in the
struct page table for this spinlock and noted, that the comments for the
PGD_ORDER and PMD_ORDER defines were wrong.

Both suggestions are addressed with this patch. Instead of having an own
spinlock to protect the pgd, we now switch to use the existing
page_table_lock.  Additionally, beside loading the pgd into cr25 in
switch_mm_irqs_off(), the physical address of this lock is loaded into
cr28 (tr4), so that we can avoid implementing a complicated lookup in
assembly for this lock in the TLB fault handlers.

The existing Hybrid L2/L3 page table scheme (where the pmd is adjacent
to the pgd) has been dropped with this patch.

Remove the locking in set_pte() and the huge-page pte functions too.
They trigger a spinlock recursion on 32bit machines and seem unnecessary.

Suggested-by: Matthew Wilcox <willy@infradead.org>
Fixes: b37d1c1898 ("parisc: Use per-pagetable spinlock")
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-02-12 16:39:42 +01:00
Tiezhu Yang
ae3c4761c1 parisc: Replace test_ti_thread_flag() with test_tsk_thread_flag()
Use test_tsk_thread_flag() directly instead of test_ti_thread_flag() to
improve readability when the argument type is struct task_struct, it is
similar with commit 5afc78551b ("arm64: Use test_tsk_thread_flag() for
checking TIF_SINGLESTEP").

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-02-12 16:33:05 +01:00
Helge Deller
61c439439c parisc: Use the generic devmem_is_allowed()
Signed-off-by: Helge Deller <deller@gmx.de>
2021-02-12 16:30:14 +01:00
Ingo Molnar
a3251c1a36 Merge branch 'x86/paravirt' into x86/entry
Merge in the recent paravirt changes to resolve conflicts caused
by objtool annotations.

Conflicts:
	arch/x86/xen/xen-asm.S

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-02-12 13:36:43 +01:00
Thomas Gleixner
cd1a41ceba softirq: Move __ARCH_HAS_DO_SOFTIRQ to Kconfig
To prepare for inlining do_softirq_own_stack() replace
__ARCH_HAS_DO_SOFTIRQ with a Kconfig switch and select it in the affected
architectures.

This allows in the next step to move the function prototype and the inline
stub into a seperate asm-generic header file which is required to avoid
include recursion.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002513.181713427@linutronix.de
2021-02-10 23:34:16 +01:00
Davidlohr Bueso
0aa91f84b1 parisc: Remove leftover reference to the power_tasklet
This was removed long ago, back in:

     6e16d9409e ([PARISC] Convert soft power switch driver to kthread)

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-01-26 19:57:26 +01:00
Randy Dunlap
87dbc209ea local64.h: make <asm/local64.h> mandatory
Make <asm-generic/local64.h> mandatory in include/asm-generic/Kbuild and
remove all arch/*/include/asm/local64.h arch-specific files since they
only #include <asm-generic/local64.h>.

This fixes build errors on arch/c6x/ and arch/nios2/ for
block/blk-iocost.c.

Build-tested on 21 of 25 arch-es.  (tools problems on the others)

Yes, we could even rename <asm-generic/local64.h> to
<linux/local64.h> and change all #includes to use
<linux/local64.h> instead.

Link: https://lkml.kernel.org/r/20201227024446.17018-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29 15:36:49 -08:00
Linus Torvalds
005b2a9dc8 tif-task_work.arch-2020-12-14
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl/YJxsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpjpyEACBdW+YjenjTbkUPeEXzQgkBkTZUYw3g007
 DPcUT1g8PQZXYXlQvBKCvGhhIr7/KVcjepKoowiNQfBNGcIPJTVopW58nzpqAfTQ
 goI2WYGn5EKFFKBPvtH04cJD/Wo8muXdxynKtqyZbnGGgZjQxPrE259b8dpHjBSR
 6L7HHkk0D1oU/5b6h6Ocpg9mc/0iIUCZylySAYY3eGO0JaVPJaXgZSJZYgHxCHll
 Lb+/y/fXdtm/0PmQ3ko0ev54g3yEWqZIX0NsZW1asrButIy+KLzQ2Mz1xFLFDMag
 prtIfwb8tzgc4dFPY090C/azjCh5CPpxqYS6FkRwS0p86n6OhkyXrqfily5Hs4/B
 NC7CBPBSH/j+NKUK7CYZcpTzTpxPjUr9p0anUdlvMJz8FhTb/3YEEZ1UTeWOeHmk
 Yo5SxnFghLeZZeZ1ok6rdymnVa7WEX12SCLGQX31BB2mld0tNbKb4b+FsBF6OUMk
 IUaX6OjwDFVRaysC88BQ4hjcIP1HxsViG4/VZDX15gjAAH2Pvb+7tev+lcDcOhjz
 TCD4GNFspTFzRhh9nT7oxQ679qCh9G9zHbzuIRewnrS6iqvo5SJQB3dR2yrWZRRH
 ySkQFiHpYOlnLJYv0jg9COlGwo2FUdcvKhCvkjQKKBz48rzW/IC0LwKdRQWZDFk3
 FKGzP/NBig==
 =cadT
 -----END PGP SIGNATURE-----

Merge tag 'tif-task_work.arch-2020-12-14' of git://git.kernel.dk/linux-block

Pull TIF_NOTIFY_SIGNAL updates from Jens Axboe:
 "This sits on top of of the core entry/exit and x86 entry branch from
  the tip tree, which contains the generic and x86 parts of this work.

  Here we convert the rest of the archs to support TIF_NOTIFY_SIGNAL.

  With that done, we can get rid of JOBCTL_TASK_WORK from task_work and
  signal.c, and also remove a deadlock work-around in io_uring around
  knowing that signal based task_work waking is invoked with the sighand
  wait queue head lock.

  The motivation for this work is to decouple signal notify based
  task_work, of which io_uring is a heavy user of, from sighand. The
  sighand lock becomes a huge contention point, particularly for
  threaded workloads where it's shared between threads. Even outside of
  threaded applications it's slower than it needs to be.

  Roman Gershman <romger@amazon.com> reported that his networked
  workload dropped from 1.6M QPS at 80% CPU to 1.0M QPS at 100% CPU
  after io_uring was changed to use TIF_NOTIFY_SIGNAL. The time was all
  spent hammering on the sighand lock, showing 57% of the CPU time there
  [1].

  There are further cleanups possible on top of this. One example is
  TIF_PATCH_PENDING, where a patch already exists to use
  TIF_NOTIFY_SIGNAL instead. Hopefully this will also lead to more
  consolidation, but the work stands on its own as well"

[1] https://github.com/axboe/liburing/issues/215

* tag 'tif-task_work.arch-2020-12-14' of git://git.kernel.dk/linux-block: (28 commits)
  io_uring: remove 'twa_signal_ok' deadlock work-around
  kernel: remove checking for TIF_NOTIFY_SIGNAL
  signal: kill JOBCTL_TASK_WORK
  io_uring: JOBCTL_TASK_WORK is no longer used by task_work
  task_work: remove legacy TWA_SIGNAL path
  sparc: add support for TIF_NOTIFY_SIGNAL
  riscv: add support for TIF_NOTIFY_SIGNAL
  nds32: add support for TIF_NOTIFY_SIGNAL
  ia64: add support for TIF_NOTIFY_SIGNAL
  h8300: add support for TIF_NOTIFY_SIGNAL
  c6x: add support for TIF_NOTIFY_SIGNAL
  alpha: add support for TIF_NOTIFY_SIGNAL
  xtensa: add support for TIF_NOTIFY_SIGNAL
  arm: add support for TIF_NOTIFY_SIGNAL
  microblaze: add support for TIF_NOTIFY_SIGNAL
  hexagon: add support for TIF_NOTIFY_SIGNAL
  csky: add support for TIF_NOTIFY_SIGNAL
  openrisc: add support for TIF_NOTIFY_SIGNAL
  sh: add support for TIF_NOTIFY_SIGNAL
  um: add support for TIF_NOTIFY_SIGNAL
  ...
2020-12-16 12:33:35 -08:00
Linus Torvalds
5ee863bec7 Merge branch 'parisc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
 "A change to increase the default maximum stack size on parisc to 100MB
  and the ability to further increase the stack hard limit size at
  runtime with ulimit for newly started processes.

  The other patches fix compile warnings, utilize the Kbuild logic and
  cleanups the parisc arch code"

* 'parisc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: pci-dma: fix warning unused-function
  parisc/uapi: Use Kbuild logic to provide <asm/types.h>
  parisc: Make user stack size configurable
  parisc: Use _TIF_USER_WORK_MASK in entry.S
  parisc: Drop loops_per_jiffy from per_cpu struct
2020-12-16 12:10:40 -08:00
Linus Torvalds
e994cc240a seccomp updates for v5.11-rc1
- Improve seccomp performance via constant-action bitmaps (YiFei Zhu & Kees Cook)
 
 - Fix bogus __user annotations (Jann Horn)
 
 - Add missed CONFIG for improved selftest coverage (Mickaël Salaün)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl/ZG5IACgkQiXL039xt
 wCbhuw/+P77jwT/p1DRnKp5vG7TXTqqXrdhQZYNyBUxRaKSGCEMydvJn/h3KscyW
 4eEy9vZKTAhIQg5oI5OXZ9jxzFdpxEg8lMPSKReNEga3d0//H9gOJHYc782D/bf1
 +6x6I4qWv+LMM/52P60gznBH+3WFVtyM5Jw+LF5igOCEVSERoZ3ChsmdSZgkALG0
 DJXKL+Dy1Wj9ESeBtuh1UsKoh4ADTAoPC+LvfGuxn2T+VtnxX/sOSDkkrpHfX+2J
 UKkIgWJHeNmq74nwWjpNuDz24ARTiVWOVQX01nOHRohtu39TZcpU774Pdp4Dsj2W
 oDDwOzIWp4/27aQxkOKv6NXMwd29XbrpH1gweyuvQh9cohSbzx6qZlXujqyd9izs
 6Nh74mvC3cns6sQWSWz5ddU4dMQ4rNjpD2CK1P8A7ZVTfH+5baaPmF8CRp126E6f
 /MAUk7Rfbe6YfYdfMwhXXhTvus0e5yenGFXr46gasJDfGnyy4cLS/MO7AZ+mR0CB
 d9DnrsIJVggL5cZ2LZmivIng18JWnbkgnenmHSXahdLstmYVkdpo4ckBl1G/dXK0
 lDmi9j9FoTxB6OrztEKA0RZB+C1e6q7X7euwsHjgF9XKgD5S+DdeYwqd2lypjyvb
 d9VNLFdngD0CRY7wcJZKRma+yPemlPNurdMjF9LrqaAu232G1UA=
 =jJwG
 -----END PGP SIGNATURE-----

Merge tag 'seccomp-v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull seccomp updates from Kees Cook:
 "The major change here is finally gaining seccomp constant-action
  bitmaps, which internally reduces the seccomp overhead for many
  real-world syscall filters to O(1), as discussed at Plumbers this
  year.

   - Improve seccomp performance via constant-action bitmaps (YiFei Zhu
     & Kees Cook)

   - Fix bogus __user annotations (Jann Horn)

   - Add missed CONFIG for improved selftest coverage (Mickaël Salaün)"

* tag 'seccomp-v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  selftests/seccomp: Update kernel config
  seccomp: Remove bogus __user annotations
  seccomp/cache: Report cache data through /proc/pid/seccomp_cache
  xtensa: Enable seccomp architecture tracking
  sh: Enable seccomp architecture tracking
  s390: Enable seccomp architecture tracking
  riscv: Enable seccomp architecture tracking
  powerpc: Enable seccomp architecture tracking
  parisc: Enable seccomp architecture tracking
  csky: Enable seccomp architecture tracking
  arm: Enable seccomp architecture tracking
  arm64: Enable seccomp architecture tracking
  selftests/seccomp: Compare bitmap vs filter overhead
  x86: Enable seccomp architecture tracking
  seccomp/cache: Add "emulator" to check if filter is constant allow
  seccomp/cache: Lookup syscall allowlist bitmap for fast path
2020-12-16 11:30:10 -08:00
Linus Torvalds
157807123c asm-generic: mmu-context cleanup
This is a cleanup series from Nicholas Piggin, preparing for
 later changes. The asm/mmu_context.h header are generalized
 and common code moved to asm-gneneric/mmu_context.h.
 
 This saves a bit of code and makes it easier to change in
 the future.
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAl/Y1LsACgkQmmx57+YA
 GNm6kBAAq4/n6nuNnh6b9LhjXaZRG75gEyW7JvHl8KE5wmZHwDHqbwiQgU1b3lUs
 JJGbfKqi5ASKxNg6MpfYodmCOqeTUUYG0FUCb6lMhcxxMdfLTLYBvkNd6Y143M+T
 boi5b/iz+OUQdNPzlVeSsUEVsD59FIXmP/GhscWZN9VAyf/aLV2MDBIOhrDSJlPo
 ObexnP0Iw1E1NRQYDQ6L2dKTHa6XmHyUtw40ABPmd/6MSd1S+D+j3FGg+CYmvnzG
 k9g8FbNby8xtUfc0pZV4W/322WN8cDFF9bc04eTDZiAv1bk9lmfvWJ2bWjs3s2qt
 RO/suiZEOAta/WUX9vVLgYn2td00ef+AyjNUgffiUfvQfl++fiCDFTGl+MoCLjbh
 xQUPcRuRdED7bMKNrC0CcDOSwWEBWVXvkU/szBLDeE1sPjXzGQ80q1Y72k9y961I
 mqg7FrHqjZsxT9luXMAzClHNhXAtvehkJZBIdHlFok83EFoTQp48Da4jaDuOOhlq
 p/lkPJWOHegIQMWtGwRyGmG1qzil7b/QBNAPLgu9pF4TA+ySRBEB2BOr2jRSkj6N
 mNTHQbSYxBoktdt+VhtrSsxR+i8lwlegx+RNRFmKK3VH5da2nfiBaOY7zBQQHxCK
 yxQvXvsljSVpfkFKLc/S2nLQL1zTkRfFKV1Xmd3+3owR+EoqM60=
 =NpMX
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-mmu-context-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic mmu-context cleanup from Arnd Bergmann:
 "This is a cleanup series from Nicholas Piggin, preparing for later
  changes. The asm/mmu_context.h header are generalized and common code
  moved to asm-gneneric/mmu_context.h.

  This saves a bit of code and makes it easier to change in the future"

* tag 'asm-generic-mmu-context-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (25 commits)
  h8300: Fix generic mmu_context build
  m68k: mmu_context: Fix Sun-3 build
  xtensa: use asm-generic/mmu_context.h for no-op implementations
  x86: use asm-generic/mmu_context.h for no-op implementations
  um: use asm-generic/mmu_context.h for no-op implementations
  sparc: use asm-generic/mmu_context.h for no-op implementations
  sh: use asm-generic/mmu_context.h for no-op implementations
  s390: use asm-generic/mmu_context.h for no-op implementations
  riscv: use asm-generic/mmu_context.h for no-op implementations
  powerpc: use asm-generic/mmu_context.h for no-op implementations
  parisc: use asm-generic/mmu_context.h for no-op implementations
  openrisc: use asm-generic/mmu_context.h for no-op implementations
  nios2: use asm-generic/mmu_context.h for no-op implementations
  nds32: use asm-generic/mmu_context.h for no-op implementations
  mips: use asm-generic/mmu_context.h for no-op implementations
  microblaze: use asm-generic/mmu_context.h for no-op implementations
  m68k: use asm-generic/mmu_context.h for no-op implementations
  ia64: use asm-generic/mmu_context.h for no-op implementations
  hexagon: use asm-generic/mmu_context.h for no-op implementations
  csky: use asm-generic/mmu_context.h for no-op implementations
  ...
2020-12-15 23:58:04 -08:00
Linus Torvalds
6febd8bef3 Merge branch 'signal-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull signal cleanup from Eric Biederman:
 "Remove a never used HP-UX compatibility from parisc headers and
  consolidating the SA_* flags definitions into a generic header as much
  as possible.

  We only have 32 SA_* flag bits total, so we need to be careful. But as
  this is the first addition in a decade or so I think we are fine for
  the forseeable future"

* 'signal-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  signal/parisc: Remove parisc specific definition of __ARCH_UAPI_SA_FLAGS
2020-12-15 19:14:27 -08:00
Linus Torvalds
2cffa11e2a Generic interrupt and irqchips subsystem:
Core:
 
      - Consolidation and robustness changes for irq time accounting
 
      - Cleanup and consolidation of irq stats
 
      - Remove the fasteoi IPI flow which has been proved useless
 
      - Provide an interface for converting legacy interrupt mechanism into
        irqdomains
 
  Drivers:
 
      The rare event of not having completely new chip driver code, just new
      DT bindings and extensions of existing drivers to accomodate new
      variants!
 
      - Preliminary support for managed interrupts on platform devices
 
      - Correctly identify allocation of MSIs proxyied by another device
 
      - Generalise the Ocelot support to new SoCs
 
      - Improve GICv4.1 vcpu entry, matching the corresponding KVM optimisation
 
      - Work around spurious interrupts on Qualcomm PDC
 
      - Random fixes and cleanups
 
 Thanks,
 
 	tglx
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/YwZgTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoW4CD/90rTi1OQrMe3nb5okVjUZmktz/K3BN
 Cl5+evFiXiNoH+yJSMIVP+8eMAtBH6RgoaD0EUtSYmgzb9h/JRRQYwtPxobXcMb2
 2xcWyLPJkVJL431JKNM8BBRYjLA2VnQ6Ia+Kx3BxqpgKXn5+cEMh1dwIy27Ll2rj
 +2NHAQe1sHL7o/KcCDhYqbVIDjw5K/d7YPwjEuPeEoNv1DOxrOCdCEfgFN0jBtRE
 CoaRTBskeAaHIzHNp47Mxyz43g4tA/D8kB68X0OjpEykVkPUbgNK1FHSwaPbIsFT
 FTSPU3zg8Q6DZ+RGyjNJykIFgUbirlJxARk2c6Ct8Kc3DN6K1jQt4EsU7CXRCc98
 BTBjUNeFeNj3irZ4GHhyMKOQJCA1Z5nCRfBUGiW6gK8183us3BLfH5DM1zEsAYUh
 DCp+UKsLuXhbB80EWq7kl82/2mNGZ8En8EerE6XJA7Z3JN8FplOHEuLezYYzwzbb
 RIes971Vc50J2u2Wf/M2c3PDz3D/4FzfwUeA4LJfTnmOL09RYZ8CsqSckpx4ku/F
 XiBnjwtGEpDXWJ8z13DC7yONrxFGByV19+sqHTBlub5DmIs0gXjhC0dKAPAruUIS
 iCC+Vx6xLgOpTDu8shFsjibbi9Hb6vuZrF2Te+WR5Rf7d80C0J4b5K5PS4daUjr6
 IuD2tz+3CtPjHw==
 =iytv
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2020-12-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "Generic interrupt and irqchips subsystem updates. Unusually, there is
  not a single completely new irq chip driver, just new DT bindings and
  extensions of existing drivers to accomodate new variants!

  Core:

   - Consolidation and robustness changes for irq time accounting

   - Cleanup and consolidation of irq stats

   - Remove the fasteoi IPI flow which has been proved useless

   - Provide an interface for converting legacy interrupt mechanism into
     irqdomains

  Drivers:

   - Preliminary support for managed interrupts on platform devices

   - Correctly identify allocation of MSIs proxyied by another device

   - Generalise the Ocelot support to new SoCs

   - Improve GICv4.1 vcpu entry, matching the corresponding KVM
     optimisation

   - Work around spurious interrupts on Qualcomm PDC

   - Random fixes and cleanups"

* tag 'irq-core-2020-12-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  irqchip/qcom-pdc: Fix phantom irq when changing between rising/falling
  driver core: platform: Add devm_platform_get_irqs_affinity()
  ACPI: Drop acpi_dev_irqresource_disabled()
  resource: Add irqresource_disabled()
  genirq/affinity: Add irq_update_affinity_desc()
  irqchip/gic-v3-its: Flag device allocation as proxied if behind a PCI bridge
  irqchip/gic-v3-its: Tag ITS device as shared if allocating for a proxy device
  platform-msi: Track shared domain allocation
  irqchip/ti-sci-intr: Fix freeing of irqs
  irqchip/ti-sci-inta: Fix printing of inta id on probe success
  drivers/irqchip: Remove EZChip NPS interrupt controller
  Revert "genirq: Add fasteoi IPI flow"
  irqchip/hip04: Make IPIs use handle_percpu_devid_irq()
  irqchip/bcm2836: Make IPIs use handle_percpu_devid_irq()
  irqchip/armada-370-xp: Make IPIs use handle_percpu_devid_irq()
  irqchip/gic, gic-v3: Make SGIs use handle_percpu_devid_irq()
  irqchip/ocelot: Add support for Jaguar2 platforms
  irqchip/ocelot: Add support for Serval platforms
  irqchip/ocelot: Add support for Luton platforms
  irqchip/ocelot: prepare to support more SoC
  ...
2020-12-15 15:03:31 -08:00
Linus Torvalds
d635a69dd4 Networking updates for 5.11
Core:
 
  - support "prefer busy polling" NAPI operation mode, where we defer softirq
    for some time expecting applications to periodically busy poll
 
  - AF_XDP: improve efficiency by more batching and hindering
            the adjacency cache prefetcher
 
  - af_packet: make packet_fanout.arr size configurable up to 64K
 
  - tcp: optimize TCP zero copy receive in presence of partial or unaligned
         reads making zero copy a performance win for much smaller messages
 
  - XDP: add bulk APIs for returning / freeing frames
 
  - sched: support fragmenting IP packets as they come out of conntrack
 
  - net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs
 
 BPF:
 
  - BPF switch from crude rlimit-based to memcg-based memory accounting
 
  - BPF type format information for kernel modules and related tracing
    enhancements
 
  - BPF implement task local storage for BPF LSM
 
  - allow the FENTRY/FEXIT/RAW_TP tracing programs to use bpf_sk_storage
 
 Protocols:
 
  - mptcp: improve multiple xmit streams support, memory accounting and
           many smaller improvements
 
  - TLS: support CHACHA20-POLY1305 cipher
 
  - seg6: add support for SRv6 End.DT4/DT6 behavior
 
  - sctp: Implement RFC 6951: UDP Encapsulation of SCTP
 
  - ppp_generic: add ability to bridge channels directly
 
  - bridge: Connectivity Fault Management (CFM) support as is defined in
            IEEE 802.1Q section 12.14.
 
 Drivers:
 
  - mlx5: make use of the new auxiliary bus to organize the driver internals
 
  - mlx5: more accurate port TX timestamping support
 
  - mlxsw:
    - improve the efficiency of offloaded next hop updates by using
      the new nexthop object API
    - support blackhole nexthops
    - support IEEE 802.1ad (Q-in-Q) bridging
 
  - rtw88: major bluetooth co-existance improvements
 
  - iwlwifi: support new 6 GHz frequency band
 
  - ath11k: Fast Initial Link Setup (FILS)
 
  - mt7915: dual band concurrent (DBDC) support
 
  - net: ipa: add basic support for IPA v4.5
 
 Refactor:
 
  - a few pieces of in_interrupt() cleanup work from Sebastian Andrzej Siewior
 
  - phy: add support for shared interrupts; get rid of multiple driver
         APIs and have the drivers write a full IRQ handler, slight growth
 	of driver code should be compensated by the simpler API which
 	also allows shared IRQs
 
  - add common code for handling netdev per-cpu counters
 
  - move TX packet re-allocation from Ethernet switch tag drivers to
    a central place
 
  - improve efficiency and rename nla_strlcpy
 
  - number of W=1 warning cleanups as we now catch those in a patchwork
    build bot
 
 Old code removal:
 
  - wan: delete the DLCI / SDLA drivers
 
  - wimax: move to staging
 
  - wifi: remove old WDS wifi bridging support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl/YXmUACgkQMUZtbf5S
 IrvSQBAAgOrt4EFopEvVqlTHZbqI45IEqgtXS+YWmlgnjZCgshyMj8q1yK1zzane
 qYxr/NNJ9kV3FdtaynmmHPgEEEfR5kJ/D3B2BsxYDkaDDrD0vbNsBGw+L+/Gbhxl
 N/5l/9FjLyLY1D+EErknuwR5XGuQ6BSDVaKQMhYOiK2hgdnAAI4hszo8Chf6wdD0
 XDBslQ7vpD/05r+eMj0IkS5dSAoGOIFXUxhJ5dqrDbRHiKsIyWqA3PLbYemfAhxI
 s2XckjfmSgGE3FKL8PSFu+EcfHbJQQjLcULJUnqgVcdwEEtRuE9ggEi52nZRXMWM
 4e8sQJAR9Fx7pZy0G1xfS149j6iPU5LjRlU9TNSpVABz14Vvvo3gEL6gyIdsz+xh
 hMN7UBdp0FEaP028CXoIYpaBesvQqj0BSndmee8qsYAtN6j+QKcM2AOSr7JN1uMH
 C/86EDoGAATiEQIVWJvnX5MPmlAoblyLA+RuVhmxkIBx2InGXkFmWqRkXT5l4jtk
 LVl8/TArR4alSQqLXictXCjYlCm9j5N4zFFtEVasSYi7/ZoPfgRNWT+lJ2R8Y+Zv
 +htzGaFuyj6RJTVeFQMrkl3whAtBamo2a0kwg45NnxmmXcspN6kJX1WOIy82+MhD
 Yht7uplSs7MGKA78q/CDU0XBeGjpABUvmplUQBIfrR/jKLW2730=
 =GXs1
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - support "prefer busy polling" NAPI operation mode, where we defer
     softirq for some time expecting applications to periodically busy
     poll

   - AF_XDP: improve efficiency by more batching and hindering the
     adjacency cache prefetcher

   - af_packet: make packet_fanout.arr size configurable up to 64K

   - tcp: optimize TCP zero copy receive in presence of partial or
     unaligned reads making zero copy a performance win for much smaller
     messages

   - XDP: add bulk APIs for returning / freeing frames

   - sched: support fragmenting IP packets as they come out of conntrack

   - net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs

  BPF:

   - BPF switch from crude rlimit-based to memcg-based memory accounting

   - BPF type format information for kernel modules and related tracing
     enhancements

   - BPF implement task local storage for BPF LSM

   - allow the FENTRY/FEXIT/RAW_TP tracing programs to use
     bpf_sk_storage

  Protocols:

   - mptcp: improve multiple xmit streams support, memory accounting and
     many smaller improvements

   - TLS: support CHACHA20-POLY1305 cipher

   - seg6: add support for SRv6 End.DT4/DT6 behavior

   - sctp: Implement RFC 6951: UDP Encapsulation of SCTP

   - ppp_generic: add ability to bridge channels directly

   - bridge: Connectivity Fault Management (CFM) support as is defined
     in IEEE 802.1Q section 12.14.

  Drivers:

   - mlx5: make use of the new auxiliary bus to organize the driver
     internals

   - mlx5: more accurate port TX timestamping support

   - mlxsw:
      - improve the efficiency of offloaded next hop updates by using
        the new nexthop object API
      - support blackhole nexthops
      - support IEEE 802.1ad (Q-in-Q) bridging

   - rtw88: major bluetooth co-existance improvements

   - iwlwifi: support new 6 GHz frequency band

   - ath11k: Fast Initial Link Setup (FILS)

   - mt7915: dual band concurrent (DBDC) support

   - net: ipa: add basic support for IPA v4.5

  Refactor:

   - a few pieces of in_interrupt() cleanup work from Sebastian Andrzej
     Siewior

   - phy: add support for shared interrupts; get rid of multiple driver
     APIs and have the drivers write a full IRQ handler, slight growth
     of driver code should be compensated by the simpler API which also
     allows shared IRQs

   - add common code for handling netdev per-cpu counters

   - move TX packet re-allocation from Ethernet switch tag drivers to a
     central place

   - improve efficiency and rename nla_strlcpy

   - number of W=1 warning cleanups as we now catch those in a patchwork
     build bot

  Old code removal:

   - wan: delete the DLCI / SDLA drivers

   - wimax: move to staging

   - wifi: remove old WDS wifi bridging support"

* tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1922 commits)
  net: hns3: fix expression that is currently always true
  net: fix proc_fs init handling in af_packet and tls
  nfc: pn533: convert comma to semicolon
  af_vsock: Assign the vsock transport considering the vsock address flags
  af_vsock: Set VMADDR_FLAG_TO_HOST flag on the receive path
  vsock_addr: Check for supported flag values
  vm_sockets: Add VMADDR_FLAG_TO_HOST vsock flag
  vm_sockets: Add flags field in the vsock address data structure
  net: Disable NETIF_F_HW_TLS_TX when HW_CSUM is disabled
  tcp: Add logic to check for SYN w/ data in tcp_simple_retransmit
  net: mscc: ocelot: install MAC addresses in .ndo_set_rx_mode from process context
  nfc: s3fwrn5: Release the nfc firmware
  net: vxget: clean up sparse warnings
  mlxsw: spectrum_router: Use eXtended mezzanine to offload IPv4 router
  mlxsw: spectrum: Set KVH XLT cache mode for Spectrum2/3
  mlxsw: spectrum_router_xm: Introduce basic XM cache flushing
  mlxsw: reg: Add Router LPM Cache Enable Register
  mlxsw: reg: Add Router LPM Cache ML Delete Register
  mlxsw: spectrum_router_xm: Implement L-value tracking for M-index
  mlxsw: reg: Add XM Router M Table Register
  ...
2020-12-15 13:22:29 -08:00
Linus Torvalds
edd7ab7684 The new preemtible kmap_local() implementation:
- Consolidate all kmap_atomic() internals into a generic implementation
     which builds the base for the kmap_local() API and make the
     kmap_atomic() interface wrappers which handle the disabling/enabling of
     preemption and pagefaults.
 
   - Switch the storage from per-CPU to per task and provide scheduler
     support for clearing mapping when scheduling out and restoring them
     when scheduling back in.
 
   - Merge the migrate_disable/enable() code, which is also part of the
     scheduler pull request. This was required to make the kmap_local()
     interface available which does not disable preemption when a mapping
     is established. It has to disable migration instead to guarantee that
     the virtual address of the mapped slot is the same accross preemption.
 
   - Provide better debug facilities: guard pages and enforced utilization
     of the mapping mechanics on 64bit systems when the architecture allows
     it.
 
   - Provide the new kmap_local() API which can now be used to cleanup the
     kmap_atomic() usage sites all over the place. Most of the usage sites
     do not require the implicit disabling of preemption and pagefaults so
     the penalty on 64bit and 32bit non-highmem systems is removed and quite
     some of the code can be simplified. A wholesale conversion is not
     possible because some usage depends on the implicit side effects and
     some need to be cleaned up because they work around these side effects.
 
     The migrate disable side effect is only effective on highmem systems
     and when enforced debugging is enabled. On 64bit and 32bit non-highmem
     systems the overhead is completely avoided.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XyQwTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoUolD/9+R+BX96fGir+I8rG9dc3cbLw5meSi
 0I/Nq3PToZMs2Iqv50DsoaPYHHz/M6fcAO9LRIgsE9jRbnY93GnsBM0wU9Y8yQaT
 4wUzOG5WHaLDfqIkx/CN9coUl458oEiwOEbn79A2FmPXFzr7IpkufnV3ybGDwzwP
 p73bjMJMPPFrsa9ig87YiYfV/5IAZHi82PN8Cq1v4yNzgXRP3Tg6QoAuCO84ZnWF
 RYlrfKjcJ2xPdn+RuYyXolPtxr1hJQ0bOUpe4xu/UfeZjxZ7i1wtwLN9kWZe8CKH
 +x4Lz8HZZ5QMTQ9sCHOLtKzu2MceMcpISzoQH4/aFQCNMgLn1zLbS790XkYiQCuR
 ne9Cua+IqgYfGMG8cq8+bkU9HCNKaXqIBgPEKE/iHYVmqzCOqhW5Cogu4KFekf6V
 Wi7pyyUdX2en8BAWpk5NHc8de9cGcc+HXMq2NIcgXjVWvPaqRP6DeITERTZLJOmz
 XPxq5oPLGl7wdm7z+ICIaNApy8zuxpzb6sPLNcn7l5OeorViORlUu08AN8587wAj
 FiVjp6ZYomg+gyMkiNkDqFOGDH5TMENpOFoB0hNNEyJwwS0xh6CgWuwZcv+N8aPO
 HuS/P+tNANbD8ggT4UparXYce7YCtgOf3IG4GA3JJYvYmJ6pU+AZOWRoDScWq4o+
 +jlfoJhMbtx5Gg==
 =n71I
 -----END PGP SIGNATURE-----

Merge tag 'core-mm-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull kmap updates from Thomas Gleixner:
 "The new preemtible kmap_local() implementation:

   - Consolidate all kmap_atomic() internals into a generic
     implementation which builds the base for the kmap_local() API and
     make the kmap_atomic() interface wrappers which handle the
     disabling/enabling of preemption and pagefaults.

   - Switch the storage from per-CPU to per task and provide scheduler
     support for clearing mapping when scheduling out and restoring them
     when scheduling back in.

   - Merge the migrate_disable/enable() code, which is also part of the
     scheduler pull request. This was required to make the kmap_local()
     interface available which does not disable preemption when a
     mapping is established. It has to disable migration instead to
     guarantee that the virtual address of the mapped slot is the same
     across preemption.

   - Provide better debug facilities: guard pages and enforced
     utilization of the mapping mechanics on 64bit systems when the
     architecture allows it.

   - Provide the new kmap_local() API which can now be used to cleanup
     the kmap_atomic() usage sites all over the place. Most of the usage
     sites do not require the implicit disabling of preemption and
     pagefaults so the penalty on 64bit and 32bit non-highmem systems is
     removed and quite some of the code can be simplified. A wholesale
     conversion is not possible because some usage depends on the
     implicit side effects and some need to be cleaned up because they
     work around these side effects.

     The migrate disable side effect is only effective on highmem
     systems and when enforced debugging is enabled. On 64bit and 32bit
     non-highmem systems the overhead is completely avoided"

* tag 'core-mm-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  ARM: highmem: Fix cache_is_vivt() reference
  x86/crashdump/32: Simplify copy_oldmem_page()
  io-mapping: Provide iomap_local variant
  mm/highmem: Provide kmap_local*
  sched: highmem: Store local kmaps in task struct
  x86: Support kmap_local() forced debugging
  mm/highmem: Provide CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP
  mm/highmem: Provide and use CONFIG_DEBUG_KMAP_LOCAL
  microblaze/mm/highmem: Add dropped #ifdef back
  xtensa/mm/highmem: Make generic kmap_atomic() work correctly
  mm/highmem: Take kmap_high_get() properly into account
  highmem: High implementation details and document API
  Documentation/io-mapping: Remove outdated blurb
  io-mapping: Cleanup atomic iomap
  mm/highmem: Remove the old kmap_atomic cruft
  highmem: Get rid of kmap_types.h
  xtensa/mm/highmem: Switch to generic kmap atomic
  sparc/mm/highmem: Switch to generic kmap atomic
  powerpc/mm/highmem: Switch to generic kmap atomic
  nds32/mm/highmem: Switch to generic kmap atomic
  ...
2020-12-14 18:35:53 -08:00
Björn Töpel
7c951cafc0 net: Add SO_BUSY_POLL_BUDGET socket option
This option lets a user set a per socket NAPI budget for
busy-polling. If the options is not set, it will use the default of 8.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/bpf/20201130185205.196029-3-bjorn.topel@gmail.com
2020-12-01 00:09:25 +01:00
Björn Töpel
7fd3253a7d net: Introduce preferred busy-polling
The existing busy-polling mode, enabled by the SO_BUSY_POLL socket
option or system-wide using the /proc/sys/net/core/busy_read knob, is
an opportunistic. That means that if the NAPI context is not
scheduled, it will poll it. If, after busy-polling, the budget is
exceeded the busy-polling logic will schedule the NAPI onto the
regular softirq handling.

One implication of the behavior above is that a busy/heavy loaded NAPI
context will never enter/allow for busy-polling. Some applications
prefer that most NAPI processing would be done by busy-polling.

This series adds a new socket option, SO_PREFER_BUSY_POLL, that works
in concert with the napi_defer_hard_irqs and gro_flush_timeout
knobs. The napi_defer_hard_irqs and gro_flush_timeout knobs were
introduced in commit 6f8b12d661 ("net: napi: add hard irqs deferral
feature"), and allows for a user to defer interrupts to be enabled and
instead schedule the NAPI context from a watchdog timer. When a user
enables the SO_PREFER_BUSY_POLL, again with the other knobs enabled,
and the NAPI context is being processed by a softirq, the softirq NAPI
processing will exit early to allow the busy-polling to be performed.

If the application stops performing busy-polling via a system call,
the watchdog timer defined by gro_flush_timeout will timeout, and
regular softirq handling will resume.

In summary; Heavy traffic applications that prefer busy-polling over
softirq processing should use this option.

Example usage:

  $ echo 2 | sudo tee /sys/class/net/ens785f1/napi_defer_hard_irqs
  $ echo 200000 | sudo tee /sys/class/net/ens785f1/gro_flush_timeout

Note that the timeout should be larger than the userspace processing
window, otherwise the watchdog will timeout and fall back to regular
softirq processing.

Enable the SO_BUSY_POLL/SO_PREFER_BUSY_POLL options on your socket.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/bpf/20201130185205.196029-2-bjorn.topel@gmail.com
2020-12-01 00:09:25 +01:00
Eric W. Biederman
bc54a5cda9 signal/parisc: Remove parisc specific definition of __ARCH_UAPI_SA_FLAGS
Randy Dunlap wrote:
> On 11/27/20 10:43 AM, Randy Dunlap wrote:
>
> > on parisc, _SA_SIGGFAULT is undefined and causing build errors.
> >
> > commit 23acdc76f1
> > Author: Peter Collingbourne <pcc@google.com>
> > Date:   Thu Nov 12 18:53:34 2020 -0800
> >
> >     signal: clear non-uapi flag bits when passing/returning sa_flags
> >
> >
> >
> > _SA_SIGGFAULT is not used or defined anywhere else in the
> > kernel source tree.
>
>
> Here is the build error (although it should be obvious):
>
> ../kernel/signal.c: In function 'do_sigaction':
> ../arch/parisc/include/asm/signal.h:24:30: error: '_SA_SIGGFAULT' undeclared (first use in this function)
>    24 | #define __ARCH_UAPI_SA_FLAGS _SA_SIGGFAULT
>       |                              ^~~~~~~~~~~~~

Stephen Rothwell pointed out:
> _SA_SIGGFAULT was removed by commit
>
>   41f5a81c07 ("parisc: Drop HP-UX specific fcntl and signal flags")
>
> which was added to Linus' tree in v5.10-rc1.

Solve this by removing the the parisc specific definition of
__ARCH_UAPI_SA_FLAGS that was just added.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Fixes: 23acdc76f1 ("signal: clear non-uapi flag bits when passing/returning sa_flags")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-11-30 13:43:35 -06:00
Peter Collingbourne
23acdc76f1 signal: clear non-uapi flag bits when passing/returning sa_flags
Previously we were not clearing non-uapi flag bits in
sigaction.sa_flags when storing the userspace-provided sa_flags or
when returning them via oldact. Start doing so.

This allows userspace to detect missing support for flag bits and
allows the kernel to use non-uapi bits internally, as we are already
doing in arch/x86 for two flag bits. Now that this change is in
place, we no longer need the code in arch/x86 that was hiding these
bits from userspace, so remove it.

This is technically a userspace-visible behavior change for sigaction, as
the unknown bits returned via oldact.sa_flags are no longer set. However,
we are free to define the behavior for unknown bits exactly because
their behavior is currently undefined, so for now we can define the
meaning of each of them to be "clear the bit in oldact.sa_flags unless
the bit becomes known in the future". Furthermore, this behavior is
consistent with OpenBSD [1], illumos [2] and XNU [3] (FreeBSD [4] and
NetBSD [5] fail the syscall if unknown bits are set). So there is some
precedent for this behavior in other kernels, and in particular in XNU,
which is probably the most popular kernel among those that I looked at,
which means that this change is less likely to be a compatibility issue.

Link: [1] f634a6a4b5/sys/kern/kern_sig.c (L278)
Link: [2] 76f19f5fdc/usr/src/uts/common/syscall/sigaction.c (L86)
Link: [3] a449c6a3b8/bsd/kern/kern_sig.c (L480)
Link: [4] eded70c370/sys/kern/kern_sig.c (L699)
Link: [5] 3365779bec/sys/kern/sys_sig.c (L473)
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Link: https://linux-review.googlesource.com/id/I35aab6f5be932505d90f3b3450c083b4db1eca86
Link: https://lkml.kernel.org/r/878dbcb5f47bc9b11881c81f745c0bef5c23f97f.1605235762.git.pcc@google.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-11-23 10:31:05 -06:00
Peter Collingbourne
1d82b7898f arch: move SA_* definitions to generic headers
Most architectures with the exception of alpha, mips, parisc and
sparc use the same values for these flags. Move their definitions into
asm-generic/signal-defs.h and allow the architectures with non-standard
values to override them. Also, document the non-standard flag values
in order to make it easier to add new generic flags in the future.

A consequence of this change is that on powerpc and x86, the constants'
values aside from SA_RESETHAND change signedness from unsigned
to signed. This is not expected to impact realistic use of these
constants. In particular the typical use of the constants where they
are or'ed together and assigned to sa_flags (or another int variable)
would not be affected.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://linux-review.googlesource.com/id/Ia3849f18b8009bf41faca374e701cdca36974528
Link: https://lkml.kernel.org/r/b6d0d1ec34f9ee93e1105f14f288fba5f89d1f24.1605235762.git.pcc@google.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-11-23 10:31:05 -06:00
Peter Collingbourne
161d36dfc7 parisc: start using signal-defs.h
We currently include signal-defs.h on all architectures except parisc.
Make parisc fall in line. This will make maintenance easier once the
flag bits are moved here.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Acked-by: Helge Deller <deller@gmx.de>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Link: https://linux-review.googlesource.com/id/If03a5135fb514fe96548fb74610e6c3586a04064
Link: https://lkml.kernel.org/r/be8f3680ef2d0a1a120994e3ae0b11d82f373279.1605235762.git.pcc@google.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-11-23 10:31:05 -06:00
Helge Deller
8663daeac7 parisc: Drop parisc special case for __sighandler_t
I believe we can and *should* drop this parisc-specific typedef for
__sighandler_t when compiling a 64-bit kernel. The reasons:

1. We don't have a 64-bit userspace yet, so nothing (on userspace side)
can break.

2. Inside the Linux kernel, this is only used in kernel/signal.c, in
function kernel_sigaction() where the signal handler is compared against
SIG_IGN.  SIG_IGN is defined as (__sighandler_t)1), so only the pointers
are compared.

3. Even when a 64-bit userspace gets added at some point, I think
__sighandler_t should be defined what it is: a function pointer struct.

I compiled kernel/signal.c with and without the patch, and the produced code
is identical in both cases.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reviewed-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I21c43f21b264f339e3aa395626af838646f62d97
Link: https://lkml.kernel.org/r/a75b8eb7bb9eac1cf73fb119eb53e5892d6e9656.1605235762.git.pcc@google.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-11-23 10:31:04 -06:00
Thomas Gleixner
9f112156f8 parisc: Remove bogus __IRQ_STAT macro
This is a leftover from a historical array based implementation and unused.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20201113141732.680780121@linutronix.de
2020-11-23 10:31:05 +01:00
YiFei Zhu
6aa7923c87 parisc: Enable seccomp architecture tracking
To enable seccomp constant action bitmaps, we need to have a static
mapping to the audit architecture and system call table size. Add these
for parisc.

Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/9bb86c546eda753adf5270425e7353202dbce87c.1605101222.git.yifeifz2@illinois.edu
2020-11-20 11:16:34 -08:00
Geert Uytterhoeven
6ca753a3a7 parisc/uapi: Use Kbuild logic to provide <asm/types.h>
Uapi <asm-generic/types.h> just includes <asm-generic/int-ll64.h>

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2020-11-11 20:05:19 +01:00
Helge Deller
22ee3ea588 parisc: Make user stack size configurable
On parisc we need to initialize the memory layout for the user stack at
process start time to a fixed size, which up until now was limited to
the size as given by CONFIG_MAX_STACK_SIZE_MB at compile time.

This hard limit was too small and showed problems when compiling
ruby2.7, qmlcachegen and some Qt packages.

This patch changes two things:
a) It increases the default maximum stack size to 100MB.
b) Users can modify the stack hard limit size with ulimit and then newly
   forked processes will use the given stack size which can even be bigger
   than the default 100MB.

Reported-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2020-11-11 14:59:08 +01:00
Helge Deller
93346da8ff parisc: Drop loops_per_jiffy from per_cpu struct
There is no need to keep a loops_per_jiffy value per cpu. Drop it.

Signed-off-by: Helge Deller <deller@gmx.de>
2020-11-11 14:57:30 +01:00
Jens Axboe
18cb328128 parisc: add support for TIF_NOTIFY_SIGNAL
Wire up TIF_NOTIFY_SIGNAL handling for parisc.

Cc: linux-parisc@vger.kernel.org
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-11-09 08:16:55 -07:00
Thomas Gleixner
d7029e4549 highmem: Get rid of kmap_types.h
The header is not longer used and on alpha, ia64, openrisc, parisc and um
it was completely unused anyway as these architectures have no highmem
support.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20201103095858.422094352@linutronix.de
2020-11-06 23:14:58 +01:00
Nicholas Piggin
4146bdab31 parisc: use asm-generic/mmu_context.h for no-op implementations
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-10-27 16:02:37 +01:00
Joe Perches
33def8498f treewide: Convert macro and uses of __section(foo) to __section("foo")
Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.

Remove the quote operator # from compiler_attributes.h __section macro.

Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.

Conversion done using the script at:

    https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-25 14:51:49 -07:00
Helge Deller
4a770b413f parisc: Add MAP_UNINITIALIZED define
We will not allow unitialized anon mmaps, but we need this define
to prevent build errors, e.g. the debian foot package.

Suggested-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2020-10-15 08:10:39 +02:00
John David Anglin
f173e3a751 parisc: Improve spinlock handling
Use READ_ONCE() to check if spinlock is locked.
The other changes are cleanups.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2020-10-15 08:10:39 +02:00
John David Anglin
53a42b6324 parisc: Switch to more fine grained lws locks
Increase the number of lws locks to 256 entries (instead of 16) and
choose lock entry based on bits 3-11 (instead of 4-7) of the relevant
address.  With this change we archieve more fine-grained locking in
futex syscalls and thus reduce the number of possible stalls.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2020-10-15 08:10:39 +02:00
John David Anglin
2a7d4eed57 parisc: Mark pointers volatile in __xchg8(), __xchg32() and __xchg64()
Let the complier treat the pointers volatile to ensure that they get
accessed atomicly.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2020-10-15 08:10:39 +02:00
John David Anglin
6caf55e570 parisc: Add alternative patching to synchronize_caches define
This change allows the sync barrier instruction to be patched to a nop.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2020-10-15 08:10:38 +02:00
Helge Deller
cd760704ee parisc: Drop useless comments in uapi/asm/signal.h
Signed-off-by: Helge Deller <deller@gmx.de>
2020-10-15 08:10:38 +02:00
Helge Deller
75ae04206a parisc: Define O_NONBLOCK to become 000200000
HPUX has separate NDELAY & NONBLOCK values. In the past we wanted to
be able to run HP-UX binaries natively on parisc Linux which is why
we defined O_NONBLOCK to 000200004 to distinguish NDELAY & NONBLOCK
bits.
But with 2 bits set in this bitmask we often ran into compatibility
issues with other Linux applications which often only test one bit (or
even compare the values).

To avoid such issues in the future, this patch changes O_NONBLOCK to
become 000200000. That way old programs will still be functional, and
for new programs we now have only one bit set.

Update the comment about SOCK_NONBLOCK too.

Signed-off-by: Helge Deller <deller@gmx.de>
2020-10-15 08:10:38 +02:00
Helge Deller
41f5a81c07 parisc: Drop HP-UX specific fcntl and signal flags
Those flags are nowhere used in the Linux kernel and were added when we
still wanted to support HP-UX in a compat mode. Since we never will
support HP-UX, drop those flags.

Signed-off-by: Helge Deller <deller@gmx.de>
2020-10-15 08:10:37 +02:00
Linus Torvalds
e18afa5bfa Merge branch 'work.quota-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat quotactl cleanups from Al Viro:
 "More Christoph's compat cleanups: quotactl(2)"

* 'work.quota-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  quota: simplify the quotactl compat handling
  compat: add a compat_need_64bit_alignment_fixup() helper
  compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h>
2020-10-12 16:37:13 -07:00
Christoph Hellwig
cc7886d25b compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h>
lift the compat_s64 and compat_u64 definitions into common code using the
COMPAT_FOR_U64_ALIGNMENT symbol for the x86 special case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-17 13:00:46 -04:00
Al Viro
c693cc4676 saner calling conventions for csum_and_copy_..._user()
All callers of these primitives will
	* discard anything we might've copied in case of error
	* ignore the csum value in case of error
	* always pass 0xffffffff as the initial sum, so the
resulting csum value (in case of success, that is) will never be 0.

That suggest the following calling conventions:
	* don't pass err_ptr - just return 0 on error.
	* don't bother with zeroing destination, etc. in case of error
	* don't pass the initial sum - just use 0xffffffff.

This commit does the minimal conversion in the instances of csum_and_copy_...();
the changes of actual asm code behind them are done later in the series.
Note that this asm code is often shared with csum_partial_copy_nocheck();
the difference is that csum_partial_copy_nocheck() passes 0 for initial
sum while csum_and_copy_..._user() pass 0xffffffff.  Fortunately, we are
free to pass 0xffffffff in all cases and subsequent patches will use that
freedom without any special comments.

A part that could be split off: parisc and uml/i386 claimed to have
csum_and_copy_to_user() instances of their own, but those were identical
to the generic one, so we simply drop them.  Not sure if it's worth
a separate commit...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-08-20 15:45:15 -04:00
Al Viro
6e41c585e3 unify generic instances of csum_partial_copy_nocheck()
quite a few architectures have the same csum_partial_copy_nocheck() -
simply memcpy() the data and then return the csum of the copy.

hexagon, parisc, ia64, s390, um: explicitly spelled out that way.

arc, arm64, csky, h8300, m68k/nommu, microblaze, mips/GENERIC_CSUM, nds32,
nios2, openrisc, riscv, unicore32: end up picking the same thing spelled
out in lib/checksum.h (with varying amounts of perversions along the way).

everybody else (alpha, arm, c6x, m68k/mmu, mips/!GENERIC_CSUM, powerpc,
sh, sparc, x86, xtensa) have non-generic variants.  For all except c6x
the declaration is in their asm/checksum.h.  c6x uses the wrapper
from asm-generic/checksum.h that would normally lead to the lib/checksum.h
instance, but in case of c6x we end up using an asm function from arch/c6x
instead.

Screw that mess - have architectures with private instances define
_HAVE_ARCH_CSUM_AND_COPY in their asm/checksum.h and have the default
one right in net/checksum.h conditional on _HAVE_ARCH_CSUM_AND_COPY
*not* defined.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-08-20 15:45:14 -04:00
Mike Rapoport
6f6aea7e96 parisc: fix PMD pages allocation by restoring pmd_alloc_one()
Commit 1355c31eeb ("asm-generic: pgalloc: provide generic pmd_alloc_one()
and pmd_free_one()") converted parisc to use generic version of
pmd_alloc_one() but it missed the fact that parisc uses order-1 pages for
PMD.

Restore the original version of pmd_alloc_one() for parisc, just use
GFP_PGTABLE_KERNEL that implies __GFP_ZERO instead of GFP_KERNEL and
memset.

Fixes: 1355c31eeb ("asm-generic: pgalloc: provide generic pmd_alloc_one() and pmd_free_one()")
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Tested-by: Meelis Roos <mroos@linux.ee>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://lkml.kernel.org/r/9f2b5ebd-e4a4-0fa1-6cd3-4b9f6892d1ad@linux.ee
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-16 10:53:13 -07:00
Krzysztof Kozlowski
8f28ca6bd8 iomap: constify ioreadX() iomem argument (as in generic implementation)
Patch series "iomap: Constify ioreadX() iomem argument", v3.

The ioread8/16/32() and others have inconsistent interface among the
architectures: some taking address as const, some not.

It seems there is nothing really stopping all of them to take pointer to
const.

This patch (of 4):

The ioreadX() and ioreadX_rep() helpers have inconsistent interface.  On
some architectures void *__iomem address argument is a pointer to const,
on some not.

Implementations of ioreadX() do not modify the memory under the address so
they can be converted to a "const" version for const-safety and
consistency among architectures.

[krzk@kernel.org: sh: clk: fix assignment from incompatible pointer type for ioreadX()]
  Link: http://lkml.kernel.org/r/20200723082017.24053-1-krzk@kernel.org
[akpm@linux-foundation.org: fix drivers/mailbox/bcm-pdc-mailbox.c]
  Link: http://lkml.kernel.org/r/202007132209.Rxmv4QyS%25lkp@intel.com

Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Allen Hubbe <allenbh@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Link: http://lkml.kernel.org/r/20200709072837.5869-1-krzk@kernel.org
Link: http://lkml.kernel.org/r/20200709072837.5869-2-krzk@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-14 19:56:57 -07:00
Linus Torvalds
7a02c8d45b Merge branch 'parisc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull more parisc updates from Helge Deller:

 - Oscar Carter contributed a patch which fixes parisc's usage of
   dereference_function_descriptor() and thus will allow using the
   -Wcast-function-type compiler option in the top-level Makefile

 - Sven Schnelle fixed a bug in the SBA code to prevent crashes during
   kexec

 - John David Anglin provided implementations for __smp_store_release()
   and __smp_load_acquire barriers() which avoids using the sync
   assembler instruction and thus speeds up barrier paths

 - Some whitespace cleanups in parisc's atomic.h header file

* 'parisc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Implement __smp_store_release and __smp_load_acquire barriers
  parisc: mask out enable and reserved bits from sba imask
  parisc: Whitespace cleanups in atomic.h
  parisc/kernel/ftrace: Remove function callback casts
  sections.h: dereference_function_descriptor() returns void pointer
2020-08-12 12:41:15 -07:00
Christoph Hellwig
428e2976a5 uaccess: remove segment_eq
segment_eq is only used to implement uaccess_kernel.  Just open code
uaccess_kernel in the arch uaccess headers and remove one layer of
indirection.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Greentime Hu <green.hu@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Link: http://lkml.kernel.org/r/20200710135706.537715-5-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:57:58 -07:00
John David Anglin
e96ebd589d parisc: Implement __smp_store_release and __smp_load_acquire barriers
This patch implements the __smp_store_release and __smp_load_acquire barriers
using ordered stores and loads.  This avoids the sync instruction present in
the generic implementation.

Cc: <stable@vger.kernel.org> # 4.14+
Signed-off-by: Dave Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2020-08-12 15:13:42 +02:00
Helge Deller
3bc6e3dc5a parisc: Whitespace cleanups in atomic.h
Fix whitespace indenting and drop trailing backslashes.

Cc: <stable@vger.kernel.org> # 4.19+
Signed-off-by: Helge Deller <deller@gmx.de>
2020-08-11 18:07:34 +02:00
Linus Torvalds
97d052ea3f A set of locking fixes and updates:
- Untangle the header spaghetti which causes build failures in various
     situations caused by the lockdep additions to seqcount to validate that
     the write side critical sections are non-preemptible.
 
   - The seqcount associated lock debug addons which were blocked by the
     above fallout.
 
     seqcount writers contrary to seqlock writers must be externally
     serialized, which usually happens via locking - except for strict per
     CPU seqcounts. As the lock is not part of the seqcount, lockdep cannot
     validate that the lock is held.
 
     This new debug mechanism adds the concept of associated locks.
     sequence count has now lock type variants and corresponding
     initializers which take a pointer to the associated lock used for
     writer serialization. If lockdep is enabled the pointer is stored and
     write_seqcount_begin() has a lockdep assertion to validate that the
     lock is held.
 
     Aside of the type and the initializer no other code changes are
     required at the seqcount usage sites. The rest of the seqcount API is
     unchanged and determines the type at compile time with the help of
     _Generic which is possible now that the minimal GCC version has been
     moved up.
 
     Adding this lockdep coverage unearthed a handful of seqcount bugs which
     have been addressed already independent of this.
 
     While generaly useful this comes with a Trojan Horse twist: On RT
     kernels the write side critical section can become preemtible if the
     writers are serialized by an associated lock, which leads to the well
     known reader preempts writer livelock. RT prevents this by storing the
     associated lock pointer independent of lockdep in the seqcount and
     changing the reader side to block on the lock when a reader detects
     that a writer is in the write side critical section.
 
  - Conversion of seqcount usage sites to associated types and initializers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8xmPYTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTuQEACyzQCjU8PgehPp9oMqWzaX2fcVyuZO
 QU2yw6gmz2oTz3ZHUNwdW8UnzGh2OWosK3kDruoD9FtSS51lER1/ISfSPCGfyqxC
 KTjOcB1Kvxwq/3LcCx7Zi3ZxWApat74qs3EhYhKtEiQ2Y9xv9rLq8VV1UWAwyxq0
 eHpjlIJ6b6rbt+ARslaB7drnccOsdK+W/roNj4kfyt+gezjBfojGRdMGQNMFcpnv
 shuTC+vYurAVIiVA/0IuizgHfwZiXOtVpjVoEWaxg6bBH6HNuYMYzdSa/YrlDkZs
 n/aBI/Xkvx+Eacu8b1Zwmbzs5EnikUK/2dMqbzXKUZK61eV4hX5c2xrnr1yGWKTs
 F/juh69Squ7X6VZyKVgJ9RIccVueqwR2EprXWgH3+RMice5kjnXH4zURp0GHALxa
 DFPfB6fawcH3Ps87kcRFvjgm6FBo0hJ1AxmsW1dY4ACFB9azFa2euW+AARDzHOy2
 VRsUdhL9CGwtPjXcZ/9Rhej6fZLGBXKr8uq5QiMuvttp4b6+j9FEfBgD4S6h8csl
 AT2c2I9LcbWqyUM9P4S7zY/YgOZw88vHRuDH7tEBdIeoiHfrbSBU7EQ9jlAKq/59
 f+Htu2Io281c005g7DEeuCYvpzSYnJnAitj5Lmp/kzk2Wn3utY1uIAVszqwf95Ul
 81ppn2KlvzUK8g==
 =7Gj+
 -----END PGP SIGNATURE-----

Merge tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Thomas Gleixner:
 "A set of locking fixes and updates:

   - Untangle the header spaghetti which causes build failures in
     various situations caused by the lockdep additions to seqcount to
     validate that the write side critical sections are non-preemptible.

   - The seqcount associated lock debug addons which were blocked by the
     above fallout.

     seqcount writers contrary to seqlock writers must be externally
     serialized, which usually happens via locking - except for strict
     per CPU seqcounts. As the lock is not part of the seqcount, lockdep
     cannot validate that the lock is held.

     This new debug mechanism adds the concept of associated locks.
     sequence count has now lock type variants and corresponding
     initializers which take a pointer to the associated lock used for
     writer serialization. If lockdep is enabled the pointer is stored
     and write_seqcount_begin() has a lockdep assertion to validate that
     the lock is held.

     Aside of the type and the initializer no other code changes are
     required at the seqcount usage sites. The rest of the seqcount API
     is unchanged and determines the type at compile time with the help
     of _Generic which is possible now that the minimal GCC version has
     been moved up.

     Adding this lockdep coverage unearthed a handful of seqcount bugs
     which have been addressed already independent of this.

     While generally useful this comes with a Trojan Horse twist: On RT
     kernels the write side critical section can become preemtible if
     the writers are serialized by an associated lock, which leads to
     the well known reader preempts writer livelock. RT prevents this by
     storing the associated lock pointer independent of lockdep in the
     seqcount and changing the reader side to block on the lock when a
     reader detects that a writer is in the write side critical section.

   - Conversion of seqcount usage sites to associated types and
     initializers"

* tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  locking/seqlock, headers: Untangle the spaghetti monster
  locking, arch/ia64: Reduce <asm/smp.h> header dependencies by moving XTP bits into the new <asm/xtp.h> header
  x86/headers: Remove APIC headers from <asm/smp.h>
  seqcount: More consistent seqprop names
  seqcount: Compress SEQCNT_LOCKNAME_ZERO()
  seqlock: Fold seqcount_LOCKNAME_init() definition
  seqlock: Fold seqcount_LOCKNAME_t definition
  seqlock: s/__SEQ_LOCKDEP/__SEQ_LOCK/g
  hrtimer: Use sequence counter with associated raw spinlock
  kvm/eventfd: Use sequence counter with associated spinlock
  userfaultfd: Use sequence counter with associated spinlock
  NFSv4: Use sequence counter with associated spinlock
  iocost: Use sequence counter with associated spinlock
  raid5: Use sequence counter with associated spinlock
  vfs: Use sequence counter with associated spinlock
  timekeeping: Use sequence counter with associated raw spinlock
  xfrm: policy: Use sequence counters with associated lock
  netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
  netfilter: conntrack: Use sequence counter with associated spinlock
  sched: tasks: Use sequence counter with associated spinlock
  ...
2020-08-10 19:07:44 -07:00
Linus Torvalds
8d3e09b433 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull regset conversion fix from Al Viro:
 "Fix a regression from an unnoticed bisect hazard in the regset series.

  A bunch of old (aout, originally) primitives used by coredumps became
  dead code after fdpic conversion to regsets. Removal of that dead code
  had been the first commit in the followups to regset series;
  unfortunately, it happened to hide the bisect hazard on sh (extern for
  fpregs_get() had not been updated in the main series when it should
  have been; followup simply made fpregs_get() static). And without that
  followup commit this bisect hazard became breakage in the mainline"

Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  kill unused dump_fpu() instances
2020-08-09 13:33:54 -07:00
Mike Rapoport
f9cb654cb5 asm-generic: pgalloc: provide generic pgd_free()
Most architectures define pgd_free() as a wrapper for free_page().

Provide a generic version in asm-generic/pgalloc.h and enable its use for
most architectures.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Matthew Wilcox <willy@infradead.org>
Link: http://lkml.kernel.org/r/20200627143453.31835-7-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:26 -07:00
Mike Rapoport
1355c31eeb asm-generic: pgalloc: provide generic pmd_alloc_one() and pmd_free_one()
For most architectures that support >2 levels of page tables,
pmd_alloc_one() is a wrapper for __get_free_pages(), sometimes with
__GFP_ZERO and sometimes followed by memset(0) instead.

More elaborate versions on arm64 and x86 account memory for the user page
tables and call to pgtable_pmd_page_ctor() as the part of PMD page
initialization.

Move the arm64 version to include/asm-generic/pgalloc.h and use the
generic version on several architectures.

The pgtable_pmd_page_ctor() is a NOP when ARCH_ENABLE_SPLIT_PMD_PTLOCK is
not enabled, so there is no functional change for most architectures
except of the addition of __GFP_ACCOUNT for allocation of user page
tables.

The pmd_free() is a wrapper for free_page() in all the cases, so no
functional change here.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Link: http://lkml.kernel.org/r/20200627143453.31835-5-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:26 -07:00
Mike Rapoport
ca15ca406f mm: remove unneeded includes of <asm/pgalloc.h>
Patch series "mm: cleanup usage of <asm/pgalloc.h>"

Most architectures have very similar versions of pXd_alloc_one() and
pXd_free_one() for intermediate levels of page table.  These patches add
generic versions of these functions in <asm-generic/pgalloc.h> and enable
use of the generic functions where appropriate.

In addition, functions declared and defined in <asm/pgalloc.h> headers are
used mostly by core mm and early mm initialization in arch and there is no
actual reason to have the <asm/pgalloc.h> included all over the place.
The first patch in this series removes unneeded includes of
<asm/pgalloc.h>

In the end it didn't work out as neatly as I hoped and moving
pXd_alloc_track() definitions to <asm-generic/pgalloc.h> would require
unnecessary changes to arches that have custom page table allocations, so
I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local
to mm/.

This patch (of 8):

In most cases <asm/pgalloc.h> header is required only for allocations of
page table memory.  Most of the .c files that include that header do not
use symbols declared in <asm/pgalloc.h> and do not require that header.

As for the other header files that used to include <asm/pgalloc.h>, it is
possible to move that include into the .c file that actually uses symbols
from <asm/pgalloc.h> and drop the include from the header file.

The process was somewhat automated using

	sed -i -E '/[<"]asm\/pgalloc\.h/d' \
                $(grep -L -w -f /tmp/xx \
                        $(git grep -E -l '[<"]asm/pgalloc\.h'))

where /tmp/xx contains all the symbols defined in
arch/*/include/asm/pgalloc.h.

[rppt@linux.ibm.com: fix powerpc warning]

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:26 -07:00
Peter Zijlstra
b3545192e2 locking, arch/ia64: Reduce <asm/smp.h> header dependencies by moving XTP bits into the new <asm/xtp.h> header
We want to remove the #include <asm/io.h> from <asm/smp.h>, but for this
we have to move the XTP bits into a separate header first (as these bits
rely on <asm/io.h> definitions), and include them in the .c files that rely
on those APIs.

Co-developed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-08-06 16:13:13 +02:00
Linus Torvalds
95ffa67658 Merge branch 'parisc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
 "The majority of the patches are reverts of previous commits regarding
  the parisc-specific low level spinlocking code and barrier handling,
  with which we tried to fix CPU stalls on our build servers. In the end
  John David Anglin found the culprit: We missed a define for
  atomic64_set_release(). This seems to have fixed our issues, so now
  it's good to remove the unnecessary code again.

  Other than that it's trivial stuff: Spelling fixes, constifications
  and such"

* 'parisc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: make the log level string for register dumps const
  parisc: Do not use an ordered store in pa_tlb_lock()
  Revert "parisc: Revert "Release spinlocks using ordered store""
  Revert "parisc: Use ldcw instruction for SMP spinlock release barrier"
  Revert "parisc: Drop LDCW barrier in CAS code when running UP"
  Revert "parisc: Improve interrupt handling in arch_spin_lock_flags()"
  parisc: Replace HTTP links with HTTPS ones
  parisc: elf.h: delete a duplicated word
  parisc: Report bad pages as HardwareCorrupted
  parisc: Convert to BIT_MASK() and BIT_WORD()
2020-08-04 22:02:47 -07:00
Ingo Molnar
28cff52eae Merge branch 'linus' into locking/core, to resolve conflict
Conflicts:
	arch/arm/include/asm/percpu.h

As Stephen Rothwell noted, there's a conflict between this commit
in locking/core:

  a21ee6055c ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables")

and this fresh upstream commit:

  aa54ea903a ("ARM: percpu.h: fix build error")

a21ee6055c is a simpler solution to the dependency problem and doesn't
further increase header hell - so this conflict resolution effectively
reverts aa54ea903a and uses the a21ee6055c solution.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-31 12:16:09 +02:00
Herbert Xu
7ca8cf5347 locking/atomic: Move ATOMIC_INIT into linux/types.h
This patch moves ATOMIC_INIT from asm/atomic.h into linux/types.h.
This allows users of atomic_t to use ATOMIC_INIT without having to
include atomic.h as that way may lead to header loops.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20200729123105.GB7047@gondor.apana.org.au
2020-07-29 16:14:18 +02:00
Helge Deller
157e9afcc4 Revert "parisc: Revert "Release spinlocks using ordered store""
This reverts commit 86d4d068df.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v5.0+
2020-07-28 18:56:14 +02:00
Helge Deller
6e9f06ee6c Revert "parisc: Use ldcw instruction for SMP spinlock release barrier"
This reverts commit 9e5c602186.
No need to use the ldcw instruction as SMP spinlock release barrier.
Revert it to gain back speed again.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v5.2+
2020-07-28 18:54:40 +02:00
Helge Deller
3d05b8aebc Revert "parisc: Improve interrupt handling in arch_spin_lock_flags()"
This reverts commit 2772f0efd5.
It turns out that we want to implement the spinlock code differently.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v5.7+
2020-07-28 18:49:45 +02:00
Randy Dunlap
961b658fea parisc: elf.h: delete a duplicated word
Drop the repeated word "the".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
2020-07-28 11:22:05 +02:00
Al Viro
bb1a773d5b kill unused dump_fpu() instances
dump_fpu() is used only on the architectures that support elf
and have neither CORE_DUMP_USE_REGSET nor ELF_CORE_COPY_FPREGS
defined.

Currently that's csky, m68k, microblaze, nds32 and unicore32.  The rest
of the instances are dead code.

NB: THIS MUST GO AFTER ELF_FDPIC CONVERSION

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-07-27 14:33:10 -04:00
John David Anglin
be6577af0c parisc: Add atomic64_set_release() define to avoid CPU soft lockups
Stalls are quite frequent with recent kernels. I enabled
CONFIG_SOFTLOCKUP_DETECTOR and I caught the following stall:

watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [cc1:22803]
CPU: 0 PID: 22803 Comm: cc1 Not tainted 5.6.17+ #3
Hardware name: 9000/800/rp3440
 IAOQ[0]: d_alloc_parallel+0x384/0x688
 IAOQ[1]: d_alloc_parallel+0x388/0x688
 RP(r2): d_alloc_parallel+0x134/0x688
Backtrace:
 [<000000004036974c>] __lookup_slow+0xa4/0x200
 [<0000000040369fc8>] walk_component+0x288/0x458
 [<000000004036a9a0>] path_lookupat+0x88/0x198
 [<000000004036e748>] filename_lookup+0xa0/0x168
 [<000000004036e95c>] user_path_at_empty+0x64/0x80
 [<000000004035d93c>] vfs_statx+0x104/0x158
 [<000000004035dfcc>] __do_sys_lstat64+0x44/0x80
 [<000000004035e5a0>] sys_lstat64+0x20/0x38
 [<0000000040180054>] syscall_exit+0x0/0x14

The code was stuck in this loop in d_alloc_parallel:

    4037d414:   0e 00 10 dc     ldd 0(r16),ret0
    4037d418:   c7 fc 5f ed     bb,< ret0,1f,4037d414 <d_alloc_parallel+0x384>
    4037d41c:   08 00 02 40     nop

This is the inner loop of bit_spin_lock which is called by hlist_bl_unlock in
d_alloc_parallel:

static inline void bit_spin_lock(int bitnum, unsigned long *addr)
{
        /*
         * Assuming the lock is uncontended, this never enters
         * the body of the outer loop. If it is contended, then
         * within the inner loop a non-atomic test is used to
         * busywait with less bus contention for a good time to
         * attempt to acquire the lock bit.
         */
        preempt_disable();
#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
        while (unlikely(test_and_set_bit_lock(bitnum, addr))) {
                preempt_enable();
                do {
                        cpu_relax();
                } while (test_bit(bitnum, addr));
                preempt_disable();
        }
#endif
        __acquire(bitlock);
}

After consideration, I realized that we must be losing bit unlocks.
Then, I noticed that we missed defining atomic64_set_release().
Adding this define fixes the stalls in bit operations.

Signed-off-by: Dave Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
2020-07-21 17:16:37 +02:00
Liam Beguin
b344d6a83d parisc: add support for cmpxchg on u8 pointers
The kernel test bot reported[1] that using set_mask_bits on a u8 causes
the following issue on parisc:

	hppa-linux-ld: drivers/phy/ti/phy-tusb1210.o: in function `tusb1210_probe':
	>> (.text+0x2f4): undefined reference to `__cmpxchg_called_with_bad_pointer'
	>> hppa-linux-ld: (.text+0x324): undefined reference to `__cmpxchg_called_with_bad_pointer'
	hppa-linux-ld: (.text+0x354): undefined reference to `__cmpxchg_called_with_bad_pointer'

Add support for cmpxchg on u8 pointers.

[1] https://lore.kernel.org/patchwork/patch/1272617/#1468946

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Liam Beguin <liambeguin@gmail.com>
Tested-by: Dave Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2020-07-21 08:11:12 +02:00
Helge Deller
208151bfb7 parisc: Convert to BIT_MASK() and BIT_WORD()
Drop own open-coded implementation to set bits and use the kernel
provided BIT_MASK() and BIT_WORD() macros.

Signed-off-by: Helge Deller <deller@gmx.de>
2020-07-21 08:04:39 +02:00
Mike Rapoport
974b9b2c68 mm: consolidate pte_index() and pte_offset_*() definitions
All architectures define pte_index() as

	(address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)

and all architectures define pte_offset_kernel() as an entry in the array
of PTEs indexed by the pte_index().

For the most architectures the pte_offset_kernel() implementation relies
on the availability of pmd_page_vaddr() that converts a PMD entry value to
the virtual address of the page containing PTEs array.

Let's move x86 definitions of the PTE accessors to the generic place in
<linux/pgtable.h> and then simply drop the respective definitions from the
other architectures.

The architectures that didn't provide pmd_page_vaddr() are updated to have
that defined.

The generic implementation of pte_offset_kernel() can be overridden by an
architecture and alpha makes use of this because it has special ordering
requirements for its version of pte_offset_kernel().

[rppt@linux.ibm.com: v2]
  Link: http://lkml.kernel.org/r/20200514170327.31389-11-rppt@kernel.org
[rppt@linux.ibm.com: update]
  Link: http://lkml.kernel.org/r/20200514170327.31389-12-rppt@kernel.org
[rppt@linux.ibm.com: update]
  Link: http://lkml.kernel.org/r/20200514170327.31389-13-rppt@kernel.org
[akpm@linux-foundation.org: fix x86 warning]
[sfr@canb.auug.org.au: fix powerpc build]
  Link: http://lkml.kernel.org/r/20200607153443.GB738695@linux.ibm.com

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-10-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:14 -07:00
Mike Rapoport
ca5999fde0 mm: introduce include/linux/pgtable.h
The include/linux/pgtable.h is going to be the home of generic page table
manipulation functions.

Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and
make the latter include asm/pgtable.h.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:13 -07:00
Mike Rapoport
e31cf2f4ca mm: don't include asm/pgtable.h if linux/mm.h is already included
Patch series "mm: consolidate definitions of page table accessors", v2.

The low level page table accessors (pXY_index(), pXY_offset()) are
duplicated across all architectures and sometimes more than once.  For
instance, we have 31 definition of pgd_offset() for 25 supported
architectures.

Most of these definitions are actually identical and typically it boils
down to, e.g.

static inline unsigned long pmd_index(unsigned long address)
{
        return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
}

static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
{
        return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
}

These definitions can be shared among 90% of the arches provided
XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.

For architectures that really need a custom version there is always
possibility to override the generic version with the usual ifdefs magic.

These patches introduce include/linux/pgtable.h that replaces
include/asm-generic/pgtable.h and add the definitions of the page table
accessors to the new header.

This patch (of 12):

The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the
functions involving page table manipulations, e.g.  pte_alloc() and
pmd_alloc().  So, there is no point to explicitly include <asm/pgtable.h>
in the files that include <linux/mm.h>.

The include statements in such cases are remove with a simple loop:

	for f in $(git grep -l "include <linux/mm.h>") ; do
		sed -i -e '/include <asm\/pgtable.h>/ d' $f
	done

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:13 -07:00
Ira Weiny
7438f36310 parisc/kmap: remove duplicate kmap code
parisc reimplements the kmap calls except to flush its dcache.  This is
arguably an abuse of kmap but regardless it is messy and confusing.

Remove the duplicate code and have parisc define ARCH_HAS_FLUSH_ON_KUNMAP
for a kunmap_flush_on_unmap() architecture specific call to flush the
cache.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian König <christian.koenig@amd.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200507150004.1423069-14-ira.weiny@intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04 19:06:22 -07:00
Ira Weiny
abca2500c0 arch/kunmap_atomic: consolidate duplicate code
Every single architecture (including !CONFIG_HIGHMEM) calls...

	pagefault_enable();
	preempt_enable();

... before returning from __kunmap_atomic().  Lift this code into the
kunmap_atomic() macro.

While we are at it rename __kunmap_atomic() to kunmap_atomic_high() to
be consistent.

[ira.weiny@intel.com: don't enable pagefault/preempt twice]
  Link: http://lkml.kernel.org/r/20200518184843.3029640-1-ira.weiny@intel.com
[akpm@linux-foundation.org: coding style fixes]
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian König <christian.koenig@amd.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Guenter Roeck <linux@roeck-us.net>
Link: http://lkml.kernel.org/r/20200507150004.1423069-8-ira.weiny@intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04 19:06:22 -07:00
Anshuman Khandual
5be9934328 mm/hugetlb: define a generic fallback for arch_clear_hugepage_flags()
There are multiple similar definitions for arch_clear_hugepage_flags() on
various platforms.  Lets just add it's generic fallback definition for
platforms that do not override.  This help reduce code duplication.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1588907271-11920-4-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03 20:09:46 -07:00
Anshuman Khandual
b0eae98c66 mm/hugetlb: define a generic fallback for is_hugepage_only_range()
There are multiple similar definitions for is_hugepage_only_range() on
various platforms.  Lets just add it's generic fallback definition for
platforms that do not override.  This help reduce code duplication.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1588907271-11920-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03 20:09:46 -07:00
Linus Torvalds
bce159d734 for-5.8/drivers-2020-06-01
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl7VPc4QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpgQkEACnQlzWOfNQMz1AzgUAv/S8IYDJCLrkbjLZ
 JK4pJv8Hjhss/7sS+fd8kyKe9VtaZz2IjmrXcC66RMMwtpx4iHnkRffoNAgEdGOl
 /M5TCZGhs+F/mp3Lc0WdR5DFHkM6yy2Tkk9wCFLreB4bW67janAWnd7nbU4INqJj
 +WqIgpzNMc/kfUhpBYTeQLORhL4e2TG9ADTi/zeUITlpnEsA65LOgXKEpeIFYnSX
 KTl4GIZ9tjazG3Y1Eva7DYHDIErNNAtX67KBqf+WBgMV98eB0O6xIPN1WlmhDTqj
 FGMLkb8msH1HHntvxDAuc4/ortnUy8vPI4o6zKP89HJJNjIM5p5eHEuVF5JnBw42
 Rtu9Om6JqWx51nhAhJNBj9bUStYbhEl0vVQCwbkfPbDJhzTy3RR8z709q9+ZwOrL
 xbp4aJBzqrzscjBEiSQbNCf2PyuOAdU0r1x81UN81ZN41d5qUcumcinjw4Y7vru8
 z5zMlo1Iy/AWQYyu7jgHmnpI7ZyA/1Qclo5dV7aa72bLFaJa35e7QxgfQOFBA5dY
 UZl6QPJRlnB80uGRzD5jCh2O2sQ3XZqYnpaKsUAka1GgbceCp9IC4A5mfZvpACsh
 Xk8VXjlhvY/iPJsKLqrh4Oedg4Dj5M3PLL9C3MDfYeIP2qgXpbnk87UV1TPNSpY0
 QcTxsXXXIw==
 =H+/Z
 -----END PGP SIGNATURE-----

Merge tag 'for-5.8/drivers-2020-06-01' of git://git.kernel.dk/linux-block

Pull block driver updates from Jens Axboe:
 "On top of the core changes, here are the block driver changes for this
  merge window:

   - NVMe changes:
        - NVMe over Fibre Channel protocol updates, which also reach
          over to drivers/scsi/lpfc (James Smart)
        - namespace revalidation support on the target (Anthony
          Iliopoulos)
        - gcc zero length array fix (Arnd Bergmann)
        - nvmet cleanups (Chaitanya Kulkarni)
        - misc cleanups and fixes (me, Keith Busch, Sagi Grimberg)
        - use a SRQ per completion vector (Max Gurtovoy)
        - fix handling of runtime changes to the queue count (Weiping
          Zhang)
        - t10 protection information support for nvme-rdma and
          nvmet-rdma (Israel Rukshin and Max Gurtovoy)
        - target side AEN improvements (Chaitanya Kulkarni)
        - various fixes and minor improvements all over, icluding the
          nvme part of the lpfc driver"

   - Floppy code cleanup series (Willy, Denis)

   - Floppy contention fix (Jiri)

   - Loop CONFIGURE support (Martijn)

   - bcache fixes/improvements (Coly, Joe, Colin)

   - q->queuedata cleanups (Christoph)

   - Get rid of ioctl_by_bdev (Christoph, Stefan)

   - md/raid5 allocation fixes (Coly)

   - zero length array fixes (Gustavo)

   - swim3 task state fix (Xu)"

* tag 'for-5.8/drivers-2020-06-01' of git://git.kernel.dk/linux-block: (166 commits)
  bcache: configure the asynchronous registertion to be experimental
  bcache: asynchronous devices registration
  bcache: fix refcount underflow in bcache_device_free()
  bcache: Convert pr_<level> uses to a more typical style
  bcache: remove redundant variables i and n
  lpfc: Fix return value in __lpfc_nvme_ls_abort
  lpfc: fix axchg pointer reference after free and double frees
  lpfc: Fix pointer checks and comments in LS receive refactoring
  nvme: set dma alignment to qword
  nvmet: cleanups the loop in nvmet_async_events_process
  nvmet: fix memory leak when removing namespaces and controllers concurrently
  nvmet-rdma: add metadata/T10-PI support
  nvmet: add metadata support for block devices
  nvmet: add metadata/T10-PI support
  nvme: add Metadata Capabilities enumerations
  nvmet: rename nvmet_check_data_len to nvmet_check_transfer_len
  nvmet: rename nvmet_rw_len to nvmet_rw_data_len
  nvmet: add metadata characteristics for a namespace
  nvme-rdma: add metadata/T10-PI support
  nvme-rdma: introduce nvme_rdma_sgl structure
  ...
2020-06-02 15:37:03 -07:00
Linus Torvalds
94709049fb Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:
 "A few little subsystems and a start of a lot of MM patches.

  Subsystems affected by this patch series: squashfs, ocfs2, parisc,
  vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup,
  swap, memcg, pagemap, memory-failure, vmalloc, kasan"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits)
  kasan: move kasan_report() into report.c
  mm/mm_init.c: report kasan-tag information stored in page->flags
  ubsan: entirely disable alignment checks under UBSAN_TRAP
  kasan: fix clang compilation warning due to stack protector
  x86/mm: remove vmalloc faulting
  mm: remove vmalloc_sync_(un)mappings()
  x86/mm/32: implement arch_sync_kernel_mappings()
  x86/mm/64: implement arch_sync_kernel_mappings()
  mm/ioremap: track which page-table levels were modified
  mm/vmalloc: track which page-table levels were modified
  mm: add functions to track page directory modifications
  s390: use __vmalloc_node in stack_alloc
  powerpc: use __vmalloc_node in alloc_vm_stack
  arm64: use __vmalloc_node in arch_alloc_vmap_stack
  mm: remove vmalloc_user_node_flags
  mm: switch the test_vmalloc module to use __vmalloc_node
  mm: remove __vmalloc_node_flags_caller
  mm: remove both instances of __vmalloc_node_flags
  mm: remove the prot argument to __vmalloc_node
  mm: remove the pgprot argument to __vmalloc
  ...
2020-06-02 12:21:36 -07:00
Andrew Morton
78128fabd0 arch/parisc/include/asm/pgtable.h: remove unused `old_pte'
parisc's set_pte_at() macro has set-but-not-used variable:

  include/linux/pgtable.h: In function 'pte_clear_not_present_full':
  arch/parisc/include/asm/pgtable.h:96:9: warning: variable 'old_pte' set but not used [-Wunused-but-set-variable]

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:05 -07:00
Al Viro
5904122c46 take the dummy csum_and_copy_from_user() into net/checksum.h
now that can be done conveniently - all non-trivial cases have
_HAVE_ARCH_COPY_AND_CSUM_FROM_USER defined, so the fallback in
net/checksum.h is used only for dummy (copy_from_user, then
csum_partial) implementation.  Allowing us to get rid of all
dummy instances, both of csum_and_copy_from_user() and
csum_partial_copy_from_user().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-29 16:11:50 -04:00
Al Viro
77a8710ba7 parisc: turn csum_partial_copy_from_user() into csum_and_copy_from_user()
Already has the right semantics.  Incidentally. failing copy_from_user()
zeroes the tail of destination - no need to repeat that manually

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-29 16:11:49 -04:00
Willy Tarreau
40b7d1b690 floppy: use symbolic register names in the parisc port
Now we can use FD_STATUS and FD_DATA instead of 4 or 5, let's do
this, and also use STATUS_DMA and STATUS_READY for the status bits.

Link: https://lore.kernel.org/r/20200331094054.24441-5-w@1wt.eu
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Denis Efremov <efremov@linux.com>
2020-05-12 19:34:53 +03:00
Willy Tarreau
e72e8bf1c9 floppy: split the base port from the register in I/O accesses
Currently we have architecture-specific fd_inb() and fd_outb() functions
or macros, taking just a port which is in fact made of a base address and
a register. The base address is FDC-specific and derived from the local or
global "fdc" variable through the FD_IOPORT macro used in the base address
calculation.

This change splits this by explicitly passing the FDC's base address and
the register separately to fd_outb() and fd_inb(). It affects the
following archs:
  - x86, alpha, mips, powerpc, parisc, arm, m68k:
    simple remap of port -> base+reg

  - sparc32: use of reg only, since the base address was already masked
    out and the FDC controller is known from a static struct.

  - sparc64: like x86 for PCI, like sparc32 for 82077

Some archs use inline functions and others macros. This was not
unified in order to minimize the number of changes to review. For the
same reason checkpatch still spews a few warnings about things that
were already there before.

The parisc still uses hard-coded register values and could be cleaned up
by taking the register definitions.

The sparc per-controller inb/outb functions could further be refined
to explicitly take an FDC register instead of a port in argument but it
was not needed yet and may be cleaned later.

Link: https://lore.kernel.org/r/20200331094054.24441-2-w@1wt.eu
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Ian Molton <spyro@f2s.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: x86@kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Denis Efremov <efremov@linux.com>
2020-05-12 19:34:52 +03:00
Anshuman Khandual
78e7c5af08 mm/special: create generic fallbacks for pte_special() and pte_mkspecial()
Currently there are many platforms that dont enable ARCH_HAS_PTE_SPECIAL
but required to define quite similar fallback stubs for special page
table entry helpers such as pte_special() and pte_mkspecial(), as they
get build in generic MM without a config check.  This creates two
generic fallback stub definitions for these helpers, eliminating much
code duplication.

mips platform has a special case where pte_special() and pte_mkspecial()
visibility is wider than what ARCH_HAS_PTE_SPECIAL enablement requires.
This restricts those symbol visibility in order to avoid redefinitions
which is now exposed through this new generic stubs and subsequent build
failure.  arm platform set_pte_at() definition needs to be moved into a
C file just to prevent a build failure.

[anshuman.khandual@arm.com: use defined(CONFIG_ARCH_HAS_PTE_SPECIAL) in mips per Thomas]
  Link: http://lkml.kernel.org/r/1583851924-21603-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Guo Ren <guoren@kernel.org>			[csky]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Acked-by: Stafford Horne <shorne@gmail.com>		[openrisc]
Acked-by: Helge Deller <deller@gmx.de>			[parisc]
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Sam Creasey <sammy@sammy.net>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Link: http://lkml.kernel.org/r/1583802551-15406-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10 15:36:21 -07:00
Anshuman Khandual
c62da0c35d mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS
There are many platforms with exact same value for VM_DATA_DEFAULT_FLAGS
This creates a default value for VM_DATA_DEFAULT_FLAGS in line with the
existing VM_STACK_DEFAULT_FLAGS.  While here, also define some more
macros with standard VMA access flag combinations that are used
frequently across many platforms.  Apart from simplification, this
reduces code duplication as well.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Chris Zankel <chris@zankel.net>
Link: http://lkml.kernel.org/r/1583391014-8170-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10 15:36:21 -07:00
Helge Deller
fbdc8f0f48 parisc: Rework arch_rw locking functions
Clean up the arch read/write locking functions based on the arc
implemenation. This improves readability of those functions.

Signed-off-by: Helge Deller <deller@gmx.de>
2020-04-05 22:29:55 +02:00
Helge Deller
2772f0efd5 parisc: Improve interrupt handling in arch_spin_lock_flags()
Rewrite arch_spin_lock() and arch_spin_lock_flags() to not re-enable and
disable the PSW_SM_I interrupt flag too often.

Signed-off-by: Helge Deller <deller@gmx.de>
2020-04-05 22:16:38 +02:00
Masahiro Yamada
630f289b71 asm-generic: make more kernel-space headers mandatory
Change a header to mandatory-y if both of the following are met:

[1] At least one architecture (except um) specifies it as generic-y in
    arch/*/include/asm/Kbuild

[2] Every architecture (except um) either has its own implementation
    (arch/*/include/asm/*.h) or specifies it as generic-y in
    arch/*/include/asm/Kbuild

This commit was generated by the following shell script.

----------------------------------->8-----------------------------------

arches=$(cd arch; ls -1 | sed -e '/Kconfig/d' -e '/um/d')

tmpfile=$(mktemp)

grep "^mandatory-y +=" include/asm-generic/Kbuild > $tmpfile

find arch -path 'arch/*/include/asm/Kbuild' |
	xargs sed -n 's/^generic-y += \(.*\)/\1/p' | sort -u |
while read header
do
	mandatory=yes

	for arch in $arches
	do
		if ! grep -q "generic-y += $header" arch/$arch/include/asm/Kbuild &&
			! [ -f arch/$arch/include/asm/$header ]; then
			mandatory=no
			break
		fi
	done

	if [ "$mandatory" = yes ]; then
		echo "mandatory-y += $header" >> $tmpfile

		for arch in $arches
		do
			sed -i "/generic-y += $header/d" arch/$arch/include/asm/Kbuild
		done
	fi

done

sed -i '/^mandatory-y +=/d' include/asm-generic/Kbuild

LANG=C sort $tmpfile >> include/asm-generic/Kbuild

----------------------------------->8-----------------------------------

One obvious benefit is the diff stat:

 25 files changed, 52 insertions(+), 557 deletions(-)

It is tedious to list generic-y for each arch that needs it.

So, mandatory-y works like a fallback default (by just wrapping
asm-generic one) when arch does not have a specific header
implementation.

See the following commits:

def3f7cefe
a1b39bae16

It is tedious to convert headers one by one, so I processed by a shell
script.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/20200210175452.5030-1-masahiroy@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02 09:35:25 -07:00
Al Viro
dc88588990 [parisc, s390, sparc64] no need for access_ok() in futex handling
access_ok() is always true on those

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-03-27 23:58:52 -04:00
Al Viro
a08971e948 futex: arch_futex_atomic_op_inuser() calling conventions change
Move access_ok() in and pagefault_enable()/pagefault_disable() out.
Mechanical conversion only - some instances don't really need
a separate access_ok() at all (e.g. the ones only using
get_user()/put_user(), or architectures where access_ok()
is always true); we'll deal with that in followups.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-03-27 23:58:51 -04:00
Linus Torvalds
33c84e89ab SCSI misc on 20200129
This series is slightly unusual because it includes Arnd's compat
 ioctl tree here:
 
 1c46a2cf2d Merge tag 'block-ioctl-cleanup-5.6' into 5.6/scsi-queue
 
 Excluding Arnd's changes, this is mostly an update of the usual
 drivers: megaraid_sas, mpt3sas, qla2xxx, ufs, lpfc, hisi_sas.  There
 are a couple of core and base updates around error propagation and
 atomicity in the attribute container base we use for the SCSI
 transport classes.  The rest is minor changes and updates.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXjHQJyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishZZ8AQC02N+v
 iUnTl1YxGPjIWBbnHuUxN2Qbb9D3C6gAT1LkigEArlk163K3A1XEQHF/VNCdAz/f
 01XYTd3p1VHuegIBHlk=
 =Cn52
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This series is slightly unusual because it includes Arnd's compat
  ioctl tree here:

    1c46a2cf2d Merge tag 'block-ioctl-cleanup-5.6' into 5.6/scsi-queue

  Excluding Arnd's changes, this is mostly an update of the usual
  drivers: megaraid_sas, mpt3sas, qla2xxx, ufs, lpfc, hisi_sas.

  There are a couple of core and base updates around error propagation
  and atomicity in the attribute container base we use for the SCSI
  transport classes.

  The rest is minor changes and updates"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (149 commits)
  scsi: hisi_sas: Rename hisi_sas_cq.pci_irq_mask
  scsi: hisi_sas: Add prints for v3 hw interrupt converge and automatic affinity
  scsi: hisi_sas: Modify the file permissions of trigger_dump to write only
  scsi: hisi_sas: Replace magic number when handle channel interrupt
  scsi: hisi_sas: replace spin_lock_irqsave/spin_unlock_restore with spin_lock/spin_unlock
  scsi: hisi_sas: use threaded irq to process CQ interrupts
  scsi: ufs: Use UFS device indicated maximum LU number
  scsi: ufs: Add max_lu_supported in struct ufs_dev_info
  scsi: ufs: Delete is_init_prefetch from struct ufs_hba
  scsi: ufs: Inline two functions into their callers
  scsi: ufs: Move ufshcd_get_max_pwr_mode() to ufshcd_device_params_init()
  scsi: ufs: Split ufshcd_probe_hba() based on its called flow
  scsi: ufs: Delete struct ufs_dev_desc
  scsi: ufs: Fix ufshcd_probe_hba() reture value in case ufshcd_scsi_add_wlus() fails
  scsi: ufs-mediatek: enable low-power mode for hibern8 state
  scsi: ufs: export some functions for vendor usage
  scsi: ufs-mediatek: add dbg_register_dump implementation
  scsi: qla2xxx: Fix a NULL pointer dereference in an error path
  scsi: qla1280: Make checking for 64bit support consistent
  scsi: megaraid_sas: Update driver version to 07.713.01.00-rc1
  ...
2020-01-29 18:16:16 -08:00
Linus Torvalds
634cd4b6af Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Cleanup of the GOP [graphics output] handling code in the EFI stub

   - Complete refactoring of the mixed mode handling in the x86 EFI stub

   - Overhaul of the x86 EFI boot/runtime code

   - Increase robustness for mixed mode code

   - Add the ability to disable DMA at the root port level in the EFI
     stub

   - Get rid of RWX mappings in the EFI memory map and page tables,
     where possible

   - Move the support code for the old EFI memory mapping style into its
     only user, the SGI UV1+ support code.

   - plus misc fixes, updates, smaller cleanups.

  ... and due to interactions with the RWX changes, another round of PAT
  cleanups make a guest appearance via the EFI tree - with no side
  effects intended"

* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits)
  efi/x86: Disable instrumentation in the EFI runtime handling code
  efi/libstub/x86: Fix EFI server boot failure
  efi/x86: Disallow efi=old_map in mixed mode
  x86/boot/compressed: Relax sed symbol type regex for LLVM ld.lld
  efi/x86: avoid KASAN false positives when accessing the 1: 1 mapping
  efi: Fix handling of multiple efi_fake_mem= entries
  efi: Fix efi_memmap_alloc() leaks
  efi: Add tracking for dynamically allocated memmaps
  efi: Add a flags parameter to efi_memory_map
  efi: Fix comment for efi_mem_type() wrt absent physical addresses
  efi/arm: Defer probe of PCIe backed efifb on DT systems
  efi/x86: Limit EFI old memory map to SGI UV machines
  efi/x86: Avoid RWX mappings for all of DRAM
  efi/x86: Don't map the entire kernel text RW for mixed mode
  x86/mm: Fix NX bit clearing issue in kernel_map_pages_in_pgd
  efi/libstub/x86: Fix unused-variable warning
  efi/libstub/x86: Use mandatory 16-byte stack alignment in mixed mode
  efi/libstub/x86: Use const attribute for efi_is_64bit()
  efi: Allow disabling PCI busmastering on bridges during boot
  efi/x86: Allow translating 64-bit arguments for mixed mode calls
  ...
2020-01-28 09:03:40 -08:00
Ingo Molnar
57ad87ddce Merge branch 'x86/mm' into efi/core, to pick up dependencies
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-01-10 18:53:14 +01:00
Christoph Hellwig
4bdc0d676a remove ioremap_nocache and devm_ioremap_nocache
ioremap has provided non-cached semantics by default since the Linux 2.6
days, so remove the additional ioremap_nocache interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2020-01-06 09:45:59 +01:00
Arnd Bergmann
202bf8d758 compat: provide compat_ptr() on all architectures
In order to avoid needless #ifdef CONFIG_COMPAT checks,
move the compat_ptr() definition to linux/compat.h
where it can be seen by any file regardless of the
architecture.

Only s390 needs a special definition, this can use the
self-#define trick we have elsewhere.

Reviewed-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-01-03 09:32:51 +01:00
Helge Deller
75cf979700 parisc: Fix compiler warnings in debug_core.c
Fix this compiler warning:
kernel/debug/debug_core.c: In function ‘kgdb_cpu_enter’:
arch/parisc/include/asm/cmpxchg.h:48:3: warning: value computed is not used [-Wunused-value]
   48 |  ((__typeof__(*(ptr)))__xchg((unsigned long)(x), (ptr), sizeof(*(ptr))))
arch/parisc/include/asm/atomic.h:78:30: note: in expansion of macro ‘xchg’
   78 | #define atomic_xchg(v, new) (xchg(&((v)->counter), new))
      |                              ^~~~
kernel/debug/debug_core.c:596:4: note: in expansion of macro ‘atomic_xchg’
  596 |    atomic_xchg(&kgdb_active, cpu);
      |    ^~~~~~~~~~~

Signed-off-by: Helge Deller <deller@gmx.de>
2019-12-20 21:01:42 +01:00
Sven Schnelle
e16260c21f parisc: fix compilation when KEXEC=n and KEXEC_FILE=y
Fix compilation when the CONFIG_KEXEC_FILE=y and
CONFIG_KEXEC=n.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2019-12-15 21:05:38 +01:00
Ingo Molnar
1f059dfdf5 mm/vmalloc: Add empty <asm/vmalloc.h> headers and use them from <linux/vmalloc.h>
In the x86 MM code we'd like to untangle various types of historic
header dependency spaghetti, but for this we'd need to pass to
the generic vmalloc code various vmalloc related defines that
customarily come via the <asm/page.h> low level arch header.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-12-10 10:12:55 +01:00
Mike Rapoport
d96885e277 parisc: use pgtable-nopXd instead of 4level-fixup
parisc has two or three levels of page tables and can use appropriate
pgtable-nopXd and folding of the upper layers.

Replace usage of include/asm-generic/4level-fixup.h and explicit
definitions of __PAGETABLE_PxD_FOLDED in parisc with
include/asm-generic/pgtable-nopmd.h for two-level configurations and
with include/asm-generic/pgtable-nopud.h for three-lelve configurations
and adjust page table manipulation macros and functions accordingly.

Link: http://lkml.kernel.org/r/1572938135-31886-9-git-send-email-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: Anatoly Pugachev <matorola@gmail.com>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Peter Rosin <peda@axentia.se>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Sam Creasey <sammy@sammy.net>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-04 19:44:15 -08:00
Masahiro Yamada
0fb9dc2867 arch: sembuf.h: make uapi asm/sembuf.h self-contained
Userspace cannot compile <asm/sembuf.h> due to some missing type
definitions.  For example, building it for x86 fails as follows:

    CC      usr/include/asm/sembuf.h.s
  In file included from <command-line>:32:0:
  usr/include/asm/sembuf.h:17:20: error: field `sem_perm' has incomplete type
    struct ipc64_perm sem_perm; /* permissions .. see ipc.h */
                      ^~~~~~~~
  usr/include/asm/sembuf.h:24:2: error: unknown type name `__kernel_time_t'
    __kernel_time_t sem_otime; /* last semop time */
    ^~~~~~~~~~~~~~~
  usr/include/asm/sembuf.h:25:2: error: unknown type name `__kernel_ulong_t'
    __kernel_ulong_t __unused1;
    ^~~~~~~~~~~~~~~~
  usr/include/asm/sembuf.h:26:2: error: unknown type name `__kernel_time_t'
    __kernel_time_t sem_ctime; /* last change time */
    ^~~~~~~~~~~~~~~
  usr/include/asm/sembuf.h:27:2: error: unknown type name `__kernel_ulong_t'
    __kernel_ulong_t __unused2;
    ^~~~~~~~~~~~~~~~
  usr/include/asm/sembuf.h:29:2: error: unknown type name `__kernel_ulong_t'
    __kernel_ulong_t sem_nsems; /* no. of semaphores in array */
    ^~~~~~~~~~~~~~~~
  usr/include/asm/sembuf.h:30:2: error: unknown type name `__kernel_ulong_t'
    __kernel_ulong_t __unused3;
    ^~~~~~~~~~~~~~~~
  usr/include/asm/sembuf.h:31:2: error: unknown type name `__kernel_ulong_t'
    __kernel_ulong_t __unused4;
    ^~~~~~~~~~~~~~~~

It is just a matter of missing include directive.

Include <asm/ipcbuf.h> to make it self-contained, and add it to
the compile-test coverage.

Link: http://lkml.kernel.org/r/20191030063855.9989-3-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-04 19:44:14 -08:00
Masahiro Yamada
9ef0e00418 arch: msgbuf.h: make uapi asm/msgbuf.h self-contained
Userspace cannot compile <asm/msgbuf.h> due to some missing type
definitions.  For example, building it for x86 fails as follows:

    CC      usr/include/asm/msgbuf.h.s
  In file included from usr/include/asm/msgbuf.h:6:0,
                   from <command-line>:32:
  usr/include/asm-generic/msgbuf.h:25:20: error: field `msg_perm' has incomplete type
    struct ipc64_perm msg_perm;
                      ^~~~~~~~
  usr/include/asm-generic/msgbuf.h:27:2: error: unknown type name `__kernel_time_t'
    __kernel_time_t msg_stime; /* last msgsnd time */
    ^~~~~~~~~~~~~~~
  usr/include/asm-generic/msgbuf.h:28:2: error: unknown type name `__kernel_time_t'
    __kernel_time_t msg_rtime; /* last msgrcv time */
    ^~~~~~~~~~~~~~~
  usr/include/asm-generic/msgbuf.h:29:2: error: unknown type name `__kernel_time_t'
    __kernel_time_t msg_ctime; /* last change time */
    ^~~~~~~~~~~~~~~
  usr/include/asm-generic/msgbuf.h:41:2: error: unknown type name `__kernel_pid_t'
    __kernel_pid_t msg_lspid; /* pid of last msgsnd */
    ^~~~~~~~~~~~~~
  usr/include/asm-generic/msgbuf.h:42:2: error: unknown type name `__kernel_pid_t'
    __kernel_pid_t msg_lrpid; /* last receive pid */
    ^~~~~~~~~~~~~~

It is just a matter of missing include directive.

Include <asm/ipcbuf.h> to make it self-contained, and add it to
the compile-test coverage.

Link: http://lkml.kernel.org/r/20191030063855.9989-2-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-04 19:44:14 -08:00
Linus Torvalds
ceb3074745 y2038: syscall implementation cleanups
This is a series of cleanups for the y2038 work, mostly intended
 for namespace cleaning: the kernel defines the traditional
 time_t, timeval and timespec types that often lead to y2038-unsafe
 code. Even though the unsafe usage is mostly gone from the kernel,
 having the types and associated functions around means that we
 can still grow new users, and that we may be missing conversions
 to safe types that actually matter.
 
 There are still a number of driver specific patches needed to
 get the last users of these types removed, those have been
 submitted to the respective maintainers.
 
 Link: https://lore.kernel.org/lkml/20191108210236.1296047-1-arnd@arndb.de/
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJd3D+wAAoJEJpsee/mABjZfdcQAJvl6e+4ddKoDMIVJqVCE25N
 meFRgA7S8jy6BefEVeUgI8TxK+amGO36szMBUEnZxSSxq9u+gd13m5bEK6Xq/ov7
 4KTAiA3Irm/W5FBTktu1zc5ROIra1Xj7jLdubf8wEC3viSXIXB3+68Y28iBN7D2O
 k9kSpwINC5lWeC8guZy2I+2yc4ywUEXao9nVh8C/J+FQtU02TcdLtZop9OhpAa8u
 U19VVH3WHkQI7ZfLvBTUiYK6tlYTiYCnpr8l6sm850CnVv1fzBW+DzmVhPJ6FdFd
 4m5staC0sQ6gVqtjVMBOtT5CdzREse6hpwbKo2GRWFroO5W9tljMOJJXHvv/f6kz
 DxrpUmj37JuRbqAbr8KDmQqPo6M2CRkxFxjol1yh5ER63u1xMwLm/PQITZIMDvPO
 jrFc2C2SdM2E9bKP/RMCVoKSoRwxCJ5IwJ2AF237rrU0sx/zB2xsrOGssx5CWEgc
 3bbk6tDQujJJubnCfgRy1tTxpLZOHEEKw8YhFLLbR2LCtA9pA/0rfLLad16cjA5e
 5jIHxfsFc23zgpzrJeB7kAF/9xgu1tlA5BotOs3VBE89LtWOA9nK5dbPXng6qlUe
 er3xLCfS38ovhUw6DusQpaYLuaYuLM7DKO4iav9kuTMcY9GkbPk7vDD3KPGh2goy
 hY5cSM8+kT1q/THLnUBH
 =Bdbv
 -----END PGP SIGNATURE-----

Merge tag 'y2038-cleanups-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground

Pull y2038 cleanups from Arnd Bergmann:
 "y2038 syscall implementation cleanups

  This is a series of cleanups for the y2038 work, mostly intended for
  namespace cleaning: the kernel defines the traditional time_t, timeval
  and timespec types that often lead to y2038-unsafe code. Even though
  the unsafe usage is mostly gone from the kernel, having the types and
  associated functions around means that we can still grow new users,
  and that we may be missing conversions to safe types that actually
  matter.

  There are still a number of driver specific patches needed to get the
  last users of these types removed, those have been submitted to the
  respective maintainers"

Link: https://lore.kernel.org/lkml/20191108210236.1296047-1-arnd@arndb.de/

* tag 'y2038-cleanups-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (26 commits)
  y2038: alarm: fix half-second cut-off
  y2038: ipc: fix x32 ABI breakage
  y2038: fix typo in powerpc vdso "LOPART"
  y2038: allow disabling time32 system calls
  y2038: itimer: change implementation to timespec64
  y2038: move itimer reset into itimer.c
  y2038: use compat_{get,set}_itimer on alpha
  y2038: itimer: compat handling to itimer.c
  y2038: time: avoid timespec usage in settimeofday()
  y2038: timerfd: Use timespec64 internally
  y2038: elfcore: Use __kernel_old_timeval for process times
  y2038: make ns_to_compat_timeval use __kernel_old_timeval
  y2038: socket: use __kernel_old_timespec instead of timespec
  y2038: socket: remove timespec reference in timestamping
  y2038: syscalls: change remaining timeval to __kernel_old_timeval
  y2038: rusage: use __kernel_old_timeval
  y2038: uapi: change __kernel_time_t to __kernel_old_time_t
  y2038: stat: avoid 'time_t' in 'struct stat'
  y2038: ipc: remove __kernel_time_t reference from headers
  y2038: vdso: powerpc: avoid timespec references
  ...
2019-12-01 14:00:59 -08:00
Linus Torvalds
738d5fabff Merge branch 'parisc-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
 "Just trivial small updates: An assembler register optimization in the
  inlined networking checksum functions, a compiler warning fix and
  don't unneccesary print a runtime warning on machines which wouldn't
  be affected anyway"

* 'parisc-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Avoid spurious inequivalent alias kernel error messages
  kexec: Fix pointer-to-int-cast warnings
  parisc: Do not hardcode registers in checksum functions
2019-11-30 14:45:32 -08:00
Arnd Bergmann
caf5e32d4e y2038: ipc: remove __kernel_time_t reference from headers
There are two structures based on time_t that conflict between libc and
kernel: timeval and timespec. Both are now renamed to __kernel_old_timeval
and __kernel_old_timespec.

For time_t, the old typedef is still __kernel_time_t. There is nothing
wrong with that name, but it would be nice to not use that going forward
as this type is used almost only in deprecated interfaces because of
the y2038 overflow.

In the IPC headers (msgbuf.h, sembuf.h, shmbuf.h), __kernel_time_t is only
used for the 64-bit variants, which are not deprecated.

Change these to a plain 'long', which is the same type as __kernel_time_t
on all 64-bit architectures anyway, to reduce the number of users of the
old type.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-11-15 14:38:28 +01:00
Christoph Hellwig
a1fd79ad0d parisc: remove __ioremap
__ioremap is always called with the _PAGE_NO_CACHE, so fold the whole
thing and rename it to ioremap.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Helge Deller <deller@gmx.de>
2019-11-11 17:19:49 +01:00
Nick Desaulniers
0703ad217e parisc: prefer __section from compiler_attributes.h
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2019-10-14 21:44:07 +02:00
Helge Deller
52b2d91752 parisc: Do not hardcode registers in checksum functions
Do not hardcode processor registers r19 to r22 as scratch registers.
Instead let the compiler decide, which may give better optimization
results when the functions get inlined.

Signed-off-by: Helge Deller <deller@gmx.de>
2019-10-14 21:40:14 +02:00