Commit Graph

2813 Commits

Author SHA1 Message Date
Linus Torvalds
8c94ccc7cd USB / Thunderbolt changes for 6.8-rc1
Here is the big set of USB and Thunderbolt changes for 6.8-rc1.
 Included in here are the following:
   - Thunderbolt subsystem and driver updates for USB 4 hardware and
     issues reported by real devices
   - xhci driver updates
   - dwc3 driver updates
   - uvc_video gadget driver updates
   - typec driver updates
   - gadget string functions cleaned up
   - other small changes
 
 All of these have been in the linux-next tree for a while with no
 reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZaedng8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yndHACfX3SA2ipK5umpMsWOoLMCBV6VyrwAn3t+FPd/
 z4mNiCuNUhbEnU7RinK0
 =k/E9
 -----END PGP SIGNATURE-----

Merge tag 'usb-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB / Thunderbolt updates from Greg KH:
 "Here is the big set of USB and Thunderbolt changes for 6.8-rc1.
  Included in here are the following:

   - Thunderbolt subsystem and driver updates for USB 4 hardware and
     issues reported by real devices

   - xhci driver updates

   - dwc3 driver updates

   - uvc_video gadget driver updates

   - typec driver updates

   - gadget string functions cleaned up

   - other small changes

  All of these have been in the linux-next tree for a while with no
  reported issues"

* tag 'usb-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (169 commits)
  usb: typec: tipd: fix use of device-specific init function
  usb: typec: tipd: Separate reset for TPS6598x
  usb: mon: Fix atomicity violation in mon_bin_vma_fault
  usb: gadget: uvc: Remove nested locking
  usb: gadget: uvc: Fix use are free during STREAMOFF
  usb: typec: class: fix typec_altmode_put_partner to put plugs
  dt-bindings: usb: dwc3: Limit num-hc-interrupters definition
  dt-bindings: usb: xhci: Add num-hc-interrupters definition
  xhci: add support to allocate several interrupters
  USB: core: Use device_driver directly in struct usb_driver and usb_device_driver
  arm64: dts: mediatek: mt8195: Add 'rx-fifo-depth' for cherry
  usb: xhci-mtk: fix a short packet issue of gen1 isoc-in transfer
  dt-bindings: usb: mtk-xhci: add a property for Gen1 isoc-in transfer issue
  arm64: dts: qcom: msm8996: Remove PNoC clock from MSS
  arm64: dts: qcom: msm8996: Remove AGGRE2 clock from SLPI
  arm64: dts: qcom: msm8998: Remove AGGRE2 clock from SLPI
  arm64: dts: qcom: msm8939: Drop RPM bus clocks
  arm64: dts: qcom: sdm630: Drop RPM bus clocks
  arm64: dts: qcom: qcs404: Drop RPM bus clocks
  arm64: dts: qcom: msm8996: Drop RPM bus clocks
  ...
2024-01-18 11:43:55 -08:00
Linus Torvalds
bd736f38c0 TTY/Serial changes for 6.8-rc1
Here is the big set of tty and serial driver changes for 6.8-rc1.
 
 As usual, Jiri has a bunch of refactoring and cleanups for the tty core
 and drivers in here, along with the usual set of rs485 updates (someday
 this might work properly...)  Along with those, in here are changes for:
   - sc16is7xx serial driver updates
   - platform driver removal api updates
   - amba-pl011 driver updates
   - tty driver binding updates
   - other small tty/serial driver updates and changes
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZaeUaw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykyOgCgp1uhP/b9iW6qM7qL6OYEG6idI0kAnj0VASNm
 vSI69HmdKKwo69YLOSBp
 =14n1
 -----END PGP SIGNATURE-----

Merge tag 'tty-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty / serial updates from Greg KH:
 "Here is the big set of tty and serial driver changes for 6.8-rc1.

  As usual, Jiri has a bunch of refactoring and cleanups for the tty
  core and drivers in here, along with the usual set of rs485 updates
  (someday this might work properly...)

  Along with those, in here are changes for:

   - sc16is7xx serial driver updates

   - platform driver removal api updates

   - amba-pl011 driver updates

   - tty driver binding updates

   - other small tty/serial driver updates and changes

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'tty-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (197 commits)
  serial: sc16is7xx: refactor EFR lock
  serial: sc16is7xx: reorder code to remove prototype declarations
  serial: sc16is7xx: refactor FIFO access functions to increase commonality
  serial: sc16is7xx: drop unneeded MODULE_ALIAS
  serial: sc16is7xx: replace hardcoded divisor value with BIT() macro
  serial: sc16is7xx: add explicit return for some switch default cases
  serial: sc16is7xx: add macro for max number of UART ports
  serial: sc16is7xx: add driver name to struct uart_driver
  serial: sc16is7xx: use i2c_get_match_data()
  serial: sc16is7xx: use spi_get_device_match_data()
  serial: sc16is7xx: use DECLARE_BITMAP for sc16is7xx_lines bitfield
  serial: sc16is7xx: improve do/while loop in sc16is7xx_irq()
  serial: sc16is7xx: remove obsolete loop in sc16is7xx_port_irq()
  serial: sc16is7xx: set safe default SPI clock frequency
  serial: sc16is7xx: add check for unsupported SPI modes during probe
  serial: sc16is7xx: fix invalid sc16is7xx_lines bitfield in case of probe error
  serial: 8250_exar: Set missing rs485_supported flag
  serial: omap: do not override settings for RS485 support
  serial: core, imx: do not set RS485 enabled if it is not supported
  serial: core: make sure RS485 cannot be enabled when it is not supported
  ...
2024-01-18 11:37:24 -08:00
Linus Torvalds
80955ae955 Driver core changes for 6.8-rc1
Here are the set of driver core and kernfs changes for 6.8-rc1.  Nothing
 major in here this release cycle, just lots of small cleanups and some
 tweaks on kernfs that in the very end, got reverted and will come back
 in a safer way next release cycle.
 
 Included in here are:
   - more driver core 'const' cleanups and fixes
   - fw_devlink=rpm is now the default behavior
   - kernfs tiny changes to remove some string functions
   - cpu handling in the driver core is updated to work better on many
     systems that add topologies and cpus after booting
   - other minor changes and cleanups
 
 All of the cpu handling patches have been acked by the respective
 maintainers and are coming in here in one series.  Everything has been
 in linux-next for a while with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZaeOrg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymtcwCffzvKKkSY9qAp6+0v2WQNkZm1JWoAoJCPYUwF
 If6wEoPLWvRfKx4gIoq9
 =D96r
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here are the set of driver core and kernfs changes for 6.8-rc1.
  Nothing major in here this release cycle, just lots of small cleanups
  and some tweaks on kernfs that in the very end, got reverted and will
  come back in a safer way next release cycle.

  Included in here are:

   - more driver core 'const' cleanups and fixes

   - fw_devlink=rpm is now the default behavior

   - kernfs tiny changes to remove some string functions

   - cpu handling in the driver core is updated to work better on many
     systems that add topologies and cpus after booting

   - other minor changes and cleanups

  All of the cpu handling patches have been acked by the respective
  maintainers and are coming in here in one series. Everything has been
  in linux-next for a while with no reported issues"

* tag 'driver-core-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (51 commits)
  Revert "kernfs: convert kernfs_idr_lock to an irq safe raw spinlock"
  kernfs: convert kernfs_idr_lock to an irq safe raw spinlock
  class: fix use-after-free in class_register()
  PM: clk: make pm_clk_add_notifier() take a const pointer
  EDAC: constantify the struct bus_type usage
  kernfs: fix reference to renamed function
  driver core: device.h: fix Excess kernel-doc description warning
  driver core: class: fix Excess kernel-doc description warning
  driver core: mark remaining local bus_type variables as const
  driver core: container: make container_subsys const
  driver core: bus: constantify subsys_register() calls
  driver core: bus: make bus_sort_breadthfirst() take a const pointer
  kernfs: d_obtain_alias(NULL) will do the right thing...
  driver core: Better advertise dev_err_probe()
  kernfs: Convert kernfs_path_from_node_locked() from strlcpy() to strscpy()
  kernfs: Convert kernfs_name_locked() from strlcpy() to strscpy()
  kernfs: Convert kernfs_walk_ns() from strlcpy() to strscpy()
  initramfs: Expose retained initrd as sysfs file
  fs/kernfs/dir: obey S_ISGID
  kernel/cgroup: use kernfs_create_dir_ns()
  ...
2024-01-18 09:48:40 -08:00
Linus Torvalds
7b5bcf9b84 More power management updates for 6.8-rc1
- Restore the system-wide asynchronous device resume optimization
    removed by a recent concurrency fix (Rafael J. Wysocki).
 
  - Make the intel_pstate cpufreq driver allow Meteor Lake systems to run
    at somewhat higher frequencies (Srinivas Pandruvada).
 
  - Make the PM QoS core code use kcalloc() for array allocation (Erick
    Archer).
 
  - Fix two PM-related typos in admin-guide (Erwan Velu).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmWmbv8SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx1WAQAIhCk6dhAy24waU8yIq+0vIxQgHNYhzq
 ueo7Z2t+FcJbDmw4iesKVYvIUjMPntimzZb1lprUEqPS7HtRWmYEvh8WGSYPSKLH
 PVS0F/AyIjyKP8+TKpstAKuRAwyeugWeUC1Gtnl3tKXIO4QgGL4NLC1makxy+gWq
 jOz56wy7eyUiTblZBHcUjDHN0IUYLMkjZqmcYYAZ3IgwdPsXK4EdrjRqFBLwuvp4
 Wm7Jbl6dM143vPWGAj8EtWo/e0bcIvhTOIghfR+dasTHtE9JABx94EicwQDzP1xl
 7ZmoML2M6z6CO6cKt2w4udIIjSQDqnxfNIcmcqo3o5jb9IGiXQQad25mlWNdjD2d
 iJDUUYo5VZHj6pIs+cI3dHLO1lELjylGY4hJ1DEMYnzd7GgfJlYEHiNJUS1QBjtF
 nRXbv4gioKjsbrEhyBpV/fV2pM+o5gcz8iSGCT8njoq728XDsDJVspCeC0MLbkp4
 GjNs18TxdyC5IO0sY1WC+ahWGN9igqAAlCrZeKfL+AG6dDOYV3rE4Ga9bIlI6EeM
 1aymAzoEp4F1hTzBsuz3G+p+tLUjFIaRWao6K20Y8MsInnZmzogWCwH7pndXYPCu
 AeiqfP6m4ChqdtwwrdRLVWT+xjllF9k7CcTMC0odxqxTq7tcvSyO/iOFQRMdRxRR
 FlGklog4pI8z
 =N3eJ
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:
 "These restore the asynchronous device resume optimization removed by
  the previous PM merge, make the intel_pstate driver work better on
  Meteor Lake systems, optimize the PM QoS core code slightly and fix up
  typos in admin-guide.

  Specifics:

   - Restore the system-wide asynchronous device resume optimization
     removed by a recent concurrency fix (Rafael J. Wysocki)

   - Make the intel_pstate cpufreq driver allow Meteor Lake systems to
     run at somewhat higher frequencies (Srinivas Pandruvada)

   - Make the PM QoS core code use kcalloc() for array allocation (Erick
     Archer)

   - Fix two PM-related typos in admin-guide (Erwan Velu)"

* tag 'pm-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: Restore asynchronous device resume optimization
  Documentation: admin-guide: PM: Fix two typos
  cpufreq: intel_pstate: Update hybrid scaling factor for Meteor Lake
  PM: QoS: Use kcalloc() instead of kzalloc()
2024-01-17 14:07:14 -08:00
Linus Torvalds
09d1c6a80f Generic:
- Use memdup_array_user() to harden against overflow.
 
 - Unconditionally advertise KVM_CAP_DEVICE_CTRL for all architectures.
 
 - Clean up Kconfigs that all KVM architectures were selecting
 
 - New functionality around "guest_memfd", a new userspace API that
   creates an anonymous file and returns a file descriptor that refers
   to it.  guest_memfd files are bound to their owning virtual machine,
   cannot be mapped, read, or written by userspace, and cannot be resized.
   guest_memfd files do however support PUNCH_HOLE, which can be used to
   switch a memory area between guest_memfd and regular anonymous memory.
 
 - New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify
   per-page attributes for a given page of guest memory; right now the
   only attribute is whether the guest expects to access memory via
   guest_memfd or not, which in Confidential SVMs backed by SEV-SNP,
   TDX or ARM64 pKVM is checked by firmware or hypervisor that guarantees
   confidentiality (AMD PSP, Intel TDX module, or EL2 in the case of pKVM).
 
 x86:
 
 - Support for "software-protected VMs" that can use the new guest_memfd
   and page attributes infrastructure.  This is mostly useful for testing,
   since there is no pKVM-like infrastructure to provide a meaningfully
   reduced TCB.
 
 - Fix a relatively benign off-by-one error when splitting huge pages during
   CLEAR_DIRTY_LOG.
 
 - Fix a bug where KVM could incorrectly test-and-clear dirty bits in non-leaf
   TDP MMU SPTEs if a racing thread replaces a huge SPTE with a non-huge SPTE.
 
 - Use more generic lockdep assertions in paths that don't actually care
   about whether the caller is a reader or a writer.
 
 - let Xen guests opt out of having PV clock reported as "based on a stable TSC",
   because some of them don't expect the "TSC stable" bit (added to the pvclock
   ABI by KVM, but never set by Xen) to be set.
 
 - Revert a bogus, made-up nested SVM consistency check for TLB_CONTROL.
 
 - Advertise flush-by-ASID support for nSVM unconditionally, as KVM always
   flushes on nested transitions, i.e. always satisfies flush requests.  This
   allows running bleeding edge versions of VMware Workstation on top of KVM.
 
 - Sanity check that the CPU supports flush-by-ASID when enabling SEV support.
 
 - On AMD machines with vNMI, always rely on hardware instead of intercepting
   IRET in some cases to detect unmasking of NMIs
 
 - Support for virtualizing Linear Address Masking (LAM)
 
 - Fix a variety of vPMU bugs where KVM fail to stop/reset counters and other state
   prior to refreshing the vPMU model.
 
 - Fix a double-overflow PMU bug by tracking emulated counter events using a
   dedicated field instead of snapshotting the "previous" counter.  If the
   hardware PMC count triggers overflow that is recognized in the same VM-Exit
   that KVM manually bumps an event count, KVM would pend PMIs for both the
   hardware-triggered overflow and for KVM-triggered overflow.
 
 - Turn off KVM_WERROR by default for all configs so that it's not
   inadvertantly enabled by non-KVM developers, which can be problematic for
   subsystems that require no regressions for W=1 builds.
 
 - Advertise all of the host-supported CPUID bits that enumerate IA32_SPEC_CTRL
   "features".
 
 - Don't force a masterclock update when a vCPU synchronizes to the current TSC
   generation, as updating the masterclock can cause kvmclock's time to "jump"
   unexpectedly, e.g. when userspace hotplugs a pre-created vCPU.
 
 - Use RIP-relative address to read kvm_rebooting in the VM-Enter fault paths,
   partly as a super minor optimization, but mostly to make KVM play nice with
   position independent executable builds.
 
 - Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on
   CONFIG_HYPERV as a minor optimization, and to self-document the code.
 
 - Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV "emulation"
   at build time.
 
 ARM64:
 
 - LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB
   base granule sizes. Branch shared with the arm64 tree.
 
 - Large Fine-Grained Trap rework, bringing some sanity to the
   feature, although there is more to come. This comes with
   a prefix branch shared with the arm64 tree.
 
 - Some additional Nested Virtualization groundwork, mostly
   introducing the NV2 VNCR support and retargetting the NV
   support to that version of the architecture.
 
 - A small set of vgic fixes and associated cleanups.
 
 Loongarch:
 
 - Optimization for memslot hugepage checking
 
 - Cleanup and fix some HW/SW timer issues
 
 - Add LSX/LASX (128bit/256bit SIMD) support
 
 RISC-V:
 
 - KVM_GET_REG_LIST improvement for vector registers
 
 - Generate ISA extension reg_list using macros in get-reg-list selftest
 
 - Support for reporting steal time along with selftest
 
 s390:
 
 - Bugfixes
 
 Selftests:
 
 - Fix an annoying goof where the NX hugepage test prints out garbage
   instead of the magic token needed to run the test.
 
 - Fix build errors when a header is delete/moved due to a missing flag
   in the Makefile.
 
 - Detect if KVM bugged/killed a selftest's VM and print out a helpful
   message instead of complaining that a random ioctl() failed.
 
 - Annotate the guest printf/assert helpers with __printf(), and fix the
   various bugs that were lurking due to lack of said annotation.
 
 There are two non-KVM patches buried in the middle of guest_memfd support:
 
   fs: Rename anon_inode_getfile_secure() and anon_inode_getfd_secure()
   mm: Add AS_UNMOVABLE to mark mapping as completely unmovable
 
 The first is small and mostly suggested-by Christian Brauner; the second
 a bit less so but it was written by an mm person (Vlastimil Babka).
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmWcMWkUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroO15gf/WLmmg3SET6Uzw9iEq2xo28831ZA+
 6kpILfIDGKozV5safDmMvcInlc/PTnqOFrsKyyN4kDZ+rIJiafJdg/loE0kPXBML
 wdR+2ix5kYI1FucCDaGTahskBDz8Lb/xTpwGg9BFLYFNmuUeHc74o6GoNvr1uliE
 4kLZL2K6w0cSMPybUD+HqGaET80ZqPwecv+s1JL+Ia0kYZJONJifoHnvOUJ7DpEi
 rgudVdgzt3EPjG0y1z6MjvDBXTCOLDjXajErlYuZD3Ej8N8s59Dh2TxOiDNTLdP4
 a4zjRvDmgyr6H6sz+upvwc7f4M4p+DBvf+TkWF54mbeObHUYliStqURIoA==
 =66Ws
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "Generic:

   - Use memdup_array_user() to harden against overflow.

   - Unconditionally advertise KVM_CAP_DEVICE_CTRL for all
     architectures.

   - Clean up Kconfigs that all KVM architectures were selecting

   - New functionality around "guest_memfd", a new userspace API that
     creates an anonymous file and returns a file descriptor that refers
     to it. guest_memfd files are bound to their owning virtual machine,
     cannot be mapped, read, or written by userspace, and cannot be
     resized. guest_memfd files do however support PUNCH_HOLE, which can
     be used to switch a memory area between guest_memfd and regular
     anonymous memory.

   - New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify
     per-page attributes for a given page of guest memory; right now the
     only attribute is whether the guest expects to access memory via
     guest_memfd or not, which in Confidential SVMs backed by SEV-SNP,
     TDX or ARM64 pKVM is checked by firmware or hypervisor that
     guarantees confidentiality (AMD PSP, Intel TDX module, or EL2 in
     the case of pKVM).

  x86:

   - Support for "software-protected VMs" that can use the new
     guest_memfd and page attributes infrastructure. This is mostly
     useful for testing, since there is no pKVM-like infrastructure to
     provide a meaningfully reduced TCB.

   - Fix a relatively benign off-by-one error when splitting huge pages
     during CLEAR_DIRTY_LOG.

   - Fix a bug where KVM could incorrectly test-and-clear dirty bits in
     non-leaf TDP MMU SPTEs if a racing thread replaces a huge SPTE with
     a non-huge SPTE.

   - Use more generic lockdep assertions in paths that don't actually
     care about whether the caller is a reader or a writer.

   - let Xen guests opt out of having PV clock reported as "based on a
     stable TSC", because some of them don't expect the "TSC stable" bit
     (added to the pvclock ABI by KVM, but never set by Xen) to be set.

   - Revert a bogus, made-up nested SVM consistency check for
     TLB_CONTROL.

   - Advertise flush-by-ASID support for nSVM unconditionally, as KVM
     always flushes on nested transitions, i.e. always satisfies flush
     requests. This allows running bleeding edge versions of VMware
     Workstation on top of KVM.

   - Sanity check that the CPU supports flush-by-ASID when enabling SEV
     support.

   - On AMD machines with vNMI, always rely on hardware instead of
     intercepting IRET in some cases to detect unmasking of NMIs

   - Support for virtualizing Linear Address Masking (LAM)

   - Fix a variety of vPMU bugs where KVM fail to stop/reset counters
     and other state prior to refreshing the vPMU model.

   - Fix a double-overflow PMU bug by tracking emulated counter events
     using a dedicated field instead of snapshotting the "previous"
     counter. If the hardware PMC count triggers overflow that is
     recognized in the same VM-Exit that KVM manually bumps an event
     count, KVM would pend PMIs for both the hardware-triggered overflow
     and for KVM-triggered overflow.

   - Turn off KVM_WERROR by default for all configs so that it's not
     inadvertantly enabled by non-KVM developers, which can be
     problematic for subsystems that require no regressions for W=1
     builds.

   - Advertise all of the host-supported CPUID bits that enumerate
     IA32_SPEC_CTRL "features".

   - Don't force a masterclock update when a vCPU synchronizes to the
     current TSC generation, as updating the masterclock can cause
     kvmclock's time to "jump" unexpectedly, e.g. when userspace
     hotplugs a pre-created vCPU.

   - Use RIP-relative address to read kvm_rebooting in the VM-Enter
     fault paths, partly as a super minor optimization, but mostly to
     make KVM play nice with position independent executable builds.

   - Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on
     CONFIG_HYPERV as a minor optimization, and to self-document the
     code.

   - Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV
     "emulation" at build time.

  ARM64:

   - LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB base
     granule sizes. Branch shared with the arm64 tree.

   - Large Fine-Grained Trap rework, bringing some sanity to the
     feature, although there is more to come. This comes with a prefix
     branch shared with the arm64 tree.

   - Some additional Nested Virtualization groundwork, mostly
     introducing the NV2 VNCR support and retargetting the NV support to
     that version of the architecture.

   - A small set of vgic fixes and associated cleanups.

  Loongarch:

   - Optimization for memslot hugepage checking

   - Cleanup and fix some HW/SW timer issues

   - Add LSX/LASX (128bit/256bit SIMD) support

  RISC-V:

   - KVM_GET_REG_LIST improvement for vector registers

   - Generate ISA extension reg_list using macros in get-reg-list
     selftest

   - Support for reporting steal time along with selftest

  s390:

   - Bugfixes

  Selftests:

   - Fix an annoying goof where the NX hugepage test prints out garbage
     instead of the magic token needed to run the test.

   - Fix build errors when a header is delete/moved due to a missing
     flag in the Makefile.

   - Detect if KVM bugged/killed a selftest's VM and print out a helpful
     message instead of complaining that a random ioctl() failed.

   - Annotate the guest printf/assert helpers with __printf(), and fix
     the various bugs that were lurking due to lack of said annotation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (185 commits)
  x86/kvm: Do not try to disable kvmclock if it was not enabled
  KVM: x86: add missing "depends on KVM"
  KVM: fix direction of dependency on MMU notifiers
  KVM: introduce CONFIG_KVM_COMMON
  KVM: arm64: Add missing memory barriers when switching to pKVM's hyp pgd
  KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache
  RISC-V: KVM: selftests: Add get-reg-list test for STA registers
  RISC-V: KVM: selftests: Add steal_time test support
  RISC-V: KVM: selftests: Add guest_sbi_probe_extension
  RISC-V: KVM: selftests: Move sbi_ecall to processor.c
  RISC-V: KVM: Implement SBI STA extension
  RISC-V: KVM: Add support for SBI STA registers
  RISC-V: KVM: Add support for SBI extension registers
  RISC-V: KVM: Add SBI STA info to vcpu_arch
  RISC-V: KVM: Add steal-update vcpu request
  RISC-V: KVM: Add SBI STA extension skeleton
  RISC-V: paravirt: Implement steal-time support
  RISC-V: Add SBI STA extension definitions
  RISC-V: paravirt: Add skeleton for pv-time support
  RISC-V: KVM: Fix indentation in kvm_riscv_vcpu_set_reg_csr()
  ...
2024-01-17 13:03:37 -08:00
Linus Torvalds
1b1934dbbd A handful of late-arriving documentation fixes.
-----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmWoABAPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5Y+yAH/2YPZFKa+QzzYE6xbQnjPErPnGl5Ubdaem3q
 PODmp5DdIqnVRz8eEHY0h4Y9676RCzXg8aH6H+C5zkKJSof/Z7KKpQjmWTBnr30z
 QUXgcyxG+rTdZezZG8PKZVhZl7j8YX5ln3i4zR4g0MeaFpxiROrfX22jrnT2fqG4
 qkoenoZPwCZsrRP4qo7kDKPyfV8yupgjJ8uDcua7e5/5lSGT5siGVitVD13lcMXo
 bO/Tdhr2w09S898nZJSEZIP8SvTA1Rjhd0xmHRSaiNjQV/qMU5ZAtaukuBkQGJpY
 FYP4enQGefBk2hJ92gm5yg0Dv8GSeC3i0aKjhomrvnpu4cVvhxc=
 =DxUH
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.8-2' of git://git.lwn.net/linux

Pull documentation fixes from Jonathan Corbet:
 "A handful of late-arriving documentation fixes"

* tag 'docs-6.8-2' of git://git.lwn.net/linux:
  docs, kprobes: Add loongarch as supported architecture
  docs, kprobes: Update email address of Masami Hiramatsu
  docs: admin-guide: hw_random: update rng-tools website
  Documentation/core-api: fix spelling mistake in workqueue
  docs: kernel_feat.py: fix potential command injection
  Documentation: constrain alabaster package to older versions
2024-01-17 11:49:11 -08:00
Youling Tang
78de91b458 LoongArch: Use generic interface to support crashkernel=X,[high,low]
LoongArch already supports two crashkernel regions in kexec-tools, so we
can directly use the common interface to support crashkernel=X,[high,low]
after commit 0ab97169aa ("crash_core: add generic function to do
reservation").

With the help of newly changed function parse_crashkernel() and generic
reserve_crashkernel_generic(), crashkernel reservation can be simplified
by steps:

1) Add a new header file <asm/crash_core.h>, then define CRASH_ALIGN,
   CRASH_ADDR_LOW_MAX and CRASH_ADDR_HIGH_MAX and in <asm/crash_core.h>;

2) Add arch_reserve_crashkernel() to call parse_crashkernel() and
   reserve_crashkernel_generic();

3) Add ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION Kconfig in
   arch/loongarch/Kconfig.

One can reserve the crash kernel from high memory above DMA zone range
by explicitly passing "crashkernel=X,high"; or reserve a memory range
below 4G with "crashkernel=X,low". Besides, there are few rules need to
take notice:

1) "crashkernel=X,[high,low]" will be ignored if "crashkernel=size" is
   specified.
2) "crashkernel=X,low" is valid only when "crashkernel=X,high" is passed
   and there is enough memory to be allocated under 4G.
3) When allocating crashkernel above 4G and no "crashkernel=X,low" is
   specified, a 128M low memory will be allocated automatically for
   swiotlb bounce buffer.
See Documentation/admin-guide/kernel-parameters.txt for more information.

Following test cases have been performed as expected:
1) crashkernel=256M                          //low=256M
2) crashkernel=1G                            //low=1G
3) crashkernel=4G                            //high=4G, low=128M(default)
4) crashkernel=4G crashkernel=256M,high      //high=4G, low=128M(default), high is ignored
5) crashkernel=4G crashkernel=256M,low       //high=4G, low=128M(default), low is ignored
6) crashkernel=4G,high                       //high=4G, low=128M(default)
7) crashkernel=256M,low                      //low=0M, invalid
8) crashkernel=4G,high crashkernel=256M,low  //high=4G, low=256M
9) crashkernel=4G,high crashkernel=4G,low    //high=0M, low=0M, invalid
10) crashkernel=512M@2560M                   //low=512M
11) crashkernel=1G,high crashkernel=0M,low   //high=1G, low=0M

Recommended usage in general:
1) In the case of small memory: crashkernel=512M
2) In the case of large memory: crashkernel=1024M,high crashkernel=128M,low

Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-01-17 12:43:08 +08:00
Rafael J. Wysocki
9223614ea7 Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-qos' into pm
* pm-sleep:
  PM: sleep: Restore asynchronous device resume optimization

* pm-cpufreq:
  Documentation: admin-guide: PM: Fix two typos
  cpufreq: intel_pstate: Update hybrid scaling factor for Meteor Lake

* pm-qos:
  PM: QoS: Use kcalloc() instead of kzalloc()
2024-01-16 12:23:24 +01:00
Linus Torvalds
23a80d462c RCU pull request for v6.8
This pull request contains the following branches:
 
 doc.2023.12.13a: Documentation and comment updates.
 
 torture.2023.11.23a: RCU torture, locktorture updates that include
         cleanups; nolibc init build support for mips, ppc and rv64;
         testing of mid stall duration scenario and fixing fqs task
         creation conditions.
 
 fixes.2023.12.13a: Misc fixes, most notably restricting usage of
         RCU CPU stall notifiers, to confine their usage primarily
         to debug kernels.
 
 rcu-tasks.2023.12.12b: RCU tasks minor fixes.
 
 srcu.2023.12.13a: lockdep annotation fix for NMI-safe accesses,
         callback advancing/acceleration cleanup and documentation
         improvements.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSi2tPIQIc2VEtjarIAHS7/6Z0wpQUCZYUS0AAKCRAAHS7/6Z0w
 pRXgAQD+k8oqjvKL6la61ppWm5Y7NLjdj/IbV+cOd42jKnM6PAEAyavNhX0n7zGx
 o9cDlvIDxJfHnFrOTc5WLH9yEs3IiQQ=
 =8rdu
 -----END PGP SIGNATURE-----

Merge tag 'rcu.release.v6.8' of https://github.com/neeraju/linux

Pull RCU updates from Neeraj Upadhyay:

 - Documentation and comment updates

 - RCU torture, locktorture updates that include cleanups; nolibc init
   build support for mips, ppc and rv64; testing of mid stall duration
   scenario and fixing fqs task creation conditions

 - Misc fixes, most notably restricting usage of RCU CPU stall
   notifiers, to confine their usage primarily to debug kernels

 - RCU tasks minor fixes

 - lockdep annotation fix for NMI-safe accesses, callback
   advancing/acceleration cleanup and documentation improvements

* tag 'rcu.release.v6.8' of https://github.com/neeraju/linux:
  rcu: Force quiescent states only for ongoing grace period
  doc: Clarify historical disclaimers in memory-barriers.txt
  doc: Mention address and data dependencies in rcu_dereference.rst
  doc: Clarify RCU Tasks reader/updater checklist
  rculist.h: docs: Fix wrong function summary
  Documentation: RCU: Remove repeated word in comments
  srcu: Use try-lock lockdep annotation for NMI-safe access.
  srcu: Explain why callbacks invocations can't run concurrently
  srcu: No need to advance/accelerate if no callback enqueued
  srcu: Remove superfluous callbacks advancing from srcu_gp_start()
  rcu: Remove unused macros from rcupdate.h
  rcu: Restrict access to RCU CPU stall notifiers
  rcu-tasks: Mark RCU Tasks accesses to current->rcu_tasks_idle_cpu
  rcutorture: Add fqs_holdoff check before fqs_task is created
  rcutorture: Add mid-sized stall to TREE07
  rcutorture: add nolibc init support for mips, ppc and rv64
  locktorture: Increase Hamming distance between call_rcu_chain and rcu_call_chains
2024-01-12 16:35:58 -08:00
Linus Torvalds
61da593f44 media updates for v6.8-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmWhMN0ACgkQCF8+vY7k
 4RUNmw/9FNwS6woEDPUgoaqBqES5N1R/QzqqWgALQSiywUSWsAnHuALR7Aio4oH4
 IMRrXn3nQpg1wj+f2OtVcXreuf1OMdD0WmQvcj/hcjGIx3mZUh2dBXJpR/iRrCRA
 jbGUDZqsiT9gPf6IcqspFccoShVjQDM7NKLa6xQMpwyITBAXhY0d5LPOs70VJO3f
 EH6wdJbhbQoocylSFCOl61un4Ebk/Zou2TDyzAAnIMQxmanVTWnRE9QL6G59t8PA
 XgTcSRiyQGdNScR8c/xgyHDj4KchvkwFHQfOAH+HPot6CIy+UsC02ofcBralTuN6
 HTiJ8upj9U/N2ArX7L9w4xOFaRlXSO4MBTbQsroCgl7vkSfjMx/Ik5gHNVSoP25y
 MdLVg9ReW6D22tWnHvqpwbv4YPjT1uFX9LzuNtINwjJVC5Hn7EMPiE7hdsBGdKZE
 98dVJKNMUHQT5KfEUqjDkdrcAYm5Pd46trEa1aMDnoxtshxlmwHYwUj0QNKVX2gT
 Wa5lw9twFvmvYPGFwIT11HrkFukj4zR7N7Eh0IOK93zjJl6gtr1UNmujj8KFXJS3
 ZB4yTlcSgFunsmkyfcZ+02bgM9fq7SC1/rL9bXBYxdo2EcuduJ93Gr5riaQtVu44
 6t6Gi8Ucf9Uh7B68Alf0GQAd+AhXPzbqtisHdYJb0ijGjRMhq7w=
 =KPPn
 -----END PGP SIGNATURE-----

Merge tag 'media/v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

 - v4l core: subdev frame interval now supports which field

 - v4l kapi: moves and renames the init_cfg pad op to init_state as an
   internal op.

 - new sensor drivers: gc0308, gc2145, Avnet Alvium, ov64a40, tw9900

 - new camera driver: STM32 DCMIPP

 - s5p-mfc has gained MFC v12 support

 - new ISP driver added to staging: Starfive

 - new stateful encoder/decoded: Wave5 codec It is found on the J721S2
   SoC, JH7100 SoC, ssd202d SoC. Etc.

 - fwnode gained support for MIPI "DisCo for Imaging"
   (https://www.mipi.org/specifications/mipi-disco-imaging)

 - as usual, lots of cleanups, fixups and driver improvements.

* tag 'media/v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (309 commits)
  media: i2c: thp7312: select CONFIG_FW_LOADER
  media: i2c: mt9m114: use fsleep() in place of udelay()
  media: videobuf2: core: Rename min_buffers_needed field in vb2_queue
  media: i2c: thp7312: Store frame interval in subdev state
  media: docs: uAPI: Fix documentation of 'which' field for routing ioctls
  media: docs: uAPI: Expand error documentation for invalid 'which' value
  media: docs: uAPI: Clarify error documentation for invalid 'which' value
  media: v4l2-subdev: Store frame interval in subdev state
  media: v4l2-subdev: Add which field to struct v4l2_subdev_frame_interval
  media: v4l2-subdev: Turn .[gs]_frame_interval into pad operations
  media: v4l: subdev: Move out subdev state lock macros outside CONFIG_MEDIA_CONTROLLER
  media: s5p-mfc: DPB Count Independent of VIDIOC_REQBUF
  media: s5p-mfc: Load firmware for each run in MFCv12.
  media: s5p-mfc: Set context for valid case before calling try_run
  media: s5p-mfc: Add support for DMABUF for encoder
  media: s5p-mfc: Add support for UHD encoding.
  media: s5p-mfc: Add support for rate controls in MFCv12
  media: s5p-mfc: Add YV12 and I420 multiplanar format support
  media: s5p-mfc: Add initial support for MFCv12
  media: s5p-mfc: Rename IS_MFCV10 macro
  ...
2024-01-12 14:29:48 -08:00
Linus Torvalds
5b9b41617b Another moderately busy cycle for documentation, including:
- The minimum Sphinx requirement has been raised to 2.4.4, following a
   warning that was added in 6.2.
 
 - Some reworking of the Documentation/process front page to, hopefully,
   make it more useful.
 
 - Various kernel-doc tweaks to, for example, make it deal properly with
   __counted_by annotations.
 
 - We have also restored a warning for documentation of nonexistent
   structure members that disappeared a while back.  That had the delightful
   consequence of adding some 600 warnings to the docs build.  A sustained
   effort by Randy, Vegard, and myself has addressed almost all of those,
   bringing the documentation back into sync with the code.  The fixes are
   going through the appropriate maintainer trees.
 
 - Various improvements to the HTML rendered docs, including automatic links
   to Git revisions and a nice new pulldown to make translations easy to
   access.
 
 - Speaking of translations, more of those for Spanish and Chinese.
 
 ...plus the usual stream of documentation updates and typo fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmWcRKMPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YTKIH/AxBt/3iWt40dPf18arZHLU6tdUbmg01ttef
 CNKWkniCmABGKc//KYDXvjZMRDt0YlrS0KgUzrb8nIQTBlZG40D+88EwjXE0HeGP
 xt1Fk7OPOiJEqBZ3HEe0PDVfOiA+4yR6CmDKklCJuKg77X9atklneBwPUw/cOASk
 CWj+BdbwPBiSNQv48Lp87rGusKwnH/g0MN2uS0z9MPr1DYjM1K8+ngZjGW24lZHt
 qs5yhP43mlZGBF/lwNJXQp/xhnKAqJ9XwylBX9Wmaoxaz9yyzNVsADGvROMudgzi
 9YB+Jdy7Z0JSrVoLIRhUuDOv7aW8vk+8qLmGJt2aTIsqehbQ6pk=
 =fCtT
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.8' of git://git.lwn.net/linux

Pull documentation update from Jonathan Corbet:
 "Another moderately busy cycle for documentation, including:

   - The minimum Sphinx requirement has been raised to 2.4.4, following
     a warning that was added in 6.2

   - Some reworking of the Documentation/process front page to,
     hopefully, make it more useful

   - Various kernel-doc tweaks to, for example, make it deal properly
     with __counted_by annotations

   - We have also restored a warning for documentation of nonexistent
     structure members that disappeared a while back. That had the
     delightful consequence of adding some 600 warnings to the docs
     build. A sustained effort by Randy, Vegard, and myself has
     addressed almost all of those, bringing the documentation back into
     sync with the code. The fixes are going through the appropriate
     maintainer trees

   - Various improvements to the HTML rendered docs, including automatic
     links to Git revisions and a nice new pulldown to make translations
     easy to access

   - Speaking of translations, more of those for Spanish and Chinese

  ... plus the usual stream of documentation updates and typo fixes"

* tag 'docs-6.8' of git://git.lwn.net/linux: (57 commits)
  MAINTAINERS: use tabs for indent of CONFIDENTIAL COMPUTING THREAT MODEL
  A reworked process/index.rst
  ring-buffer/Documentation: Add documentation on buffer_percent file
  Translated the RISC-V architecture boot documentation.
  Docs: remove mentions of fdformat from util-linux
  Docs/zh_CN: Fix the meaning of DEBUG to pr_debug()
  Documentation: move driver-api/dcdbas to userspace-api/
  Documentation: move driver-api/isapnp to userspace-api/
  Documentation/core-api : fix typo in workqueue
  Documentation/trace: Fixed typos in the ftrace FLAGS section
  kernel-doc: handle a void function without producing a warning
  scripts/get_abi.pl: ignore some temp files
  docs: kernel_abi.py: fix command injection
  scripts/get_abi: fix source path leak
  CREDITS, MAINTAINERS, docs/process/howto: Update man-pages' maintainer
  docs: translations: add translations links when they exist
  kernel-doc: Align quick help and the code
  MAINTAINERS: add reviewer for Spanish translations
  docs: ignore __counted_by attribute in structure definitions
  scripts: kernel-doc: Clarify missing struct member description
  ..
2024-01-11 19:46:52 -08:00
Linus Torvalds
3e7aeb78ab Networking changes for 6.8.
Core & protocols
 ----------------
 
  - Analyze and reorganize core networking structs (socks, netdev,
    netns, mibs) to optimize cacheline consumption and set up
    build time warnings to safeguard against future header changes.
    This improves TCP performances with many concurrent connections
    up to 40%.
 
  - Add page-pool netlink-based introspection, exposing the
    memory usage and recycling stats. This helps indentify
    bad PP users and possible leaks.
 
  - Refine TCP/DCCP source port selection to no longer favor even
    source port at connect() time when IP_LOCAL_PORT_RANGE is set.
    This lowers the time taken by connect() for hosts having
    many active connections to the same destination.
 
  - Refactor the TCP bind conflict code, shrinking related socket
    structs.
 
  - Refactor TCP SYN-Cookie handling, as a preparation step to
    allow arbitrary SYN-Cookie processing via eBPF.
 
  - Tune optmem_max for 0-copy usage, increasing the default value
    to 128KB and namespecifying it.
 
  - Allow coalescing for cloned skbs coming from page pools, improving
    RX performances with some common configurations.
 
  - Reduce extension header parsing overhead at GRO time.
 
  - Add bridge MDB bulk deletion support, allowing user-space to
    request the deletion of matching entries.
 
  - Reorder nftables struct members, to keep data accessed by the
    datapath first.
 
  - Introduce TC block ports tracking and use. This allows supporting
    multicast-like behavior at the TC layer.
 
  - Remove UAPI support for retired TC qdiscs (dsmark, CBQ and ATM) and
    classifiers (RSVP and tcindex).
 
  - More data-race annotations.
 
  - Extend the diag interface to dump TCP bound-only sockets.
 
  - Conditional notification of events for TC qdisc class and actions.
 
  - Support for WPAN dynamic associations with nearby devices, to form
    a sub-network using a specific PAN ID.
 
  - Implement SMCv2.1 virtual ISM device support.
 
  - Add support for Batman-avd mulicast packet type.
 
 BPF
 ---
 
  - Tons of verifier improvements:
    - BPF register bounds logic and range support along with a large
      test suite
    - log improvements
    - complete precision tracking support for register spills
    - track aligned STACK_ZERO cases as imprecise spilled registers. It
      improves the verifier "instructions processed" metric from single
      digit to 50-60% for some programs
    - support for user's global BPF subprogram arguments with few
      commonly requested annotations for a better developer experience
    - support tracking of BPF_JNE which helps cases when the compiler
      transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the
      like
    - several fixes
 
  - Add initial TX metadata implementation for AF_XDP with support in
    mlx5 and stmmac drivers. Two types of offloads are supported right
    now, that is, TX timestamp and TX checksum offload.
 
  - Fix kCFI bugs in BPF all forms of indirect calls from BPF into
    kernel and from kernel into BPF work with CFI enabled. This allows
    BPF to work with CONFIG_FINEIBT=y.
 
  - Change BPF verifier logic to validate global subprograms lazily
    instead of unconditionally before the main program, so they can be
    guarded using BPF CO-RE techniques.
 
  - Support uid/gid options when mounting bpffs.
 
  - Add a new kfunc which acquires the associated cgroup of a task
    within a specific cgroup v1 hierarchy where the latter is identified
    by its id.
 
  - Extend verifier to allow bpf_refcount_acquire() of a map value field
    obtained via direct load which is a use-case needed in sched_ext.
 
  - Add BPF link_info support for uprobe multi link along with bpftool
    integration for the latter.
 
  - Support for VLAN tag in XDP hints.
 
  - Remove deprecated bpfilter kernel leftovers given the project
    is developed in user-space (https://github.com/facebook/bpfilter).
 
 Misc
 ----
 
  - Support for parellel TC self-tests execution.
 
  - Increase MPTCP self-tests coverage.
 
  - Updated the bridge documentation, including several so-far
    undocumented features.
 
  - Convert all the net self-tests to run in unique netns, to
    avoid random failures due to conflict and allow concurrent
    runs.
 
  - Add TCP-AO self-tests.
 
  - Add kunit tests for both cfg80211 and mac80211.
 
  - Autogenerate Netlink families documentation from YAML spec.
 
  - Add yml-gen support for fixed headers and recursive nests, the
    tool can now generate user-space code for all genetlink families
    for which we have specs.
 
  - A bunch of additional module descriptions fixes.
 
  - Catch incorrect freeing of pages belonging to a page pool.
 
 Driver API
 ----------
 
  - Rust abstractions for network PHY drivers; do not cover yet the
    full C API, but already allow implementing functional PHY drivers
    in rust.
 
  - Introduce queue and NAPI support in the netdev Netlink interface,
    allowing complete access to the device <> NAPIs <> queues
    relationship.
 
  - Introduce notifications filtering for devlink to allow control
    application scale to thousands of instances.
 
  - Improve PHY validation, requesting rate matching information for
    each ethtool link mode supported by both the PHY and host.
 
  - Add support for ethtool symmetric-xor RSS hash.
 
  - ACPI based Wifi band RFI (WBRF) mitigation feature for the AMD
    platform.
 
  - Expose pin fractional frequency offset value over new DPLL generic
    netlink attribute.
 
  - Convert older drivers to platform remove callback returning void.
 
  - Add support for PHY package MMD read/write.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - Octeon CN10K devices
    - Broadcom 5760X P7
    - Qualcomm SM8550 SoC
    - Texas Instrument DP83TG720S PHY
 
  - Bluetooth:
    - IMC Networks Bluetooth radio
 
 Removed
 -------
 
  - WiFi:
    - libertas 16-bit PCMCIA support
    - Atmel at76c50x drivers
    - HostAP ISA/PCMCIA style 802.11b driver
    - zd1201 802.11b USB dongles
    - Orinoco ISA/PCMCIA 802.11b driver
    - Aviator/Raytheon driver
    - Planet WL3501 driver
    - RNDIS USB 802.11b driver
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Intel (100G, ice, idpf):
      - allow one by one port representors creation and removal
      - add temperature and clock information reporting
      - add get/set for ethtool's header split ringparam
      - add again FW logging
      - adds support switchdev hardware packet mirroring
      - iavf: implement symmetric-xor RSS hash
      - igc: add support for concurrent physical and free-running timers
      - i40e: increase the allowable descriptors
    - nVidia/Mellanox:
      - Preparation for Socket-Direct multi-dev netdev. That will allow
        in future releases combining multiple PFs devices attached to
        different NUMA nodes under the same netdev
    - Broadcom (bnxt):
      - TX completion handling improvements
      - add basic ntuple filter support
      - reduce MSIX vectors usage for MQPRIO offload
      - add VXLAN support, USO offload and TX coalesce completion for P7
    - Marvell Octeon EP:
      - xmit-more support
      - add PF-VF mailbox support and use it for FW notifications for VFs
    - Wangxun (ngbe/txgbe):
      - implement ethtool functions to operate pause param, ring param,
        coalesce channel number and msglevel
    - Netronome/Corigine (nfp):
      - add flow-steering support
      - support UDP segmentation offload
 
  - Ethernet NICs embedded, slower, virtual:
    - Xilinx AXI: remove duplicate DMA code adopting the dma engine driver
    - stmmac: add support for HW-accelerated VLAN stripping
    - TI AM654x sw: add mqprio, frame preemption & coalescing
    - gve: add support for non-4k page sizes.
    - virtio-net: support dynamic coalescing moderation
 
  - nVidia/Mellanox Ethernet datacenter switches:
    - allow firmware upgrade without a reboot
    - more flexible support for bridge flooding via the compressed
      FID flooding mode
 
  - Ethernet embedded switches:
    - Microchip:
      - fine-tune flow control and speed configurations in KSZ8xxx
      - KSZ88X3: enable setting rmii reference
    - Renesas:
      - add jumbo frames support
    - Marvell:
      - 88E6xxx: add "eth-mac" and "rmon" stats support
 
  - Ethernet PHYs:
    - aquantia: add firmware load support
    - at803x: refactor the driver to simplify adding support for more
      chip variants
    - NXP C45 TJA11xx: Add MACsec offload support
 
  - Wifi:
    - MediaTek (mt76):
      - NVMEM EEPROM improvements
      - mt7996 Extremely High Throughput (EHT) improvements
      - mt7996 Wireless Ethernet Dispatcher (WED) support
      - mt7996 36-bit DMA support
    - Qualcomm (ath12k):
      - support for a single MSI vector
      - WCN7850: support AP mode
    - Intel (iwlwifi):
      - new debugfs file fw_dbg_clear
      - allow concurrent P2P operation on DFS channels
 
  - Bluetooth:
    - QCA2066: support HFP offload
    - ISO: more broadcast-related improvements
    - NXP: better recovery in case receiver/transmitter get out of sync
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmWdamsSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkGC4P/2xjLzdw22ckSssuE9ORbGko9SNjnqHk
 PQh1E+26BHiCg5KB8VvzMsL78E79MRNXEattSW+1g7dhCvln3oi+Vd0WkdRkgt35
 98Iv18zLbbwFAJeyKvmLAPAkQkMLtVj19QILBBRrugF+egEZgVSE3JBcTAiKv2ZQ
 HzkabA171Ri6LpCcEEtY5XuaKvimGnGzF8YMFf8rX0wtqd2p5kbY9aMe47WAGxvU
 Vf9548XvH+A5yVH2/4/gujtUOpA/RHuhuCMb+oo0cZ+VCC1x9MGzoXzj6r87OTkf
 k2W1whNzcGoin92f+9Lk1JYMuiGKBH4QVaDdNXJnYFSJWPTE7RvRsPzYTSD4/GzK
 yEZbzSJXpy/2vDQm16NoAxl7evRs8Sorzkw4LQRviZHI/5SAkK2ZQiCK5CO8QSYy
 C1LELcV5kn6Foe24xWnrWLjAGug9oJnYoGPMU5gvPmFJMvUMXqm5rmbBgUWL5Rxw
 q1M6gVzabCyWUy6z2G2vaqW2ZntNVvCkdsLtIX0XZkcTzNoP0MA+TuhyGz4wbiuo
 PeyQp/mbGnDgCYggqKIA0YWrTVxkhFrKN520cbO8qXBQytV9oFbM/0/+C0/r/5WX
 pL1JVzLrh6l5ME7EIQfha8UOF9j8q4ueSwb40P3AR2NaZiDABM0zfUZ6+sx+91WF
 ucqPEcZB5cRE
 =1bW6
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "The most interesting thing is probably the networking structs
  reorganization and a significant amount of changes is around
  self-tests.

  Core & protocols:

   - Analyze and reorganize core networking structs (socks, netdev,
     netns, mibs) to optimize cacheline consumption and set up build
     time warnings to safeguard against future header changes

     This improves TCP performances with many concurrent connections up
     to 40%

   - Add page-pool netlink-based introspection, exposing the memory
     usage and recycling stats. This helps indentify bad PP users and
     possible leaks

   - Refine TCP/DCCP source port selection to no longer favor even
     source port at connect() time when IP_LOCAL_PORT_RANGE is set. This
     lowers the time taken by connect() for hosts having many active
     connections to the same destination

   - Refactor the TCP bind conflict code, shrinking related socket
     structs

   - Refactor TCP SYN-Cookie handling, as a preparation step to allow
     arbitrary SYN-Cookie processing via eBPF

   - Tune optmem_max for 0-copy usage, increasing the default value to
     128KB and namespecifying it

   - Allow coalescing for cloned skbs coming from page pools, improving
     RX performances with some common configurations

   - Reduce extension header parsing overhead at GRO time

   - Add bridge MDB bulk deletion support, allowing user-space to
     request the deletion of matching entries

   - Reorder nftables struct members, to keep data accessed by the
     datapath first

   - Introduce TC block ports tracking and use. This allows supporting
     multicast-like behavior at the TC layer

   - Remove UAPI support for retired TC qdiscs (dsmark, CBQ and ATM) and
     classifiers (RSVP and tcindex)

   - More data-race annotations

   - Extend the diag interface to dump TCP bound-only sockets

   - Conditional notification of events for TC qdisc class and actions

   - Support for WPAN dynamic associations with nearby devices, to form
     a sub-network using a specific PAN ID

   - Implement SMCv2.1 virtual ISM device support

   - Add support for Batman-avd mulicast packet type

  BPF:

   - Tons of verifier improvements:
       - BPF register bounds logic and range support along with a large
         test suite
       - log improvements
       - complete precision tracking support for register spills
       - track aligned STACK_ZERO cases as imprecise spilled registers.
         This improves the verifier "instructions processed" metric from
         single digit to 50-60% for some programs
       - support for user's global BPF subprogram arguments with few
         commonly requested annotations for a better developer
         experience
       - support tracking of BPF_JNE which helps cases when the compiler
         transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the
         like
       - several fixes

   - Add initial TX metadata implementation for AF_XDP with support in
     mlx5 and stmmac drivers. Two types of offloads are supported right
     now, that is, TX timestamp and TX checksum offload

   - Fix kCFI bugs in BPF all forms of indirect calls from BPF into
     kernel and from kernel into BPF work with CFI enabled. This allows
     BPF to work with CONFIG_FINEIBT=y

   - Change BPF verifier logic to validate global subprograms lazily
     instead of unconditionally before the main program, so they can be
     guarded using BPF CO-RE techniques

   - Support uid/gid options when mounting bpffs

   - Add a new kfunc which acquires the associated cgroup of a task
     within a specific cgroup v1 hierarchy where the latter is
     identified by its id

   - Extend verifier to allow bpf_refcount_acquire() of a map value
     field obtained via direct load which is a use-case needed in
     sched_ext

   - Add BPF link_info support for uprobe multi link along with bpftool
     integration for the latter

   - Support for VLAN tag in XDP hints

   - Remove deprecated bpfilter kernel leftovers given the project is
     developed in user-space (https://github.com/facebook/bpfilter)

  Misc:

   - Support for parellel TC self-tests execution

   - Increase MPTCP self-tests coverage

   - Updated the bridge documentation, including several so-far
     undocumented features

   - Convert all the net self-tests to run in unique netns, to avoid
     random failures due to conflict and allow concurrent runs

   - Add TCP-AO self-tests

   - Add kunit tests for both cfg80211 and mac80211

   - Autogenerate Netlink families documentation from YAML spec

   - Add yml-gen support for fixed headers and recursive nests, the tool
     can now generate user-space code for all genetlink families for
     which we have specs

   - A bunch of additional module descriptions fixes

   - Catch incorrect freeing of pages belonging to a page pool

  Driver API:

   - Rust abstractions for network PHY drivers; do not cover yet the
     full C API, but already allow implementing functional PHY drivers
     in rust

   - Introduce queue and NAPI support in the netdev Netlink interface,
     allowing complete access to the device <> NAPIs <> queues
     relationship

   - Introduce notifications filtering for devlink to allow control
     application scale to thousands of instances

   - Improve PHY validation, requesting rate matching information for
     each ethtool link mode supported by both the PHY and host

   - Add support for ethtool symmetric-xor RSS hash

   - ACPI based Wifi band RFI (WBRF) mitigation feature for the AMD
     platform

   - Expose pin fractional frequency offset value over new DPLL generic
     netlink attribute

   - Convert older drivers to platform remove callback returning void

   - Add support for PHY package MMD read/write

  New hardware / drivers:

   - Ethernet:
       - Octeon CN10K devices
       - Broadcom 5760X P7
       - Qualcomm SM8550 SoC
       - Texas Instrument DP83TG720S PHY

   - Bluetooth:
       - IMC Networks Bluetooth radio

  Removed:

   - WiFi:
       - libertas 16-bit PCMCIA support
       - Atmel at76c50x drivers
       - HostAP ISA/PCMCIA style 802.11b driver
       - zd1201 802.11b USB dongles
       - Orinoco ISA/PCMCIA 802.11b driver
       - Aviator/Raytheon driver
       - Planet WL3501 driver
       - RNDIS USB 802.11b driver

  Driver updates:

   - Ethernet high-speed NICs:
       - Intel (100G, ice, idpf):
          - allow one by one port representors creation and removal
          - add temperature and clock information reporting
          - add get/set for ethtool's header split ringparam
          - add again FW logging
          - adds support switchdev hardware packet mirroring
          - iavf: implement symmetric-xor RSS hash
          - igc: add support for concurrent physical and free-running
            timers
          - i40e: increase the allowable descriptors
       - nVidia/Mellanox:
          - Preparation for Socket-Direct multi-dev netdev. That will
            allow in future releases combining multiple PFs devices
            attached to different NUMA nodes under the same netdev
       - Broadcom (bnxt):
          - TX completion handling improvements
          - add basic ntuple filter support
          - reduce MSIX vectors usage for MQPRIO offload
          - add VXLAN support, USO offload and TX coalesce completion
            for P7
       - Marvell Octeon EP:
          - xmit-more support
          - add PF-VF mailbox support and use it for FW notifications
            for VFs
       - Wangxun (ngbe/txgbe):
          - implement ethtool functions to operate pause param, ring
            param, coalesce channel number and msglevel
       - Netronome/Corigine (nfp):
          - add flow-steering support
          - support UDP segmentation offload

   - Ethernet NICs embedded, slower, virtual:
       - Xilinx AXI: remove duplicate DMA code adopting the dma engine
         driver
       - stmmac: add support for HW-accelerated VLAN stripping
       - TI AM654x sw: add mqprio, frame preemption & coalescing
       - gve: add support for non-4k page sizes.
       - virtio-net: support dynamic coalescing moderation

   - nVidia/Mellanox Ethernet datacenter switches:
       - allow firmware upgrade without a reboot
       - more flexible support for bridge flooding via the compressed
         FID flooding mode

   - Ethernet embedded switches:
       - Microchip:
          - fine-tune flow control and speed configurations in KSZ8xxx
          - KSZ88X3: enable setting rmii reference
       - Renesas:
          - add jumbo frames support
       - Marvell:
          - 88E6xxx: add "eth-mac" and "rmon" stats support

   - Ethernet PHYs:
       - aquantia: add firmware load support
       - at803x: refactor the driver to simplify adding support for more
         chip variants
       - NXP C45 TJA11xx: Add MACsec offload support

   - Wifi:
       - MediaTek (mt76):
          - NVMEM EEPROM improvements
          - mt7996 Extremely High Throughput (EHT) improvements
          - mt7996 Wireless Ethernet Dispatcher (WED) support
          - mt7996 36-bit DMA support
       - Qualcomm (ath12k):
          - support for a single MSI vector
          - WCN7850: support AP mode
       - Intel (iwlwifi):
          - new debugfs file fw_dbg_clear
          - allow concurrent P2P operation on DFS channels

   - Bluetooth:
       - QCA2066: support HFP offload
       - ISO: more broadcast-related improvements
       - NXP: better recovery in case receiver/transmitter get out of sync"

* tag 'net-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1714 commits)
  lan78xx: remove redundant statement in lan78xx_get_eee
  lan743x: remove redundant statement in lan743x_ethtool_get_eee
  bnxt_en: Fix RCU locking for ntuple filters in bnxt_rx_flow_steer()
  bnxt_en: Fix RCU locking for ntuple filters in bnxt_srxclsrldel()
  bnxt_en: Remove unneeded variable in bnxt_hwrm_clear_vnic_filter()
  tcp: Revert no longer abort SYN_SENT when receiving some ICMP
  Revert "mlx5 updates 2023-12-20"
  Revert "net: stmmac: Enable Per DMA Channel interrupt"
  ipvlan: Remove usage of the deprecated ida_simple_xx() API
  ipvlan: Fix a typo in a comment
  net/sched: Remove ipt action tests
  net: stmmac: Use interrupt mode INTM=1 for per channel irq
  net: stmmac: Add support for TX/RX channel interrupt
  net: stmmac: Make MSI interrupt routine generic
  dt-bindings: net: snps,dwmac: per channel irq
  net: phy: at803x: make read_status more generic
  net: phy: at803x: add support for cdt cross short test for qca808x
  net: phy: at803x: refactor qca808x cable test get status function
  net: phy: at803x: generalize cdt fault length function
  net: ethernet: cortina: Drop TSO support
  ...
2024-01-11 10:07:29 -08:00
Baruch Siach
54a2ffe952 docs: admin-guide: hw_random: update rng-tools website
rng-tools upstream moved to github. New upstream does not appear to
consider itself official website for hw_random. Drop that part.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/ef52ace5008fa934084442149f64f5f9ddbba465.1704720105.git.baruch@tkos.co.il
2024-01-11 09:35:18 -07:00
Vegard Nossum
c48a7c44a1 docs: kernel_feat.py: fix potential command injection
The kernel-feat directive passes its argument straight to the shell.
This is unfortunate and unnecessary.

Let's always use paths relative to $srctree/Documentation/ and use
subprocess.check_call() instead of subprocess.Popen(shell=True).

This also makes the code shorter.

This is analogous to commit 3231dd5862 ("docs: kernel_abi.py: fix
command injection") where we did exactly the same thing for
kernel_abi.py, somehow I completely missed this one.

Link: https://fosstodon.org/@jani/111676532203641247
Reported-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20240110174758.3680506-1-vegard.nossum@oracle.com
2024-01-11 09:21:01 -07:00
Erwan Velu
03c305861c Documentation: admin-guide: PM: Fix two typos
Fix two typos in the admin-guide:

 - a missing e in "reference_perf" in cppc_sysfs.rst.
 - the amd_pstate sysfs path uses a dash instead of an underscore.

Signed-off-by: Erwan Velu <e.velu@criteo.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-01-10 15:10:44 +01:00
Breno Leitao
aefb2f2e61 x86/bugs: Rename CONFIG_RETPOLINE => CONFIG_MITIGATION_RETPOLINE
Step 5/10 of the namespace unification of CPU mitigations related Kconfig options.

[ mingo: Converted a few more uses in comments/messages as well. ]

Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ariel Miculas <amiculas@cisco.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20231121160740.1249350-6-leitao@debian.org
2024-01-10 10:52:28 +01:00
Linus Torvalds
5fda5698c2 platform-drivers-x86 for v6.8-1
Highlights:
  -  Intel PMC / PMT / TPMI / uncore-freq / vsec improvements and
     new platform support
  -  AMD PMC / PMF improvements and new platform support
  -  AMD ACPI based Wifi band RFI mitigation feature (WBRF)
  -  WMI bus driver cleanups and improvements (Armin Wolf)
  -  acer-wmi Predator PHN16-71 support
  -  New Silicom network appliance EC LEDs / GPIOs driver
 
 The following is an automated git shortlog grouped by driver:
 
 ACPI:
  -  scan: Add LNXVIDEO HID to ignore_serial_bus_ids[]
 
 Add Silicom Platform Driver:
  - Add Silicom Platform Driver
 
 Documentation/driver-api:
  -  Add document about WBRF mechanism
 
 ISST:
  -  Process read/write blocked feature status
 
 Merge tag 'platform-drivers-x86-amd-wbrf-v6.8-1' into review-hans:
  - Merge tag 'platform-drivers-x86-amd-wbrf-v6.8-1' into review-hans
 
 Merge tag 'platform-drivers-x86-v6.7-3' into pdx86/for-next:
  - Merge tag 'platform-drivers-x86-v6.7-3' into pdx86/for-next
 
 Merge tag 'platform-drivers-x86-v6.7-6' into pdx86/for-next:
  - Merge tag 'platform-drivers-x86-v6.7-6' into pdx86/for-next
 
 Remove "X86 PLATFORM DRIVERS - ARCH" from MAINTAINERS:
  - Remove "X86 PLATFORM DRIVERS - ARCH" from MAINTAINERS
 
 acer-wmi:
  -  add fan speed monitoring for Predator PHN16-71
  -  Depend on ACPI_VIDEO instead of selecting it
  -  Add platform profile and mode key support for Predator PHN16-71
 
 asus-laptop:
  -  remove redundant braces in if statements
 
 asus-wmi:
  -  Convert to platform remove callback returning void
 
 clk:
  -  x86: lpss-atom: Drop unneeded 'extern' in the header
 
 dell-smbios-wmi:
  -  Stop using WMI chardev
  -  Use devm_get_free_pages()
 
 hp-bioscfg:
  -  Removed needless asm-generic
 
 hp-wmi:
  -  Convert to platform remove callback returning void
 
 intel-uncore-freq:
  -  Add additional client processors
 
 intel-wmi-sbl-fw-update:
  -  Use bus-based WMI interface
 
 intel/pmc:
  -  Call pmc_get_low_power_modes from platform init
 
 ips:
  -  Remove unused debug code
 
 platform/mellanox:
  -  mlxbf-tmfifo: Remove unnecessary bool conversion
 
 platform/x86/amd:
  -  Add support for AMD ACPI based Wifi band RFI mitigation feature
 
 platform/x86/amd/pmc:
  -  Modify SMU message port for latest AMD platform
  -  Add 1Ah family series to STB support list
  -  Add idlemask support for 1Ah family
  -  call amd_pmc_get_ip_info() during driver probe
  -  Add VPE information for AMDI000A platform
  -  Send OS_HINT command for AMDI000A platform
 
 platform/x86/amd/pmf:
  -  Return a status code only as a constant in two functions
  -  Return directly after a failed apmf_if_call() in apmf_sbios_heartbeat_notify()
  -  dump policy binary data
  -  Add capability to sideload of policy binary
  -  Add facility to dump TA inputs
  -  Make source_as_str() as non-static
  -  Add support to update system state
  -  Add support update p3t limit
  -  Add support to get inputs from other subsystems
  -  change amd_pmf_init_features() call sequence
  -  Add support for PMF Policy Binary
  -  Change return type of amd_pmf_set_dram_addr()
  -  Add support for PMF-TA interaction
  -  Add PMF TEE interface
 
 platform/x86/dell:
  -  alienware-wmi: Use kasprintf()
 
 platform/x86/intel-uncore-freq:
  -  Process read/write blocked feature status
 
 platform/x86/intel/pmc:
  -  Add missing extern
  -  Add Lunar Lake M support to intel_pmc_core driver
  -  Add Arrow Lake S support to intel_pmc_core driver
  -  Add ssram_init flag in PMC discovery in Meteor Lake
  -  Move common code to core.c
  -  Add PSON residency counter for Alder Lake
  -  Add regmap for Tiger Lake H PCH
  -  Add PSON residency counter
  -  Fix in mtl_punit_pmt_init()
  -  Fix in pmc_core_ssram_get_pmc()
  -  Show Die C6 counter on Meteor Lake
  -  Add debug attribute for Die C6 counter
  -  Read low power mode requirements for MTL-M and MTL-P
  -  Retrieve LPM information using Intel PMT
  -  Display LPM requirements for multiple PMCs
  -  Find and register PMC telemetry entries
  -  Cleanup SSRAM discovery
  -  Allow pmc_core_ssram_init to fail
 
 platform/x86/intel/pmc/arl:
  -  Add GBE LTR ignore during suspend
 
 platform/x86/intel/pmc/lnl:
  -  Add GBE LTR ignore during suspend
 
 platform/x86/intel/pmc/mtl:
  -  Use return value from pmc_core_ssram_init()
 
 platform/x86/intel/pmt:
  -  telemetry: Export API to read telemetry
  -  Add header to struct intel_pmt_entry
 
 platform/x86/intel/tpmi:
  -  Move TPMI ID definition
  -  Modify external interface to get read/write state
  -  Don't create devices for disabled features
 
 platform/x86/intel/vsec:
  -  Add support for Lunar Lake M
  -  Add base address field
  -  Add intel_vsec_register
  -  Assign auxdev parent by argument
  -  Use cleanup.h
  -  remove platform_info from vsec device structure
  -  Move structures to header
  -  Remove unnecessary return
  -  Fix xa_alloc memory leak
 
 platform/x86/intel/wmi:
  -  thunderbolt: Use bus-based WMI interface
 
 silicom-platform:
  -  Fix spelling mistake "platfomr" -> "platform"
 
 wmi:
  -  linux/wmi.h: fix Excess kernel-doc description warning
  -  Simplify get_subobj_info()
  -  Decouple ACPI notify handler from wmi_block_list
  -  Create WMI bus device first
  -  Use devres for resource handling
  -  Remove ACPI handlers after WMI devices
  -  Remove unused variable in address space handler
  -  Remove chardev interface
  -  Remove debug_event module param
  -  Remove debug_dump_wdg module param
  -  Add to_wmi_device() helper macro
  -  Add wmidev_block_set()
 
 x86-android-tablets:
  -  Fix an IS_ERR() vs NULL check in probe
  -  Fix backlight ctrl for Lenovo Yoga Tab 3 Pro YT3-X90F
  -  Add audio codec info for Lenovo Yoga Tab 3 Pro YT3-X90F
  -  Add support for SPI device instantiation
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmWb29kUHGhkZWdvZWRl
 QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9wliAf/bjJw8xGjzKYzipUhhDhM/tPdmuCs
 9zXCYXSeioSDCmhQNq6E4R+aBWJvcKAq5tNyf45x6tvWJVY2hk1HVnnJeu7qyVNk
 9fA7pGpqsUJg9w9nHW6YLU0AMSSl2a3k/E3lDwhDjYXwZiwQ2Y8atQ702HjiV5US
 cZ8/ZblIYqv/WKmhHpBf6dGlGzOUiwnXNIVQDIhXbde7yQPt3gGisftiu6bGyYvA
 FuBxm3o0UREX6yHAtII4TQfmcM45BrQa7/7FO3c0ZkpRmGCwiJJjRewUi8Rd2epJ
 mP9bNGkGdPriGMdlqcVDMyWvB5XJCaZTlbHHHSyEAaCRTEqeUMKWxp8OMg==
 =bmmu
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver updates from Hans de Goede:

 - Intel PMC / PMT / TPMI / uncore-freq / vsec improvements and new
   platform support

 - AMD PMC / PMF improvements and new platform support

 - AMD ACPI based Wifi band RFI mitigation feature (WBRF)

 - WMI bus driver cleanups and improvements (Armin Wolf)

 - acer-wmi Predator PHN16-71 support

 - New Silicom network appliance EC LEDs / GPIOs driver

* tag 'platform-drivers-x86-v6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (96 commits)
  platform/x86/amd/pmc: Modify SMU message port for latest AMD platform
  platform/x86/amd/pmc: Add 1Ah family series to STB support list
  platform/x86/amd/pmc: Add idlemask support for 1Ah family
  platform/x86/amd/pmc: call amd_pmc_get_ip_info() during driver probe
  platform/x86/amd/pmc: Add VPE information for AMDI000A platform
  platform/x86/amd/pmc: Send OS_HINT command for AMDI000A platform
  platform/x86/amd/pmf: Return a status code only as a constant in two functions
  platform/x86/amd/pmf: Return directly after a failed apmf_if_call() in apmf_sbios_heartbeat_notify()
  platform/x86: wmi: linux/wmi.h: fix Excess kernel-doc description warning
  platform/x86/intel/pmc: Add missing extern
  platform/x86/intel/pmc/lnl: Add GBE LTR ignore during suspend
  platform/x86/intel/pmc/arl: Add GBE LTR ignore during suspend
  platform/x86: intel-uncore-freq: Add additional client processors
  platform/x86: Remove "X86 PLATFORM DRIVERS - ARCH" from MAINTAINERS
  platform/x86: hp-bioscfg: Removed needless asm-generic
  platform/x86/intel/pmc: Add Lunar Lake M support to intel_pmc_core driver
  platform/x86/intel/pmc: Add Arrow Lake S support to intel_pmc_core driver
  platform/x86/intel/pmc: Add ssram_init flag in PMC discovery in Meteor Lake
  platform/x86/intel/pmc: Move common code to core.c
  platform/x86/intel/pmc: Add PSON residency counter for Alder Lake
  ...
2024-01-09 17:07:12 -08:00
Linus Torvalds
da96801729 regulator: Updates for v6.8
The main updates for this release are around monitoring of regulators,
 largely for error handling purposes.  We allow the stream of regulator
 events to be seen by userspace as netlink events and allow system
 integrators to describe individual regulators as system critical with
 information on how long the system is expected to last on error.  The
 system level error handling is very much about best effort problem
 mitigation rather than providing something fully robust, the initial
 drive was to provide a mechanism for trying to avoid initiating any new
 writes to flash once we notice the power going out.
 
 Otherwise it's very quiet, mainly several new Qualcomm devices.
 
  - Support for marking regulators as system critical and providing
    information on how long the system might last with those regulators
    in a failure state, hooked into the existing critical shutdown error
    handling.
  - Optional support for generating netlink events for events, there are
    use cases for system monitoring UIs and error handling.
  - A command line option to leave unused controllable regulators
    enabled, useful for debugging.  We already only disable regulators we
    were explicitly given permission to control.
  - Support for Quacomm MP5496, PM8010 and PM8937.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmWbJAkACgkQJNaLcl1U
 h9AmPwf/SXOxx0sp8xfmt1iJU30dg0L/0MNETf76dPFmCR8Oy1G9PLUqyzNQkTRf
 bvDrLf9amRRhY4FDCT74VoEiGo7fcduHmjDfYbK/A8bwY1l1UDn0d7hLwgqoyydf
 p07JbJzCHXAc1PhhMMdgOfdcpYs1Tah91CXOIdbe36pwgGJ8jwodJFD55uhXTsUZ
 R4PcNs/M2A8rW8SaggopOEzDExdne/ZogpGwclTTWau0OIze2SuPVSsQfrOtAabY
 BIxaMYKU5tSRdAJOSBNaL9NssUYzyO4q4hXs3Cms1p8XQlzZOVfMZznefdNHoVnw
 VXlJyEvMREpg8ilwlz7KOyvyF7rshg==
 =DADk
 -----END PGP SIGNATURE-----

Merge tag 'regulator-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator updates from Mark Brown:
 "The main updates for this release are around monitoring of regulators,
  largely for error handling purposes. We allow the stream of regulator
  events to be seen by userspace as netlink events and allow system
  integrators to describe individual regulators as system critical with
  information on how long the system is expected to last on error. The
  system level error handling is very much about best effort problem
  mitigation rather than providing something fully robust, the initial
  drive was to provide a mechanism for trying to avoid initiating any
  new writes to flash once we notice the power going out.

  Otherwise it's very quiet, mainly several new Qualcomm devices.

   - Support for marking regulators as system critical and providing
     information on how long the system might last with those regulators
     in a failure state, hooked into the existing critical shutdown
     error handling.

   - Optional support for generating netlink events for events, there
     are use cases for system monitoring UIs and error handling.

   - A command line option to leave unused controllable regulators
     enabled, useful for debugging. We already only disable regulators
     we were explicitly given permission to control.

   - Support for Quacomm MP5496, PM8010 and PM8937"

* tag 'regulator-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (31 commits)
  regulator: event: Ensure atomicity for sequence number
  uapi: regulator: Fix typo
  regulator: Reuse LINEAR_RANGE() in REGULATOR_LINEAR_RANGE()
  dt-bindings: regulator: qcom,usb-vbus-regulator: clean up example
  regulator: qcom_smd: Add LDO5 MP5496 regulator
  regulator: qcom-rpmh: add support for pm8010 regulators
  regulator: dt-bindings: qcom,rpmh: add compatible for pm8010
  regulator: qcom-rpmh: extend to support multiple linear voltage ranges
  regulator: wm8350: Convert to platform remove callback returning void
  regulator: virtual: Convert to platform remove callback returning void
  regulator: userspace-consumer: Convert to platform remove callback returning void
  regulator: uniphier: Convert to platform remove callback returning void
  regulator: stm32-vrefbuf: Convert to platform remove callback returning void
  regulator: db8500-prcmu: Convert to platform remove callback returning void
  regulator: bd9571mwv: Convert to platform remove callback returning void
  regulator: arizona-ldo1: Convert to platform remove callback returning void
  regulator: event: Add regulator netlink event support
  regulator: event: Add regulator netlink event support
  regulator: stpmic1: Fix kernel-doc notation warnings
  regulator: palmas: remove redundant initialization of pointer pdata
  ...
2024-01-09 14:41:21 -08:00
Linus Torvalds
fb46e22a9e Many singleton patches against the MM code. The patch series which
are included in this merge do the following:
 
 - Peng Zhang has done some mapletree maintainance work in the
   series
 
 	"maple_tree: add mt_free_one() and mt_attr() helpers"
 	"Some cleanups of maple tree"
 
 - In the series "mm: use memmap_on_memory semantics for dax/kmem"
   Vishal Verma has altered the interworking between memory-hotplug
   and dax/kmem so that newly added 'device memory' can more easily
   have its memmap placed within that newly added memory.
 
 - Matthew Wilcox continues folio-related work (including a few
   fixes) in the patch series
 
 	"Add folio_zero_tail() and folio_fill_tail()"
 	"Make folio_start_writeback return void"
 	"Fix fault handler's handling of poisoned tail pages"
 	"Convert aops->error_remove_page to ->error_remove_folio"
 	"Finish two folio conversions"
 	"More swap folio conversions"
 
 - Kefeng Wang has also contributed folio-related work in the series
 
 	"mm: cleanup and use more folio in page fault"
 
 - Jim Cromie has improved the kmemleak reporting output in the
   series "tweak kmemleak report format".
 
 - In the series "stackdepot: allow evicting stack traces" Andrey
   Konovalov to permits clients (in this case KASAN) to cause
   eviction of no longer needed stack traces.
 
 - Charan Teja Kalla has fixed some accounting issues in the page
   allocator's atomic reserve calculations in the series "mm:
   page_alloc: fixes for high atomic reserve caluculations".
 
 - Dmitry Rokosov has added to the samples/ dorectory some sample
   code for a userspace memcg event listener application.  See the
   series "samples: introduce cgroup events listeners".
 
 - Some mapletree maintanance work from Liam Howlett in the series
   "maple_tree: iterator state changes".
 
 - Nhat Pham has improved zswap's approach to writeback in the
   series "workload-specific and memory pressure-driven zswap
   writeback".
 
 - DAMON/DAMOS feature and maintenance work from SeongJae Park in
   the series
 
 	"mm/damon: let users feed and tame/auto-tune DAMOS"
 	"selftests/damon: add Python-written DAMON functionality tests"
 	"mm/damon: misc updates for 6.8"
 
 - Yosry Ahmed has improved memcg's stats flushing in the series
   "mm: memcg: subtree stats flushing and thresholds".
 
 - In the series "Multi-size THP for anonymous memory" Ryan Roberts
   has added a runtime opt-in feature to transparent hugepages which
   improves performance by allocating larger chunks of memory during
   anonymous page faults.
 
 - Matthew Wilcox has also contributed some cleanup and maintenance
   work against eh buffer_head code int he series "More buffer_head
   cleanups".
 
 - Suren Baghdasaryan has done work on Andrea Arcangeli's series
   "userfaultfd move option".  UFFDIO_MOVE permits userspace heap
   compaction algorithms to move userspace's pages around rather than
   UFFDIO_COPY'a alloc/copy/free.
 
 - Stefan Roesch has developed a "KSM Advisor", in the series
   "mm/ksm: Add ksm advisor".  This is a governor which tunes KSM's
   scanning aggressiveness in response to userspace's current needs.
 
 - Chengming Zhou has optimized zswap's temporary working memory
   use in the series "mm/zswap: dstmem reuse optimizations and
   cleanups".
 
 - Matthew Wilcox has performed some maintenance work on the
   writeback code, both code and within filesystems.  The series is
   "Clean up the writeback paths".
 
 - Andrey Konovalov has optimized KASAN's handling of alloc and
   free stack traces for secondary-level allocators, in the series
   "kasan: save mempool stack traces".
 
 - Andrey also performed some KASAN maintenance work in the series
   "kasan: assorted clean-ups".
 
 - David Hildenbrand has gone to town on the rmap code.  Cleanups,
   more pte batching, folio conversions and more.  See the series
   "mm/rmap: interface overhaul".
 
 - Kinsey Ho has contributed some maintenance work on the MGLRU
   code in the series "mm/mglru: Kconfig cleanup".
 
 - Matthew Wilcox has contributed lruvec page accounting code
   cleanups in the series "Remove some lruvec page accounting
   functions".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZZyF2wAKCRDdBJ7gKXxA
 jjWjAP42LHvGSjp5M+Rs2rKFL0daBQsrlvy6/jCHUequSdWjSgEAmOx7bc5fbF27
 Oa8+DxGM9C+fwqZ/7YxU2w/WuUmLPgU=
 =0NHs
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Many singleton patches against the MM code. The patch series which are
  included in this merge do the following:

   - Peng Zhang has done some mapletree maintainance work in the series

	'maple_tree: add mt_free_one() and mt_attr() helpers'
	'Some cleanups of maple tree'

   - In the series 'mm: use memmap_on_memory semantics for dax/kmem'
     Vishal Verma has altered the interworking between memory-hotplug
     and dax/kmem so that newly added 'device memory' can more easily
     have its memmap placed within that newly added memory.

   - Matthew Wilcox continues folio-related work (including a few fixes)
     in the patch series

	'Add folio_zero_tail() and folio_fill_tail()'
	'Make folio_start_writeback return void'
	'Fix fault handler's handling of poisoned tail pages'
	'Convert aops->error_remove_page to ->error_remove_folio'
	'Finish two folio conversions'
	'More swap folio conversions'

   - Kefeng Wang has also contributed folio-related work in the series

	'mm: cleanup and use more folio in page fault'

   - Jim Cromie has improved the kmemleak reporting output in the series
     'tweak kmemleak report format'.

   - In the series 'stackdepot: allow evicting stack traces' Andrey
     Konovalov to permits clients (in this case KASAN) to cause eviction
     of no longer needed stack traces.

   - Charan Teja Kalla has fixed some accounting issues in the page
     allocator's atomic reserve calculations in the series 'mm:
     page_alloc: fixes for high atomic reserve caluculations'.

   - Dmitry Rokosov has added to the samples/ dorectory some sample code
     for a userspace memcg event listener application. See the series
     'samples: introduce cgroup events listeners'.

   - Some mapletree maintanance work from Liam Howlett in the series
     'maple_tree: iterator state changes'.

   - Nhat Pham has improved zswap's approach to writeback in the series
     'workload-specific and memory pressure-driven zswap writeback'.

   - DAMON/DAMOS feature and maintenance work from SeongJae Park in the
     series

	'mm/damon: let users feed and tame/auto-tune DAMOS'
	'selftests/damon: add Python-written DAMON functionality tests'
	'mm/damon: misc updates for 6.8'

   - Yosry Ahmed has improved memcg's stats flushing in the series 'mm:
     memcg: subtree stats flushing and thresholds'.

   - In the series 'Multi-size THP for anonymous memory' Ryan Roberts
     has added a runtime opt-in feature to transparent hugepages which
     improves performance by allocating larger chunks of memory during
     anonymous page faults.

   - Matthew Wilcox has also contributed some cleanup and maintenance
     work against eh buffer_head code int he series 'More buffer_head
     cleanups'.

   - Suren Baghdasaryan has done work on Andrea Arcangeli's series
     'userfaultfd move option'. UFFDIO_MOVE permits userspace heap
     compaction algorithms to move userspace's pages around rather than
     UFFDIO_COPY'a alloc/copy/free.

   - Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm:
     Add ksm advisor'. This is a governor which tunes KSM's scanning
     aggressiveness in response to userspace's current needs.

   - Chengming Zhou has optimized zswap's temporary working memory use
     in the series 'mm/zswap: dstmem reuse optimizations and cleanups'.

   - Matthew Wilcox has performed some maintenance work on the writeback
     code, both code and within filesystems. The series is 'Clean up the
     writeback paths'.

   - Andrey Konovalov has optimized KASAN's handling of alloc and free
     stack traces for secondary-level allocators, in the series 'kasan:
     save mempool stack traces'.

   - Andrey also performed some KASAN maintenance work in the series
     'kasan: assorted clean-ups'.

   - David Hildenbrand has gone to town on the rmap code. Cleanups, more
     pte batching, folio conversions and more. See the series 'mm/rmap:
     interface overhaul'.

   - Kinsey Ho has contributed some maintenance work on the MGLRU code
     in the series 'mm/mglru: Kconfig cleanup'.

   - Matthew Wilcox has contributed lruvec page accounting code cleanups
     in the series 'Remove some lruvec page accounting functions'"

* tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits)
  mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER
  mm, treewide: introduce NR_PAGE_ORDERS
  selftests/mm: add separate UFFDIO_MOVE test for PMD splitting
  selftests/mm: skip test if application doesn't has root privileges
  selftests/mm: conform test to TAP format output
  selftests: mm: hugepage-mmap: conform to TAP format output
  selftests/mm: gup_test: conform test to TAP format output
  mm/selftests: hugepage-mremap: conform test to TAP format output
  mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING
  mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large
  mm/memcontrol: remove __mod_lruvec_page_state()
  mm/khugepaged: use a folio more in collapse_file()
  slub: use a folio in __kmalloc_large_node
  slub: use folio APIs in free_large_kmalloc()
  slub: use alloc_pages_node() in alloc_slab_page()
  mm: remove inc/dec lruvec page state functions
  mm: ratelimit stat flush from workingset shrinker
  kasan: stop leaking stack trace handles
  mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE
  mm/mglru: add dummy pmd_dirty()
  ...
2024-01-09 11:18:47 -08:00
Linus Torvalds
9f8413c4a6 cgroup: Changes for v6.8
- Yafang Shao added task_get_cgroup1() helper to enable a similar BPF helper
   so that BPF progs can be more useful on cgroup1 hierarchies. While cgroup1
   is mostly in maintenance mode, this addition is very small while having an
   outsized usefulness for users who are still on cgroup1. Yafang also
   optimized root cgroup list access by making it RCU protected in the
   process.
 
 - Waiman Long optimized rstat operation leading to substantially lower and
   more consistent lock hold time while flushing the hierarchical statistics.
   As the lock can be acquired briefly in various hot paths, this reduction
   has cascading benefits.
 
 - Waiman also improved the quality of isolation for cpuset's isolated
   partitions. CPUs which are allocated to isolated partitions are now
   excluded from running unbound work items and cpu_is_isolated() test which
   is used by vmstat and memcg to reduce interference now includes cpuset
   isolated CPUs. While it isn't there yet, the hope is eventually reaching
   parity with the isolation level provided by the `isolcpus` boot param but
   in a dynamic manner.
 
   This involved a couple workqueue patches which were applied directly to
   cgroup/for-6.8 rather than ping-ponged through the wq tree. This was
   because the wq code change was small and the area is usually very static
   and unlikely to cause conflicts. However, luck had it that there was a wq
   bug fix in the area during the 6.7 cycle which caused a conflict. The
   conflict is contextual but can be a bit confusing to resolve, so there is
   one merge from wq/for-6.7-fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZYnuJg4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGQ5kAP9nMMWqi+R1HeG7+hWROTVjQZ0OM9KRcpZ1TmjF
 FNbkJgEAzt+sPnoWwYDTSI7pkNeZ/IM7x1qkkKGvENNtUXrz0Ac=
 =PyYN
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:

 - Yafang Shao added task_get_cgroup1() helper to enable a similar BPF
   helper so that BPF progs can be more useful on cgroup1 hierarchies.
   While cgroup1 is mostly in maintenance mode, this addition is very
   small while having an outsized usefulness for users who are still on
   cgroup1. Yafang also optimized root cgroup list access by making it
   RCU protected in the process.

 - Waiman Long optimized rstat operation leading to substantially lower
   and more consistent lock hold time while flushing the hierarchical
   statistics. As the lock can be acquired briefly in various hot paths,
   this reduction has cascading benefits.

 - Waiman also improved the quality of isolation for cpuset's isolated
   partitions. CPUs which are allocated to isolated partitions are now
   excluded from running unbound work items and cpu_is_isolated() test
   which is used by vmstat and memcg to reduce interference now includes
   cpuset isolated CPUs. While it isn't there yet, the hope is
   eventually reaching parity with the isolation level provided by the
   `isolcpus` boot param but in a dynamic manner.

* tag 'cgroup-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Move rcu_head up near the top of cgroup_root
  cgroup/cpuset: Include isolated cpuset CPUs in cpu_is_isolated() check
  cgroup: Avoid false cacheline sharing of read mostly rstat_cpu
  cgroup/rstat: Optimize cgroup_rstat_updated_list()
  cgroup: Fix documentation for cpu.idle
  cgroup/cpuset: Expose cpuset.cpus.isolated
  workqueue: Move workqueue_set_unbound_cpumask() and its helpers inside CONFIG_SYSFS
  cgroup/rstat: Reduce cpu_lock hold time in cgroup_rstat_flush_locked()
  cgroup/cpuset: Take isolated CPUs out of workqueue unbound cpumask
  cgroup/cpuset: Keep track of CPUs in isolated partitions
  selftests/cgroup: Minor code cleanup and reorganization of test_cpuset_prs.sh
  workqueue: Add workqueue_unbound_exclude_cpumask() to exclude CPUs from wq_unbound_cpumask
  selftests: cgroup: Fixes a typo in a comment
  cgroup: Add a new helper for cgroup1 hierarchy
  cgroup: Add annotation for holding namespace_sem in current_cgns_cgroup_from_root()
  cgroup: Eliminate the need for cgroup_mutex in proc_cgroup_show()
  cgroup: Make operations on the cgroup root_list RCU safe
  cgroup: Remove unnecessary list_empty()
2024-01-08 20:04:02 -08:00
Kirill A. Shutemov
5e0a760b44 mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER
commit 23baf831a3 ("mm, treewide: redefine MAX_ORDER sanely") has
changed the definition of MAX_ORDER to be inclusive.  This has caused
issues with code that was not yet upstream and depended on the previous
definition.

To draw attention to the altered meaning of the define, rename MAX_ORDER
to MAX_PAGE_ORDER.

Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-08 15:27:15 -08:00
Kirill A. Shutemov
fd37721803 mm, treewide: introduce NR_PAGE_ORDERS
NR_PAGE_ORDERS defines the number of page orders supported by the page
allocator, ranging from 0 to MAX_ORDER, MAX_ORDER + 1 in total.

NR_PAGE_ORDERS assists in defining arrays of page orders and allows for
more natural iteration over them.

[kirill.shutemov@linux.intel.com: fixup for kerneldoc warning]
  Link: https://lkml.kernel.org/r/20240101111512.7empzyifq7kxtzk3@box
Link: https://lkml.kernel.org/r/20231228144704.14033-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-08 15:27:15 -08:00
Vegard Nossum
3231dd5862 docs: kernel_abi.py: fix command injection
The kernel-abi directive passes its argument straight to the shell.
This is unfortunate and unnecessary.

Let's always use paths relative to $srctree/Documentation/ and use
subprocess.check_call() instead of subprocess.Popen(shell=True).

This also makes the code shorter.

Link: https://fosstodon.org/@jani/111676532203641247
Reported-by: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20231231235959.3342928-2-vegard.nossum@oracle.com
2024-01-03 13:44:11 -07:00
Andrew Jones
323925ed6d RISC-V: paravirt: Add skeleton for pv-time support
Add the files and functions needed to support paravirt time on
RISC-V. Also include the common code needed for the first
application of pv-time, which is steal-time. In the next
patches we'll complete the functions to fully enable steal-time
support.

Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2023-12-30 11:25:03 +05:30
Nhat Pham
501a06fe8e zswap: memcontrol: implement zswap writeback disabling
During our experiment with zswap, we sometimes observe swap IOs due to
occasional zswap store failures and writebacks-to-swap.  These swapping
IOs prevent many users who cannot tolerate swapping from adopting zswap to
save memory and improve performance where possible.

This patch adds the option to disable this behavior entirely: do not
writeback to backing swapping device when a zswap store attempt fail, and
do not write pages in the zswap pool back to the backing swap device (both
when the pool is full, and when the new zswap shrinker is called).

This new behavior can be opted-in/out on a per-cgroup basis via a new
cgroup file.  By default, writebacks to swap device is enabled, which is
the previous behavior.  Initially, writeback is enabled for the root
cgroup, and a newly created cgroup will inherit the current setting of its
parent.

Note that this is subtly different from setting memory.swap.max to 0, as
it still allows for pages to be stored in the zswap pool (which itself
consumes swap space in its current form).

This patch should be applied on top of the zswap shrinker series:

https://lore.kernel.org/linux-mm/20231130194023.4102148-1-nphamcs@gmail.com/

as it also disables the zswap shrinker, a major source of zswap
writebacks.

For the most part, this feature is motivated by internal parties who
have already established their opinions regarding swapping - the
workloads that are highly sensitive to IO, and especially those who are
using servers with really slow disk performance (for instance, massive
but slow HDDs).  For these folks, it's impossible to convince them to
even entertain zswap if swapping also comes as a packaged deal. 
Writeback disabling is quite a useful feature in these situations - on
a mixed workloads deployment, they can disable writeback for the more
IO-sensitive workloads, and enable writeback for other background
workloads.

For instance, on a server with HDD, I allocate memories and populate
them with random values (so that zswap store will always fail), and
specify memory.high low enough to trigger reclaim.  The time it takes
to allocate the memories and just read through it a couple of times
(doing silly things like computing the values' average etc.):

zswap.writeback disabled:
real 0m30.537s
user 0m23.687s
sys 0m6.637s
0 pages swapped in
0 pages swapped out

zswap.writeback enabled:
real 0m45.061s
user 0m24.310s
sys 0m8.892s
712686 pages swapped in
461093 pages swapped out

(the last two lines are from vmstat -s).

[nphamcs@gmail.com: add a comment about recurring zswap store failures leading to reclaim inefficiency]
  Link: https://lkml.kernel.org/r/20231221005725.3446672-1-nphamcs@gmail.com
Link: https://lkml.kernel.org/r/20231207192406.3809579-1-nphamcs@gmail.com
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Chris Li <chrisl@kernel.org>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: David Heidelberg <david@ixit.cz>
Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29 20:22:11 -08:00
Stefan Roesch
0710f38ad2 mm/ksm: document ksm advisor and its sysfs knobs
This documents the KSM advisor and its new knobs in /sys/fs/kernel/mm.

Link: https://lkml.kernel.org/r/20231218231054.1625219-5-shr@devkernel.io
Signed-off-by: Stefan Roesch <shr@devkernel.io>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29 11:58:28 -08:00
Andrea Arcangeli
adef440691 userfaultfd: UFFDIO_MOVE uABI
Implement the uABI of UFFDIO_MOVE ioctl.
UFFDIO_COPY performs ~20% better than UFFDIO_MOVE when the application
needs pages to be allocated [1]. However, with UFFDIO_MOVE, if pages are
available (in userspace) for recycling, as is usually the case in heap
compaction algorithms, then we can avoid the page allocation and memcpy
(done by UFFDIO_COPY). Also, since the pages are recycled in the
userspace, we avoid the need to release (via madvise) the pages back to
the kernel [2].

We see over 40% reduction (on a Google pixel 6 device) in the compacting
thread's completion time by using UFFDIO_MOVE vs.  UFFDIO_COPY.  This was
measured using a benchmark that emulates a heap compaction implementation
using userfaultfd (to allow concurrent accesses by application threads). 
More details of the usecase are explained in [2].  Furthermore,
UFFDIO_MOVE enables moving swapped-out pages without touching them within
the same vma.  Today, it can only be done by mremap, however it forces
splitting the vma.

[1] https://lore.kernel.org/all/1425575884-2574-1-git-send-email-aarcange@redhat.com/
[2] https://lore.kernel.org/linux-mm/CA+EESO4uO84SSnBhArH4HvLNhaUQ5nZKNKXqxRCyjniNVjp0Aw@mail.gmail.com/

Update for the ioctl_userfaultfd(2)  manpage:

   UFFDIO_MOVE
       (Since Linux xxx)  Move a continuous memory chunk into the
       userfault registered range and optionally wake up the blocked
       thread. The source and destination addresses and the number of
       bytes to move are specified by the src, dst, and len fields of
       the uffdio_move structure pointed to by argp:

           struct uffdio_move {
               __u64 dst;    /* Destination of move */
               __u64 src;    /* Source of move */
               __u64 len;    /* Number of bytes to move */
               __u64 mode;   /* Flags controlling behavior of move */
               __s64 move;   /* Number of bytes moved, or negated error */
           };

       The following value may be bitwise ORed in mode to change the
       behavior of the UFFDIO_MOVE operation:

       UFFDIO_MOVE_MODE_DONTWAKE
              Do not wake up the thread that waits for page-fault
              resolution

       UFFDIO_MOVE_MODE_ALLOW_SRC_HOLES
              Allow holes in the source virtual range that is being moved.
              When not specified, the holes will result in ENOENT error.
              When specified, the holes will be accounted as successfully
              moved memory. This is mostly useful to move hugepage aligned
              virtual regions without knowing if there are transparent
              hugepages in the regions or not, but preventing the risk of
              having to split the hugepage during the operation.

       The move field is used by the kernel to return the number of
       bytes that was actually moved, or an error (a negated errno-
       style value).  If the value returned in move doesn't match the
       value that was specified in len, the operation fails with the
       error EAGAIN.  The move field is output-only; it is not read by
       the UFFDIO_MOVE operation.

       The operation may fail for various reasons. Usually, remapping of
       pages that are not exclusive to the given process fail; once KSM
       might deduplicate pages or fork() COW-shares pages during fork()
       with child processes, they are no longer exclusive. Further, the
       kernel might only perform lightweight checks for detecting whether
       the pages are exclusive, and return -EBUSY in case that check fails.
       To make the operation more likely to succeed, KSM should be
       disabled, fork() should be avoided or MADV_DONTFORK should be
       configured for the source VMA before fork().

       This ioctl(2) operation returns 0 on success.  In this case, the
       entire area was moved.  On error, -1 is returned and errno is
       set to indicate the error.  Possible errors include:

       EAGAIN The number of bytes moved (i.e., the value returned in
              the move field) does not equal the value that was
              specified in the len field.

       EINVAL Either dst or len was not a multiple of the system page
              size, or the range specified by src and len or dst and len
              was invalid.

       EINVAL An invalid bit was specified in the mode field.

       ENOENT
              The source virtual memory range has unmapped holes and
              UFFDIO_MOVE_MODE_ALLOW_SRC_HOLES is not set.

       EEXIST
              The destination virtual memory range is fully or partially
              mapped.

       EBUSY
              The pages in the source virtual memory range are either
              pinned or not exclusive to the process. The kernel might
              only perform lightweight checks for detecting whether the
              pages are exclusive. To make the operation more likely to
              succeed, KSM should be disabled, fork() should be avoided
              or MADV_DONTFORK should be configured for the source virtual
              memory area before fork().

       ENOMEM Allocating memory needed for the operation failed.

       ESRCH
              The target process has exited at the time of a UFFDIO_MOVE
              operation.

Link: https://lkml.kernel.org/r/20231206103702.3873743-3-surenb@google.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Nicolas Geoffray <ngeoffray@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: ZhangPeng <zhangpeng362@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-29 11:58:24 -08:00
SeongJae Park
e93b81a3fc Docs/admin-guide/mm/damon/usage: use a list for 'state' sysfs file input commands
There are eight command inputs for 'state' DAMON sysfs file, and those are
verbosely explained in multiple paragraphs.  It is not easy to find
explanation of specific command, and getting whole picture of supported
commands.  Replace the paragraphs with a list.

Link: https://lkml.kernel.org/r/20231213190338.54146-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 14:48:13 -08:00
SeongJae Park
9c8c315da2 Docs/admin-guide/mm/damon/usage: add links to sysfs files hierarchy
'Sysfs Files Hierarchy' section of DAMON usage document shows whole
picture of the interface.  Then sections for detailed explanation of the
files follow.  Due to the amount of the files, navigating between the
whole picture and the section for specific files sometimes require no
subtle amount of scrolling.  Add links from the whole picture to the
dedicated sections for making the navigation easier.

Link: https://lkml.kernel.org/r/20231213190338.54146-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 14:48:13 -08:00
SeongJae Park
c7ae9634a4 Docs/admin-guide/mm/damon/usage: update context directory section label
The label for context DAMON sysfs directory section is having name
sysfs_contexts.  The name would be better to be used for the contexts
directory.  Rename it to represent a single context.

Link: https://lkml.kernel.org/r/20231213190338.54146-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 14:48:13 -08:00
Ryan Roberts
3485b88390 mm: thp: introduce multi-size THP sysfs interface
In preparation for adding support for anonymous multi-size THP, introduce
new sysfs structure that will be used to control the new behaviours.  A
new directory is added under transparent_hugepage for each supported THP
size, and contains an `enabled` file, which can be set to "inherit" (to
inherit the global setting), "always", "madvise" or "never".  For now, the
kernel still only supports PMD-sized anonymous THP, so only 1 directory is
populated.

The first half of the change converts transhuge_vma_suitable() and
hugepage_vma_check() so that they take a bitfield of orders for which the
user wants to determine support, and the functions filter out all the
orders that can't be supported, given the current sysfs configuration and
the VMA dimensions.  The resulting functions are renamed to
thp_vma_suitable_orders() and thp_vma_allowable_orders() respectively. 
Convenience functions that take a single, unencoded order and return a
boolean are also defined as thp_vma_suitable_order() and
thp_vma_allowable_order().

The second half of the change implements the new sysfs interface.  It has
been done so that each supported THP size has a `struct thpsize`, which
describes the relevant metadata and is itself a kobject.  This is pretty
minimal for now, but should make it easy to add new per-thpsize files to
the interface if needed in future (e.g.  per-size defrag).  Rather than
keep the `enabled` state directly in the struct thpsize, I've elected to
directly encode it into huge_anon_orders_[always|madvise|inherit]
bitfields since this reduces the amount of work required in
thp_vma_allowable_orders() which is called for every page fault.

See Documentation/admin-guide/mm/transhuge.rst, as modified by this
commit, for details of how the new sysfs interface works.

[ryan.roberts@arm.com: fix build warning when CONFIG_SYSFS is disabled]
  Link: https://lkml.kernel.org/r/20231211125320.3997543-1-ryan.roberts@arm.com
Link: https://lkml.kernel.org/r/20231207161211.2374093-4-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Barry Song <v-songbaohua@oppo.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Itaru Kitayama <itaru.kitayama@gmail.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yin Fengwei <fengwei.yin@intel.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-20 14:48:12 -08:00
Shyam Sundar S K
d0ba7ad438 platform/x86/amd/pmf: Add support to update system state
PMF driver based on the output actions from the TA can request to update
the system states like entering s0i3, lock screen etc. by generating
an uevent. Based on the udev rules set in the userspace the event id
matching the uevent shall get updated accordingly using the systemctl.

Sample udev rules under Documentation/admin-guide/pmf.rst.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Link: https://lore.kernel.org/r/20231212014705.2017474-9-Shyam-sundar.S-k@amd.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-12-18 12:47:46 +01:00
Vlastimil Babka
a3a2782745 Documentation, mm/unaccepted: document accept_memory kernel parameter
The accept_memory kernel parameter was added in commit dcdfdd40fa
("mm: Add support for unaccepted memory") but not listed in the
kernel-parameters doc. Add it there.

Acked-by: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20231214-accept_memory_param-v2-1-f38cd20a0247@suse.cz
2023-12-15 09:29:34 -07:00
Alexander Graf
2678fd2fe9 initramfs: Expose retained initrd as sysfs file
When the kernel command line option "retain_initrd" is set, we do not
free the initrd memory. However, we also don't expose it to anyone for
consumption. That leaves us in a weird situation where the only user of
this feature is ppc64 and arm64 specific kexec tooling.

To make it more generally useful, this patch adds a kobject to the
firmware object that contains the initrd context when "retain_initrd"
is set. That way, we can access the initrd any time after boot from
user space and for example hand it into kexec as --initrd parameter
if we want to reboot the same initrd. Or inspect it directly locally.

With this patch applied, there is a new /sys/firmware/initrd file when
the kernel was booted with an initrd and "retain_initrd" command line
option is set.

Signed-off-by: Alexander Graf <graf@amazon.com>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20231207235654.16622-1-graf@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-15 17:23:00 +01:00
Rex Nie
121d0ba224 Documentation: Remove redundant file names from examples
Since most examples use the ddcmd alias, remove the redundant file names

Signed-off-by: Rex Nie <rex.nie@jaguarmicro.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20231213073735.2850-1-rex.nie@jaguarmicro.com
2023-12-15 09:15:05 -07:00
Eric Dumazet
4944566706 net: increase optmem_max default value
For many years, /proc/sys/net/core/optmem_max default value
on a 64bit kernel has been 20 KB.

Regular usage of TCP tx zerocopy needs a bit more.

Google has used 128KB as the default value for 7 years without
any problem.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-15 11:01:26 +00:00
Shuai Xue
cae40614cd docs: perf: Add description for Synopsys DesignWare PCIe PMU driver
Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe
controller which implements PMU for performance and functional debugging to
facilitate system maintenance.

Document it to provide guidance on how to use it.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20231208025652.87192-2-xueshuai@linux.alibaba.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-12-13 13:35:41 +00:00
SeongJae Park
6140edeea8 Docs/admin-guide/mm/damon/usage: document for quota goals
Update DAMON sysfs usage for newly added DAMOS quota goals interface.

Link: https://lkml.kernel.org/r/20231130023652.50284-10-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:05 -08:00
Nhat Pham
b5ba474f3f zswap: shrink zswap pool based on memory pressure
Currently, we only shrink the zswap pool when the user-defined limit is
hit.  This means that if we set the limit too high, cold data that are
unlikely to be used again will reside in the pool, wasting precious
memory.  It is hard to predict how much zswap space will be needed ahead
of time, as this depends on the workload (specifically, on factors such as
memory access patterns and compressibility of the memory pages).

This patch implements a memcg- and NUMA-aware shrinker for zswap, that is
initiated when there is memory pressure.  The shrinker does not have any
parameter that must be tuned by the user, and can be opted in or out on a
per-memcg basis.

Furthermore, to make it more robust for many workloads and prevent
overshrinking (i.e evicting warm pages that might be refaulted into
memory), we build in the following heuristics:

* Estimate the number of warm pages residing in zswap, and attempt to
  protect this region of the zswap LRU.
* Scale the number of freeable objects by an estimate of the memory
  saving factor. The better zswap compresses the data, the fewer pages
  we will evict to swap (as we will otherwise incur IO for relatively
  small memory saving).
* During reclaim, if the shrinker encounters a page that is also being
  brought into memory, the shrinker will cautiously terminate its
  shrinking action, as this is a sign that it is touching the warmer
  region of the zswap LRU.

As a proof of concept, we ran the following synthetic benchmark: build the
linux kernel in a memory-limited cgroup, and allocate some cold data in
tmpfs to see if the shrinker could write them out and improved the overall
performance.  Depending on the amount of cold data generated, we observe
from 14% to 35% reduction in kernel CPU time used in the kernel builds.

[nphamcs@gmail.com: check shrinker enablement early, use less costly stat flushing]
  Link: https://lkml.kernel.org/r/20231206194456.3234203-1-nphamcs@gmail.com
Link: https://lkml.kernel.org/r/20231130194023.4102148-7-nphamcs@gmail.com
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-12 10:57:02 -08:00
Paul E. McKenney
4e58aaeebb rcu: Restrict access to RCU CPU stall notifiers
Although the RCU CPU stall notifiers can be useful for dumping state when
tracking down delicate forward-progress bugs where NUMA effects cause
cache lines to be delivered to a given CPU regularly, but always in a
state that prevents that CPU from making forward progress.  These bugs can
be detected by the RCU CPU stall-warning mechanism, but in some cases,
the stall-warnings printk()s disrupt the forward-progress bug before
any useful state can be obtained.

Unfortunately, the notifier mechanism added by commit 5b404fdaba ("rcu:
Add RCU CPU stall notifier") can make matters worse if used at all
carelessly. For example, if the stall warning was caused by a lock not
being released, then any attempt to acquire that lock in the notifier
will hang. This will prevent not only the notifier from producing any
useful output, but it will also prevent the stall-warning message from
ever appearing.

This commit therefore hides this new RCU CPU stall notifier
mechanism under a new RCU_CPU_STALL_NOTIFIER Kconfig option that
depends on both DEBUG_KERNEL and RCU_EXPERT.  In addition, the
rcupdate.rcu_cpu_stall_notifiers=1 kernel boot parameter must also
be specified.  The RCU_CPU_STALL_NOTIFIER Kconfig option's help text
contains a warning and explains the dangers of careless use, recommending
lockless notifier code.  In addition, a WARN() is triggered each time
that an attempt is made to register a stall-warning notifier in kernels
built with CONFIG_RCU_CPU_STALL_NOTIFIER=y.

This combination of measures will keep use of this mechanism confined to
debug kernels and away from routine deployments.

[ paulmck: Apply Dan Carpenter feedback. ]

Fixes: 5b404fdaba ("rcu: Add RCU CPU stall notifier")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
2023-12-12 02:31:22 +05:30
Sergey Senozhatsky
a7a0350583 zram: split memory-tracking and ac-time tracking
ZRAM_MEMORY_TRACKING enables two features:
- per-entry ac-time tracking
- debugfs interface

The latter one is the reason why memory-tracking depends on DEBUG_FS,
while the former one is used far beyond debugging these days.  Namely
ac-time is used for fine grained writeback of idle entries (pages).

Move ac-time tracking under its own config option so that it can be
enabled (along with writeback) on systems without DEBUG_FS.

[senozhatsky@chromium.org: ifdef fixup, per Dmytro]
  Link: https://lkml.kernel.org/r/20231117013543.540280-1-senozhatsky@chromium.org
Link: https://lkml.kernel.org/r/20231115024223.4133148-1-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Dmytro Maluka <dmaluka@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:40 -08:00
Andrei Vagin
e6a9a2cbc1 fs/proc/task_mmu: report SOFT_DIRTY bits through the PAGEMAP_SCAN ioctl
The PAGEMAP_SCAN ioctl returns information regarding page table entries. 
It is more efficient compared to reading pagemap files.  CRIU can start to
utilize this ioctl, but it needs info about soft-dirty bits to track
memory changes.

We are aware of a new method for tracking memory changes implemented in
the PAGEMAP_SCAN ioctl.  For CRIU, the primary advantage of this method is
its usability by unprivileged users.  However, it is not feasible to
transparently replace the soft-dirty tracker with the new one.  The main
problem here is userfault descriptors that have to be preserved between
pre-dump iterations.  It means criu continues supporting the soft-dirty
method to avoid breakage for current users.  The new method will be
implemented as a separate feature.

[avagin@google.com: update tools/include/uapi/linux/fs.h]
  Link: https://lkml.kernel.org/r/20231107164139.576046-1-avagin@google.com
Link: https://lkml.kernel.org/r/20231106220959.296568-1-avagin@google.com
Signed-off-by: Andrei Vagin <avagin@google.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-10 16:51:35 -08:00
Detlev Casanova
357547b876 doc: media: visl: Add AV1 support
Add AV1 information in visl documentation.

Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2023-12-07 08:31:14 +01:00
Xu Yang
9745295358 docs/perf: Add explanation for DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER quirk
Add explanation for DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER quirk.

Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20231120093317.2652866-2-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-12-05 14:12:07 +00:00
Josh Don
7b91eb6000 cgroup: Fix documentation for cpu.idle
Two problems:
	- cpu.idle cgroups show up with 0 weight, correct the
	  documentation to indicate this.
	- cpu.idle has no entry describing it.

Signed-off-by: Josh Don <joshdon@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2023-12-01 06:49:32 -10:00
Waiman Long
877c737db9 cgroup/cpuset: Expose cpuset.cpus.isolated
The root-only cpuset.cpus.isolated control file shows the current set
of isolated CPUs in isolated partitions. This control file is currently
exposed only with the cgroup_debug boot command line option which also
adds the ".__DEBUG__." prefix. This is actually a useful control file if
users want to find out which CPUs are currently in an isolated state by
the cpuset controller. Remove CFTYPE_DEBUG flag for this control file and
make it available by default without any prefix.

The test_cpuset_prs.sh test script and the cgroup-v2.rst documentation
file are also updated accordingly. Minor code change is also made in
test_cpuset_prs.sh to avoid false test failure when running on debug
kernel.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2023-11-28 06:45:11 -10:00
Tomas Mudrunka
39ff20f5fd /proc/sysrq-trigger: accept multiple keys at once
This way we can do:
`echo _reisub > /proc/sysrq-trigger`
Instead of:
`for i in r e i s u b; do echo "$i" > /proc/sysrq-trigger; done;`

This can be very useful when trying to execute sysrq combo remotely
or from userspace. When sending keys in multiple separate writes,
userspace (eg. bash or ssh) can be killed before whole combo is completed.
Therefore putting all keys in single write is more robust approach.

Signed-off-by: Tomas Mudrunka <tomas.mudrunka@gmail.com>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20231120111451.527952-1-tomas.mudrunka@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-25 07:23:16 +00:00
Manikanta Guntupalli
0be916a68c Documentation: devices.txt: Update ttyUL major number allocation details
Describe when uartlite driver uses static/dynamic allocation for major
number based on maximum number of uartlite serial ports.

Signed-off-by: Manikanta Guntupalli <manikanta.guntupalli@amd.com>
Link: https://lore.kernel.org/r/20231116134003.3762725-2-manikanta.guntupalli@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-25 07:23:16 +00:00
Hardik Gajjar
5a1ccf0c72 usb: new quirk to reduce the SET_ADDRESS request timeout
This patch introduces a new USB quirk,
USB_QUIRK_SHORT_SET_ADDRESS_REQ_TIMEOUT, which modifies the timeout value
for the SET_ADDRESS request. The standard timeout for USB request/command
is 5000 ms, as recommended in the USB 3.2 specification (section 9.2.6.1).

However, certain scenarios, such as connecting devices through an APTIV
hub, can lead to timeout errors when the device enumerates as full speed
initially and later switches to high speed during chirp negotiation.

In such cases, USB analyzer logs reveal that the bus suspends for
5 seconds due to incorrect chirp parsing and resumes only after two
consecutive timeout errors trigger a hub driver reset.

Packet(54) Dir(?) Full Speed J(997.100 us) Idle(  2.850 us)
_______| Time Stamp(28 . 105 910 682)
_______|_____________________________________________________________Ch0
Packet(55) Dir(?) Full Speed J(997.118 us) Idle(  2.850 us)
_______| Time Stamp(28 . 106 910 632)
_______|_____________________________________________________________Ch0
Packet(56) Dir(?) Full Speed J(399.650 us) Idle(222.582 us)
_______| Time Stamp(28 . 107 910 600)
_______|_____________________________________________________________Ch0
Packet(57) Dir Chirp J( 23.955 ms) Idle(115.169 ms)
_______| Time Stamp(28 . 108 532 832)
_______|_____________________________________________________________Ch0
Packet(58) Dir(?) Full Speed J (Suspend)( 5.347 sec) Idle(  5.366 us)
_______| Time Stamp(28 . 247 657 600)
_______|_____________________________________________________________Ch0

This 5-second delay in device enumeration is undesirable, particularly
in automotive applications where quick enumeration is crucial
(ideally within 3 seconds).

The newly introduced quirks provide the flexibility to align with a
3-second time limit, as required in specific contexts like automotive
applications.

By reducing the SET_ADDRESS request timeout to 500 ms, the
system can respond more swiftly to errors, initiate rapid recovery, and
ensure efficient device enumeration. This change is vital for scenarios
where rapid smartphone enumeration and screen projection are essential.

To use the quirk, please write "vendor_id:product_id:p" to
/sys/bus/usb/drivers/hub/module/parameter/quirks

For example,
echo "0x2c48:0x0132:p" > /sys/bus/usb/drivers/hub/module/parameters/quirks"

Signed-off-by: Hardik Gajjar <hgajjar@de.adit-jv.com>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/20231027152029.104363-2-hgajjar@de.adit-jv.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-23 12:32:44 +00:00
Jonathan Corbet
d591aefc66 Merge branch 'vegard' into docs-mw
Vegard Nossum writes:

  This patch series replaces some instances of 'class:: toc-title' with
  toctree's :caption: attribute, see the last patch in the series for some
  more rationale/explanation.
2023-11-17 13:07:51 -07:00
Vegard Nossum
a37a44572f media: admin-guide: properly format ToC heading
"class:: toc-title" was a workaround for older Sphinx versions that are
no longer supported.

The canonical way to add a heading to the ToC is to use :caption:.
Do that.

Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20231027081830.195056-4-vegard.nossum@oracle.com
2023-11-17 13:05:26 -07:00
Jack Zhu
f72f80550d media: admin-guide: Add starfive_camss.rst for Starfive Camera Subsystem
Add starfive_camss.rst file that documents the Starfive Camera
Subsystem driver which is used for handing image sensor data.

Signed-off-by: Jack Zhu <jack.zhu@starfivetech.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2023-11-16 12:13:25 +01:00
Javier Martinez Canillas
c986968fe9
regulator: core: Add option to prevent disabling unused regulators
This may be useful for debugging and develompent purposes, when there are
drivers that depend on regulators to be enabled but do not request them.

It is inspired from the clk_ignore_unused and pd_ignore_unused parameters,
that are used to keep firmware-enabled clocks and power domains on even if
these are not used by drivers.

The parameter is not expected to be used in normal cases and should not be
needed on a platform with proper driver support.

Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Brian Masney <bmasney@redhat.com>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20231107190926.1185326-1-javierm@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2023-11-13 01:26:29 +00:00
Waiman Long
72c6303acf cgroup/cpuset: Take isolated CPUs out of workqueue unbound cpumask
To make CPUs in isolated cpuset partition closer in isolation to
the boot time isolated CPUs specified in the "isolcpus" boot command
line option, we need to take those CPUs out of the workqueue unbound
cpumask so that work functions from the unbound workqueues won't run
on those CPUs.  Otherwise, they will interfere the user tasks running
on those isolated CPUs.

With the introduction of the workqueue_unbound_exclude_cpumask() helper
function in an earlier commit, those isolated CPUs can now be taken
out from the workqueue unbound cpumask.

This patch also updates cgroup-v2.rst to mention that isolated
CPUs will be excluded from unbound workqueue cpumask as well as
updating test_cpuset_prs.sh to verify the correctness of the new
*cpuset.cpus.isolated file, if available via cgroup_debug option.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2023-11-12 15:07:41 -06:00
Linus Torvalds
4bbdb725a3 IOMMU Updates for Linux v6.7
Including:
 
 	- Core changes:
 	  - Make default-domains mandatory for all IOMMU drivers
 	  - Remove group refcounting
 	  - Add generic_single_device_group() helper and consolidate
 	    drivers
 	  - Cleanup map/unmap ops
 	  - Scaling improvements for the IOVA rcache depot
 	  - Convert dart & iommufd to the new domain_alloc_paging()
 
 	- ARM-SMMU:
 	  - Device-tree binding update:
 	    - Add qcom,sm7150-smmu-v2 for Adreno on SM7150 SoC
 	  - SMMUv2:
 	    - Support for Qualcomm SDM670 (MDSS) and SM7150 SoCs
 	  - SMMUv3:
 	    - Large refactoring of the context descriptor code to
 	      move the CD table into the master, paving the way
 	      for '->set_dev_pasid()' support on non-SVA domains
 	  - Minor cleanups to the SVA code
 
 	- Intel VT-d:
 	  - Enable debugfs to dump domain attached to a pasid
 	  - Remove an unnecessary inline function.
 
 	- AMD IOMMU:
 	  - Initial patches for SVA support (not complete yet)
 
 	- S390 IOMMU:
 	  - DMA-API conversion and optimized IOTLB flushing
 
 	- Some smaller fixes and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmVJFcEACgkQK/BELZcB
 GuMgDxAAsnYVQjQ7wRkwR0rHARuEaJ+Lz2vkLNH+uYXjBzhFe2bT+ykMcZysAkdK
 A5PMLOFT5Etf+PAqOM0CoIGQFOefAId6uGl7S61Fp9ZWDKhMrOBFWhxGOaufA1Du
 tNvt3i66hwPSDZa82kY3wRCluYtj0aBBzmM6ZTwBwFZdQ7LABMtE8OxisqncVvq0
 H6vhV213fqvhCFSQJ6PnTAEiv70WvWBWygA+Z/gwYf9hypZQae91PNXdK9313a9z
 OvCzGBkL/R5/3KkJd88UhFwyYzyNGxq/DmH1etawYR5gYZ8UT/Z/sYpcx9hlO7qr
 eENPqeQc+YHZXpKqkaq66HBA1FSnXUqRZLl4cVaZahRRMe/yArsBM6R0W1AfkMAR
 rZxwHKoHUWeuHQLMVvmSDNL57h/GJJpTXjRc8HMxLZkVp+ScvnT5XCYHWWzRdCdx
 TcC/pJ1tet0FQ8rw09ovlwpGVA6eojWvcpVbLVLfGN8ZWViSVfvNFoPNb7HsGK6M
 iRi+L41Y7s63cyogC/Gsae2RAvYv29ZpvE91lmon2u+VBlTpMdOFX9EhWS6RqOBF
 cV30bhsw0dyCB7v5jDPtABYEOaR6l1mPLhn1gX3u0Ue/tmPhLX69k4bVWBY6wP3p
 gmmJD9ub8FuPQtFCGPE7/8ZINjGGrfiKO24DNI2Ty3XEeq21hU4=
 =UyWC
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:
 "Core changes:
   - Make default-domains mandatory for all IOMMU drivers
   - Remove group refcounting
   - Add generic_single_device_group() helper and consolidate drivers
   - Cleanup map/unmap ops
   - Scaling improvements for the IOVA rcache depot
   - Convert dart & iommufd to the new domain_alloc_paging()

  ARM-SMMU:
   - Device-tree binding update:
       - Add qcom,sm7150-smmu-v2 for Adreno on SM7150 SoC
   - SMMUv2:
       - Support for Qualcomm SDM670 (MDSS) and SM7150 SoCs
   - SMMUv3:
       - Large refactoring of the context descriptor code to move the CD
         table into the master, paving the way for '->set_dev_pasid()'
         support on non-SVA domains
   - Minor cleanups to the SVA code

  Intel VT-d:
   - Enable debugfs to dump domain attached to a pasid
   - Remove an unnecessary inline function

  AMD IOMMU:
   - Initial patches for SVA support (not complete yet)

  S390 IOMMU:
   - DMA-API conversion and optimized IOTLB flushing

  And some smaller fixes and improvements"

* tag 'iommu-updates-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (102 commits)
  iommu/dart: Remove the force_bypass variable
  iommu/dart: Call apple_dart_finalize_domain() as part of alloc_paging()
  iommu/dart: Convert to domain_alloc_paging()
  iommu/dart: Move the blocked domain support to a global static
  iommu/dart: Use static global identity domains
  iommufd: Convert to alloc_domain_paging()
  iommu/vt-d: Use ops->blocked_domain
  iommu/vt-d: Update the definition of the blocking domain
  iommu: Move IOMMU_DOMAIN_BLOCKED global statics to ops->blocked_domain
  Revert "iommu/vt-d: Remove unused function"
  iommu/amd: Remove DMA_FQ type from domain allocation path
  iommu: change iommu_map_sgtable to return signed values
  iommu/virtio: Add __counted_by for struct viommu_request and use struct_size()
  iommu/vt-d: debugfs: Support dumping a specified page table
  iommu/vt-d: debugfs: Create/remove debugfs file per {device, pasid}
  iommu/vt-d: debugfs: Dump entry pointing to huge page
  iommu/vt-d: Remove unused function
  iommu/arm-smmu-v3-sva: Remove bond refcount
  iommu/arm-smmu-v3-sva: Remove unused iommu_sva handle
  iommu/arm-smmu-v3: Rename cdcfg to cd_table
  ...
2023-11-09 13:37:28 -08:00
Linus Torvalds
6bc986ab83 NFS client updates for Linux 6.7
Highlights include:
 
 Bugfixes:
  - SUNRPC: A fix to re-probe the target RPC port after an ECONNRESET error
  - SUNRPC: Handle allocation errors from rpcb_call_async()
  - SUNRPC: Fix a use-after-free condition in rpc_pipefs
  - SUNRPC: fix up various checks for timeouts
  - NFSv4.1: Handle NFS4ERR_DELAY errors during session trunking
  - NFSv4.1: fix SP4_MACH_CRED protection for pnfs IO
  - NFSv4: Ensure that we test all delegations when the server notifies
    us that it may have revoked some of them
 
 Features:
  - Allow knfsd processes to break out of NFS4ERR_DELAY loops when
    re-exporting NFSv4.x by setting appropriate values for the
    'delay_retrans' module parameter.
  - nfs: Convert nfs_symlink() to use a folio
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAmVL62IACgkQZwvnipYK
 API4Ww/+I75hmEdI4i/6v8WnrLWLWzmCgybez2AfrKYtuYmEcDZTf2K4pNVEJxFG
 PJ4TYRpaSgwxCXqoup5INOdL5gS8g3JbAlTqrDQ8nYGeBXETN9tN5n2Xj8liHk2l
 OrnYzgTjC2pwpWDZq/W1LCQMatCbs9XhpmyBvqZH5r7tAOVGkrk1ICZ/r+/nGiN8
 LAbm/I0M1Jp0iJisN/i/0CsEgLQMCfeQVrtEMCZGsoVS79Mr/W1hF3KiGognI/xz
 FvEXnZKauw9npu7U7ckhHZcHd8oQxby0Q0Xny/IpgiO2Z1YqfKCvJSK1sjBt/lFu
 7fe2HGMFfcTMx/bn/qdUJR351607rBi4h3t1OfK4KIxV1gSLUyS/RR7ayFx5A0gM
 pFJcC0ZnKoiVr2vSNMMguennbyjScqNOC5ECb+bpMAyDV7suF9e/khK12CdYwqFm
 cpeUUD35GZT+RjP3htI92Pj0bqBOHx+o7qEi+MEel3t90us8PPBrWywHk7Tw40vJ
 eUxH0XQtZVswH6TvekNdoMXx8TXSAbx4I6Hw3fmNGWLRp494p3GVSmWQmctcvCi7
 Y6E1P3YT8G1OeI9+fIQr5Wp2F9sOdFPrb3BrAojj9ndJ3ZqQAcx+gY5z2RVnvvur
 PTsLInFxS8+WvvqPtQCZZ5UxnrcPFCk1js0rg6EqdjZmq6+OqJU=
 =TNRl
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-6.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Bugfixes:

   - SUNRPC:
       - re-probe the target RPC port after an ECONNRESET error
       - handle allocation errors from rpcb_call_async()
       - fix a use-after-free condition in rpc_pipefs
       - fix up various checks for timeouts

   - NFSv4.1:
       - Handle NFS4ERR_DELAY errors during session trunking
       - fix SP4_MACH_CRED protection for pnfs IO

   - NFSv4:
       - Ensure that we test all delegations when the server notifies
         us that it may have revoked some of them

  Features:

   - Allow knfsd processes to break out of NFS4ERR_DELAY loops when
     re-exporting NFSv4.x by setting appropriate values for the
     'delay_retrans' module parameter

   - nfs: Convert nfs_symlink() to use a folio"

* tag 'nfs-for-6.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  nfs: Convert nfs_symlink() to use a folio
  SUNRPC: Fix RPC client cleaned up the freed pipefs dentries
  NFSv4.1: fix SP4_MACH_CRED protection for pnfs IO
  SUNRPC: Add an IS_ERR() check back to where it was
  NFSv4.1: fix handling NFS4ERR_DELAY when testing for session trunking
  nfs41: drop dependency between flexfiles layout driver and NFSv3 modules
  NFSv4: fairly test all delegations on a SEQ4_ revocation
  SUNRPC: SOFTCONN tasks should time out when on the sending list
  SUNRPC: Force close the socket when a hard error is reported
  SUNRPC: Don't skip timeout checks in call_connect_status()
  SUNRPC: ECONNRESET might require a rebind
  NFSv4/pnfs: Allow layoutget to return EAGAIN for softerr mounts
  NFSv4: Add a parameter to limit the number of retries after NFS4ERR_DELAY
2023-11-08 13:39:16 -08:00
Linus Torvalds
be3ca57cfb media updates for v6.7-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmVF2z0ACgkQCF8+vY7k
 4RUyHBAAhO7ArWtie5SZZ2lYzeoQ2KWZJsiRUdl7ER+lXeKr5HIa23CqVG5+D3hA
 2VQAn/+2wJHMhfSZUcgS889iKGJMhdEj77JBehakTA0122wq/0NNMfbwN0ebHoIZ
 B5FqhXkU8NvQn+8MVyRSnmC7lzlZq7lUlDxbpjCkqOqm5t1TXuMCD81briZxuKWR
 N+STu3rsQ1Vq+HudAqLHcuQKCJjzqo5x2/MOk7DlI+FHtKPLn50CfizmZNiMIn/2
 lVfp6PoZhtBCJAlQFx3VHjYIir5ENvcmdj0ehsocVe4vYFFfBh0NrN8/ixcWyl0i
 z4BSC9/AQTJuAt2mxj2g/OE9ipFqGkhHspSy87GWqCSzIKIKuZYFRHB55e6h6/kA
 11MceDQ+VNmO6dkU4G6/dChaeXt+5omU9mlEaugzmtb/G0HbvYW5jJJuvVMmmGde
 Gy2F2SazGJsfLLBS+I7yKJRDhn5+m+9Q0gCsiKDbEcDoRLrwsi5zraRRVrsKI9q7
 CAFMrU5MCzniMh1UpJxdETPbuxjc54/Uwp3k3ieg7klIyx2rxvL6MzED9O67qkay
 1m0A8hRNpvi+gqS5Zd+V9YafgOEHDziL/KDypp1je+6KbLBvlBsH3YFBps2APJou
 +VgVElxZnwzas4kAkUC+RbCEmRXc/T96oL1mH8w1q4O3iDoaMVc=
 =RO+Y
 -----END PGP SIGNATURE-----

Merge tag 'media/v6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

 - the old V4L2 core videobuf kAPI was finally removed. All media
   drivers should now be using VB2 kAPI

 - new automotive driver: mgb4

 - new platform video driver: npcm-video

 - new sensor driver: mt9m114

 - new TI driver used in conjunction with Cadence CSI2RX IP to bridge
   TI-specific parts

 - ir-rx51 was removed and the N900 DT binding was moved to the
   pwm-ir-tx generic driver

 - drop atomisp-specific ov5693, using the upstream driver instead

 - the camss driver has gained RDI3 support for VFE 17x

 - the atomisp driver now detects ISP2400 or ISP2401 at run time. No
   need to set it up at build time anymore

 - lots of driver fixes, cleanups and improvements

* tag 'media/v6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (377 commits)
  media: nuvoton: VIDEO_NPCM_VCD_ECE should depend on ARCH_NPCM
  media: venus: Fix firmware path for resources
  media: venus: hfi_cmds: Replace one-element array with flex-array member and use __counted_by
  media: venus: hfi_parser: Add check to keep the number of codecs within range
  media: venus: hfi: add checks to handle capabilities from firmware
  media: venus: hfi: fix the check to handle session buffer requirement
  media: venus: hfi: add checks to perform sanity on queue pointers
  media: platform: cadence: select MIPI_DPHY dependency
  media: MAINTAINERS: Fix path for J721E CSI2RX bindings
  media: cec: meson: always include meson sub-directory in Makefile
  media: videobuf2: Fix IS_ERR checking in vb2_dc_put_userptr()
  media: platform: mtk-mdp3: fix uninitialized variable in mdp_path_config()
  media: mediatek: vcodec: using encoder device to alloc/free encoder memory
  media: imx-jpeg: notify source chagne event when the first picture parsed
  media: cx231xx: Use EP5_BUF_SIZE macro
  media: siano: Drop unnecessary error check for debugfs_create_dir/file()
  media: mediatek: vcodec: Handle invalid encoder vsi
  media: aspeed: Drop unnecessary error check for debugfs_create_file()
  Documentation: media: buffer.rst: fix V4L2_BUF_FLAG_PREPARED
  Documentation: media: gen-errors.rst: fix confusing ENOTTY description
  ...
2023-11-06 15:06:06 -08:00
Linus Torvalds
0a23fb262d Major microcode loader restructuring, cleanup and improvements by Thomas
Gleixner:
 
 - Restructure the code needed for it and add a temporary initrd mapping
   on 32-bit so that the loader can access the microcode blobs. This in
   itself is a preparation for the next major improvement:
 
 - Do not load microcode on 32-bit before paging has been enabled.
   Handling this has caused an endless stream of headaches, issues, ugly
   code and unnecessary hacks in the past. And there really wasn't any
   sensible reason to do that in the first place. So switch the 32-bit
   loading to happen after paging has been enabled and turn the loader
   code "real purrty" again
 
 - Drop mixed microcode steppings loading on Intel - there, a single patch
   loaded on the whole system is sufficient
 
 - Rework late loading to track which CPUs have updated microcode
   successfully and which haven't, act accordingly
 
 - Move late microcode loading on Intel in NMI context in order to
   guarantee concurrent loading on all threads
 
 - Make the late loading CPU-hotplug-safe and have the offlined threads
   be woken up for the purpose of the update
 
 - Add support for a minimum revision which determines whether late
   microcode loading is safe on a machine and the microcode does not
   change software visible features which the machine cannot use anyway
   since feature detection has happened already. Roughly, the minimum
   revision is the smallest revision number which must be loaded
   currently on the system so that late updates can be allowed
 
 - Other nice leanups, fixess, etc all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmVE0xkACgkQEsHwGGHe
 VUrCuBAAhOqqwkYPiGXPWd2hvdn1zGtD5fvEdXn3Orzd+Lwc6YaQTsCxCjIO/0ws
 8inpPFuOeGz4TZcplzipi3G5oatPVc7ORDuW+/BvQQQljZOsSKfhiaC29t6dvS6z
 UG3sbCXKVwlJ5Kwv3Qe4eWur4Ex6GeFDZkIvBCmbaAdGPFlfu1i2uO1yBooNP1Rs
 GiUkp+dP1/KREWwR/dOIsHYL2QjWIWfHQEWit/9Bj46rxE9ERx/TWt3AeKPfKriO
 Wp0JKp6QY78jg6a0a2/JVmbT1BKz69Z9aPp6hl4P2MfbBYOnqijRhdezFW0NyqV2
 pn6nsuiLIiXbnSOEw0+Wdnw5Q0qhICs5B5eaBfQrwgfZ8pxPHv2Ir777GvUTV01E
 Dv0ZpYsHa+mHe17nlK8V3+4eajt0PetExcXAYNiIE+pCb7pLjjKkl8e+lcEvEsO0
 QSL3zG5i5RWUMPYUvaFRgepWy3k/GPIoDQjRcUD3P+1T0GmnogNN10MMNhmOzfWU
 pyafe4tJUOVsq0HJ7V/bxIHk2p+Q+5JLKh5xBm9janE4BpabmSQnvFWNblVfK4ig
 M9ohjI/yMtgXROC4xkNXgi8wE5jfDKBghT6FjTqKWSV45vknF1mNEjvuaY+aRZ3H
 MB4P3HCj+PKWJimWHRYnDshcytkgcgVcYDiim8va/4UDrw8O2ks=
 =JOZu
 -----END PGP SIGNATURE-----

Merge tag 'x86_microcode_for_v6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 microcode loading updates from Borislac Petkov:
 "Major microcode loader restructuring, cleanup and improvements by
  Thomas Gleixner:

   - Restructure the code needed for it and add a temporary initrd
     mapping on 32-bit so that the loader can access the microcode
     blobs. This in itself is a preparation for the next major
     improvement:

   - Do not load microcode on 32-bit before paging has been enabled.

     Handling this has caused an endless stream of headaches, issues,
     ugly code and unnecessary hacks in the past. And there really
     wasn't any sensible reason to do that in the first place. So switch
     the 32-bit loading to happen after paging has been enabled and turn
     the loader code "real purrty" again

   - Drop mixed microcode steppings loading on Intel - there, a single
     patch loaded on the whole system is sufficient

   - Rework late loading to track which CPUs have updated microcode
     successfully and which haven't, act accordingly

   - Move late microcode loading on Intel in NMI context in order to
     guarantee concurrent loading on all threads

   - Make the late loading CPU-hotplug-safe and have the offlined
     threads be woken up for the purpose of the update

   - Add support for a minimum revision which determines whether late
     microcode loading is safe on a machine and the microcode does not
     change software visible features which the machine cannot use
     anyway since feature detection has happened already. Roughly, the
     minimum revision is the smallest revision number which must be
     loaded currently on the system so that late updates can be allowed

   - Other nice leanups, fixess, etc all over the place"

* tag 'x86_microcode_for_v6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  x86/microcode/intel: Add a minimum required revision for late loading
  x86/microcode: Prepare for minimal revision check
  x86/microcode: Handle "offline" CPUs correctly
  x86/apic: Provide apic_force_nmi_on_cpu()
  x86/microcode: Protect against instrumentation
  x86/microcode: Rendezvous and load in NMI
  x86/microcode: Replace the all-in-one rendevous handler
  x86/microcode: Provide new control functions
  x86/microcode: Add per CPU control field
  x86/microcode: Add per CPU result state
  x86/microcode: Sanitize __wait_for_cpus()
  x86/microcode: Clarify the late load logic
  x86/microcode: Handle "nosmt" correctly
  x86/microcode: Clean up mc_cpu_down_prep()
  x86/microcode: Get rid of the schedule work indirection
  x86/microcode: Mop up early loading leftovers
  x86/microcode/amd: Use cached microcode for AP load
  x86/microcode/amd: Cache builtin/initrd microcode early
  x86/microcode/amd: Cache builtin microcode too
  x86/microcode/amd: Use correct per CPU ucode_cpu_info
  ...
2023-11-04 08:46:37 -10:00
Linus Torvalds
ecae0bd517 Many singleton patches against the MM code. The patch series which are
included in this merge do the following:
 
 - Kemeng Shi has contributed some compation maintenance work in the
   series "Fixes and cleanups to compaction".
 
 - Joel Fernandes has a patchset ("Optimize mremap during mutual
   alignment within PMD") which fixes an obscure issue with mremap()'s
   pagetable handling during a subsequent exec(), based upon an
   implementation which Linus suggested.
 
 - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the
   following patch series:
 
 	mm/damon: misc fixups for documents, comments and its tracepoint
 	mm/damon: add a tracepoint for damos apply target regions
 	mm/damon: provide pseudo-moving sum based access rate
 	mm/damon: implement DAMOS apply intervals
 	mm/damon/core-test: Fix memory leaks in core-test
 	mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval
 
 - In the series "Do not try to access unaccepted memory" Adrian Hunter
   provides some fixups for the recently-added "unaccepted memory' feature.
   To increase the feature's checking coverage.  "Plug a few gaps where
   RAM is exposed without checking if it is unaccepted memory".
 
 - In the series "cleanups for lockless slab shrink" Qi Zheng has done
   some maintenance work which is preparation for the lockless slab
   shrinking code.
 
 - Qi Zheng has redone the earlier (and reverted) attempt to make slab
   shrinking lockless in the series "use refcount+RCU method to implement
   lockless slab shrink".
 
 - David Hildenbrand contributes some maintenance work for the rmap code
   in the series "Anon rmap cleanups".
 
 - Kefeng Wang does more folio conversions and some maintenance work in
   the migration code.  Series "mm: migrate: more folio conversion and
   unification".
 
 - Matthew Wilcox has fixed an issue in the buffer_head code which was
   causing long stalls under some heavy memory/IO loads.  Some cleanups
   were added on the way.  Series "Add and use bdev_getblk()".
 
 - In the series "Use nth_page() in place of direct struct page
   manipulation" Zi Yan has fixed a potential issue with the direct
   manipulation of hugetlb page frames.
 
 - In the series "mm: hugetlb: Skip initialization of gigantic tail
   struct pages if freed by HVO" has improved our handling of gigantic
   pages in the hugetlb vmmemmep optimizaton code.  This provides
   significant boot time improvements when significant amounts of gigantic
   pages are in use.
 
 - Matthew Wilcox has sent the series "Small hugetlb cleanups" - code
   rationalization and folio conversions in the hugetlb code.
 
 - Yin Fengwei has improved mlock()'s handling of large folios in the
   series "support large folio for mlock"
 
 - In the series "Expose swapcache stat for memcg v1" Liu Shixin has
   added statistics for memcg v1 users which are available (and useful)
   under memcg v2.
 
 - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
   prctl so that userspace may direct the kernel to not automatically
   propagate the denial to child processes.  The series is named "MDWE
   without inheritance".
 
 - Kefeng Wang has provided the series "mm: convert numa balancing
   functions to use a folio" which does what it says.
 
 - In the series "mm/ksm: add fork-exec support for prctl" Stefan Roesch
   makes is possible for a process to propagate KSM treatment across
   exec().
 
 - Huang Ying has enhanced memory tiering's calculation of memory
   distances.  This is used to permit the dax/kmem driver to use "high
   bandwidth memory" in addition to Optane Data Center Persistent Memory
   Modules (DCPMM).  The series is named "memory tiering: calculate
   abstract distance based on ACPI HMAT"
 
 - In the series "Smart scanning mode for KSM" Stefan Roesch has
   optimized KSM by teaching it to retain and use some historical
   information from previous scans.
 
 - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the
   series "mm: memcg: fix tracking of pending stats updates values".
 
 - In the series "Implement IOCTL to get and optionally clear info about
   PTEs" Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits
   us to atomically read-then-clear page softdirty state.  This is mainly
   used by CRIU.
 
 - Hugh Dickins contributed the series "shmem,tmpfs: general maintenance"
   - a bunch of relatively minor maintenance tweaks to this code.
 
 - Matthew Wilcox has increased the use of the VMA lock over file-backed
   page faults in the series "Handle more faults under the VMA lock".  Some
   rationalizations of the fault path became possible as a result.
 
 - In the series "mm/rmap: convert page_move_anon_rmap() to
   folio_move_anon_rmap()" David Hildenbrand has implemented some cleanups
   and folio conversions.
 
 - In the series "various improvements to the GUP interface" Lorenzo
   Stoakes has simplified and improved the GUP interface with an eye to
   providing groundwork for future improvements.
 
 - Andrey Konovalov has sent along the series "kasan: assorted fixes and
   improvements" which does those things.
 
 - Some page allocator maintenance work from Kemeng Shi in the series
   "Two minor cleanups to break_down_buddy_pages".
 
 - In thes series "New selftest for mm" Breno Leitao has developed
   another MM self test which tickles a race we had between madvise() and
   page faults.
 
 - In the series "Add folio_end_read" Matthew Wilcox provides cleanups
   and an optimization to the core pagecache code.
 
 - Nhat Pham has added memcg accounting for hugetlb memory in the series
   "hugetlb memcg accounting".
 
 - Cleanups and rationalizations to the pagemap code from Lorenzo
   Stoakes, in the series "Abstract vma_merge() and split_vma()".
 
 - Audra Mitchell has fixed issues in the procfs page_owner code's new
   timestamping feature which was causing some misbehaviours.  In the
   series "Fix page_owner's use of free timestamps".
 
 - Lorenzo Stoakes has fixed the handling of new mappings of sealed files
   in the series "permit write-sealed memfd read-only shared mappings".
 
 - Mike Kravetz has optimized the hugetlb vmemmap optimization in the
   series "Batch hugetlb vmemmap modification operations".
 
 - Some buffer_head folio conversions and cleanups from Matthew Wilcox in
   the series "Finish the create_empty_buffers() transition".
 
 - As a page allocator performance optimization Huang Ying has added
   automatic tuning to the allocator's per-cpu-pages feature, in the series
   "mm: PCP high auto-tuning".
 
 - Roman Gushchin has contributed the patchset "mm: improve performance
   of accounted kernel memory allocations" which improves their performance
   by ~30% as measured by a micro-benchmark.
 
 - folio conversions from Kefeng Wang in the series "mm: convert page
   cpupid functions to folios".
 
 - Some kmemleak fixups in Liu Shixin's series "Some bugfix about
   kmemleak".
 
 - Qi Zheng has improved our handling of memoryless nodes by keeping them
   off the allocation fallback list.  This is done in the series "handle
   memoryless nodes more appropriately".
 
 - khugepaged conversions from Vishal Moola in the series "Some
   khugepaged folio conversions".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZULEMwAKCRDdBJ7gKXxA
 jhQHAQCYpD3g849x69DmHnHWHm/EHQLvQmRMDeYZI+nx/sCJOwEAw4AKg0Oemv9y
 FgeUPAD1oasg6CP+INZvCj34waNxwAc=
 =E+Y4
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Many singleton patches against the MM code. The patch series which are
  included in this merge do the following:

   - Kemeng Shi has contributed some compation maintenance work in the
     series 'Fixes and cleanups to compaction'

   - Joel Fernandes has a patchset ('Optimize mremap during mutual
     alignment within PMD') which fixes an obscure issue with mremap()'s
     pagetable handling during a subsequent exec(), based upon an
     implementation which Linus suggested

   - More DAMON/DAMOS maintenance and feature work from SeongJae Park i
     the following patch series:

	mm/damon: misc fixups for documents, comments and its tracepoint
	mm/damon: add a tracepoint for damos apply target regions
	mm/damon: provide pseudo-moving sum based access rate
	mm/damon: implement DAMOS apply intervals
	mm/damon/core-test: Fix memory leaks in core-test
	mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval

   - In the series 'Do not try to access unaccepted memory' Adrian
     Hunter provides some fixups for the recently-added 'unaccepted
     memory' feature. To increase the feature's checking coverage. 'Plug
     a few gaps where RAM is exposed without checking if it is
     unaccepted memory'

   - In the series 'cleanups for lockless slab shrink' Qi Zheng has done
     some maintenance work which is preparation for the lockless slab
     shrinking code

   - Qi Zheng has redone the earlier (and reverted) attempt to make slab
     shrinking lockless in the series 'use refcount+RCU method to
     implement lockless slab shrink'

   - David Hildenbrand contributes some maintenance work for the rmap
     code in the series 'Anon rmap cleanups'

   - Kefeng Wang does more folio conversions and some maintenance work
     in the migration code. Series 'mm: migrate: more folio conversion
     and unification'

   - Matthew Wilcox has fixed an issue in the buffer_head code which was
     causing long stalls under some heavy memory/IO loads. Some cleanups
     were added on the way. Series 'Add and use bdev_getblk()'

   - In the series 'Use nth_page() in place of direct struct page
     manipulation' Zi Yan has fixed a potential issue with the direct
     manipulation of hugetlb page frames

   - In the series 'mm: hugetlb: Skip initialization of gigantic tail
     struct pages if freed by HVO' has improved our handling of gigantic
     pages in the hugetlb vmmemmep optimizaton code. This provides
     significant boot time improvements when significant amounts of
     gigantic pages are in use

   - Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code
     rationalization and folio conversions in the hugetlb code

   - Yin Fengwei has improved mlock()'s handling of large folios in the
     series 'support large folio for mlock'

   - In the series 'Expose swapcache stat for memcg v1' Liu Shixin has
     added statistics for memcg v1 users which are available (and
     useful) under memcg v2

   - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
     prctl so that userspace may direct the kernel to not automatically
     propagate the denial to child processes. The series is named 'MDWE
     without inheritance'

   - Kefeng Wang has provided the series 'mm: convert numa balancing
     functions to use a folio' which does what it says

   - In the series 'mm/ksm: add fork-exec support for prctl' Stefan
     Roesch makes is possible for a process to propagate KSM treatment
     across exec()

   - Huang Ying has enhanced memory tiering's calculation of memory
     distances. This is used to permit the dax/kmem driver to use 'high
     bandwidth memory' in addition to Optane Data Center Persistent
     Memory Modules (DCPMM). The series is named 'memory tiering:
     calculate abstract distance based on ACPI HMAT'

   - In the series 'Smart scanning mode for KSM' Stefan Roesch has
     optimized KSM by teaching it to retain and use some historical
     information from previous scans

   - Yosry Ahmed has fixed some inconsistencies in memcg statistics in
     the series 'mm: memcg: fix tracking of pending stats updates
     values'

   - In the series 'Implement IOCTL to get and optionally clear info
     about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap
     which permits us to atomically read-then-clear page softdirty
     state. This is mainly used by CRIU

   - Hugh Dickins contributed the series 'shmem,tmpfs: general
     maintenance', a bunch of relatively minor maintenance tweaks to
     this code

   - Matthew Wilcox has increased the use of the VMA lock over
     file-backed page faults in the series 'Handle more faults under the
     VMA lock'. Some rationalizations of the fault path became possible
     as a result

   - In the series 'mm/rmap: convert page_move_anon_rmap() to
     folio_move_anon_rmap()' David Hildenbrand has implemented some
     cleanups and folio conversions

   - In the series 'various improvements to the GUP interface' Lorenzo
     Stoakes has simplified and improved the GUP interface with an eye
     to providing groundwork for future improvements

   - Andrey Konovalov has sent along the series 'kasan: assorted fixes
     and improvements' which does those things

   - Some page allocator maintenance work from Kemeng Shi in the series
     'Two minor cleanups to break_down_buddy_pages'

   - In thes series 'New selftest for mm' Breno Leitao has developed
     another MM self test which tickles a race we had between madvise()
     and page faults

   - In the series 'Add folio_end_read' Matthew Wilcox provides cleanups
     and an optimization to the core pagecache code

   - Nhat Pham has added memcg accounting for hugetlb memory in the
     series 'hugetlb memcg accounting'

   - Cleanups and rationalizations to the pagemap code from Lorenzo
     Stoakes, in the series 'Abstract vma_merge() and split_vma()'

   - Audra Mitchell has fixed issues in the procfs page_owner code's new
     timestamping feature which was causing some misbehaviours. In the
     series 'Fix page_owner's use of free timestamps'

   - Lorenzo Stoakes has fixed the handling of new mappings of sealed
     files in the series 'permit write-sealed memfd read-only shared
     mappings'

   - Mike Kravetz has optimized the hugetlb vmemmap optimization in the
     series 'Batch hugetlb vmemmap modification operations'

   - Some buffer_head folio conversions and cleanups from Matthew Wilcox
     in the series 'Finish the create_empty_buffers() transition'

   - As a page allocator performance optimization Huang Ying has added
     automatic tuning to the allocator's per-cpu-pages feature, in the
     series 'mm: PCP high auto-tuning'

   - Roman Gushchin has contributed the patchset 'mm: improve
     performance of accounted kernel memory allocations' which improves
     their performance by ~30% as measured by a micro-benchmark

   - folio conversions from Kefeng Wang in the series 'mm: convert page
     cpupid functions to folios'

   - Some kmemleak fixups in Liu Shixin's series 'Some bugfix about
     kmemleak'

   - Qi Zheng has improved our handling of memoryless nodes by keeping
     them off the allocation fallback list. This is done in the series
     'handle memoryless nodes more appropriately'

   - khugepaged conversions from Vishal Moola in the series 'Some
     khugepaged folio conversions'"

[ bcachefs conflicts with the dynamically allocated shrinkers have been
  resolved as per Stephen Rothwell in

     https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/

  with help from Qi Zheng.

  The clone3 test filtering conflict was half-arsed by yours truly ]

* tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits)
  mm/damon/sysfs: update monitoring target regions for online input commit
  mm/damon/sysfs: remove requested targets when online-commit inputs
  selftests: add a sanity check for zswap
  Documentation: maple_tree: fix word spelling error
  mm/vmalloc: fix the unchecked dereference warning in vread_iter()
  zswap: export compression failure stats
  Documentation: ubsan: drop "the" from article title
  mempolicy: migration attempt to match interleave nodes
  mempolicy: mmap_lock is not needed while migrating folios
  mempolicy: alloc_pages_mpol() for NUMA policy without vma
  mm: add page_rmappable_folio() wrapper
  mempolicy: remove confusing MPOL_MF_LAZY dead code
  mempolicy: mpol_shared_policy_init() without pseudo-vma
  mempolicy trivia: use pgoff_t in shared mempolicy tree
  mempolicy trivia: slightly more consistent naming
  mempolicy trivia: delete those ancient pr_debug()s
  mempolicy: fix migrate_pages(2) syscall return nr_failed
  kernfs: drop shared NUMA mempolicy hooks
  hugetlbfs: drop shared NUMA mempolicy pretence
  mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets()
  ...
2023-11-02 19:38:47 -10:00
Linus Torvalds
bc3012f4e3 This update includes the following changes:
API:
 
 - Add virtual-address based lskcipher interface.
 - Optimise ahash/shash performance in light of costly indirect calls.
 - Remove ahash alignmask attribute.
 
 Algorithms:
 
 - Improve AES/XTS performance of 6-way unrolling for ppc.
 - Remove some uses of obsolete algorithms (md4, md5, sha1).
 - Add FIPS 202 SHA-3 support in pkcs1pad.
 - Add fast path for single-page messages in adiantum.
 - Remove zlib-deflate.
 
 Drivers:
 
 - Add support for S4 in meson RNG driver.
 - Add STM32MP13x support in stm32.
 - Add hwrng interface support in qcom-rng.
 - Add support for deflate algorithm in hisilicon/zip.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmVB3vgACgkQxycdCkmx
 i6dsOBAAykbnX8BpnpnOXYywE9ZWrl98rAk51MK0N9olZNfg78zRPIv7fFxFdC20
 SDJrDSNPmn0Qvaa5e0EfoAdklsm0k2GkXL/BwPKMKWUsyIoJVYI3WrBMnjBy9xMp
 yfME+h0bKoXJCZKnYkIUSGUejmUPSyRlEylrXoFlH/VWYwAaii/x9zwreQoF+0LR
 KI24A1q8AYs6Dw9HSfndaAub9GOzrqKYs6fSaMG+77Y4UC5aoi5J9Bp2G3uVyHay
 x/0bZtIxKXS9wn+LeG/3GspX23x/I5VwBOdAoMigrYmAIaIg5qgyMszudltTAs4R
 zF1Kh7WsnM5+vpnBSeigzo+/GGOU3QTz8y3tBTg+3ZR7GWGOwQLiizhOYqCyOfAH
 pIm6c++sZw/OOHiL69Nt4HeLKzGNYYWk3s4X/B/6cqoouPfOsfBaQobZNx9zfy7q
 ZNEvSVBjrFX/L6wDSotny1LTWLUNjHbmLaMV5uQZ/SQKEtv19fp2Dl7SsLkHH+3v
 ldOAwfoJR6QcSwz3Ez02TUAvQhtP172Hnxi7u44eiZu2aUboLhCFr7aEU6kVdBCx
 1rIRVHD1oqlOEDRwPRXzhF3I8R4QDORJIxZ6UUhg7yueuI+XCGDsBNC+LqBrBmSR
 IbdjqmSDUBhJyM5yMnt1VFYhqKQ/ZzwZ3JQviwW76Es9pwEIolM=
 =IZmR
 -----END PGP SIGNATURE-----

Merge tag 'v6.7-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Add virtual-address based lskcipher interface
   - Optimise ahash/shash performance in light of costly indirect calls
   - Remove ahash alignmask attribute

  Algorithms:
   - Improve AES/XTS performance of 6-way unrolling for ppc
   - Remove some uses of obsolete algorithms (md4, md5, sha1)
   - Add FIPS 202 SHA-3 support in pkcs1pad
   - Add fast path for single-page messages in adiantum
   - Remove zlib-deflate

  Drivers:
   - Add support for S4 in meson RNG driver
   - Add STM32MP13x support in stm32
   - Add hwrng interface support in qcom-rng
   - Add support for deflate algorithm in hisilicon/zip"

* tag 'v6.7-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (283 commits)
  crypto: adiantum - flush destination page before unmapping
  crypto: testmgr - move pkcs1pad(rsa,sha3-*) to correct place
  Documentation/module-signing.txt: bring up to date
  module: enable automatic module signing with FIPS 202 SHA-3
  crypto: asymmetric_keys - allow FIPS 202 SHA-3 signatures
  crypto: rsa-pkcs1pad - Add FIPS 202 SHA-3 support
  crypto: FIPS 202 SHA-3 register in hash info for IMA
  x509: Add OIDs for FIPS 202 SHA-3 hash and signatures
  crypto: ahash - optimize performance when wrapping shash
  crypto: ahash - check for shash type instead of not ahash type
  crypto: hash - move "ahash wrapping shash" functions to ahash.c
  crypto: talitos - stop using crypto_ahash::init
  crypto: chelsio - stop using crypto_ahash::init
  crypto: ahash - improve file comment
  crypto: ahash - remove struct ahash_request_priv
  crypto: ahash - remove crypto_ahash_alignmask
  crypto: gcm - stop using alignmask of ahash
  crypto: chacha20poly1305 - stop using alignmask of ahash
  crypto: ccm - stop using alignmask of ahash
  net: ipv6: stop checking crypto_ahash_alignmask
  ...
2023-11-02 16:15:30 -10:00
Linus Torvalds
5be9911406 sh updates for v6.7
- locking/atomic: sh: Use generic_cmpxchg_local for arch_cmpxchg_local()
 - Documentation: kernel-parameters: Add earlyprintk=bios on SH
 - sh: bios: Revive earlyprintk support
 - sh: machvec: Remove custom ioport_{un,}map()
 - sh: Remove superhyway bus support
 - sh: Remove unused SH4-202 support
 - sh: Remove stale microdev board
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEYv+KdYTgKVaVRgAGdCY7N/W1+RMFAmVDaQsACgkQdCY7N/W1
 +RMoghAAsOOOCSdxGFe6d7FWW+NS981NaBn/5gkJXKAoPqZELGzHLczaob/LBHB9
 iDUjWxrn1qURQFRvJLLlEW+1s9IThPfx62lsSAS1tkoTcDNfI/xUDw+NiCGy1AZP
 M2XVOwPI8ucPMCIj2whNsnwsfu4wka1jIHSM6I03PWRdH1HSmLmyJRnYEzS/QftV
 RI8ABy/BNaB85CBfUGjlohxwD1dYHCKd0WKO1Sd3/YrOIN4Ls0Th4Vn5+dphFbA8
 xM5K9vTzNjEGZrjTLiE9OIkXIOn8dUS8WN9fdDsQlbAP+lmes+SeIDhbYCKH3UV6
 CBHhm47qASV/sNi3QsS5JfkjXj/bwmj+XSoecVqYk+8oujx3ERzplRxbLKEOODru
 iKP1vA12OdaYdGlcK1erDAdddrdu0WTvilM6/XqBHyIsjsi5ZB2zui9OUwNgLjaN
 udLeJm9hGSqWObxWhJIiF9omf+8VostSDrL+zb3VVWvQsj3KRTPro4tkHwhS7rTd
 1nHR3XijRBG5XkVnX4117Q5JR+07amxq6KQakXeljZfmy6qexXeC3UPEvoynV6N6
 xba9TkaiG3XyJhg6CT6KTVVzxfnge3NlcmTYnQZcxIHvbWyeBr4b1IA/eyNnJSdE
 8evAtLXPUMqFjUZFaHtuQdCPddskd9Cg3vlud234rsVmpFfKF2Q=
 =KwXl
 -----END PGP SIGNATURE-----

Merge tag 'sh-for-v6.7-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux

Pull sh updates from John Paul Adrian Glaubitz:
 "While the previously announced patch series for converting arch/sh to
  device trees is not yet ready for inclusion to mainline and therefore
  didn't make it for this pull request, there are still a small number
  changes for v6.7 which include one platform (board plus CPU and driver
  code) removal plus two fixes.

  The removal sent in by Arnd Bergmann concerns the microdev board which
  was an early SuperH prototype board that was never used in production.
  With the board removed, we were able to drop the now unused code for
  the SH4-202 CPU and well as the driver code for the superhyway bus and
  a custom implementation for ioport_map() and ioport_unmap() which will
  allow us to simplify ioport handling in the future.

  Another patch set by Geert Uytterhoeven revives SuperH BIOS
  earlyprintk support which got accidentally disabled in
  e76fe57447 ("sh: Remove old early serial console code V2"), the
  second patch in the series updates the documentation.

  Finally, a patch by Masami Hiramatsu fixes a regression reported by
  the kernel test robot which uncovered that arch/sh is not implementing
  arch_cmpxchg_local() and therefore needs use __generic_cmpxchg_local()
  instead"

* tag 'sh-for-v6.7-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
  locking/atomic: sh: Use generic_cmpxchg_local for arch_cmpxchg_local()
  Documentation: kernel-parameters: Add earlyprintk=bios on SH
  sh: bios: Revive earlyprintk support
  sh: machvec: Remove custom ioport_{un,}map()
  sh: Remove superhyway bus support
  sh: Remove unused SH4-202 support
  sh: Remove stale microdev board
2023-11-02 15:34:59 -10:00
Linus Torvalds
babe393974 The number of commits for documentation is not huge this time around, but
there are some significant changes nonetheless:
 
 - Some more Spanish-language and Chinese translations.
 
 - The much-discussed documentation of the confidential-computing threat
   model.
 
 - Powerpc and RISCV documentation move under Documentation/arch - these
   complete this particular bit of documentation churn.
 
 - A large traditional-Chinese documentation update.
 
 - A new document on backporting and conflict resolution.
 
 - Some kernel-doc and Sphinx fixes.
 
 Plus the usual smattering of smaller updates and typo fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmVBNv8PHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5Y0JkH/36MOpkaDnsY69/dMRKSuD4mAAP2H6LS8V63
 SsMgH5VCj8lcy/Tz1+J89t14pbcX8l0viKxSo4UxvzoJ5snrz8A8gZ9oqY7NCcNs
 nMtolnN5IwdbgGnEGqASSLsl07lnabhRK0VYv9ZO7lHjYQp97VsJ/qrjJn385HFE
 vYW8iRcxcKdwtuuwOtbPcdAMjP54saJdNC5wMLsfMR0csKcGbzaSNpqpiGovzT7l
 phG2DSxrJH0gUZyeGPryroNppaf+mVKSDSiwRdI8mzm0J67p6dZYYwBS1Iw6Awbf
 8iYoj6W63/FVQbXffPx5d6ffOSQh4JkAskxgBUOzluSGusSDc+4=
 =9HU5
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.7' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "The number of commits for documentation is not huge this time around,
  but there are some significant changes nonetheless:

   - Some more Spanish-language and Chinese translations

   - The much-discussed documentation of the confidential-computing
     threat model

   - Powerpc and RISCV documentation move under Documentation/arch -
     these complete this particular bit of documentation churn

   - A large traditional-Chinese documentation update

   - A new document on backporting and conflict resolution

   - Some kernel-doc and Sphinx fixes

  Plus the usual smattering of smaller updates and typo fixes"

* tag 'docs-6.7' of git://git.lwn.net/linux: (40 commits)
  scripts/kernel-doc: Fix the regex for matching -Werror flag
  docs: backporting: address feedback
  Documentation: driver-api: pps: Update PPS generator documentation
  speakup: Document USB support
  doc: blk-ioprio: Bring the doc in line with the implementation
  docs: usb: fix reference to nonexistent file in UVC Gadget
  docs: doc-guide: mention 'make refcheckdocs'
  Documentation: fix typo in dynamic-debug howto
  scripts/kernel-doc: match -Werror flag strictly
  Documentation/sphinx: Remove the repeated word "the" in comments.
  docs: sparse: add SPDX-License-Identifier
  docs/zh_CN: Add subsystem-apis Chinese translation
  docs/zh_TW: update contents for zh_TW
  docs: submitting-patches: encourage direct notifications to commenters
  docs: add backporting and conflict resolution document
  docs: move riscv under arch
  docs: update link to powerpc/vmemmap_dedup.rst
  mm/memory-hotplug: fix typo in documentation
  docs: move powerpc under arch
  PCI: Update the devres documentation regarding to pcim_*()
  ...
2023-11-01 17:11:41 -10:00
Linus Torvalds
1e0c505e13 asm-generic updates for v6.7
The ia64 architecture gets its well-earned retirement as planned,
 now that there is one last (mostly) working release that will
 be maintained as an LTS kernel.
 
 The architecture specific system call tables are updated for
 the added map_shadow_stack() syscall and to remove references
 to the long-gone sys_lookup_dcookie() syscall.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmVC40IACgkQYKtH/8kJ
 Uidhmw/9EX+aWSXGoObJ3fngaNSMw+PmrEuP8qEKBHxfKHcCdX3hc451Oh4GlhaQ
 tru91pPwgNvN2/rfoKusxT+V4PemGIzfNni/04rp+P0kvmdw5otQ2yNhsQNsfVmq
 XGWvkxF4P2GO6bkjjfR/1dDq7GtlyXtwwPDKeLbYb6TnJOZjtx+EAN27kkfSn1Ms
 R4Sa3zJ+DfHUmHL5S9g+7UD/CZ5GfKNmIskI4Mz5GsfoUz/0iiU+Bge/9sdcdSJQ
 kmbLy5YnVzfooLZ3TQmBFsO3iAMWb0s/mDdtyhqhTVmTUshLolkPYyKnPFvdupyv
 shXcpEST2XJNeaDRnL2K4zSCdxdbnCZHDpjfl9wfioBg7I8NfhXKpf1jYZHH1de4
 LXq8ndEFEOVQw/zSpYWfQq1sux8Jiqr+UK/ukbVeFWiGGIUs91gEWtPAf8T0AZo9
 ujkJvaWGl98O1g5wmBu0/dAR6QcFJMDfVwbmlIFpU8O+MEaz6X8mM+O5/T0IyTcD
 eMbAUjj4uYcU7ihKzHEv/0SS9Of38kzff67CLN5k8wOP/9NlaGZ78o1bVle9b52A
 BdhrsAefFiWHp1jT6Y9Rg4HOO/TguQ9e6EWSKOYFulsiLH9LEFaB9RwZLeLytV0W
 vlAgY9rUW77g1OJcb7DoNv33nRFuxsKqsnz3DEIXtgozo9CzbYI=
 =H1vH
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull ia64 removal and asm-generic updates from Arnd Bergmann:

 - The ia64 architecture gets its well-earned retirement as planned,
   now that there is one last (mostly) working release that will be
   maintained as an LTS kernel.

 - The architecture specific system call tables are updated for the
   added map_shadow_stack() syscall and to remove references to the
   long-gone sys_lookup_dcookie() syscall.

* tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  hexagon: Remove unusable symbols from the ptrace.h uapi
  asm-generic: Fix spelling of architecture
  arch: Reserve map_shadow_stack() syscall number for all architectures
  syscalls: Cleanup references to sys_lookup_dcookie()
  Documentation: Drop or replace remaining mentions of IA64
  lib/raid6: Drop IA64 support
  Documentation: Drop IA64 from feature descriptions
  kernel: Drop IA64 support from sig_fault handlers
  arch: Remove Itanium (IA-64) architecture
2023-11-01 15:28:33 -10:00
Linus Torvalds
56ec8e4cd8 arm64 updates for 6.7:
* Major refactoring of the CPU capability detection logic resulting in
   the removal of the cpus_have_const_cap() function and migrating the
   code to "alternative" branches where possible
 
 * Backtrace/kgdb: use IPIs and pseudo-NMI
 
 * Perf and PMU:
 
   - Add support for Ampere SoC PMUs
 
   - Multi-DTC improvements for larger CMN configurations with multiple
     Debug & Trace Controllers
 
   - Rework the Arm CoreSight PMU driver to allow separate registration of
     vendor backend modules
 
   - Fixes: add missing MODULE_DEVICE_TABLE to the amlogic perf
     driver; use device_get_match_data() in the xgene driver; fix NULL
     pointer dereference in the hisi driver caused by calling
     cpuhp_state_remove_instance(); use-after-free in the hisi driver
 
 * HWCAP updates:
 
   - FEAT_SVE_B16B16 (BFloat16)
 
   - FEAT_LRCPC3 (release consistency model)
 
   - FEAT_LSE128 (128-bit atomic instructions)
 
 * SVE: remove a couple of pseudo registers from the cpufeature code.
   There is logic in place already to detect mismatched SVE features
 
 * Miscellaneous:
 
   - Reduce the default swiotlb size (currently 64MB) if no ZONE_DMA
     bouncing is needed. The buffer is still required for small kmalloc()
     buffers
 
   - Fix module PLT counting with !RANDOMIZE_BASE
 
   - Restrict CPU_BIG_ENDIAN to LLVM IAS 15.x or newer move
     synchronisation code out of the set_ptes() loop
 
   - More compact cpufeature displaying enabled cores
 
   - Kselftest updates for the new CPU features
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmU7/QUACgkQa9axLQDI
 XvEx3xAAjICmHm+ryKJxS1IGXLYu2DXMcHUjeW6w1SxkK/vKhTMlHRx/CIWDze2l
 eENu7TcDLtTw+Gv9kqg30TSwzLfJhP9oFpX2T5TKkh5qlJlbz8fBtm+as14DTLCZ
 p2sra3J0w4B5JwTVqnj2RHOlEftMKvbyLGRkz3ve6wIUbsp5pXMkxAd/k3wOf0lC
 m6d9w1OMA2sOsw9YCgjcCNQGEzFMJk+13w7K+4w6A8Djn/Jxkt4fAFVn2ZlCiZzD
 NA2lTDWJqGmeGHo3iFdCTensWXmWTqjzxsNEf7PyBk5mBOdzDVxlTfEL7vnJg7gf
 BlTQ/nhIpra7rHQ9q2rwqEzbF+4Tn3uWlQfdDb7+/4goPjDh7tlBhEOYyOwTCEIT
 0t9cCSvBmSCKeXC3lKWWtJ+QJKhZHSmXN84EotTs65KyyfIsi4RuSezvV/+aIL86
 06sHYlYxETuujZP1cgOjf69Wsdsgizx0mqXJXf/xOjp22HFDcL4Bki6Rgi6t5OZj
 GEHG15kSE+eJ+RIpxpuAN8fdrlxYubsVLIksCqK7cZf9zXbQGIlifKAIrYiEx6kz
 FD+o+j/5niRWR6yJZCtCcGxqpSlwnYWPqc1Ds0GES8A/BphWMPozXUAZ0ll4Fnp1
 yyR2/Due/eBsCNESn579kP8989rashubB8vxvdx2fcWVtLC7VgE=
 =QaEo
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "No major architecture features this time around, just some new HWCAP
  definitions, support for the Ampere SoC PMUs and a few fixes/cleanups.

  The bulk of the changes is reworking of the CPU capability checking
  code (cpus_have_cap() etc).

   - Major refactoring of the CPU capability detection logic resulting
     in the removal of the cpus_have_const_cap() function and migrating
     the code to "alternative" branches where possible

   - Backtrace/kgdb: use IPIs and pseudo-NMI

   - Perf and PMU:

      - Add support for Ampere SoC PMUs

      - Multi-DTC improvements for larger CMN configurations with
        multiple Debug & Trace Controllers

      - Rework the Arm CoreSight PMU driver to allow separate
        registration of vendor backend modules

      - Fixes: add missing MODULE_DEVICE_TABLE to the amlogic perf
        driver; use device_get_match_data() in the xgene driver; fix
        NULL pointer dereference in the hisi driver caused by calling
        cpuhp_state_remove_instance(); use-after-free in the hisi driver

   - HWCAP updates:

      - FEAT_SVE_B16B16 (BFloat16)

      - FEAT_LRCPC3 (release consistency model)

      - FEAT_LSE128 (128-bit atomic instructions)

   - SVE: remove a couple of pseudo registers from the cpufeature code.
     There is logic in place already to detect mismatched SVE features

   - Miscellaneous:

      - Reduce the default swiotlb size (currently 64MB) if no ZONE_DMA
        bouncing is needed. The buffer is still required for small
        kmalloc() buffers

      - Fix module PLT counting with !RANDOMIZE_BASE

      - Restrict CPU_BIG_ENDIAN to LLVM IAS 15.x or newer move
        synchronisation code out of the set_ptes() loop

      - More compact cpufeature displaying enabled cores

      - Kselftest updates for the new CPU features"

 * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (83 commits)
  arm64: Restrict CPU_BIG_ENDIAN to GNU as or LLVM IAS 15.x or newer
  arm64: module: Fix PLT counting when CONFIG_RANDOMIZE_BASE=n
  arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helper
  perf: hisi: Fix use-after-free when register pmu fails
  drivers/perf: hisi_pcie: Initialize event->cpu only on success
  drivers/perf: hisi_pcie: Check the type first in pmu::event_init()
  arm64: cpufeature: Change DBM to display enabled cores
  arm64: cpufeature: Display the set of cores with a feature
  perf/arm-cmn: Enable per-DTC counter allocation
  perf/arm-cmn: Rework DTC counters (again)
  perf/arm-cmn: Fix DTC domain detection
  drivers: perf: arm_pmuv3: Drop some unused arguments from armv8_pmu_init()
  drivers: perf: arm_pmuv3: Read PMMIR_EL1 unconditionally
  drivers/perf: hisi: use cpuhp_state_remove_instance_nocalls() for hisi_hns3_pmu uninit process
  clocksource/drivers/arm_arch_timer: limit XGene-1 workaround
  arm64: Remove system_uses_lse_atomics()
  arm64: Mark the 'addr' argument to set_ptes() and __set_pte_at() as unused
  drivers/perf: xgene: Use device_get_match_data()
  perf/amlogic: add missing MODULE_DEVICE_TABLE
  arm64/mm: Hoist synchronization out of set_ptes() loop
  ...
2023-11-01 09:34:55 -10:00
Linus Torvalds
59fff63cc2 platform-drivers-x86 for v6.7-1
Highlights:
  - asus-wmi:		Support for screenpad and solve brightness key
 			press duplication
  - int3472:		Eliminate the last use of deprecated GPIO functions
  - mlxbf-pmc:		New HW support
  - msi-ec:		Support new EC configurations
  - thinkpad_acpi:	Support reading aux MAC address during passthrough
  - wmi: 		Fixes & improvements
  - x86-android-tablets:	Detection fix and avoid use of GPIO private APIs
  - Debug & metrics interface improvements
  - Miscellaneous cleanups / fixes / improvements
 
 The following is an automated shortlog grouped by driver:
 
 acer-wmi:
  -  Remove void function return
 
 amd/hsmp:
  -  add support for metrics tbl
  -  create plat specific struct
  -  Fix iomem handling
  -  improve the error log
 
 amd/pmc:
  -  Add dump_custom_stb module parameter
  -  Add PMFW command id to support S2D force flush
  -  Handle overflow cases where the num_samples range is higher
  -  Use flex array when calling amd_pmc_stb_debugfs_open_v2()
 
 asus-wireless:
  -  Replace open coded acpi_match_acpi_device()
 
 asus-wmi:
  -  add support for ASUS screenpad
  -  Do not report brightness up/down keys when also reported by acpi_video
 
 gpiolib: acpi:
  -  Add a ignore interrupt quirk for Peaq C1010
  -  Check if a GPIO is listed in ignore_interrupt earlier
 
 hp-bioscfg:
  -  Annotate struct bios_args with __counted_by
 
 inspur-platform-profile:
  -  Add platform profile support
 
 int3472:
  -  Add new skl_int3472_fill_gpiod_lookup() helper
  -  Add new skl_int3472_gpiod_get_from_temp_lookup() helper
  -  Stop using gpiod_toggle_active_low()
  -  Switch to devm_get_gpiod()
 
 intel: bytcrc_pwrsrc:
  -  Convert to platform remove callback returning void
 
 intel/ifs:
  -  Add new CPU support
  -  Add new error code
  -  ARRAY BIST for Sierra Forest
  -  Gen2 scan image loading
  -  Gen2 Scan test support
  -  Metadata validation for start_chunk
  -  Refactor image loading code
  -  Store IFS generation number
  -  Validate image size
 
 intel_speed_select_if:
  -  Remove hardcoded map size
  -  Use devm_ioremap_resource
 
 intel/tpmi:
  -  Add debugfs support for read/write blocked
  -  Add defines to get version information
 
 intel-uncore-freq:
  -  Ignore minor version change
 
 ISST:
  -  Allow level 0 to be not present
  -  Ignore minor version change
  -  Use fuse enabled mask instead of allowed levels
 
 mellanox:
  -  Fix misspelling error in routine name
  -  Rename some init()/exit() functions for consistent naming
 
 mlxbf-bootctl:
  -  Convert to platform remove callback returning void
 
 mlxbf-pmc:
  -  Add support for BlueField-3
 
 mlxbf-tmfifo:
  -  Convert to platform remove callback returning void
 
 mlx-Convert to platform remove callback returning void:
  - mlx-Convert to platform remove callback returning void
 
 mlxreg-hotplug:
  -  Convert to platform remove callback returning void
 
 mlxreg-io:
  -  Convert to platform remove callback returning void
 
 mlxreg-lc:
  -  Convert to platform remove callback returning void
 
 msi-ec:
  -  Add more EC configs
  -  rename fn_super_swap
 
 nvsw-sn2201:
  -  Convert to platform remove callback returning void
 
 sel3350-Convert to platform remove callback returning void:
  - sel3350-Convert to platform remove callback returning void
 
 siemens: simatic-ipc-batt-apollolake:
  -  Convert to platform remove callback returning void
 
 siemens: simatic-ipc-batt:
  -  Convert to platform remove callback returning void
 
 siemens: simatic-ipc-batt-elkhartlake:
  -  Convert to platform remove callback returning void
 
 siemens: simatic-ipc-batt-f7188x:
  -  Convert to platform remove callback returning void
 
 siemens: simatic-ipc-batt:
  -  Simplify simatic_ipc_batt_remove()
 
 surface: acpi-notify:
  -  Convert to platform remove callback returning void
 
 surface: aggregator:
  -  Annotate struct ssam_event with __counted_by
 
 surface: aggregator-cdev:
  -  Convert to platform remove callback returning void
 
 surface: aggregator-registry:
  -  Convert to platform remove callback returning void
 
 surface: dtx:
  -  Convert to platform remove callback returning void
 
 surface: gpe:
  -  Convert to platform remove callback returning void
 
 surface: hotplug:
  -  Convert to platform remove callback returning void
 
 surface: surface3-wmi:
  -  Convert to platform remove callback returning void
 
 think-lmi:
  -  Add bulk save feature
  -  Replace kstrdup() + strreplace() with kstrdup_and_replace()
  -  Use strreplace() to replace a character by nul
 
 thinkpad_acpi:
  -  Add battery quirk for Thinkpad X120e
  -  replace deprecated strncpy with memcpy
  -  sysfs interface to auxmac
 
 tools/power/x86/intel-speed-select:
  -  Display error for core-power support
  -  Increase max CPUs in one request
  -  No TRL for non compute domains
  -  Sanitize integer arguments
  -  turbo-mode enable disable swapped
  -  Update help for TRL
  -  Use cgroup isolate for CPU 0
  -  v1.18 release
 
 wmi:
  -  Decouple probe deferring from wmi_block_list
  -  Decouple WMI device removal from wmi_block_list
  -  Fix opening of char device
  -  Fix probe failure when failing to register WMI devices
  -  Fix refcounting of WMI devices in legacy functions
 
 x86-android-tablets:
  -  Add a comment about x86_android_tablet_get_gpiod()
  -  Create a platform_device from module_init()
  -  Drop "linux,power-supply-name" from lenovo_yt3_bq25892_0_props[]
  -  Fix Lenovo Yoga Tablet 2 830F/L vs 1050F/L detection
  -  Remove invalid_aei_gpiochip from Peaq C1010
  -  Remove invalid_aei_gpiochip support
  -  Stop using gpiolib private APIs
  -  Use platform-device as gpio-keys parent
 
 xo15-ebook:
  -  Replace open coded acpi_match_acpi_device()
 
 Merges:
  -  Merge branch 'pdx86/platform-drivers-x86-int3472' into review-ilpo
  -  Merge branch 'pdx86/platform-drivers-x86-mellanox-init' into review-ilpo
  -  Merge remote-tracking branch 'intel-speed-select/intel-sst' into review-ilpo
  -  Merge remote-tracking branch 'pdx86/platform-drivers-x86-android-tablets' into review-hans
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSCSUwRdwTNL2MhaBlZrE9hU+XOMQUCZT+lBwAKCRBZrE9hU+XO
 Mck0AQCFU7dYLCF4d1CXtHf1eZhSXLpYdhcO+C08JGGoM+MqSgD+Jyb9KJHk4pxE
 FvKG51I9neyAne9lvNrLodHRzxCYgAo=
 =duM8
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver updates from Ilpo Järvinen:

 - asus-wmi: Support for screenpad and solve brightness key press
   duplication

 - int3472: Eliminate the last use of deprecated GPIO functions

 - mlxbf-pmc: New HW support

 - msi-ec: Support new EC configurations

 - thinkpad_acpi: Support reading aux MAC address during passthrough

 - wmi: Fixes & improvements

 - x86-android-tablets: Detection fix and avoid use of GPIO private APIs

 - Debug & metrics interface improvements

 - Miscellaneous cleanups / fixes / improvements

* tag 'platform-drivers-x86-v6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (80 commits)
  platform/x86: inspur-platform-profile: Add platform profile support
  platform/x86: thinkpad_acpi: Add battery quirk for Thinkpad X120e
  platform/x86: wmi: Decouple WMI device removal from wmi_block_list
  platform/x86: wmi: Fix opening of char device
  platform/x86: wmi: Fix probe failure when failing to register WMI devices
  platform/x86: wmi: Fix refcounting of WMI devices in legacy functions
  platform/x86: wmi: Decouple probe deferring from wmi_block_list
  platform/x86/amd/hsmp: Fix iomem handling
  platform/x86: asus-wmi: Do not report brightness up/down keys when also reported by acpi_video
  platform/x86: thinkpad_acpi: replace deprecated strncpy with memcpy
  tools/power/x86/intel-speed-select: v1.18 release
  tools/power/x86/intel-speed-select: Use cgroup isolate for CPU 0
  tools/power/x86/intel-speed-select: Increase max CPUs in one request
  tools/power/x86/intel-speed-select: Display error for core-power support
  tools/power/x86/intel-speed-select: No TRL for non compute domains
  tools/power/x86/intel-speed-select: turbo-mode enable disable swapped
  tools/power/x86/intel-speed-select: Update help for TRL
  tools/power/x86/intel-speed-select: Sanitize integer arguments
  platform/x86: acer-wmi: Remove void function return
  platform/x86/amd/pmc: Add dump_custom_stb module parameter
  ...
2023-10-31 17:53:00 -10:00
Linus Torvalds
89ed67ef12 Networking changes for 6.7.
Core & protocols
 ----------------
 
  - Support usec resolution of TCP timestamps, enabled selectively by
    a route attribute.
 
  - Defer regular TCP ACK while processing socket backlog, try to send
    a cumulative ACK at the end. Increase single TCP flow performance
    on a 200Gbit NIC by 20% (100Gbit -> 120Gbit).
 
  - The Fair Queuing (FQ) packet scheduler:
    - add built-in 3 band prio / WRR scheduling
    - support bypass if the qdisc is mostly idle (5% speed up for TCP RR)
    - improve inactive flow reporting
    - optimize the layout of structures for better cache locality
 
  - Support TCP Authentication Option (RFC 5925, TCP-AO), a more modern
    replacement for the old MD5 option.
 
  - Add more retransmission timeout (RTO) related statistics to TCP_INFO.
 
  - Support sending fragmented skbs over vsock sockets.
 
  - Make sure we send SIGPIPE for vsock sockets if socket was shutdown().
 
  - Add sysctl for ignoring lower limit on lifetime in Router
    Advertisement PIO, based on an in-progress IETF draft.
 
  - Add sysctl to control activation of TCP ping-pong mode.
 
  - Add sysctl to make connection timeout in MPTCP configurable.
 
  - Support rcvlowat and notsent_lowat on MPTCP sockets, to help apps
    limit the number of wakeups.
 
  - Support netlink GET for MDB (multicast forwarding), allowing user
    space to request a single MDB entry instead of dumping the entire
    table.
 
  - Support selective FDB flushing in the VXLAN tunnel driver.
 
  - Allow limiting learned FDB entries in bridges, prevent OOM attacks.
 
  - Allow controlling via configfs netconsole targets which were created
    via the kernel cmdline at boot, rather than via configfs at runtime.
 
  - Support multiple PTP timestamp event queue readers with different
    filters.
 
  - MCTP over I3C.
 
 BPF
 ---
 
  - Add new veth-like netdevice where BPF program defines the logic
    of the xmit routine. It can operate in L3 and L2 mode.
 
  - Support exceptions - allow asserting conditions which should
    never be true but are hard for the verifier to infer.
    With some extra flexibility around handling of the exit / failure.
    https://lwn.net/Articles/938435/
 
  - Add support for local per-cpu kptr, allow allocating and storing
    per-cpu objects in maps. Access to those objects operates on
    the value for the current CPU. This allows to deprecate local
    one-off implementations of per-CPU storage like
    BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE maps.
 
  - Extend cgroup BPF sockaddr hooks for UNIX sockets. The use case is
    for systemd to re-implement the LogNamespace feature which allows
    running multiple instances of systemd-journald to process the logs
    of different services.
 
  - Enable open-coded task_vma iteration, after maple tree conversion
    made it hard to directly walk VMAs in tracing programs.
 
  - Add open-coded task, css_task and css iterator support.
    One of the use cases is customizable OOM victim selection via BPF.
 
  - Allow source address selection with bpf_*_fib_lookup().
 
  - Add ability to pin BPF timer to the current CPU.
 
  - Prevent creation of infinite loops by combining tail calls and
    fentry/fexit programs.
 
  - Add missed stats for kprobes to retrieve the number of missed kprobe
    executions and subsequent executions of BPF programs.
 
  - Inherit system settings for CPU security mitigations.
 
  - Add BPF v4 CPU instruction support for arm32 and s390x.
 
 Changes to common code
 ----------------------
 
  - overflow: add DEFINE_FLEX() for on-stack definition of structs
    with flexible array members.
 
  - Process doc update with more guidance for reviewers.
 
 Driver API
 ----------
 
  - Simplify locking in WiFi (cfg80211 and mac80211 layers), use wiphy
    mutex in most places and remove a lot of smaller locks.
 
  - Create a common DPLL configuration API. Allow configuring
    and querying state of PLL circuits used for clock syntonization,
    in network time distribution.
 
  - Unify fragmented and full page allocation APIs in page pool code.
    Let drivers be ignorant of PAGE_SIZE.
 
  - Rework PHY state machine to avoid races with calls to phy_stop().
 
  - Notify DSA drivers of MAC address changes on user ports, improve
    correctness of offloads which depend on matching port MAC addresses.
 
  - Allow antenna control on injected WiFi frames.
 
  - Reduce the number of variants of napi_schedule().
 
  - Simplify error handling when composing devlink health messages.
 
 Misc
 ----
 
  - A lot of KCSAN data race "fixes", from Eric.
 
  - A lot of __counted_by() annotations, from Kees.
 
  - A lot of strncpy -> strscpy and printf format fixes.
 
  - Replace master/slave terminology with conduit/user in DSA drivers.
 
  - Handful of KUnit tests for netdev and WiFi core.
 
 Removed
 -------
 
  - AppleTalk COPS.
 
  - AppleTalk ipddp.
 
  - TI AR7 CPMAC Ethernet driver.
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Intel (100G, ice, idpf):
      - add a driver for the Intel E2000 IPUs
      - make CRC/FCS stripping configurable
      - cross-timestamping for E823 devices
      - basic support for E830 devices
      - use aux-bus for managing client drivers
      - i40e: report firmware versions via devlink
    - nVidia/Mellanox:
      - support 4-port NICs
      - increase max number of channels to 256
      - optimize / parallelize SF creation flow
    - Broadcom (bnxt):
      - enhance NIC temperature reporting
      - support PAM4 speeds and lane configuration
    - Marvell OcteonTX2:
      - PTP pulse-per-second output support
      - enable hardware timestamping for VFs
    - Solarflare/AMD:
      - conntrack NAT offload and offload for tunnels
    - Wangxun (ngbe/txgbe):
      - expose HW statistics
    - Pensando/AMD:
      - support PCI level reset
      - narrow down the condition under which skbs are linearized
    - Netronome/Corigine (nfp):
      - support CHACHA20-POLY1305 crypto in IPsec offload
 
  - Ethernet NICs embedded, slower, virtual:
    - Synopsys (stmmac):
      - add Loongson-1 SoC support
      - enable use of HW queues with no offload capabilities
      - enable PPS input support on all 5 channels
      - increase TX coalesce timer to 5ms
    - RealTek USB (r8152): improve efficiency of Rx by using GRO frags
    - xen: support SW packet timestamping
    - add drivers for implementations based on TI's PRUSS (AM64x EVM)
 
  - nVidia/Mellanox Ethernet datacenter switches:
    - avoid poor HW resource use on Spectrum-4 by better block selection
      for IPv6 multicast forwarding and ordering of blocks in ACL region
 
  - Ethernet embedded switches:
    - Microchip:
      - support configuring the drive strength for EMI compliance
      - ksz9477: partial ACL support
      - ksz9477: HSR offload
      - ksz9477: Wake on LAN
    - Realtek:
      - rtl8366rb: respect device tree config of the CPU port
 
  - Ethernet PHYs:
    - support Broadcom BCM5221 PHYs
    - TI dp83867: support hardware LED blinking
 
  - CAN:
    - add support for Linux-PHY based CAN transceivers
    - at91_can: clean up and use rx-offload helpers
 
  - WiFi:
    - MediaTek (mt76):
      - new sub-driver for mt7925 USB/PCIe devices
      - HW wireless <> Ethernet bridging in MT7988 chips
      - mt7603/mt7628 stability improvements
    - Qualcomm (ath12k):
      - WCN7850:
        - enable 320 MHz channels in 6 GHz band
        - hardware rfkill support
        - enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS
          to make scan faster
        - read board data variant name from SMBIOS
      - QCN9274: mesh support
    - RealTek (rtw89):
      - TDMA-based multi-channel concurrency (MCC)
    - Silicon Labs (wfx):
      - Remain-On-Channel (ROC) support
 
  - Bluetooth:
    - ISO: many improvements for broadcast support
    - mark BCM4378/BCM4387 as BROKEN_LE_CODED
    - add support for QCA2066
    - btmtksdio: enable Bluetooth wakeup from suspend
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmU8XsYACgkQMUZtbf5S
 Irv19RAAnud/24OOF5XMEJkIcYlnfqximh4XO6PujRSYkSkOUJdZTF6iJPgf3pSP
 YpwoHYbYKHYfeOf8+3bTNESiQNSnoVmvmvwiS6/7lZ3behHUrGLQzW9Htc3EZyWH
 2h6QkDZ5OOjfg0bwYSfp3vXkmMH2k8WE9Y0NvCkhcohqZi13Rmp14RnyPmNb2d1V
 yZRYDMSM133KqE6gnBr1Ct65IEvnKeGlCUN2mTGqOJgdn6DZMsyxvtt0y4rmN7Ab
 41+CgPU5SfxfbYpW+Dl2HJpgfte3WrC57KC6AM0PAPJzPmQWgeB/m9mjz/apj6Bg
 bhsEIo7FdvbCnQm3yWPhK2OgCAcSwLr8jfGMU+Q+W4VnL5SRRR3Rm0zjsze+kHNP
 OfqJgxzl3DpvoJqVBy1h5FGcZt0XHwhksm4cTxWqIahsF+veY0ECBXbuBBQx9XTF
 Y7INfI8ulg7wISJs+CJfIClYkgOibTw2u8taBS5ikbtgxNqp5D4QqODn7UefQap1
 PR/IDYODF+zRgmMJLeBqSa6fij6BkfOEDiOWak5kggBoZdtbtmeKI6tzze06CNdW
 lWv1WEhRufxnwK+IuWsEkjhiMbs2WGLvkJ5JbgQV9BfqHfIfiqBCrcWtT/WbQnGt
 lmU46CXh1t/FZEqbmK9h+8vsIIfrcDl6jb5npEiKPRG00vDKRTM=
 =46nS
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Support usec resolution of TCP timestamps, enabled selectively by a
     route attribute.

   - Defer regular TCP ACK while processing socket backlog, try to send
     a cumulative ACK at the end. Increase single TCP flow performance
     on a 200Gbit NIC by 20% (100Gbit -> 120Gbit).

   - The Fair Queuing (FQ) packet scheduler:
       - add built-in 3 band prio / WRR scheduling
       - support bypass if the qdisc is mostly idle (5% speed up for TCP RR)
       - improve inactive flow reporting
       - optimize the layout of structures for better cache locality

   - Support TCP Authentication Option (RFC 5925, TCP-AO), a more modern
     replacement for the old MD5 option.

   - Add more retransmission timeout (RTO) related statistics to
     TCP_INFO.

   - Support sending fragmented skbs over vsock sockets.

   - Make sure we send SIGPIPE for vsock sockets if socket was
     shutdown().

   - Add sysctl for ignoring lower limit on lifetime in Router
     Advertisement PIO, based on an in-progress IETF draft.

   - Add sysctl to control activation of TCP ping-pong mode.

   - Add sysctl to make connection timeout in MPTCP configurable.

   - Support rcvlowat and notsent_lowat on MPTCP sockets, to help apps
     limit the number of wakeups.

   - Support netlink GET for MDB (multicast forwarding), allowing user
     space to request a single MDB entry instead of dumping the entire
     table.

   - Support selective FDB flushing in the VXLAN tunnel driver.

   - Allow limiting learned FDB entries in bridges, prevent OOM attacks.

   - Allow controlling via configfs netconsole targets which were
     created via the kernel cmdline at boot, rather than via configfs at
     runtime.

   - Support multiple PTP timestamp event queue readers with different
     filters.

   - MCTP over I3C.

  BPF:

   - Add new veth-like netdevice where BPF program defines the logic of
     the xmit routine. It can operate in L3 and L2 mode.

   - Support exceptions - allow asserting conditions which should never
     be true but are hard for the verifier to infer. With some extra
     flexibility around handling of the exit / failure:

          https://lwn.net/Articles/938435/

   - Add support for local per-cpu kptr, allow allocating and storing
     per-cpu objects in maps. Access to those objects operates on the
     value for the current CPU.

     This allows to deprecate local one-off implementations of per-CPU
     storage like BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE maps.

   - Extend cgroup BPF sockaddr hooks for UNIX sockets. The use case is
     for systemd to re-implement the LogNamespace feature which allows
     running multiple instances of systemd-journald to process the logs
     of different services.

   - Enable open-coded task_vma iteration, after maple tree conversion
     made it hard to directly walk VMAs in tracing programs.

   - Add open-coded task, css_task and css iterator support. One of the
     use cases is customizable OOM victim selection via BPF.

   - Allow source address selection with bpf_*_fib_lookup().

   - Add ability to pin BPF timer to the current CPU.

   - Prevent creation of infinite loops by combining tail calls and
     fentry/fexit programs.

   - Add missed stats for kprobes to retrieve the number of missed
     kprobe executions and subsequent executions of BPF programs.

   - Inherit system settings for CPU security mitigations.

   - Add BPF v4 CPU instruction support for arm32 and s390x.

  Changes to common code:

   - overflow: add DEFINE_FLEX() for on-stack definition of structs with
     flexible array members.

   - Process doc update with more guidance for reviewers.

  Driver API:

   - Simplify locking in WiFi (cfg80211 and mac80211 layers), use wiphy
     mutex in most places and remove a lot of smaller locks.

   - Create a common DPLL configuration API. Allow configuring and
     querying state of PLL circuits used for clock syntonization, in
     network time distribution.

   - Unify fragmented and full page allocation APIs in page pool code.
     Let drivers be ignorant of PAGE_SIZE.

   - Rework PHY state machine to avoid races with calls to phy_stop().

   - Notify DSA drivers of MAC address changes on user ports, improve
     correctness of offloads which depend on matching port MAC
     addresses.

   - Allow antenna control on injected WiFi frames.

   - Reduce the number of variants of napi_schedule().

   - Simplify error handling when composing devlink health messages.

  Misc:

   - A lot of KCSAN data race "fixes", from Eric.

   - A lot of __counted_by() annotations, from Kees.

   - A lot of strncpy -> strscpy and printf format fixes.

   - Replace master/slave terminology with conduit/user in DSA drivers.

   - Handful of KUnit tests for netdev and WiFi core.

  Removed:

   - AppleTalk COPS.

   - AppleTalk ipddp.

   - TI AR7 CPMAC Ethernet driver.

  Drivers:

   - Ethernet high-speed NICs:
      - Intel (100G, ice, idpf):
         - add a driver for the Intel E2000 IPUs
         - make CRC/FCS stripping configurable
         - cross-timestamping for E823 devices
         - basic support for E830 devices
         - use aux-bus for managing client drivers
         - i40e: report firmware versions via devlink
      - nVidia/Mellanox:
         - support 4-port NICs
         - increase max number of channels to 256
         - optimize / parallelize SF creation flow
      - Broadcom (bnxt):
         - enhance NIC temperature reporting
         - support PAM4 speeds and lane configuration
      - Marvell OcteonTX2:
         - PTP pulse-per-second output support
         - enable hardware timestamping for VFs
      - Solarflare/AMD:
         - conntrack NAT offload and offload for tunnels
      - Wangxun (ngbe/txgbe):
         - expose HW statistics
      - Pensando/AMD:
         - support PCI level reset
         - narrow down the condition under which skbs are linearized
      - Netronome/Corigine (nfp):
         - support CHACHA20-POLY1305 crypto in IPsec offload

   - Ethernet NICs embedded, slower, virtual:
      - Synopsys (stmmac):
         - add Loongson-1 SoC support
         - enable use of HW queues with no offload capabilities
         - enable PPS input support on all 5 channels
         - increase TX coalesce timer to 5ms
      - RealTek USB (r8152): improve efficiency of Rx by using GRO frags
      - xen: support SW packet timestamping
      - add drivers for implementations based on TI's PRUSS (AM64x EVM)

   - nVidia/Mellanox Ethernet datacenter switches:
      - avoid poor HW resource use on Spectrum-4 by better block
        selection for IPv6 multicast forwarding and ordering of blocks
        in ACL region

   - Ethernet embedded switches:
      - Microchip:
         - support configuring the drive strength for EMI compliance
         - ksz9477: partial ACL support
         - ksz9477: HSR offload
         - ksz9477: Wake on LAN
      - Realtek:
         - rtl8366rb: respect device tree config of the CPU port

   - Ethernet PHYs:
      - support Broadcom BCM5221 PHYs
      - TI dp83867: support hardware LED blinking

   - CAN:
      - add support for Linux-PHY based CAN transceivers
      - at91_can: clean up and use rx-offload helpers

   - WiFi:
      - MediaTek (mt76):
         - new sub-driver for mt7925 USB/PCIe devices
         - HW wireless <> Ethernet bridging in MT7988 chips
         - mt7603/mt7628 stability improvements
      - Qualcomm (ath12k):
         - WCN7850:
            - enable 320 MHz channels in 6 GHz band
            - hardware rfkill support
            - enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS to
              make scan faster
            - read board data variant name from SMBIOS
        - QCN9274: mesh support
      - RealTek (rtw89):
         - TDMA-based multi-channel concurrency (MCC)
      - Silicon Labs (wfx):
         - Remain-On-Channel (ROC) support

   - Bluetooth:
      - ISO: many improvements for broadcast support
      - mark BCM4378/BCM4387 as BROKEN_LE_CODED
      - add support for QCA2066
      - btmtksdio: enable Bluetooth wakeup from suspend"

* tag 'net-next-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1816 commits)
  net: pcs: xpcs: Add 2500BASE-X case in get state for XPCS drivers
  net: bpf: Use sockopt_lock_sock() in ip_sock_set_tos()
  net: mana: Use xdp_set_features_flag instead of direct assignment
  vxlan: Cleanup IFLA_VXLAN_PORT_RANGE entry in vxlan_get_size()
  iavf: delete the iavf client interface
  iavf: add a common function for undoing the interrupt scheme
  iavf: use unregister_netdev
  iavf: rely on netdev's own registered state
  iavf: fix the waiting time for initial reset
  iavf: in iavf_down, don't queue watchdog_task if comms failed
  iavf: simplify mutex_trylock+sleep loops
  iavf: fix comments about old bit locks
  doc/netlink: Update schema to support cmd-cnt-name and cmd-max-name
  tools: ynl: introduce option to process unknown attributes or types
  ipvlan: properly track tx_errors
  netdevsim: Block until all devices are released
  nfp: using napi_build_skb() to replace build_skb()
  net: dsa: microchip: ksz9477: Fix spelling mistake "Enery" -> "Energy"
  net: dsa: microchip: Ensure Stable PME Pin State for Wake-on-LAN
  net: dsa: microchip: Refactor switch shutdown routine for WoL preparation
  ...
2023-10-31 05:10:11 -10:00
Linus Torvalds
5a6a09e971 cgroup: Changes for v6.7
* cpuset now supports remote partitions where CPUs can be reserved for
   exclusive use down the tree without requiring all the intermediate nodes
   to be partitions. This makes it easier to use partitions without modifying
   existing cgroup hierarchy.
 
 * cpuset partition configuration behavior improvement.
 
 * cgroup_favordynmods= boot param added to allow setting the flag on boot on
   cgroup1.
 
 * Misc code and doc updates.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZUBUKA4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGWfMAP9WP+Z21qzzL2bY5I5kOxu+rD2fF9ORk7azILrI
 c3gXSQD/bdxNWcdtrCQMvs+ToNEJvqDFxrmgG5uRUF8Ea+FCtQ8=
 =gZj2
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:

 - cpuset now supports remote partitions where CPUs can be reserved for
   exclusive use down the tree without requiring all the intermediate
   nodes to be partitions. This makes it easier to use partitions
   without modifying existing cgroup hierarchy.

 - cpuset partition configuration behavior improvement

 - cgroup_favordynmods= boot param added to allow setting the flag on
   boot on cgroup1

 - Misc code and doc updates

* tag 'cgroup-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  docs/cgroup: Add the list of threaded controllers to cgroup-v2.rst
  cgroup: use legacy_name for cgroup v1 disable info
  cgroup/cpuset: Cleanup signedness issue in cpu_exclusive_check()
  cgroup/cpuset: Enable invalid to valid local partition transition
  cgroup: add cgroup_favordynmods= command-line option
  cgroup/cpuset: Extend test_cpuset_prs.sh to test remote partition
  cgroup/cpuset: Documentation update for partition
  cgroup/cpuset: Check partition conflict with housekeeping setup
  cgroup/cpuset: Introduce remote partition
  cgroup/cpuset: Add cpuset.cpus.exclusive for v2
  cgroup/cpuset: Add cpuset.cpus.exclusive.effective for v2
  cgroup/cpuset: Fix load balance state in update_partition_sd_lb()
  cgroup: Avoid extra dereference in css_populate_dir()
  cgroup: Check for ret during cgroup1_base_files cft addition
2023-10-30 20:58:48 -10:00
Linus Torvalds
5e37269945 pstore updates for v6.7-rc1
- Check for out-of-memory condition during initialization (Jiasheng Jiang)
 
 - Fix documentation typos (Tudor Ambarus)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmU/4ioWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJuPsD/kBcJPVzn4zx3sx6LJZJL4d9/4M
 xQ22mJM5H/z1L9YY7PrxD8lDcpue7EJmr8/Q7tdcUHNjzSxgecfuSR9Eq1Apcmra
 RbQGv7B/VmbU8aPMr5k+XiIzGU9Evoj6aZLGB3bL+3uj9Xmadkrz6SvctJy0zcRg
 D5VpLkXLx1JzShVfplYQtmSIRUg3gcn9BbwbZTWgD4lVZD7UNx624xUBH7ywYxCB
 +qXKW+DvFiZAUszAGXSBWcPdx4pue0Ff0Q52ZnHjEdYOrlPC6dVQ1ABMDnWrNCVh
 58i7dtREEX+IX50xtjsNkVU+DaunAP7VI0+/+fgDk8KLCYfh7dTXN0fOgSlHILnt
 zJx5qTkBh/A/g+3X21L5srq1qAwR7V3tBIQPQAlyBcJCneYHLaQSynsMynydrJaH
 nYbpWeINhXP/ZfDFA7bVum+uBMrz1stIuEWeL1pELTk7NUVkHtkmr14Af1+zNz0z
 GeKBrPebBhg9lbCPAmV8VT08vJJ+8tD/WA7fIfNLH7/LdYgl+tRFVs/c6yKDfmu9
 p6wt5eCKVszg1jFYj/QMOBfOR5g2koectdO8qu4LI6PcW2hOTneCw1RLtwdUpFol
 2HNXl+L69KVnhLNlKSAnRhPTfpcDb16ssFMQuVNWTXYS8rKW7J/+Mh9RYUbEzYaH
 Ftzs99LnZ+orxNvy+Q==
 =e8WJ
 -----END PGP SIGNATURE-----

Merge tag 'pstore-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull pstore updates from Kees Cook:

 - Check for out-of-memory condition during initialization (Jiasheng
   Jiang)

 - Fix documentation typos (Tudor Ambarus)

* tag 'pstore-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  pstore/platform: Add check for kstrdup
  docs: pstore-blk.rst: fix typo, s/console/ftrace
  docs: pstore-blk.rst: use "about" as a preposition after "care"
2023-10-30 19:26:39 -10:00
Linus Torvalds
2656821f1f RCU pull request for v6.7
This pull request contains the following branches:
 
 rcu/torture: RCU torture, locktorture and generic torture infrastructure
 	updates that include various fixes, cleanups and consolidations.
 	Among the user visible things, ftrace dumps can now be found into
 	their own file, and module parameters get better documented and
 	reported on dumps.
 
 rcu/fixes: Generic and misc fixes all over the place. Some highlights:
 
 	* Hotplug handling has seen some light cleanups and comments.
 
 	* An RCU barrier can now be triggered through sysfs to serialize
 	memory stress testing and avoid OOM.
 
 	* Object information is now dumped in case of invalid callback
 	invocation.
 
 	* Also various SRCU issues, too hard to trigger to deserve urgent
 	pull requests, have been fixed.
 
 rcu/docs: RCU documentation updates
 
 rcu/refscale: RCU reference scalability test minor fixes and doc
 	improvements.
 
 rcu/tasks: RCU tasks minor fixes
 
 rcu/stall: Stall detection updates. Introduce RCU CPU Stall notifiers
 	that allows a subsystem to provide informations to help debugging.
 	Also cure some false positive stalls.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEd76+gtGM8MbftQlOhSRUR1COjHcFAmU21h0ACgkQhSRUR1CO
 jHdUgA/+Myy5K5OxNrqlF/gIK+flOSg635RyZ0DBx8OMXZ/fAg9qRI+PKt5I4Lha
 eXAg6EtmwSgHmIbjcg8WzsvwniEsqqjOF+n1qil447fHUI2Qqw6c7fIm/MXQkeHJ
 qA7CODDRtsAnwnjmTteasmMeGV0bmXDENxhNrAZBFnVkRgTqfyDbFcn+nxOaPK6b
 fmbKvnB07WUg1KOV8/MbEtAZPb8QgHo58bXSZRKjKkiqRQWB/D3On+tShFK7SYJi
 wIqQ96MLyUXLaIWQ47v6xEO4PZO+3o1wAryvP1DRdb5UrPjO6yKFfQaoo5Mza92G
 zhBJhnXkVvCoNoCU7GKJIDV54SgDHaB6Sf1GN5cjwfujOkLuGCyg0CpKktCGm7uH
 n3X66PVep608Uj2Y/pAo/hv3Hbv7lCu4nfrERvVLG9YoxUvTJDsKmBv+SF/g2mxF
 rHqFa39HUPr1yHA5WjqOQS3lLdqCXEGKvNi6zXCvOceiDbHbiJFkBo6p8TVrbSMX
 FCOWZ3LoE+6uiLu/lLOEroTjeBd8GhDh1LgWgyVK7o0LhP1018DSBolrpcSwnmOo
 Q/E4G2x+aPWs+5NTOmMGOIPY70khKQIM3c8YZelSRffJBo6O3yV68h6X45NQxYvx
 keLvrDaza8h4hKwaof/QaX4ZJgTOZ0xjpawr1vR0hbK8LNtPrUw=
 =cVD7
 -----END PGP SIGNATURE-----

Merge tag 'rcu-next-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks

Pull RCU updates from Frederic Weisbecker:

 - RCU torture, locktorture and generic torture infrastructure updates
   that include various fixes, cleanups and consolidations.

   Among the user visible things, ftrace dumps can now be found into
   their own file, and module parameters get better documented and
   reported on dumps.

 - Generic and misc fixes all over the place. Some highlights:

     * Hotplug handling has seen some light cleanups and comments

     * An RCU barrier can now be triggered through sysfs to serialize
       memory stress testing and avoid OOM

     * Object information is now dumped in case of invalid callback
       invocation

     * Also various SRCU issues, too hard to trigger to deserve urgent
       pull requests, have been fixed

 - RCU documentation updates

 - RCU reference scalability test minor fixes and doc improvements.

 - RCU tasks minor fixes

 - Stall detection updates. Introduce RCU CPU Stall notifiers that
   allows a subsystem to provide informations to help debugging. Also
   cure some false positive stalls.

* tag 'rcu-next-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks: (56 commits)
  srcu: Only accelerate on enqueue time
  locktorture: Check the correct variable for allocation failure
  srcu: Fix callbacks acceleration mishandling
  rcu: Comment why callbacks migration can't wait for CPUHP_RCUTREE_PREP
  rcu: Standardize explicit CPU-hotplug calls
  rcu: Conditionally build CPU-hotplug teardown callbacks
  rcu: Remove references to rcu_migrate_callbacks() from diagrams
  rcu: Assume rcu_report_dead() is always called locally
  rcu: Assume IRQS disabled from rcu_report_dead()
  rcu: Use rcu_segcblist_segempty() instead of open coding it
  rcu: kmemleak: Ignore kmemleak false positives when RCU-freeing objects
  srcu: Fix srcu_struct node grpmask overflow on 64-bit systems
  torture: Convert parse-console.sh to mktemp
  rcutorture: Traverse possible cpu to set maxcpu in rcu_nocb_toggle()
  rcutorture: Replace schedule_timeout*() 1-jiffy waits with HZ/20
  torture: Add kvm.sh --debug-info argument
  locktorture: Rename readers_bind/writers_bind to bind_readers/bind_writers
  doc: Catch-up update for locktorture module parameters
  locktorture: Add call_rcu_chains module parameter
  locktorture: Add new module parameters to lock_torture_print_module_parms()
  ...
2023-10-30 18:01:41 -10:00
Linus Torvalds
9a0f53e0cf CSD lock commits for v6.7
This series adds a kernel boot parameter that causes the kernel to
 panic if one of the call_smp_function() APIs is stalled for more than
 the specified duration.  This is useful in deployments in which a clean
 panic is preferable to an indefinite stall.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmU9plITHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jB91D/9OiOMtV03TXN2K+zGmJMjTFgLuVnug
 OqG4mrCv7jTJ3k6fTpu7hih/BCmG1Mu7byyPV6BUSfKsYony7L4yTPFJsjK8lNmq
 MHh847DErGieuURCDnsvqBVpYIRfXnvW9ptlf+BMCjbzz4FuUu1XhJTm+U2nab3i
 BEIEMORxDCyghh7yluAVG6sULRXOqjv5VcypwOXUbavDf0JyJTns4QlXFD85yiBr
 nvfHvLUrzu5EblA3m09lTnCaKrAlz5pwD7fWQq7bS3rz2gndR/DcVcZknQ68FMsj
 Mcf0Zf45TzyWWfMcE8LCulhMlZ2GYsIm2YkIbgwlsOAndjBrV55rsgaVbSZDke37
 QMHPGUZ7m7AjuDqpWZITrJjQHkWCtn/5tFSazHSlMwWg44pgOqvc3OgZn04tzn01
 L/guq3yBIKBJiAjdgzdx8/H9S7cSH8TLEWFY2utEAMIjef9Xzi6qwQ4X5p4K9QYJ
 Sm1hTTCbF9NeyMk0o2rokR58f13+9ewxE4RIYcwP9loo6nbBcTVN/fvdHN6jQywa
 7UIui508AUf55zQgO+5z8LsmjyNiDmDIeE8CLeVDicllWN+ne7F//AXAuA5V8m+y
 G0Mphn/lZpnddDSFlR/QFuMvGqtpIu2y8w93awq8g/VzirbkiiCia3iFcFtss4ip
 n0Y50CCu/W70fw==
 =ZD0S
 -----END PGP SIGNATURE-----

Merge tag 'csd-lock.2023.10.23a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull CSD lock update from Paul McKenney:
 "This adds a kernel boot parameter that causes the kernel to panic if
  one of the call_smp_function() APIs is stalled for more than the
  specified duration.

  This is useful in deployments in which a clean panic is preferable to
  an indefinite stall"

* tag 'csd-lock.2023.10.23a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  smp,csd: Throw an error if a CSD lock is stuck for too long
2023-10-30 17:56:53 -10:00
Linus Torvalds
ed766c2611 Changes to the x86 entry code in v6.7:
- Make IA32_EMULATION boot time configurable with
    the new ia32_emulation=<bool> boot option.
 
  - Clean up fast syscall return validation code: convert
    it to C and refactor the code.
 
  - As part of this, optimize the canonical RIP test code.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU9DiARHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iNAw//cLn9gBXMVPDiCDVUOTqjkZ+OwIF11Y9v
 WatksSe5hrw0Bzl5CiSvtrWpTkKPnhyM8Lc1WD8l0YSMKprdkQfNAvQOPv0IMLjk
 XP1pgQhAiXwB87XL/G2sA6RunuK56zlnl7KJiDrQThrS/WOfrq3UkB2vyYEP4GtP
 69WZ/WM++u74uEml0+HZ0Z9HVvzwYl1VQPdTYfl52S4H3U8MXL89YEsPr13Ttq88
 FMKdXJ/VvItuVM/ZHHqFkGvRJjUtDWePLu29b684Ap6onDJ7uMMw86Gj5UxXtdpB
 Axsjuwlca8sCPotcqohay6IdyxIth6lMdvjPv0KhA+/QMrHbDaluv88YQs4k7Add
 1GPULH6oeDTHxMPOcJmFuSTpMY8HP6O9ZIXB6ogQRkLaDJKaWr5UQU7L2VBQ/WUy
 NRa6mba0XHYrz6U7DmtsdL0idWBJeJokHmaIcGJ/pp6gMznvufm2+SoJ6w6wcYva
 VTSTyrAAj/N9/TzJ5i8S2+yDPI9GanFpZJfYbW/rT9XGutvXWVKe3AmUNgR8O+hE
 JiEMfpR0TtXXlrik74jur/RPZhaFIE8MeCvJrkJ3oxQlPThYSTMBAlUOtD7kOfNT
 onjPrumREX4hOIBU+nnC9VrJMqxX9lz4xDzqw3jvX99Ma0o8Wx/UndWELX8tAYwd
 j8M8NWAbv90=
 =YkaP
 -----END PGP SIGNATURE-----

Merge tag 'x86-entry-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 entry updates from Ingo Molnar:

 - Make IA32_EMULATION boot time configurable with
   the new ia32_emulation=<bool> boot option

 - Clean up fast syscall return validation code: convert
   it to C and refactor the code

 - As part of this, optimize the canonical RIP test code

* tag 'x86-entry-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/entry/32: Clean up syscall fast exit tests
  x86/entry/64: Use TASK_SIZE_MAX for canonical RIP test
  x86/entry/64: Convert SYSRET validation tests to C
  x86/entry/32: Remove SEP test for SYSEXIT
  x86/entry/32: Convert do_fast_syscall_32() to bool return type
  x86/entry/compat: Combine return value test from syscall handler
  x86/entry/64: Remove obsolete comment on tracing vs. SYSRET
  x86: Make IA32_EMULATION boot time configurable
  x86/entry: Make IA32 syscalls' availability depend on ia32_enabled()
  x86/elf: Make loading of 32bit processes depend on ia32_enabled()
  x86/entry: Compile entry_SYSCALL32_ignore() unconditionally
  x86/entry: Rename ignore_sysret()
  x86: Introduce ia32_enabled()
2023-10-30 15:27:27 -10:00
Linus Torvalds
63ce50fff9 Scheduler changes for v6.7 are:
- Fair scheduler (SCHED_OTHER) improvements:
 
     - Remove the old and now unused SIS_PROP code & option
     - Scan cluster before LLC in the wake-up path
     - Use candidate prev/recent_used CPU if scanning failed for cluster wakeup
 
  - NUMA scheduling improvements:
 
     - Improve the VMA access-PID code to better skip/scan VMAs
     - Extend tracing to cover VMA-skipping decisions
     - Improve/fix the recently introduced sched_numa_find_nth_cpu() code
     - Generalize numa_map_to_online_node()
 
  - Energy scheduling improvements:
 
     - Remove the EM_MAX_COMPLEXITY limit
     - Add tracepoints to track energy computation
     - Make the behavior of the 'sched_energy_aware' sysctl more consistent
     - Consolidate and clean up access to a CPU's max compute capacity
     - Fix uclamp code corner cases
 
  - RT scheduling improvements:
 
     - Drive dl_rq->overloaded with dl_rq->pushable_dl_tasks updates
     - Drive the ->rto_mask with rt_rq->pushable_tasks updates
 
  - Scheduler scalability improvements:
 
     - Rate-limit updates to tg->load_avg
     - On x86 disable IBRS when CPU is offline to improve single-threaded performance
     - Micro-optimize in_task() and in_interrupt()
     - Micro-optimize the PSI code
     - Avoid updating PSI triggers and ->rtpoll_total when there are no state changes
 
  - Core scheduler infrastructure improvements:
 
     - Use saved_state to reduce some spurious freezer wakeups
     - Bring in a handful of fast-headers improvements to scheduler headers
     - Make the scheduler UAPI headers more widely usable by user-space
     - Simplify the control flow of scheduler syscalls by using lock guards
     - Fix sched_setaffinity() vs. CPU hotplug race
 
  - Scheduler debuggability improvements:
     - Disallow writing invalid values to sched_rt_period_us
     - Fix a race in the rq-clock debugging code triggering warnings
     - Fix a warning in the bandwidth distribution code
     - Micro-optimize in_atomic_preempt_off() checks
     - Enforce that the tasklist_lock is held in for_each_thread()
     - Print the TGID in sched_show_task()
     - Remove the /proc/sys/kernel/sched_child_runs_first sysctl
 
  - Misc cleanups & fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmU8/NoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gN+xAAvKGYNZBCBG4jowxccgqAbCx81KOhhsy/
 KUaOmdLPg9WaXuqjZ5sggXQCMT0wUqBYAmqV7ts53VhWcma2I1ap4dCM6Jj+RLrc
 vNwkeNetsikiZtarMoCJs5NahL8ULh3liBaoAkkToPjQ5r43aZ/eKwDovEdIKc+g
 +Vgn7jUY8ssIrAOKT1midSwY1y8kAU2AzWOSFDTgedkJP4PgOu9/lBl9jSJ2sYaX
 N4XqONYPXTwOHUtvmzkYILxLz0k0GgJ7hmt78E8Xy2rC4taGCRwCfCMBYxREuwiP
 huo3O1P/iIe5svm4/EBUvcpvf44eAWTV+CD0dnJPwOc9IvFhpSzqSZZAsyy/JQKt
 Lnzmc/xmyc1PnXCYJfHuXrw2/m+MyUHaegPzh5iLJFrlqa79GavOElj0jNTAMzbZ
 39fybzPtuFP+64faRfu0BBlQZfORPBNc/oWMpPKqgP58YGuveKTWaUF5rl5lM7Ne
 nm07uOmq02JVR8YzPl/FcfhU2dPMawWuMwUjEr2eU+lAunY3PF88vu0FALj7iOBd
 66F8qrtpDHJanOxrdEUwSJ7hgw79qY1iw66Db7cQYjMazFKZONxArQPqFUZ0ngLI
 n9hVa7brg1bAQKrQflqjcIAIbpVu3SjPEl15cKpAJTB/gn5H66TQgw8uQ6HfG+h2
 GtOsn1nlvuk=
 =GDqb
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:
 "Fair scheduler (SCHED_OTHER) improvements:
   - Remove the old and now unused SIS_PROP code & option
   - Scan cluster before LLC in the wake-up path
   - Use candidate prev/recent_used CPU if scanning failed for cluster
     wakeup

  NUMA scheduling improvements:
   - Improve the VMA access-PID code to better skip/scan VMAs
   - Extend tracing to cover VMA-skipping decisions
   - Improve/fix the recently introduced sched_numa_find_nth_cpu() code
   - Generalize numa_map_to_online_node()

  Energy scheduling improvements:
   - Remove the EM_MAX_COMPLEXITY limit
   - Add tracepoints to track energy computation
   - Make the behavior of the 'sched_energy_aware' sysctl more
     consistent
   - Consolidate and clean up access to a CPU's max compute capacity
   - Fix uclamp code corner cases

  RT scheduling improvements:
   - Drive dl_rq->overloaded with dl_rq->pushable_dl_tasks updates
   - Drive the ->rto_mask with rt_rq->pushable_tasks updates

  Scheduler scalability improvements:
   - Rate-limit updates to tg->load_avg
   - On x86 disable IBRS when CPU is offline to improve single-threaded
     performance
   - Micro-optimize in_task() and in_interrupt()
   - Micro-optimize the PSI code
   - Avoid updating PSI triggers and ->rtpoll_total when there are no
     state changes

  Core scheduler infrastructure improvements:
   - Use saved_state to reduce some spurious freezer wakeups
   - Bring in a handful of fast-headers improvements to scheduler
     headers
   - Make the scheduler UAPI headers more widely usable by user-space
   - Simplify the control flow of scheduler syscalls by using lock
     guards
   - Fix sched_setaffinity() vs. CPU hotplug race

  Scheduler debuggability improvements:
   - Disallow writing invalid values to sched_rt_period_us
   - Fix a race in the rq-clock debugging code triggering warnings
   - Fix a warning in the bandwidth distribution code
   - Micro-optimize in_atomic_preempt_off() checks
   - Enforce that the tasklist_lock is held in for_each_thread()
   - Print the TGID in sched_show_task()
   - Remove the /proc/sys/kernel/sched_child_runs_first sysctl

  ... and misc cleanups & fixes"

* tag 'sched-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits)
  sched/fair: Remove SIS_PROP
  sched/fair: Use candidate prev/recent_used CPU if scanning failed for cluster wakeup
  sched/fair: Scan cluster before scanning LLC in wake-up path
  sched: Add cpus_share_resources API
  sched/core: Fix RQCF_ACT_SKIP leak
  sched/fair: Remove unused 'curr' argument from pick_next_entity()
  sched/nohz: Update comments about NEWILB_KICK
  sched/fair: Remove duplicate #include
  sched/psi: Update poll => rtpoll in relevant comments
  sched: Make PELT acronym definition searchable
  sched: Fix stop_one_cpu_nowait() vs hotplug
  sched/psi: Bail out early from irq time accounting
  sched/topology: Rename 'DIE' domain to 'PKG'
  sched/psi: Delete the 'update_total' function parameter from update_triggers()
  sched/psi: Avoid updating PSI triggers and ->rtpoll_total when there are no state changes
  sched/headers: Remove comment referring to rq::cpu_load, since this has been removed
  sched/numa: Complete scanning of inactive VMAs when there is no alternative
  sched/numa: Complete scanning of partial VMAs regardless of PID activity
  sched/numa: Move up the access pid reset logic
  sched/numa: Trace decisions related to skipping VMAs
  ...
2023-10-30 13:12:15 -10:00
Dimitri John Ledkov
f2b88bab69 Documentation/module-signing.txt: bring up to date
Update the documentation to mention that ECC NIST P-384 automatic
keypair generation is available to use ECDSA signature type, in
addition to the RSA.

Drop mentions of the now removed SHA-1 and SHA-224 options.

Add the just added FIPS 202 SHA-3 module signature hashes.

Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-10-27 18:04:30 +08:00
Joerg Roedel
e8cca466a8 Merge branches 'iommu/fixes', 'arm/tegra', 'arm/smmu', 'virtio', 'x86/vt-d', 'x86/amd', 'core' and 's390' into next 2023-10-27 09:13:40 +02:00
Samuel Thibault
07d87ceaec speakup: Document USB support
Speakup has been supporting USB for a while already.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20231020181059.7rtj2csi7t6vorrm@begin
2023-10-26 11:35:21 -06:00
Tang Yizhou
c1081a7b16 doc: blk-ioprio: Bring the doc in line with the implementation
Our system administrator have noted that the names 'rt-to-be' and
'all-to-idle' in the I/O priority policies table appeared without
explanations, leading to confusion. Let's bring these names in line
with the naming in the 'attribute' section.

Additionally,
1. Correct the interface name to 'io.prio.class'.
2. Add a table entry of 'promote-to-rt' for consistency.
3. Fix a typo of 'priority'.

Suggested-by: Yingfu Zhou <yingfu.zhou@shopee.com>
Reviewed-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Tang Yizhou <yizhou.tang@shopee.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20231012024228.2161283-1-yizhou.tang@shopee.com
2023-10-26 11:33:39 -06:00
Geert Uytterhoeven
78a96c86a0 Documentation: kernel-parameters: Add earlyprintk=bios on SH
Document the use of the "earlyprintk=bios[,keep]" kernel parameter
option on SuperH systems with a SuperH standard BIOS.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: https://lore.kernel.org/r/febc920964f2f0919d21775132e84c5cc270177e.1697708489.git.geert+renesas@glider.be
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2023-10-25 16:50:30 +02:00
Thomas Gleixner
9407bda845 x86/microcode: Prepare for minimal revision check
Applying microcode late can be fatal for the running kernel when the
update changes functionality which is in use already in a non-compatible
way, e.g. by removing a CPUID bit.

There is no way for admins which do not have access to the vendors deep
technical support to decide whether late loading of such a microcode is
safe or not.

Intel has added a new field to the microcode header which tells the
minimal microcode revision which is required to be active in the CPU in
order to be safe.

Provide infrastructure for handling this in the core code and a command
line switch which allows to enforce it.

If the update is considered safe the kernel is not tainted and the annoying
warning message not emitted. If it's enforced and the currently loaded
microcode revision is not safe for late loading then the load is aborted.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231017211724.079611170@linutronix.de
2023-10-24 15:05:55 +02:00
Frederic Weisbecker
d97ae6474c Merge branches 'rcu/torture', 'rcu/fixes', 'rcu/docs', 'rcu/refscale', 'rcu/tasks' and 'rcu/stall' into rcu/next
rcu/torture: RCU torture, locktorture and generic torture infrastructure
rcu/fixes: Generic and misc fixes
rcu/docs: RCU documentation updates
rcu/refscale: RCU reference scalability test updates
rcu/tasks: RCU tasks updates
rcu/stall: Stall detection updates
2023-10-23 15:24:11 +02:00
Jade Lovelace
a10874e8db Documentation: fix typo in dynamic-debug howto
Signed-off-by: Jade Lovelace <lists@jade.fyi>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20231019231655.3162225-1-lists@jade.fyi>
2023-10-22 20:37:17 -06:00
Trond Myklebust
5b9d31ae1c NFSv4: Add a parameter to limit the number of retries after NFS4ERR_DELAY
When using a 'softerr' mount, the NFSv4 client can get stuck waiting
forever while the server just returns NFS4ERR_DELAY. Among other things,
this causes the knfsd server threads to busy wait.
Add a parameter that tells the NFSv4 client how many times to retry
before giving up.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-10-22 19:47:56 -04:00
Josh Poimboeuf
dc6306ad5b x86/srso: Fix vulnerability reporting for missing microcode
The SRSO default safe-ret mitigation is reported as "mitigated" even if
microcode hasn't been updated.  That's wrong because userspace may still
be vulnerable to SRSO attacks due to IBPB not flushing branch type
predictions.

Report the safe-ret + !microcode case as vulnerable.

Also report the microcode-only case as vulnerable as it leaves the
kernel open to attacks.

Fixes: fb3bd914b3 ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/a8a14f97d1b0e03ec255c81637afdf4cf0ae9c99.1693889988.git.jpoimboe@kernel.org
2023-10-20 11:46:09 +02:00
SeongJae Park
bc17ea26a8 Docs/admin-guide/mm/damon/usage: update for tried regions update time interval
The documentation says DAMOS tried regions update feature of DAMON sysfs
interface is doing the update for one aggregation interval after the
request is made.  Since the introduction of the per-scheme apply interval,
that behavior makes no much sense.  Hence the implementation has changed
to update the regions for each scheme for only its apply interval. 
Further update the document to reflect the real behavior.

Link: https://lkml.kernel.org/r/20231012192256.33556-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:19 -07:00
Nhat Pham
8cba9576df hugetlb: memcg: account hugetlb-backed memory in memory controller
Currently, hugetlb memory usage is not acounted for in the memory
controller, which could lead to memory overprotection for cgroups with
hugetlb-backed memory.  This has been observed in our production system.

For instance, here is one of our usecases: suppose there are two 32G
containers.  The machine is booted with hugetlb_cma=6G, and each container
may or may not use up to 3 gigantic page, depending on the workload within
it.  The rest is anon, cache, slab, etc.  We can set the hugetlb cgroup
limit of each cgroup to 3G to enforce hugetlb fairness.  But it is very
difficult to configure memory.max to keep overall consumption, including
anon, cache, slab etc.  fair.

What we have had to resort to is to constantly poll hugetlb usage and
readjust memory.max.  Similar procedure is done to other memory limits
(memory.low for e.g).  However, this is rather cumbersome and buggy. 
Furthermore, when there is a delay in memory limits correction, (for e.g
when hugetlb usage changes within consecutive runs of the userspace
agent), the system could be in an over/underprotected state.

This patch rectifies this issue by charging the memcg when the hugetlb
folio is utilized, and uncharging when the folio is freed (analogous to
the hugetlb controller).  Note that we do not charge when the folio is
allocated to the hugetlb pool, because at this point it is not owned by
any memcg.

Some caveats to consider:
  * This feature is only available on cgroup v2.
  * There is no hugetlb pool management involved in the memory
    controller. As stated above, hugetlb folios are only charged towards
    the memory controller when it is used. Host overcommit management
    has to consider it when configuring hard limits.
  * Failure to charge towards the memcg results in SIGBUS. This could
    happen even if the hugetlb pool still has pages (but the cgroup
    limit is hit and reclaim attempt fails).
  * When this feature is enabled, hugetlb pages contribute to memory
    reclaim protection. low, min limits tuning must take into account
    hugetlb memory.
  * Hugetlb pages utilized while this option is not selected will not
    be tracked by the memory controller (even if cgroup v2 is remounted
    later on).

Link: https://lkml.kernel.org/r/20231006184629.155543-4-nphamcs@gmail.com
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Frank van der Linden <fvdl@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Rik van Riel <riel@surriel.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun heo <tj@kernel.org>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:17 -07:00
Muhammad Usama Anjum
18825b8ae9 mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL
Add some explanation and method to use write-protection and written-to
on memory range.

Link: https://lkml.kernel.org/r/20230821141518.870589-6-usama.anjum@collabora.com
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Miroslaw <emmir@google.com>
Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Paul Gofman <pgofman@codeweavers.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yun Zhou <yun.zhou@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:13 -07:00
Peter Xu
d61ea1cb00 userfaultfd: UFFD_FEATURE_WP_ASYNC
Patch series "Implement IOCTL to get and optionally clear info about
PTEs", v33.

*Motivation*
The real motivation for adding PAGEMAP_SCAN IOCTL is to emulate Windows
GetWriteWatch() and ResetWriteWatch() syscalls [1].  The GetWriteWatch()
retrieves the addresses of the pages that are written to in a region of
virtual memory.

This syscall is used in Windows applications and games etc.  This syscall
is being emulated in pretty slow manner in userspace.  Our purpose is to
enhance the kernel such that we translate it efficiently in a better way. 
Currently some out of tree hack patches are being used to efficiently
emulate it in some kernels.  We intend to replace those with these
patches.  So the whole gaming on Linux can effectively get benefit from
this.  It means there would be tons of users of this code.

CRIU use case [2] was mentioned by Andrei and Danylo:
> Use cases for migrating sparse VMAs are binaries sanitized with ASAN,
> MSAN or TSAN [3]. All of these sanitizers produce sparse mappings of
> shadow memory [4]. Being able to migrate such binaries allows to highly
> reduce the amount of work needed to identify and fix post-migration
> crashes, which happen constantly.

Andrei defines the following uses of this code:
* it is more granular and allows us to track changed pages more
  effectively. The current interface can clear dirty bits for the entire
  process only. In addition, reading info about pages is a separate
  operation. It means we must freeze the process to read information
  about all its pages, reset dirty bits, only then we can start dumping
  pages. The information about pages becomes more and more outdated,
  while we are processing pages. The new interface solves both these
  downsides. First, it allows us to read pte bits and clear the
  soft-dirty bit atomically. It means that CRIU will not need to freeze
  processes to pre-dump their memory. Second, it clears soft-dirty bits
  for a specified region of memory. It means CRIU will have actual info
  about pages to the moment of dumping them.
* The new interface has to be much faster because basic page filtering
  is happening in the kernel. With the old interface, we have to read
  pagemap for each page.

*Implementation Evolution (Short Summary)*
From the definition of GetWriteWatch(), we feel like kernel's soft-dirty
feature can be used under the hood with some additions like:
* reset soft-dirty flag for only a specific region of memory instead of
clearing the flag for the entire process
* get and clear soft-dirty flag for a specific region atomically

So we decided to use ioctl on pagemap file to read or/and reset soft-dirty
flag. But using soft-dirty flag, sometimes we get extra pages which weren't
even written. They had become soft-dirty because of VMA merging and
VM_SOFTDIRTY flag. This breaks the definition of GetWriteWatch(). We were
able to by-pass this short coming by ignoring VM_SOFTDIRTY until David
reported that mprotect etc messes up the soft-dirty flag while ignoring
VM_SOFTDIRTY [5]. This wasn't happening until [6] got introduced. We
discussed if we can revert these patches. But we could not reach to any
conclusion. So at this point, I made couple of tries to solve this whole
VM_SOFTDIRTY issue by correcting the soft-dirty implementation:
* [7] Correct the bug fixed wrongly back in 2014. It had potential to cause
regression. We left it behind.
* [8] Keep a list of soft-dirty part of a VMA across splits and merges. I
got the reply don't increase the size of the VMA by 8 bytes.

At this point, we left soft-dirty considering it is too much delicate and
userfaultfd [9] seemed like the only way forward. From there onward, we
have been basing soft-dirty emulation on userfaultfd wp feature where
kernel resolves the faults itself when WP_ASYNC feature is used. It was
straight forward to add WP_ASYNC feature in userfautlfd. Now we get only
those pages dirty or written-to which are really written in reality. (PS
There is another WP_UNPOPULATED userfautfd feature is required which is
needed to avoid pre-faulting memory before write-protecting [9].)

All the different masks were added on the request of CRIU devs to create
interface more generic and better.

[1] https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-getwritewatch
[2] https://lore.kernel.org/all/20221014134802.1361436-1-mdanylo@google.com
[3] https://github.com/google/sanitizers
[4] https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#64-bit
[5] https://lore.kernel.org/all/bfcae708-db21-04b4-0bbe-712badd03071@redhat.com
[6] https://lore.kernel.org/all/20220725142048.30450-1-peterx@redhat.com/
[7] https://lore.kernel.org/all/20221122115007.2787017-1-usama.anjum@collabora.com
[8] https://lore.kernel.org/all/20221220162606.1595355-1-usama.anjum@collabora.com
[9] https://lore.kernel.org/all/20230306213925.617814-1-peterx@redhat.com
[10] https://lore.kernel.org/all/20230125144529.1630917-1-mdanylo@google.com


This patch (of 6):

Add a new userfaultfd-wp feature UFFD_FEATURE_WP_ASYNC, that allows
userfaultfd wr-protect faults to be resolved by the kernel directly.

It can be used like a high accuracy version of soft-dirty, without vma
modifications during tracking, and also with ranged support by default
rather than for a whole mm when reset the protections due to existence of
ioctl(UFFDIO_WRITEPROTECT).

Several goals of such a dirty tracking interface:

1. All types of memory should be supported and tracable. This is nature
   for soft-dirty but should mention when the context is userfaultfd,
   because it used to only support anon/shmem/hugetlb. The problem is for
   a dirty tracking purpose these three types may not be enough, and it's
   legal to track anything e.g. any page cache writes from mmap.

2. Protections can be applied to partial of a memory range, without vma
   split/merge fuss.  The hope is that the tracking itself should not
   affect any vma layout change.  It also helps when reset happens because
   the reset will not need mmap write lock which can block the tracee.

3. Accuracy needs to be maintained.  This means we need pte markers to work
   on any type of VMA.

One could question that, the whole concept of async dirty tracking is not
really close to fundamentally what userfaultfd used to be: it's not "a
fault to be serviced by userspace" anymore. However, using userfaultfd-wp
here as a framework is convenient for us in at least:

1. VM_UFFD_WP vma flag, which has a very good name to suite something like
   this, so we don't need VM_YET_ANOTHER_SOFT_DIRTY. Just use a new
   feature bit to identify from a sync version of uffd-wp registration.

2. PTE markers logic can be leveraged across the whole kernel to maintain
   the uffd-wp bit as long as an arch supports, this also applies to this
   case where uffd-wp bit will be a hint to dirty information and it will
   not go lost easily (e.g. when some page cache ptes got zapped).

3. Reuse ioctl(UFFDIO_WRITEPROTECT) interface for either starting or
   resetting a range of memory, while there's no counterpart in the old
   soft-dirty world, hence if this is wanted in a new design we'll need a
   new interface otherwise.

We can somehow understand that commonality because uffd-wp was
fundamentally a similar idea of write-protecting pages just like
soft-dirty.

This implementation allows WP_ASYNC to imply WP_UNPOPULATED, because so
far WP_ASYNC seems to not usable if without WP_UNPOPULATE.  This also
gives us chance to modify impl of WP_ASYNC just in case it could be not
depending on WP_UNPOPULATED anymore in the future kernels.  It's also fine
to imply that because both features will rely on PTE_MARKER_UFFD_WP config
option, so they'll show up together (or both missing) in an UFFDIO_API
probe.

vma_can_userfault() now allows any VMA if the userfaultfd registration is
only about async uffd-wp.  So we can track dirty for all kinds of memory
including generic file systems (like XFS, EXT4 or BTRFS).

One trick worth mention in do_wp_page() is that we need to manually update
vmf->orig_pte here because it can be used later with a pte_same() check -
this path always has FAULT_FLAG_ORIG_PTE_VALID set in the flags.

The major defect of this approach of dirty tracking is we need to populate
the pgtables when tracking starts.  Soft-dirty doesn't do it like that. 
It's unwanted in the case where the range of memory to track is huge and
unpopulated (e.g., tracking updates on a 10G file with mmap() on top,
without having any page cache installed yet).  One way to improve this is
to allow pte markers exist for larger than PTE level for PMD+.  That will
not change the interface if to implemented, so we can leave that for
later.

Link: https://lkml.kernel.org/r/20230821141518.870589-1-usama.anjum@collabora.com
Link: https://lkml.kernel.org/r/20230821141518.870589-2-usama.anjum@collabora.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Miroslaw <emmir@google.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Paul Gofman <pgofman@codeweavers.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yun Zhou <yun.zhou@windriver.com>
Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:12 -07:00
Waiman Long
a41796b553 docs/cgroup: Add the list of threaded controllers to cgroup-v2.rst
The cgroup-v2 file mentions the concept of threaded controllers which can
be used in a threaded cgroup. However, it doesn't mention clearly which
controllers are threaded leading to some confusion about what controller
can be used requiring some experimentation. Clear this up by explicitly
listing the controllers that can be used currently in a threaded cgroup.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2023-10-17 22:45:44 -10:00
Jakub Kicinski
a3c2dd9648 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZS1d4wAKCRDbK58LschI
 g4DSAP441CdKh8fd+wNKUSKHFbpCQ6EvocR6Nf+Sj2DFUx/w/QEA7mfju7Abqjc3
 xwDEx0BuhrjMrjV5MmEpxc7lYl9XcQU=
 =vuWk
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-10-16

We've added 90 non-merge commits during the last 25 day(s) which contain
a total of 120 files changed, 3519 insertions(+), 895 deletions(-).

The main changes are:

1) Add missed stats for kprobes to retrieve the number of missed kprobe
   executions and subsequent executions of BPF programs, from Jiri Olsa.

2) Add cgroup BPF sockaddr hooks for unix sockets. The use case is
   for systemd to reimplement the LogNamespace feature which allows
   running multiple instances of systemd-journald to process the logs
   of different services, from Daan De Meyer.

3) Implement BPF CPUv4 support for s390x BPF JIT, from Ilya Leoshkevich.

4) Improve BPF verifier log output for scalar registers to better
   disambiguate their internal state wrt defaults vs min/max values
   matching, from Andrii Nakryiko.

5) Extend the BPF fib lookup helpers for IPv4/IPv6 to support retrieving
   the source IP address with a new BPF_FIB_LOOKUP_SRC flag,
   from Martynas Pumputis.

6) Add support for open-coded task_vma iterator to help with symbolization
   for BPF-collected user stacks, from Dave Marchevsky.

7) Add libbpf getters for accessing individual BPF ring buffers which
   is useful for polling them individually, for example, from Martin Kelly.

8) Extend AF_XDP selftests to validate the SHARED_UMEM feature,
   from Tushar Vyavahare.

9) Improve BPF selftests cross-building support for riscv arch,
   from Björn Töpel.

10) Add the ability to pin a BPF timer to the same calling CPU,
   from David Vernet.

11) Fix libbpf's bpf_tracing.h macros for riscv to use the generic
   implementation of PT_REGS_SYSCALL_REGS() to access syscall arguments,
   from Alexandre Ghiti.

12) Extend libbpf to support symbol versioning for uprobes, from Hengqi Chen.

13) Fix bpftool's skeleton code generation to guarantee that ELF data
    is 8 byte aligned, from Ian Rogers.

14) Inherit system-wide cpu_mitigations_off() setting for Spectre v1/v4
    security mitigations in BPF verifier, from Yafang Shao.

15) Annotate struct bpf_stack_map with __counted_by attribute to prepare
    BPF side for upcoming __counted_by compiler support, from Kees Cook.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (90 commits)
  bpf: Ensure proper register state printing for cond jumps
  bpf: Disambiguate SCALAR register state output in verifier logs
  selftests/bpf: Make align selftests more robust
  selftests/bpf: Improve missed_kprobe_recursion test robustness
  selftests/bpf: Improve percpu_alloc test robustness
  selftests/bpf: Add tests for open-coded task_vma iter
  bpf: Introduce task_vma open-coded iterator kfuncs
  selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c
  bpf: Don't explicitly emit BTF for struct btf_iter_num
  bpf: Change syscall_nr type to int in struct syscall_tp_t
  net/bpf: Avoid unused "sin_addr_len" warning when CONFIG_CGROUP_BPF is not set
  bpf: Avoid unnecessary audit log for CPU security mitigations
  selftests/bpf: Add tests for cgroup unix socket address hooks
  selftests/bpf: Make sure mount directory exists
  documentation/bpf: Document cgroup unix socket address hooks
  bpftool: Add support for cgroup unix socket address hooks
  libbpf: Add support for cgroup unix socket address hooks
  bpf: Implement cgroup sockaddr hooks for unix sockets
  bpf: Add bpf_sock_addr_set_sun_path() to allow writing unix sockaddr from bpf
  bpf: Propagate modified uaddrlen from cgroup sockaddr programs
  ...
====================

Link: https://lore.kernel.org/r/20231016204803.30153-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-16 21:05:33 -07:00
Rik van Riel
94b3f0b5af smp,csd: Throw an error if a CSD lock is stuck for too long
The CSD lock seems to get stuck in 2 "modes". When it gets stuck
temporarily, it usually gets released in a few seconds, and sometimes
up to one or two minutes.

If the CSD lock stays stuck for more than several minutes, it never
seems to get unstuck, and gradually more and more things in the system
end up also getting stuck.

In the latter case, we should just give up, so the system can dump out
a little more information about what went wrong, and, with panic_on_oops
and a kdump kernel loaded, dump a whole bunch more information about what
might have gone wrong.  In addition, there is an smp.panic_on_ipistall
kernel boot parameter that by default retains the old behavior, but when
set enables the panic after the CSD lock has been stuck for more than
the specified number of milliseconds, as in 300,000 for five minutes.

[ paulmck: Apply Imran Khan feedback. ]
[ paulmck: Apply Leonardo Bras feedback. ]

Link: https://lore.kernel.org/lkml/bc7cc8b0-f587-4451-8bcd-0daae627bcc7@paulmck-laptop/
Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Imran Khan <imran.f.khan@oracle.com>
Reviewed-by: Leonardo Bras <leobras@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Randy Dunlap <rdunlap@infradead.org>
2023-10-16 16:06:37 -07:00
Stefan Roesch
b0540208a5 mm/ksm: document pages_skipped sysfs knob
This adds documentation for the new metric pages_skipped.

Link: https://lkml.kernel.org/r/20230926040939.516161-5-shr@devkernel.io
Signed-off-by: Stefan Roesch <shr@devkernel.io>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-16 15:44:39 -07:00
Stefan Roesch
75d7dd4138 mm/ksm: document smart scan mode
This adds documentation for the smart scan mode of KSM.

[akpm@linux-foundation.org: fix typo]
[akpm@linux-foundation.org: document that smart_scan defaults to on]
Link: https://lkml.kernel.org/r/20230926040939.516161-4-shr@devkernel.io
Signed-off-by: Stefan Roesch <shr@devkernel.io>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-16 15:44:39 -07:00
Ilkka Koskinen
0abe7f61c2 docs/perf: Add ampere_cspmu to toctree to fix a build warning
Add ampere_cspmu to toctree in order to address the following warning
produced when building documents:

	Documentation/admin-guide/perf/ampere_cspmu.rst: WARNING: document isn't included in any toctree

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/all/20231011172250.5a6498e5@canb.auug.org.au/
Fixes: 53a810ad3c ("perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU")
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20231012074103.3772114-1-ilkka@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-10-12 12:32:36 +01:00
Amos Wenger
2087f270be mm/memory-hotplug: fix typo in documentation
I'm 90% sure memory hotunplugging doesn't involve a "fist" phase

Signed-off-by: Amos Wenger <amos@bearcove.net>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20231006112636.97128-1-amos@bearcove.net
2023-10-10 13:35:55 -06:00
Vratislav Bendel
d17ff438a0 docs: mm: fix vm overcommit documentation for OVERCOMMIT_GUESS
Commit 8c7829b04c "mm: fix false-positive OVERCOMMIT_GUESS failures"
changed the behavior of the default OVERCOMMIT_GUESS setting.
Reflect the change also in the Documentation, namely files:
    Documentation/admin-guide/sysctl/vm.rst
    Documentation/mm/overcommit-accounting.rst

Reported-by: Jozef Bacik <jobacik@redhat.com>
Signed-off-by: Vratislav Bendel <vbendel@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20220829124638.63748-1-vbendel@redhat.com
2023-10-10 13:35:55 -06:00
Takahiro Itazuri
a3c12cf3a6 docs/hw-vuln: Update desc of best effort mode
Moves the description of the best effort mitigation mode to the table of
the possible values in the mds and tsx_async_abort docs, and adds the
same one to the mmio_stale_data doc.

Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230901082959.28310-1-itazur@amazon.com
2023-10-10 13:35:55 -06:00
Ilkka Koskinen
53a810ad3c perf: arm_cspmu: ampere_cspmu: Add support for Ampere SoC PMU
Ampere SoC PMU follows CoreSight PMU architecture. It uses implementation
specific registers to filter events rather than PMEVFILTnR registers.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20230913233941.9814-5-ilkka@os.amperecomputing.com
[will: Include linux/io.h in ampere_cspmu.c for writel()]
Signed-off-by: Will Deacon <will@kernel.org>
2023-10-10 19:10:54 +01:00
Shrikanth Hegde
8f833c82cd sched/topology: Change behaviour of the 'sched_energy_aware' sysctl, based on the platform
The 'sched_energy_aware' sysctl is available for the admin to disable/enable
energy aware scheduling(EAS). EAS is enabled only if few conditions are
met by the platform. They are, asymmetric CPU capacity, no SMT,
schedutil CPUfreq governor, frequency invariant load tracking etc.
A platform may boot without EAS capability, but could gain such
capability at runtime. For example, changing/registering the cpufreq
governor to schedutil.

At present, though platform doesn't support EAS, this sysctl returns 1
and it ends up calling build_perf_domains on write to 1 and
NOP when writing to 0. That is confusing and un-necessary.

Desired behavior would be to have this sysctl to enable/disable the EAS
on supported platform. On non-supported platform write to the sysctl
would return not supported error and read of the sysctl would return
empty. So sched_energy_aware returns empty - EAS is not possible at this moment
This will include EAS capable platforms which have at least one EAS
condition false during startup, e.g. not using the schedutil cpufreq governor
sched_energy_aware returns 0 - EAS is supported but disabled by admin.
sched_energy_aware returns 1 - EAS is supported and enabled.

User can find out the reason why EAS is not possible by checking
info messages. sched_is_eas_possible returns true if the platform
can do EAS at this moment.

Signed-off-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Link: https://lore.kernel.org/r/20231009060037.170765-3-sshegde@linux.vnet.ibm.com
2023-10-09 17:24:44 +02:00
Waiman Long
aa1567a7e6 intel_idle: Add ibrs_off module parameter to force-disable IBRS
Commit bf5835bcdb ("intel_idle: Disable IBRS during long idle")
disables IBRS when the cstate is 6 or lower. However, there are
some use cases where a customer may want to use max_cstate=1 to
lower latency. Such use cases will suffer from the performance
degradation caused by the enabling of IBRS in the sibling idle thread.
Add a "ibrs_off" module parameter to force disable IBRS and the
CPUIDLE_FLAG_IRQ_ENABLE flag if set.

In the case of a Skylake server with max_cstate=1, this new ibrs_off
option will likely increase the IRQ response latency as IRQ will now
be disabled.

When running SPECjbb2015 with cstates set to C1 on a Skylake system.

First test when the kernel is booted with: "intel_idle.ibrs_off":

  max-jOPS = 117828, critical-jOPS = 66047

Then retest when the kernel is booted without the "intel_idle.ibrs_off"
added:

  max-jOPS = 116408, critical-jOPS = 58958

That means booting with "intel_idle.ibrs_off" improves performance by:

  max-jOPS:      +1.2%, which could be considered noise range.
  critical-jOPS: +12%,  which is definitely a solid improvement.

The admin-guide/pm/intel_idle.rst file is updated to add a description
about the new "ibrs_off" module parameter.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20230727184600.26768-5-longman@redhat.com
2023-10-07 11:33:28 +02:00
Ross Zwisler
85b901e600 media: visl: use canonical ftrace path
The canonical location for the tracefs filesystem is at
/sys/kernel/tracing.

But, from Documentation/trace/ftrace.rst:

  Before 4.1, all ftrace tracing control files were within the debugfs
  file system, which is typically located at /sys/kernel/debug/tracing.
  For backward compatibility, when mounting the debugfs file system,
  the tracefs file system will be automatically mounted at:

  /sys/kernel/debug/tracing

Update the visl decoder driver documentation to use this tracefs path.

Signed-off-by: Ross Zwisler <zwisler@google.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2023-10-07 10:55:45 +02:00
Martin Tůma
bd7e2477d7 media: Documentation: Added Digiteq Automotive MGB4 driver documentation
The "admin-guide" documentation for the Digiteq Automotive MGB4 driver.

Signed-off-by: Martin Tůma <martin.tuma@digiteqautomotive.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
2023-10-07 10:52:16 +02:00
Liu Shixin
72a14e821c memcg: expose swapcache stat for memcg v1
Patch series "Expose swapcache stat for memcg v1", v2.

Since commit b603894248 ("mm: memcg: add swapcache stat for memcg v2")
adds swapcache stat for the cgroup v2, it seems there is no reason to hide
it in memcg v1.  Conversely, with swapcached it is more accurate to
evaluate the available memory for memcg.

Link: https://lkml.kernel.org/r/20230915105845.3199656-1-liushixin2@huawei.com
Link: https://lkml.kernel.org/r/20230915105845.3199656-2-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Suggested-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-06 14:44:10 -07:00
Ingo Molnar
3fc18b06b8 Linux 6.6-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmUZ4WEeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGnIYH/07zef2U1nlqI+ro
 HRL2GlWGIs9yE70Oax+A3eYUYsjJIPu0yiDhFHUgOV3VyAALo44ZX/WNwKCGsI3e
 zhuOeItyyVcLGZXVC/jxSu0uveyfEiEYIWRYGyQ6Sna8Ksdk/qwhNgQNotdWdQG5
 7xt8z32couglu0uOkxcGqjTxmbjO6WSM5qi7Ts+xLsgrcS5cRuNhAg/vezp9bfeL
 1IUieCih4RJFgar/6LPOiB8uoVXEBonVbtlTRRqYdnqcsSIC+ACR9ZFk/+X88b5z
 S+Ta5VTcOAPu+2M/lSGe+PlUECvoBNK0SIYnaVCP2paPmDxfDXOFvSy/qJE87/7L
 9BeonFw=
 =8FTr
 -----END PGP SIGNATURE-----

Merge tag 'v6.6-rc4' into x86/entry, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-10-05 10:05:51 +02:00
Luiz Capitulino
9b81d3a5be cgroup: add cgroup_favordynmods= command-line option
We have a need of using favordynmods with cgroup v1, which doesn't support
changing mount flags during remount. Enabling CONFIG_CGROUP_FAVOR_DYNMODS at
build-time is not an option because we want to be able to selectively
enable it for certain systems.

This commit addresses this by introducing the cgroup_favordynmods=
command-line option. This option works for both cgroup v1 and v2 and also
allows for disabling favorynmods when the kernel built with
CONFIG_CGROUP_FAVOR_DYNMODS=y.

Also, note that when cgroup_favordynmods=true favordynmods is never
disabled in cgroup_destroy_root().

Signed-off-by: Luiz Capitulino <luizcap@amazon.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2023-10-04 08:47:55 -10:00
SeongJae Park
033343d5c5 Docs/admin-guide/mm/damon/usage: update for DAMOS apply intervals
Update DAMON usage document's DAMON sysfs interface section for the newly
added DAMOS apply intervals support (apply_interval_us file).

Link: https://lkml.kernel.org/r/20230916020945.47296-9-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:32:31 -07:00
SeongJae Park
1b2b7a17ab Docs/admin-guide/mm/damon/usage: document damos_before_apply tracepoint
Document damos_before_apply tracepoint on the usage document.

Link: https://lkml.kernel.org/r/20230913022050.2109-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:32:28 -07:00
Xin Hao
811244a501 mm: memcg: add THP swap out info for anonymous reclaim
At present, we support per-memcg reclaim strategy, however we do not know
the number of transparent huge pages being reclaimed, as we know the
transparent huge pages need to be splited before reclaim them, and they
will bring some performance bottleneck effect.  for example, when two
memcg (A & B) are doing reclaim for anonymous pages at same time, and 'A'
memcg is reclaiming a large number of transparent huge pages, we can
better analyze that the performance bottleneck will be caused by 'A'
memcg.  therefore, in order to better analyze such problems, there add THP
swap out info for per-memcg.

[akpm@linux-foundation.orgL fix swap_writepage_fs(), per Johannes]
  Link: https://lkml.kernel.org/r/20230913213343.GB48476@cmpxchg.org
Link: https://lkml.kernel.org/r/20230913164938.16918-1-vernhao@tencent.com
Signed-off-by: Xin Hao <vernhao@tencent.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:32:27 -07:00
SeongJae Park
46158bf211 Docs/admin-guide/mm/damon/usage: link design doc for details of kdamond and context
The explanation of kdamond and context is duplicated in the design and
the usage documents.  Replace that in the usage with links to those in
the design document.

Link: https://lkml.kernel.org/r/20230907022929.91361-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:32:21 -07:00
SeongJae Park
4f554ca15a Docs/admin-guide/mm/damon/usage: explain the format of damon_aggregate tracepoint
The example of the section for damon_aggregated tracepoint is not
explaining how the output looks like, and how it can be interpreted.
Add it.

Link: https://lkml.kernel.org/r/20230907022929.91361-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:32:21 -07:00
SeongJae Park
4f7112786e Docs/admin-guide/mm/damon/usage: move debugfs intro to the bottom of the section
On the DAMON usage introduction section, the introduction of DAMON
debugfs interface, which is deprecated, is above kernel API, which is
actively supported.  Move the DAMON debugfs intro to bottom, so that
readers have less chances to read it.

Link: https://lkml.kernel.org/r/20230907022929.91361-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:32:21 -07:00
SeongJae Park
75999724ba Docs/admin-guide/mm/damon/usage: place debugfs usage at the bottom
debugfs interface is deprecated.  Put it at the bottom of the document
so that readers have less chances to read it.

Link: https://lkml.kernel.org/r/20230907022929.91361-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:32:21 -07:00
SeongJae Park
b4c078004a Docs/admin-guide/mm/damon/usage: fixup missed :ref: keyword
Patch series "mm/damon: misc fixups for documents, comments and its
tracepoint".

This patchset contains miscellaneous simple fixups for documents, comments and
tracepoint of DAMON.


This patch (of 11):

A cross-link reference in DAMON usage document is missing ':ref:' Sphynx
keyword.  Fix it.

Link: https://lkml.kernel.org/r/20230907022929.91361-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20230907022929.91361-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-04 10:32:21 -07:00
Randy Dunlap
92224e806f docs: admin-guide: sysctl: fix details of struct dentry_stat_t
Commit c8c0c239d5 moved struct dentry_stat_t to fs/dcache.c but
did not update its location in Documentation, so update that now.
Also change each struct member from int to long as done in
commit 3942c07ccf.

Fixes: c8c0c239d5 ("fs: move dcache sysctls to its own file")
Fixes: 3942c07ccf ("fs: bump inode and dentry counters to long")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Glauber Costa <glommer@openvz.org>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230923195144.26043-1-rdunlap@infradead.org
2023-10-03 09:25:55 -06:00
Niklas Schnelle
c76c067e48 s390/pci: Use dma-iommu layer
While s390 already has a standard IOMMU driver and previous changes have
added I/O TLB flushing operations this driver is currently only used for
user-space PCI access such as vfio-pci. For the DMA API s390 instead
utilizes its own implementation in arch/s390/pci/pci_dma.c which drives
the same hardware and shares some code but requires a complex and
fragile hand over between DMA API and IOMMU API use of a device and
despite code sharing still leads to significant duplication and
maintenance effort. Let's utilize the common code DMAP API
implementation from drivers/iommu/dma-iommu.c instead allowing us to
get rid of arch/s390/pci/pci_dma.c.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20230928-dma_iommu-v13-3-9e5fc4dacc36@linux.ibm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-10-02 08:43:00 +02:00
Michal Hocko
4597648fdd mm, memcg: reconsider kmem.limit_in_bytes deprecation
This reverts commits 86327e8eb9 ("memcg: drop kmem.limit_in_bytes") and
partially reverts 58056f7750 ("memcg, kmem: further deprecate
kmem.limit_in_bytes") which have incrementally removed support for the
kernel memory accounting hard limit.  Unfortunately it has turned out that
there is still userspace depending on the existence of
memory.kmem.limit_in_bytes [1].  The underlying functionality is not
really required but the non-existent file just confuses the userspace
which fails in the result.  The patch to fix this on the userspace side
has been submitted but it is hard to predict how it will propagate through
the maze of 3rd party consumers of the software.

Now, reverting alone 86327e8eb9 is not an option because there is
another set of userspace which cannot cope with ENOTSUPP returned when
writing to the file.  Therefore we have to go and revisit 58056f7750 as
well.  There are two ways to go ahead.  Either we give up on the
deprecation and fully revert 58056f7750 as well or we can keep
kmem.limit_in_bytes but make the write a noop and warn about the fact. 
This should work for both known breaking workloads which depend on the
existence but do not depend on the hard limit enforcement.

Note to backporters to stable trees.  a8c49af3be ("memcg: add per-memcg
total kernel memory stat") introduced in 4.18 has added memcg_account_kmem
so the accounting is not done by obj_cgroup_charge_pages directly for v1
anymore.  Prior kernels need to add it explicitly (thanks to Johannes for
pointing this out).

[akpm@linux-foundation.org: fix build - remove unused local]
Link: http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net [1]
Link: https://lkml.kernel.org/r/ZRE5VJozPZt9bRPy@dhcp22.suse.cz
Fixes: 86327e8eb9 ("memcg: drop kmem.limit_in_bytes")
Fixes: 58056f7750 ("memcg, kmem: further deprecate kmem.limit_in_bytes")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Tejun heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-29 17:20:47 -07:00
Fernando Eckhardt Valle
18801efed7
platform/x86: thinkpad_acpi: sysfs interface to auxmac
Newer Thinkpads have a feature called MAC Address Pass-through.
This patch provides a sysfs interface that userspace can use
to get this auxiliary mac address.

Signed-off-by: Fernando Eckhardt Valle <fevalle@ipt.br>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230926202144.5906-1-fevalle@ipt.br
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-09-28 13:12:36 +03:00
Paul E. McKenney
2273799c29 locktorture: Rename readers_bind/writers_bind to bind_readers/bind_writers
This commit renames the readers_bind and writers_bind module parameters
to bind_readers and bind_writers, respectively.  This provides added
clarity via the imperative mode and better organizes the documentation.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-24 17:24:02 +02:00
Paul E. McKenney
b1326d766b doc: Catch-up update for locktorture module parameters
This commit documents recently added locktorture module parameters.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-24 17:24:02 +02:00
Paul E. McKenney
7f993623e9 locktorture: Add call_rcu_chains module parameter
When running locktorture on large systems, there will normally be
enough RCU activity to ensure that there is a grace period in flight
at all times.  However, on smaller systems, RCU might well be idle the
majority of the time.  This situation can be inconvenient in cases where
the RCU CPU stall warning is part of the debugging process.

This commit therefore adds an call_rcu_chains module parameter to
locktorture, allowing the user to specify the desired number of
self-propagating call_rcu() chains.  For good measure, immediately
before invoking call_rcu(), the self-propagating RCU callback invokes
start_poll_synchronize_rcu() to force the immediate start of a grace
period, with the call_rcu() forcing another to start shortly thereafter.

Booting with locktorture.call_rcu_chains=2 increases the probability
of a stuck locking primitive resulting in an RCU CPU stall warning from
about 25% to nearly 100%.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-24 17:24:02 +02:00
Tudor Ambarus
5ee1a43047 docs: pstore-blk.rst: fix typo, s/console/ftrace
The author referred to the ftrace log, but mentioned console log
instead. Fix the typo.

Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Link: https://lore.kernel.org/r/20230914133729.1956907-2-tudor.ambarus@linaro.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-09-23 20:45:26 -07:00
Tudor Ambarus
4bc028de97 docs: pstore-blk.rst: use "about" as a preposition after "care"
Reword the sentence to "care about the {oops/panic, pmsg, console} log."

Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Link: https://lore.kernel.org/r/20230914133729.1956907-1-tudor.ambarus@linaro.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-09-23 20:45:26 -07:00
Mikko Rapeli
e4c0138ab3 Documentation efi-stub.rst: fix arm64 EFI source location
arch/arm64/kernel/efi-entry.S has been moved to
drivers/firmware/efi/libstub/arm64-entry.S by commit
v6.1-rc4-6-g4ef806096bdb and to
drivers/firmware/efi/libstub/arm64.c by commit
v6.2-rc3-6-g617861703830

Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20230904113214.4147021-1-mikko.rapeli@linaro.org>
2023-09-22 05:29:19 -06:00
Wang Jinchao
d25e92d2ae memory-hotplug.rst: fix wrong /sys/device/ path
Actually, it should be /sys/devices/

Signed-off-by: Wang Jinchao <wangjinchao@xfusion.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <ZQz1NUATBMOb3RT+@fedora>
2023-09-22 05:19:14 -06:00
Tiezhu Yang
ac0691c75a bpf, docs: Add loongarch64 as arch supporting BPF JIT
As BPF JIT support for loongarch64 was added about one year ago
with commit 5dc615520c ("LoongArch: Add BPF JIT support"), it
is appropriate to add loongarch64 as arch supporting BPF JIT in
bpf and sysctl docs as well.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/1695111937-19697-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-09-21 15:30:09 -07:00
Waiman Long
efdf7532bd cgroup/cpuset: Documentation update for partition
This patch updates the cgroup-v2.rst file to include information about
the new "cpuset.cpus.exclusive" and "cpuset.cpus.excluisve.effective"
control files as well as the new remote partition type.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2023-09-18 10:32:31 -10:00
Nikolay Borisov
a11e097504 x86: Make IA32_EMULATION boot time configurable
Distributions would like to reduce their attack surface as much as
possible but at the same time they'd want to retain flexibility to cater
to a variety of legacy software. This stems from the conjecture that
compat layer is likely rarely tested and could have latent security
bugs. Ideally distributions will set their default policy and also
give users the ability to override it as appropriate.

To enable this use case, introduce CONFIG_IA32_EMULATION_DEFAULT_DISABLED
compile time option, which controls whether 32bit processes/syscalls
should be allowed or not. This option is aimed mainly at distributions
to set their preferred default behavior in their kernels.

To allow users to override the distro's policy, introduce the 'ia32_emulation'
parameter which allows overriding CONFIG_IA32_EMULATION_DEFAULT_DISABLED
state at boot time.

Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230623111409.3047467-7-nik.borisov@suse.com
2023-09-14 13:19:53 +02:00
Paul E. McKenney
16128b1f8c rcu: Add sysfs to provide throttled access to rcu_barrier()
When running a series of stress tests all making heavy use of RCU,
it is all too possible to OOM the system when the prior test's RCU
callbacks don't get invoked until after the subsequent test starts.
One way of handling this is just a timed wait, but this fails when a
given CPU has so many callbacks queued that they take longer to invoke
than allowed for by that timed wait.

This commit therefore adds an rcutree.do_rcu_barrier module parameter that
is accessible from sysfs.  Writing one of the many synonyms for boolean
"true" will cause an rcu_barrier() to be invoked, but will guarantee that
no more than one rcu_barrier() will be invoked per sixteenth of a second
via this mechanism.  The flip side is that a given request might wait a
second or three longer than absolutely necessary, but only when there are
multiple uses of rcutree.do_rcu_barrier within a one-second time interval.

This commit unnecessarily serializes the rcu_barrier() machinery, given
that serialization is already provided by procfs.  This has the advantage
of allowing throttled rcu_barrier() from other sources within the kernel.

Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-13 22:28:49 +02:00
Paul E. McKenney
8a4c0c90f2 doc: Add refscale.lookup_instances to kernel-parameters.txt
This commit adds refscale.lookup_instances to kernel-parameters.txt.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2023-09-11 23:02:41 +02:00
Ard Biesheuvel
944834901a Documentation: Drop or replace remaining mentions of IA64
Drop or update mentions of IA64, as appropriate.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-09-11 08:13:18 +00:00
Linus Torvalds
535a265d7f perf tools changes for v6.6:
perf tools maintainership:
 
 - Add git information for perf-tools and perf-tools-next trees/branches to the
   MAINTAINERS file. That is where development now takes place and myself and
   Namhyung Kim have write access, more people to come as we emulate other
   maintainer groups.
 
 perf record:
 
 - Record kernel data maps when 'perf record --data' is used, so that global variables can
   be resolved and used in tools that do data profiling.
 
 perf trace:
 
 - Remove the old, experimental support for BPF events in which a .c file was passed as
   an event: "perf trace -e hello.c" to then get compiled and loaded.
 
   The only known usage for that, that shipped with the kernel as an example for such events,
   augmented the raw_syscalls tracepoints and was converted to a libbpf skeleton, reusing all
   the user space components and the BPF code connected to the syscalls.
 
   In the end just the way to glue the BPF part and the user space type beautifiers changed,
   now being performed by libbpf skeletons.
 
   The next step is to use BTF to do pretty printing of all syscall types, as discussed with
   Alan Maguire and others.
 
   Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all path/filenames/strings,
   some of the networking data structures, perf_event_attr, etc, i.e. systemwide tracing of
   nanosleep calls and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5 seconds:
 
   # perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5
      0.000 (   9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
      9.039 (   0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
          ? (           ): gpm/991  ... [continued]: clock_nanosleep())               = 0
     10.133 (           ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ...
          ? (           ): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
     30.276 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
    223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
     30.276 (2000.394 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
   1230.814 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
   1230.814 (1000.404 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
   2030.886 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
   2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
          ? (           ): crond/1172  ... [continued]: clock_nanosleep())            = 0
   3242.699 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
   2030.886 (2000.385 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
   3728.078 (           ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ...
   3242.699 (1000.158 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
   4031.409 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
     10.133 (5000.375 ms): sleep/327642  ... [continued]: clock_nanosleep())          = 0
 
  Performance counter stats for 'sleep 5':
 
          2,617,347      cycles
          1,855,997      instructions                     #    0.71  insn per cycle
 
        5.002282128 seconds time elapsed
 
        0.000855000 seconds user
        0.000852000 seconds sys
   #
 
 perf annotate:
 
 - Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1) for
   licensing reasons, and we missed a build test on tools/perf/tests makefile.
 
   Since we now default to NDEBUG=1, we ended up segfaulting when building with
   BUILD_NONDISTRO=1 because a needed initialization routine was being "error
   checked" via an assert.
 
   Fix it by explicitly checking the result and aborting instead if it fails.
 
   We better back propagate the error, but at least 'perf annotate' on samples
   collected for a BPF program is back working when perf is built with
   BUILD_NONDISTRO=1.
 
 perf report/top:
 
 - Add back TUI hierarchy mode header, that is seen when using 'perf report/top --hierarchy'.
 
 - Fix the number of entries for 'e' key in the TUI that was preventing navigation of
   lines when expanding an entry.
 
 perf report/script:
 
 - Support cross platform register handling, allowing a perf.data file collected
   on one architecture to have registers sampled correctly displayed when
   analysis tools such as 'perf report' and 'perf script' are used on a different
   architecture.
 
 - Fix handling of event attributes in pipe mode, i.e. when one uses:
 
 	perf record -o - | perf report -i -
 
   When no perf.data files are used.
 
 - Handle files generated via pipe mode with a version of perf and then read
   also via pipe mode with a different version of perf, where the event attr
   record may have changed, use the record size field to properly support this
   version mismatch.
 
 perf probe:
 
 - Accessing global variables from uprobes isn't supported, make the error
   message state that instead of stating that some minimal kernel version is
   needed to have that feature. This seems just a tool limitation, the kernel
   probably has all that is needed.
 
 perf tests:
 
 - Fix a reference count related leak in the dlfilter v0 API where the result
   of a thread__find_symbol_fb() is not matched with an addr_location__exit()
   to drop the reference counts of the resolved components (machine, thread, map,
   symbol, etc). Add a dlfilter test to make sure that doesn't regresses.
 
 - Lots of fixes for the 'perf test' written in shell script related to problems
   found with the shellcheck utility.
 
 - Fixes for 'perf test' shell scripts testing features enabled when perf is
   built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf counters.
 
 - Add perf record sample filtering test, things like the following example, that gets
   implemented as a BPF filter attached to the event:
 
    # perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000'
 
 - Improve the way the task_analyzer test checks if libtraceevent is linked,
   using 'perf version --build-options' instead of the more expensinve
   'perf record -e "sched:sched_switch"'.
 
 - Add support for riscv in the mmap-basic test. (This went as well via the RiscV tree, same contents).
 
 libperf:
 
 - Implement riscv mmap support (This went as well via the RiscV tree, same contents).
 
 perf script:
 
 - New tool that converts perf.data files to the firefox profiler format so that one can use
   the visualizer at https://profiler.firefox.com/. Done by Anup Sharma as part of this year's
   Google Summer of Code.
 
   One can generate the output and upload it to the web interface but Anup also automated
   everything:
 
      perf script gecko -F 99 -a sleep 60
 
 - Support syscall name parsing on arm64.
 
 - Print "cgroup" field on the same line as "comm".
 
 perf bench:
 
 - Add new 'uprobe' benchmark to measure the overhead of uprobes with/without
   BPF programs attached to it.
 
 - breakpoints are not available on power9, skip that test.
 
 perf stat:
 
 - Add #num_cpus_online literal to be used in 'perf stat' metrics, and add this extra
   'perf test' check that exemplifies its purpose:
 
 	TEST_ASSERT_VAL("#num_cpus_online",
                        expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0);
 	TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
 	TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online);
 
 Miscellaneous:
 
 - Improve tool startup time by lazily reading PMU, JSON, sysfs data.
 
 - Improve error reporting in the parsing of events, passing YYLTYPE to error routines,
   so that the output can show were the parsing error was found.
 
 - Add 'perf test' entries to check the parsing of events improvements.
 
 - Fix various leak for things detected by -fsanitize=address, mostly things that would
   be freed at tool exit, including:
 
   - Free evsel->filter on the destructor.
 
   - Allow tools to register a thread->priv destructor and use it in 'perf trace'.
 
   - Free evsel->priv in 'perf trace'.
 
   - Free string returned by synthesize_perf_probe_point() when the caller fails
     to do all it needs.
 
 - Adjust various compiler options to not consider errors some warnings when
   building with broken headers found in things like python, flex, bison, as we
   otherwise build with -Werror. Some for gcc, some for clang, some for some
   specific version of those, some for some specific version of flex or bison, or
   some specific combination of these components, bah.
 
 - Allow customization of clang options for BPF target, this helps building on
   gentoo where there are other oddities where BPF targets gets passed some compiler
   options intended for the native build, so building with WERROR=0 helps while
   these oddities are fixed.
 
 - Dont pass ERR_PTR() values to perf_session__delete() in 'perf top' and 'perf lock',
   fixing some segfaults when handling some odd failures.
 
 - Add LTO build option.
 
 - Fix format of unordered lists in the perf docs (tools/perf/Documentation).
 
 - Overhaul the bison files, using constructs such as YYNOMEM.
 
 - Remove unused tokens from the bison .y files.
 
 - Add more comments to various structs.
 
 - A few LoongArch enablement patches.
 
 Vendor events (JSON):
 
 - Add JSON metrics for Yitian 710 DDR (aarch64). Things like:
 
 	EventName, BriefDescription
 	visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.",
 	visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.",
 	op_is_dqsosc_mpc	       , "A DQS Oscillator MPC command to DRAM.",
 	op_is_dqsosc_mrr	       , "A DQS Oscillator MRR command to DRAM.",
 	op_is_tcr_mrr		       , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.",
 
 - Add AmpereOne metrics (aarch64).
 
 - Update N2 and V2 metrics (aarch64) and events using Arm telemetry repo.
 
 - Update scale units and descriptions of common topdown metrics on aarch64. Things like:
 
   - "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)",
   - "BriefDescription": "Frontend bound L1 topdown metric",
   + "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))",
   + "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.",
 
 - Update events for intel: meteorlake to 1.04, sapphirerapids to 1.15, Icelake+ metric constraints.
 
 - Update files for the power10 platform.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZPfJZgAKCRCyPKLppCJ+
 J1/eAP9lgtavD0V75wy1p5zyotkceOmPTkk1DYFVx2Euhxa/lAD/YW/JvuVSo0Gr
 HqJP52XaV0tF8gG+YxL+Lay/Ke0P5AQ=
 =d12c
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "perf tools maintainership:

   - Add git information for perf-tools and perf-tools-next trees and
     branches to the MAINTAINERS file. That is where development now
     takes place and myself and Namhyung Kim have write access, more
     people to come as we emulate other maintainer groups.

  perf record:

   - Record kernel data maps when 'perf record --data' is used, so that
     global variables can be resolved and used in tools that do data
     profiling.

  perf trace:

   - Remove the old, experimental support for BPF events in which a .c
     file was passed as an event: "perf trace -e hello.c" to then get
     compiled and loaded.

     The only known usage for that, that shipped with the kernel as an
     example for such events, augmented the raw_syscalls tracepoints and
     was converted to a libbpf skeleton, reusing all the user space
     components and the BPF code connected to the syscalls.

     In the end just the way to glue the BPF part and the user space
     type beautifiers changed, now being performed by libbpf skeletons.

     The next step is to use BTF to do pretty printing of all syscall
     types, as discussed with Alan Maguire and others.

     Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all
     path/filenames/strings, some of the networking data structures,
     perf_event_attr, etc, i.e. systemwide tracing of nanosleep calls
     and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5
     seconds:

      # perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5
         0.000 (   9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
         9.039 (   0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
             ? (           ): gpm/991  ... [continued]: clock_nanosleep())               = 0
        10.133 (           ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ...
             ? (           ): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
        30.276 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
       223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
        30.276 (2000.394 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
      1230.814 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
      1230.814 (1000.404 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
      2030.886 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
      2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
             ? (           ): crond/1172  ... [continued]: clock_nanosleep())            = 0
      3242.699 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
      2030.886 (2000.385 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
      3728.078 (           ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ...
      3242.699 (1000.158 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
      4031.409 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
        10.133 (5000.375 ms): sleep/327642  ... [continued]: clock_nanosleep())          = 0

      Performance counter stats for 'sleep 5':

             2,617,347      cycles
             1,855,997      instructions                     #    0.71  insn per cycle

           5.002282128 seconds time elapsed

           0.000855000 seconds user
           0.000852000 seconds sys

  perf annotate:

   - Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1)
     for licensing reasons, and we missed a build test on
     tools/perf/tests makefile.

     Since we now default to NDEBUG=1, we ended up segfaulting when
     building with BUILD_NONDISTRO=1 because a needed initialization
     routine was being "error checked" via an assert.

     Fix it by explicitly checking the result and aborting instead if it
     fails.

     We better back propagate the error, but at least 'perf annotate' on
     samples collected for a BPF program is back working when perf is
     built with BUILD_NONDISTRO=1.

  perf report/top:

   - Add back TUI hierarchy mode header, that is seen when using 'perf
     report/top --hierarchy'.

   - Fix the number of entries for 'e' key in the TUI that was
     preventing navigation of lines when expanding an entry.

  perf report/script:

   - Support cross platform register handling, allowing a perf.data file
     collected on one architecture to have registers sampled correctly
     displayed when analysis tools such as 'perf report' and 'perf
     script' are used on a different architecture.

   - Fix handling of event attributes in pipe mode, i.e. when one uses:

  	perf record -o - | perf report -i -

     When no perf.data files are used.

   - Handle files generated via pipe mode with a version of perf and
     then read also via pipe mode with a different version of perf,
     where the event attr record may have changed, use the record size
     field to properly support this version mismatch.

  perf probe:

   - Accessing global variables from uprobes isn't supported, make the
     error message state that instead of stating that some minimal
     kernel version is needed to have that feature. This seems just a
     tool limitation, the kernel probably has all that is needed.

  perf tests:

   - Fix a reference count related leak in the dlfilter v0 API where the
     result of a thread__find_symbol_fb() is not matched with an
     addr_location__exit() to drop the reference counts of the resolved
     components (machine, thread, map, symbol, etc). Add a dlfilter test
     to make sure that doesn't regresses.

   - Lots of fixes for the 'perf test' written in shell script related
     to problems found with the shellcheck utility.

   - Fixes for 'perf test' shell scripts testing features enabled when
     perf is built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf
     counters.

   - Add perf record sample filtering test, things like the following
     example, that gets implemented as a BPF filter attached to the
     event:

       # perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000'

   - Improve the way the task_analyzer test checks if libtraceevent is
     linked, using 'perf version --build-options' instead of the more
     expensinve 'perf record -e "sched:sched_switch"'.

   - Add support for riscv in the mmap-basic test. (This went as well
     via the RiscV tree, same contents).

  libperf:

   - Implement riscv mmap support (This went as well via the RiscV tree,
     same contents).

  perf script:

   - New tool that converts perf.data files to the firefox profiler
     format so that one can use the visualizer at
     https://profiler.firefox.com/. Done by Anup Sharma as part of this
     year's Google Summer of Code.

     One can generate the output and upload it to the web interface but
     Anup also automated everything:

       perf script gecko -F 99 -a sleep 60

   - Support syscall name parsing on arm64.

   - Print "cgroup" field on the same line as "comm".

  perf bench:

   - Add new 'uprobe' benchmark to measure the overhead of uprobes
     with/without BPF programs attached to it.

   - breakpoints are not available on power9, skip that test.

  perf stat:

   - Add #num_cpus_online literal to be used in 'perf stat' metrics, and
     add this extra 'perf test' check that exemplifies its purpose:

  	TEST_ASSERT_VAL("#num_cpus_online",
                         expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0);
  	TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
  	TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online);

  Miscellaneous:

   - Improve tool startup time by lazily reading PMU, JSON, sysfs data.

   - Improve error reporting in the parsing of events, passing YYLTYPE
     to error routines, so that the output can show were the parsing
     error was found.

   - Add 'perf test' entries to check the parsing of events
     improvements.

   - Fix various leak for things detected by -fsanitize=address, mostly
     things that would be freed at tool exit, including:

       - Free evsel->filter on the destructor.

       - Allow tools to register a thread->priv destructor and use it in
         'perf trace'.

       - Free evsel->priv in 'perf trace'.

       - Free string returned by synthesize_perf_probe_point() when the
         caller fails to do all it needs.

   - Adjust various compiler options to not consider errors some
     warnings when building with broken headers found in things like
     python, flex, bison, as we otherwise build with -Werror. Some for
     gcc, some for clang, some for some specific version of those, some
     for some specific version of flex or bison, or some specific
     combination of these components, bah.

   - Allow customization of clang options for BPF target, this helps
     building on gentoo where there are other oddities where BPF targets
     gets passed some compiler options intended for the native build, so
     building with WERROR=0 helps while these oddities are fixed.

   - Dont pass ERR_PTR() values to perf_session__delete() in 'perf top'
     and 'perf lock', fixing some segfaults when handling some odd
     failures.

   - Add LTO build option.

   - Fix format of unordered lists in the perf docs
     (tools/perf/Documentation)

   - Overhaul the bison files, using constructs such as YYNOMEM.

   - Remove unused tokens from the bison .y files.

   - Add more comments to various structs.

   - A few LoongArch enablement patches.

  Vendor events (JSON):

   - Add JSON metrics for Yitian 710 DDR (aarch64). Things like:

  	EventName, BriefDescription
  	visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.",
  	visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.",
  	op_is_dqsosc_mpc	       , "A DQS Oscillator MPC command to DRAM.",
  	op_is_dqsosc_mrr	       , "A DQS Oscillator MRR command to DRAM.",
  	op_is_tcr_mrr		       , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.",

   - Add AmpereOne metrics (aarch64).

   - Update N2 and V2 metrics (aarch64) and events using Arm telemetry
     repo.

   - Update scale units and descriptions of common topdown metrics on
     aarch64. Things like:
       - "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)",
       - "BriefDescription": "Frontend bound L1 topdown metric",
       + "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))",
       + "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.",

   - Update events for intel: meteorlake to 1.04, sapphirerapids to
     1.15, Icelake+ metric constraints.

   - Update files for the power10 platform"

* tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (217 commits)
  perf parse-events: Fix driver config term
  perf parse-events: Fixes relating to no_value terms
  perf parse-events: Fix propagation of term's no_value when cloning
  perf parse-events: Name the two term enums
  perf list: Don't print Unit for "default_core"
  perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake
  perf dlfilter: Avoid leak in v0 API test use of resolve_address()
  perf metric: Add #num_cpus_online literal
  perf pmu: Remove str from perf_pmu_alias
  perf parse-events: Make common term list to strbuf helper
  perf parse-events: Minor help message improvements
  perf pmu: Avoid uninitialized use of alias->str
  perf jevents: Use "default_core" for events with no Unit
  perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test
  perf test shell stat_bpf_counters: Fix test on Intel
  perf test shell record_bpf_filter: Skip 6.2 kernel
  libperf: Get rid of attr.id field
  perf tools: Convert to perf_record_header_attr_id()
  libperf: Add perf_record_header_attr_id()
  perf tools: Handle old data in PERF_RECORD_ATTR
  ...
2023-09-09 20:06:17 -07:00
Linus Torvalds
7ccc3ebf0c io_uring-6.6-2023-09-08
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmT7NXsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgphevEACwpgak3KV+zE+u/2g8rsjRu9FDkgi9hW+7
 RKbwPtgTN3R3/z+eGq57gWW3aSJx0Mqaxf/mtEqf+K+yxK3FbDDKAkkwMG0S/TGT
 FT+beCeJHiA9tTruoW1ipGh7gxS0D3JWwPRl5i7+rtagzdbOSYWmsrNPVXl3uVee
 x6sORlZ3Ye2VRd0m+3fQ7nWw+epMKebMrlpmWC7og0mdt/zhF+q6zgAV8tdEYe7E
 VQ8yrWa9LyYw07a+M2kra2+IZzSWYZE7qZE1OZcwbI9ARvqNkXyslqXfW0gVrDml
 lfoUnwqsRI45RfTOTao6lAzc5QZHTlrl1Mqg79139grDaEM/bCx/kQq2ZUG03Tcy
 FUgKNrz6E/te5u1ar8E8ppXEAmjjAsKiYZsT+d5nHv7xPV9F15Ryu+Mr9sFAiHXw
 hMyInmBJ6JcQx9BYpvxTMi7N4Mfiw2Z9GTE2yhhVPvEAG418+YYa7CSNdXVSy/M2
 QuAQEc+K1CcVhmgNZbSe2FnSCve1khHURT0C3PYI0u1VHpZmkzcSY5utdKuqeRDw
 SqHk/XO3bBWW13UuLBnIB8+jk1AGKDc0vJQrdzxxxSYHc7uTx30QHYQMdA90Bqyx
 2z/lg7IwfITgFkRO30AGXbe2NN4YnbseJ5wyfHvlN/WcHrDo9zbZUt6gReVCcOWA
 /NBftGe8WA==
 =eHn2
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-6.6-2023-09-08' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:
 "A few fixes that should go into the 6.6-rc merge window:

   - Fix for a regression this merge window caused by the SQPOLL
     affinity patch, where we can race with SQPOLL thread shutdown and
     cause an oops when trying to set affinity (Gabriel)

   - Fix for a regression this merge window where fdinfo reading with
     for a ring setup with IORING_SETUP_NO_SQARRAY will attempt to
     deference the non-existing SQ ring array (me)

   - Add the patch that allows more finegrained control over who can use
     io_uring (Matteo)

   - Locking fix for a regression added this merge window for IOPOLL
     overflow (Pavel)

   - IOPOLL fix for stable, breaking our loop if helper threads are
     exiting (Pavel)

  Also had a fix for unreaped iopoll requests from io-wq from Ming, but
  we found an issue with that and hence it got reverted. Will get this
  sorted for a future rc"

* tag 'io_uring-6.6-2023-09-08' of git://git.kernel.dk/linux:
  Revert "io_uring: fix IO hang in io_wq_put_and_exit from do_exit()"
  io_uring: fix unprotected iopoll overflow
  io_uring: break out of iowq iopoll on teardown
  io_uring: add a sysctl to disable io_uring system-wide
  io_uring/fdinfo: only print ->sq_array[] if it's there
  io_uring: fix IO hang in io_wq_put_and_exit from do_exit()
  io_uring: Don't set affinity on a dying sqpoll thread
2023-09-08 21:32:28 -07:00
Matteo Rizzo
76d3ccecfa io_uring: add a sysctl to disable io_uring system-wide
Introduce a new sysctl (io_uring_disabled) which can be either 0, 1, or
2. When 0 (the default), all processes are allowed to create io_uring
instances, which is the current behavior.  When 1, io_uring creation is
disabled (io_uring_setup() will fail with -EPERM) for unprivileged
processes not in the kernel.io_uring_group group.  When 2, calls to
io_uring_setup() fail with -EPERM regardless of privilege.

Signed-off-by: Matteo Rizzo <matteorizzo@google.com>
[JEM: modified to add io_uring_group]
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Link: https://lore.kernel.org/r/x49y1i42j1z.fsf@segfault.boston.devel.redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-09-05 08:34:07 -06:00
Linus Torvalds
bd30fe6a7d workqueue: Changes for v6.6
* Unbound workqueues now support more flexible affinity scopes. The default
   behavior is to soft-affine according to last level cache boundaries. A
   work item queued from a given LLC is executed by a worker running on the
   same LLC but the worker may be moved across cache boundaries as the
   scheduler sees fit. On machines which multiple L3 caches, which are
   becoming more popular along with chiplet designs, this improves cache
   locality while not harming work conservation too much.
 
   Unbound workqueues are now also a lot more flexible in terms of execution
   affinity. Differeing levels of affinity scopes are supported and both the
   default and per-workqueue affinity settings can be modified dynamically.
   This should help working around amny of sub-optimal behaviors observed
   recently with asymmetric ARM CPUs.
 
   This involved signficant restructuring of workqueue code. Nothing was
   reported yet but there's some risk of subtle regressions. Should keep an
   eye out.
 
 * Rescuer workers now has more identifiable comms.
 
 * workqueue.unbound_cpus added so that CPUs which can be used by workqueue
   can be constrained early during boot.
 
 * Now that all the in-tree users have been flushed out, trigger warning if
   system-wide workqueues are flushed.
 
 * One pull commit from for-6.5-fixes to avoid cascading conflicts in the
   affinity scope patchset.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZPERlQ4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGVqQAPwIOy9tWY5jFAmMuIyH6wV50hbmfxCc2n5xhQNr
 5HoyGgEA8lw1W7afDCIPiQVA7AYsu8dhwuNSOcRCJxhrrn4XsA0=
 =g/Uu
 -----END PGP SIGNATURE-----

Merge tag 'wq-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue updates from Tejun Heo:

 - Unbound workqueues now support more flexible affinity scopes.

   The default behavior is to soft-affine according to last level cache
   boundaries. A work item queued from a given LLC is executed by a
   worker running on the same LLC but the worker may be moved across
   cache boundaries as the scheduler sees fit. On machines which
   multiple L3 caches, which are becoming more popular along with
   chiplet designs, this improves cache locality while not harming work
   conservation too much.

   Unbound workqueues are now also a lot more flexible in terms of
   execution affinity. Differeing levels of affinity scopes are
   supported and both the default and per-workqueue affinity settings
   can be modified dynamically. This should help working around amny of
   sub-optimal behaviors observed recently with asymmetric ARM CPUs.

   This involved signficant restructuring of workqueue code. Nothing was
   reported yet but there's some risk of subtle regressions. Should keep
   an eye out.

 - Rescuer workers now has more identifiable comms.

 - workqueue.unbound_cpus added so that CPUs which can be used by
   workqueue can be constrained early during boot.

 - Now that all the in-tree users have been flushed out, trigger warning
   if system-wide workqueues are flushed.

* tag 'wq-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (31 commits)
  workqueue: fix data race with the pwq->stats[] increment
  workqueue: Rename rescuer kworker
  workqueue: Make default affinity_scope dynamically updatable
  workqueue: Add "Affinity Scopes and Performance" section to documentation
  workqueue: Implement non-strict affinity scope for unbound workqueues
  workqueue: Add workqueue_attrs->__pod_cpumask
  workqueue: Factor out need_more_worker() check and worker wake-up
  workqueue: Factor out work to worker assignment and collision handling
  workqueue: Add multiple affinity scopes and interface to select them
  workqueue: Modularize wq_pod_type initialization
  workqueue: Add tools/workqueue/wq_dump.py which prints out workqueue configuration
  workqueue: Generalize unbound CPU pods
  workqueue: Factor out clearing of workqueue-only attrs fields
  workqueue: Factor out actual cpumask calculation to reduce subtlety in wq_update_pod()
  workqueue: Initialize unbound CPU pods later in the boot
  workqueue: Move wq_pod_init() below workqueue_init()
  workqueue: Rename NUMA related names to use pod instead
  workqueue: Rename workqueue_attrs->no_numa to ->ordered
  workqueue: Make unbound workqueues to use per-cpu pool_workqueues
  workqueue: Call wq_update_unbound_numa() on all CPUs in NUMA node on CPU hotplug
  ...
2023-09-01 16:06:32 -07:00
Linus Torvalds
7716f383a5 cgroup: Changes for v6.6
* Per-cpu cpu usage stats are now tracked. This currently isn't printed out
   in the cgroupfs interface and can only be accessed through e.g. BPF.
   Should decide on a not-too-ugly way to show per-cpu stats in cgroupfs.
 
 * cpuset received some cleanups and prepatory patches for the pending
   cpus.exclusive patchset which will allow cpuset partitions to be created
   below non-partition parents, which should ease the management of partition
   cpusets.
 
 * A lot of code and documentation cleanup patches.
 
 * tools/testing/selftests/cgroup/test_cpuset.c is added. This causes trivial
   conflicts in .gitignore and Makefile under the directory against
   fe3b1bf19b ("selftests: cgroup: add test_zswap program"). They can be
   resolved by keeping lines from both branches.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZPENTg4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGcyBAP44cHwpSFxXe3cehxAzb1l/2BZXtzU5l48OqUQd
 MwHyrwEAm7+MTVAR2xOF4f+oVM9KWmKj7oV7Clpixl1S7hHyjwE=
 =FCc9
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:

 - Per-cpu cpu usage stats are now tracked

   This currently isn't printed out in the cgroupfs interface and can
   only be accessed through e.g. BPF. Should decide on a not-too-ugly
   way to show per-cpu stats in cgroupfs

 - cpuset received some cleanups and prepatory patches for the pending
   cpus.exclusive patchset which will allow cpuset partitions to be
   created below non-partition parents, which should ease the management
   of partition cpusets

 - A lot of code and documentation cleanup patches

 - tools/testing/selftests/cgroup/test_cpuset.c added

* tag 'cgroup-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (32 commits)
  cgroup: Avoid -Wstringop-overflow warnings
  cgroup:namespace: Remove unused cgroup_namespaces_init()
  cgroup/rstat: Record the cumulative per-cpu time of cgroup and its descendants
  cgroup: clean up if condition in cgroup_pidlist_start()
  cgroup: fix obsolete function name in cgroup_destroy_locked()
  Documentation: cgroup-v2.rst: Correct number of stats entries
  cgroup: fix obsolete function name above css_free_rwork_fn()
  cgroup/cpuset: fix kernel-doc
  cgroup: clean up printk()
  cgroup: fix obsolete comment above cgroup_create()
  docs: cgroup-v1: fix typo
  docs: cgroup-v1: correct the term of Page Cache organization in inode
  cgroup/misc: Store atomic64_t reads to u64
  cgroup/misc: Change counters to be explicit 64bit types
  cgroup/misc: update struct members descriptions
  cgroup: remove cgrp->kn check in css_populate_dir()
  cgroup: fix obsolete function name
  cgroup: use cached local variable parent in for loop
  cgroup: remove obsolete comment above struct cgroupstats
  cgroup: put cgroup_tryget_css() inside CONFIG_CGROUP_SCHED
  ...
2023-09-01 15:58:21 -07:00
Linus Torvalds
307d59039f media updates for v6.6-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmTxzmEACgkQCF8+vY7k
 4RWP6A/+Ljbwdoq92qOcaKAG2h2HzJa/H+xKMQwqIYjpbXnjNuFD2S9FCRfhNa9b
 Pt4K2g4lH2IJvYiJ3qhBbMxV8GPmovnHFX5LvyTFpRmrtZBAKp+TPXpbPt+a2/WL
 IPfQ0I52/c/JNqhm3fnmKgpXorp0wHYNbfY/LXztslimZj95+t0qjW62BoBmsJ3s
 hR+j/Xlgnd+9gld1OqX6OndH3mpeqDzBl4KZatQzw6yuIo8SK0ASEpu/vzgZoVy+
 WiBtbzMuta2ZghnEHbnCkurwBSU/oLXhBmXsgp+Zdy0gglSk1RBdxM+3O65OVQt3
 CCWSXMS0vGOk6JiogMpcPzO5piaUePcHEIjgAaaepTOzbKaf6PbEd9dj73LT9qcx
 4TYFtGaDDhyDU4nzKTngfNiwmYrL1h+NuG119ZLHfrdH3MT7itIaydwFJRqLC+6D
 7K6/1H2LKq25i+hRp5ZK2pgv0dAJw/nSdwFGFVgWM3Tuyt5dGdL/4SlZO4nIFKF2
 pPWJUTJJP/0t9GUtwWmCh1fdgDr0A6Zg5M2OduyhC/YkqyLuD/02Bb4aR8hzloPj
 pym+94/PFaT5S7zvKywpvyIc8U+87/M2tw+mAPN2r3i4c0RFJa7CkyKqlKTKFw13
 jw7NLLlrRbZ3a3zlhpJVqGLKgF2FlWudLUo4Y4kddWvxTMbwYXs=
 =yuz5
 -----END PGP SIGNATURE-----

Merge tag 'media/v6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

 - new i2c drivers: ds90ub913, ds90ub953, ds90ub960, dw9719, ds90ub913

 - new Intel IVSC MEI drivers

 - some Mediatek platform drivers were moved to a common location

 - Intel atomisp2 driver is now working with the main ov2680 driver. Due
   to that, the atomisp2 ov2680 staging one was removed

 - the bttv driver was finally converted to videobuf2 framework. This
   was the last one upstream using videobuf version 1 core. We'll likely
   remove the old videobuf framework on 6.7

 - lots of improvements at atomisp driver: it now works with normal I2C
   sensors. Several compile-mode dependecies to select between ISP2400
   and ISP2401 are now solved in runtime

 - a new ipu-bridge logic was added to work with IVSC MEI drivers

 - venus driver gained better support for new VPU versions

 - the v4l core async framework has gained lots of improvements and
   cleanups

 - lots of other cleanups, improvements and driver fixes

* tag 'media/v6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (358 commits)
  media: ivsc: Add ACPI dependency
  media: bttv: convert to vb2
  media: bttv: use audio defaults for winfast2000
  media: bttv: refactor bttv_set_dma()
  media: bttv: move vbi_skip/vbi_count out of buffer
  media: bttv: remove crop info from bttv_buffer
  media: bttv: remove tvnorm field from bttv_buffer
  media: bttv: remove format field from bttv_buffer
  media: bttv: move do_crop flag out of bttv_fh
  media: bttv: copy vbi_fmt from bttv_fh
  media: bttv: copy vid fmt/width/height from fh
  media: bttv: radio use v4l2_fh instead of bttv_fh
  media: bttv: replace BUG with WARN_ON
  media: bttv: use video_drvdata to get bttv
  media: i2c: rdacm21: Fix uninitialized value
  media: coda: Remove duplicated include
  media: vivid: fix the racy dev->radio_tx_rds_owner
  media: i2c: ccs: Check rules is non-NULL
  media: i2c: ds90ub960: Fix PLL config for 1200 MHz CSI rate
  media: i2c: ds90ub953: Fix use of uninitialized variables
  ...
2023-09-01 12:21:32 -07:00
Linus Torvalds
1c9f8dff62 Char/Misc driver changes for 6.6-rc1
Here is the big set of char/misc and other small driver subsystem
 changes for 6.6-rc1.
 
 Stuff all over the place here, lots of driver updates and changes and
 new additions.  Short summary is:
   - new IIO drivers and updates
   - Interconnect driver updates
   - fpga driver updates and additions
   - fsi driver updates
   - mei driver updates
   - coresight driver updates
   - nvmem driver updates
   - counter driver updates
   - lots of smaller misc and char driver updates and additions
 
 All of these have been in linux-next for a long time with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZPH64g8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynr2QCfd3RKeR+WnGzyEOFhksl30UJJhiIAoNZtYT5+
 t9KG0iMDXRuTsOqeEQbd
 =tVnk
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big set of char/misc and other small driver subsystem
  changes for 6.6-rc1.

  Stuff all over the place here, lots of driver updates and changes and
  new additions. Short summary is:

   - new IIO drivers and updates

   - Interconnect driver updates

   - fpga driver updates and additions

   - fsi driver updates

   - mei driver updates

   - coresight driver updates

   - nvmem driver updates

   - counter driver updates

   - lots of smaller misc and char driver updates and additions

  All of these have been in linux-next for a long time with no reported
  problems"

* tag 'char-misc-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (267 commits)
  nvmem: core: Notify when a new layout is registered
  nvmem: core: Do not open-code existing functions
  nvmem: core: Return NULL when no nvmem layout is found
  nvmem: core: Create all cells before adding the nvmem device
  nvmem: u-boot-env:: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
  nvmem: sec-qfprom: Add Qualcomm secure QFPROM support
  dt-bindings: nvmem: sec-qfprom: Add bindings for secure qfprom
  dt-bindings: nvmem: Add compatible for QCM2290
  nvmem: Kconfig: Fix typo "drive" -> "driver"
  nvmem: Explicitly include correct DT includes
  nvmem: add new NXP QorIQ eFuse driver
  dt-bindings: nvmem: Add t1023-sfp efuse support
  dt-bindings: nvmem: qfprom: Add compatible for MSM8226
  nvmem: uniphier: Use devm_platform_get_and_ioremap_resource()
  nvmem: qfprom: do some cleanup
  nvmem: stm32-romem: Use devm_platform_get_and_ioremap_resource()
  nvmem: rockchip-efuse: Use devm_platform_get_and_ioremap_resource()
  nvmem: meson-mx-efuse: Convert to devm_platform_ioremap_resource()
  nvmem: lpc18xx_otp: Convert to devm_platform_ioremap_resource()
  nvmem: brcm_nvram: Use devm_platform_get_and_ioremap_resource()
  ...
2023-09-01 09:53:54 -07:00
Linus Torvalds
8e1e49550d TTY/Serial driver changes for 6.6-rc1
Here is the big set of tty and serial driver changes for 6.6-rc1.
 
 Lots of cleanups in here this cycle, and some driver updates.  Short
 summary is:
   - Jiri's continued work to make the tty code and apis be a bit more
     sane with regards to modern kernel coding style and types
   - cpm_uart driver updates
   - n_gsm updates and fixes
   - meson driver updates
   - sc16is7xx driver updates
   - 8250 driver updates for different hardware types
   - qcom-geni driver fixes
   - tegra serial driver change
   - stm32 driver updates
   - synclink_gt driver cleanups
   - tty structure size reduction
 
 All of these have been in linux-next this week with no reported issues.
 The last bit of cleanups from Jiri and the tty structure size reduction
 came in last week, a bit late but as they were just style changes and
 size reductions, I figured they should get into this merge cycle so that
 others can work on top of them with no merge conflicts.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZPH+jA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykKyACgldt6QeenTN+6dXIHS/eQHtTKZwMAn3arSeXI
 QrUUnLFjOWyoX87tbMBQ
 =LVw0
 -----END PGP SIGNATURE-----

Merge tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial driver updates from Greg KH:
 "Here is the big set of tty and serial driver changes for 6.6-rc1.

  Lots of cleanups in here this cycle, and some driver updates. Short
  summary is:

   - Jiri's continued work to make the tty code and apis be a bit more
     sane with regards to modern kernel coding style and types

   - cpm_uart driver updates

   - n_gsm updates and fixes

   - meson driver updates

   - sc16is7xx driver updates

   - 8250 driver updates for different hardware types

   - qcom-geni driver fixes

   - tegra serial driver change

   - stm32 driver updates

   - synclink_gt driver cleanups

   - tty structure size reduction

  All of these have been in linux-next this week with no reported
  issues. The last bit of cleanups from Jiri and the tty structure size
  reduction came in last week, a bit late but as they were just style
  changes and size reductions, I figured they should get into this merge
  cycle so that others can work on top of them with no merge conflicts"

* tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (199 commits)
  tty: shrink the size of struct tty_struct by 40 bytes
  tty: n_tty: deduplicate copy code in n_tty_receive_buf_real_raw()
  tty: n_tty: extract ECHO_OP processing to a separate function
  tty: n_tty: unify counts to size_t
  tty: n_tty: use u8 for chars and flags
  tty: n_tty: simplify chars_in_buffer()
  tty: n_tty: remove unsigned char casts from character constants
  tty: n_tty: move newline handling to a separate function
  tty: n_tty: move canon handling to a separate function
  tty: n_tty: use MASK() for masking out size bits
  tty: n_tty: make n_tty_data::num_overrun unsigned
  tty: n_tty: use time_is_before_jiffies() in n_tty_receive_overrun()
  tty: n_tty: use 'num' for writes' counts
  tty: n_tty: use output character directly
  tty: n_tty: make flow of n_tty_receive_buf_common() a bool
  Revert "tty: serial: meson: Add a earlycon for the T7 SoC"
  Documentation: devices.txt: Fix minors for ttyCPM*
  Documentation: devices.txt: Remove ttySIOC*
  Documentation: devices.txt: Remove ttyIOC*
  serial: 8250_bcm7271: improve bcm7271 8250 port
  ...
2023-09-01 09:38:00 -07:00
Linus Torvalds
51e7accbe8 USB / Thunderbolt / PHY driver update for 6.6-rc1
Here is the big set of USB, Thunderbolt, and PHY driver updates for
 6.6-rc1.  Included in here are:
   - PHY driver additions and cleanups
   - Thunderbolt minor additions and fixes
   - USB MIDI 2 gadget support added
   - dwc3 driver updates and additions
   - Removal of some old USB wireless code that was missed when that
     codebase was originally removed a few years ago, cleaning up some
     core USB code paths
   - USB core potential use-after-free fixes that syzbot from different
     people/groups keeps tripping over
   - typec updates and additions
   - gadget fixes and cleanups
   - loads of smaller USB core and driver cleanups all over the place
 
 Full details are in the shortlog.  All of these have been in linux-next
 for a while with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZPIAOQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yn80gCgybzMp0YnSildFetSC8lUJTnzjQcAn3KWzb75
 Zt72jxGl4ZOXHEpozG4O
 =FLrK
 -----END PGP SIGNATURE-----

Merge tag 'usb-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB / Thunderbolt / PHY driver updates from Greg KH:
 "Here is the big set of USB, Thunderbolt, and PHY driver updates for
  6.6-rc1. Included in here are:

   - PHY driver additions and cleanups

   - Thunderbolt minor additions and fixes

   - USB MIDI 2 gadget support added

   - dwc3 driver updates and additions

   - Removal of some old USB wireless code that was missed when that
     codebase was originally removed a few years ago, cleaning up some
     core USB code paths

   - USB core potential use-after-free fixes that syzbot from different
     people/groups keeps tripping over

   - typec updates and additions

   - gadget fixes and cleanups

   - loads of smaller USB core and driver cleanups all over the place

  Full details are in the shortlog. All of these have been in linux-next
  for a while with no reported problems"

* tag 'usb-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (154 commits)
  platform/chrome: cros_ec_typec: Configure Retimer cable type
  tcpm: Avoid soft reset when partner does not support get_status
  usb: typec: tcpm: reset counter when enter into unattached state after try role
  usb: typec: tcpm: set initial svdm version based on pd revision
  USB: serial: option: add FOXCONN T99W368/T99W373 product
  USB: serial: option: add Quectel EM05G variant (0x030e)
  usb: dwc2: add pci_device_id driver_data parse support
  usb: gadget: remove max support speed info in bind operation
  usb: gadget: composite: cleanup function config_ep_by_speed_and_alt()
  usb: gadget: config: remove max speed check in usb_assign_descriptors()
  usb: gadget: unconditionally allocate hs/ss descriptor in bind operation
  usb: gadget: f_uvc: change endpoint allocation in uvc_function_bind()
  usb: gadget: add a inline function gether_bitrate()
  usb: gadget: use working speed to calcaulate network bitrate and qlen
  dt-bindings: usb: samsung,exynos-dwc3: Add Exynos850 support
  usb: dwc3: exynos: Add support for Exynos850 variant
  usb: gadget: udc-xilinx: fix incorrect type in assignment warning
  usb: gadget: udc-xilinx: fix cast from restricted __le16 warning
  usb: gadget: udc-xilinx: fix restricted __le16 degrades to integer warning
  USB: dwc2: hande irq on dead controller correctly
  ...
2023-09-01 09:23:34 -07:00
Linus Torvalds
e0152e7481 RISC-V Patches for the 6.6 Merge Window, Part 1
* Support for the new "riscv,isa-extensions" and "riscv,isa-base" device
   tree interfaces for probing extensions.
 * Support for userspace access to the performance counters.
 * Support for more instructions in kprobes.
 * Crash kernels can be allocated above 4GiB.
 * Support for KCFI.
 * Support for ELFs in !MMU configurations.
 * ARCH_KMALLOC_MINALIGN has been reduced to 8.
 * mmap() defaults to sv48-sized addresses, with longer addresses hidden
   behind a hint (similar to Arm and Intel).
 * Also various fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmTx96kTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiVjRD/9DYVLlkQ/OEDJjPaEcYCP49xgIVUUU
 lhs3XbSs2VNHBaiG114f6Q0AaT/uNi+uqSej3CeTmEot2kZkBk/f2yu+UNIriPZ9
 GQiZsdyXhu921C+5VFtiI47KDWOVZ+Jpy3M1ll61IWt3yPSQHr1xOP0AOiyHHqe3
 cmqpNnzjajlfVDoXPc2mGGzUJt/7ar4thcwnMNi98raXR5Qh7SP6rrHjoQhE1oFk
 LMP3CHqEAcHE2tE4CxZVpc6HOQ5m0LpQIOK7ypufGMyoIYESm5dt/JOT4MlhTtDw
 6JzyVKtiM7lartUnUaW3ZoX4trQYT5gbXxWrJ2gCnUGy3VulikoXr1Rpz0qfdeOR
 XN8OLkVAqHfTGFI7oKk24f9Adw96R5NPZcdCay90h4J/kMfCiC7ZyUUI1XIa5iy1
 np5pZCkf8HNcdywML7qcFd5n2O0wchyFnRLFZo6kJP9Ls5cEi6kBx/1jSdTcNgx/
 fUKXyoEcriGoQiiwn29+4RZnU69gJV3zqQNLPpuwDQ5F/Q1zHTlrr+dqzezKkzcO
 dRTV2d2Q4A5vIDXPptzNNLlRQdrc8qxPJ1lxQVkPIU4/mtqczmZBwlyY2u9zwPyS
 sehJgJZnoAf+jm71NgQAKLck4MUBsMnMogOWunhXkVRCoZlbbkUWX4ECZYwPKsVk
 W7zVPmLvSM0l5g==
 =/tXb
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the new "riscv,isa-extensions" and "riscv,isa-base"
   device tree interfaces for probing extensions

 - Support for userspace access to the performance counters

 - Support for more instructions in kprobes

 - Crash kernels can be allocated above 4GiB

 - Support for KCFI

 - Support for ELFs in !MMU configurations

 - ARCH_KMALLOC_MINALIGN has been reduced to 8

 - mmap() defaults to sv48-sized addresses, with longer addresses hidden
   behind a hint (similar to Arm and Intel)

 - Also various fixes and cleanups

* tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits)
  lib/Kconfig.debug: Restrict DEBUG_INFO_SPLIT for RISC-V
  riscv: support PREEMPT_DYNAMIC with static keys
  riscv: Move create_tmp_mapping() to init sections
  riscv: Mark KASAN tmp* page tables variables as static
  riscv: mm: use bitmap_zero() API
  riscv: enable DEBUG_FORCE_FUNCTION_ALIGN_64B
  riscv: remove redundant mv instructions
  RISC-V: mm: Document mmap changes
  RISC-V: mm: Update pgtable comment documentation
  RISC-V: mm: Add tests for RISC-V mm
  RISC-V: mm: Restrict address space for sv39,sv48,sv57
  riscv: enable DMA_BOUNCE_UNALIGNED_KMALLOC for !dma_coherent
  riscv: allow kmalloc() caches aligned to the smallest value
  riscv: support the elf-fdpic binfmt loader
  binfmt_elf_fdpic: support 64-bit systems
  riscv: Allow CONFIG_CFI_CLANG to be selected
  riscv/purgatory: Disable CFI
  riscv: Add CFI error handling
  riscv: Add ftrace_stub_graph
  riscv: Add types to indirectly called assembly functions
  ...
2023-09-01 08:09:48 -07:00
Linus Torvalds
4ad0a4c234 powerpc updates for 6.6
- Add HOTPLUG_SMT support (/sys/devices/system/cpu/smt) and honour the
    configured SMT state when hotplugging CPUs into the system.
 
  - Combine final TLB flush and lazy TLB mm shootdown IPIs when using the Radix
    MMU to avoid a broadcast TLBIE flush on exit.
 
  - Drop the exclusion between ptrace/perf watchpoints, and drop the now unused
    associated arch hooks.
 
  - Add support for the "nohlt" command line option to disable CPU idle.
 
  - Add support for -fpatchable-function-entry for ftrace, with GCC >= 13.1.
 
  - Rework memory block size determination, and support 256MB size on systems
    with GPUs that have hotpluggable memory.
 
  - Various other small features and fixes.
 
 Thanks to: Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev,
 Benjamin Gray, Christophe Leroy, Frederic Barrat, Gautam Menghani, Geoff Levand,
 Hari Bathini, Immad Mir, Jialin Zhang, Joel Stanley, Jordan Niethe, Justin
 Stitt, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent Dufour, Liang He,
 Linus Walleij, Mahesh Salgaonkar, Masahiro Yamada, Michal Suchanek, Nageswara
 R Sastry, Nathan Chancellor, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nick
 Desaulniers, Omar Sandoval, Randy Dunlap, Reza Arbab, Rob Herring, Russell
 Currey, Sourabh Jain, Thomas Gleixner, Trevor Woerner, Uwe Kleine-König, Vaibhav
 Jain, Xiongfeng Wang, Yuan Tan, Zhang Rui, Zheng Zengkai.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmTwgbwTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgFmpD/432vipeoqvkAYsyK0xi/Y3GcY0wcyd
 WJApLXXadEbtKQrgXQ6sowWqalg5thYnQCRarg/tXKK/po3KfgwkPjGDpOL+cIdr
 12QVN2XJm9VmJ1wYJxzk+yXx4F43AdmMdr94qWAGufbTHezwb4UpzVR1NxtFrOE/
 X5TNsC2+2mdZY/ZaNHS5vsTIFv3EhQfqgjZPlIAdLn6CGc8xWT514Q/uHA8+ytM/
 HL7Hqs33DoPSvgTa5TT/2E0d0k5nO3P5KObzAjpYlireTPaBi51mpKGewcrtm0o2
 v3cBlbfx3C7pe9ZhKBK9BH8cjynfiqsVZ9/lCw/7eBNdm9tHuzG0jeS7Db9tCZXS
 fM7G2R7SoIusPTqxlBmkU5DpYslwrHiVgCyy3ijxkoA/fakVwh/GgTcMsRt73IY6
 n6DsUvWwuYHCIeIiHmHQJqCqCRtV+aMzU3AbbBHOjtdIanhlW16M686dEsgCirh7
 akRVRD5VqKaqXs34PpkRL89Xv3wZRjl6XZ3hZFfCjSYXfpXDXhgSToIskpHYhKL8
 gpY7WtG9YQP05Xz5HRCx6EluaZVeKe0lZi6fezX7Mi9AygJQO8FfXqP1mHBlEq40
 ThWtvL9D89RV6lADqqFN20XepgvKNOyAXcE4szvsnIZYUSPmZQZSPxx+DHtROaLP
 jX3ifxtxJp92pQ==
 =5g7K
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Add HOTPLUG_SMT support (/sys/devices/system/cpu/smt) and honour the
   configured SMT state when hotplugging CPUs into the system

 - Combine final TLB flush and lazy TLB mm shootdown IPIs when using the
   Radix MMU to avoid a broadcast TLBIE flush on exit

 - Drop the exclusion between ptrace/perf watchpoints, and drop the now
   unused associated arch hooks

 - Add support for the "nohlt" command line option to disable CPU idle

 - Add support for -fpatchable-function-entry for ftrace, with GCC >=
   13.1

 - Rework memory block size determination, and support 256MB size on
   systems with GPUs that have hotpluggable memory

 - Various other small features and fixes

Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Benjamin Gray, Christophe Leroy, Frederic Barrat, Gautam
Menghani, Geoff Levand, Hari Bathini, Immad Mir, Jialin Zhang, Joel
Stanley, Jordan Niethe, Justin Stitt, Kajol Jain, Kees Cook, Krzysztof
Kozlowski, Laurent Dufour, Liang He, Linus Walleij, Mahesh Salgaonkar,
Masahiro Yamada, Michal Suchanek, Nageswara R Sastry, Nathan Chancellor,
Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nick Desaulniers, Omar
Sandoval, Randy Dunlap, Reza Arbab, Rob Herring, Russell Currey, Sourabh
Jain, Thomas Gleixner, Trevor Woerner, Uwe Kleine-König, Vaibhav Jain,
Xiongfeng Wang, Yuan Tan, Zhang Rui, and Zheng Zengkai.

* tag 'powerpc-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (135 commits)
  macintosh/ams: linux/platform_device.h is needed
  powerpc/xmon: Reapply "Relax frame size for clang"
  powerpc/mm/book3s64: Use 256M as the upper limit with coherent device memory attached
  powerpc/mm/book3s64: Fix build error with SPARSEMEM disabled
  powerpc/iommu: Fix notifiers being shared by PCI and VIO buses
  powerpc/mpc5xxx: Add missing fwnode_handle_put()
  powerpc/config: Disable SLAB_DEBUG_ON in skiroot
  powerpc/pseries: Remove unused hcall tracing instruction
  powerpc/pseries: Fix hcall tracepoints with JUMP_LABEL=n
  powerpc: dts: add missing space before {
  powerpc/eeh: Use pci_dev_id() to simplify the code
  powerpc/64s: Move CPU -mtune options into Kconfig
  powerpc/powermac: Fix unused function warning
  powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT
  powerpc: Don't include lppaca.h in paca.h
  powerpc/pseries: Move hcall_vphn() prototype into vphn.h
  powerpc/pseries: Move VPHN constants into vphn.h
  cxl: Drop unused detach_spa()
  powerpc: Drop zalloc_maybe_bootmem()
  powerpc/powernv: Use struct opal_prd_msg in more places
  ...
2023-08-31 12:43:10 -07:00
Palmer Dabbelt
9389e6715f
Merge patch series "support allocating crashkernel above 4G explicitly on riscv"
Chen Jiahao <chenjiahao16@huawei.com> says:

On riscv, the current crash kernel allocation logic is trying to
allocate within 32bit addressible memory region by default, if
failed, try to allocate without 4G restriction.

In need of saving DMA zone memory while allocating a relatively large
crash kernel region, allocating the reserved memory top down in
high memory, without overlapping the DMA zone, is a mature solution.
Hence this patchset introduces the parameter option crashkernel=X,[high,low].

One can reserve the crash kernel from high memory above DMA zone range
by explicitly passing "crashkernel=X,high"; or reserve a memory range
below 4G with "crashkernel=X,low". Besides, there are few rules need
to take notice:
1. "crashkernel=X,[high,low]" will be ignored if "crashkernel=size"
   is specified.
2. "crashkernel=X,low" is valid only when "crashkernel=X,high" is passed
   and there is enough memory to be allocated under 4G.
3. When allocating crashkernel above 4G and no "crashkernel=X,low" is
   specified, a 128M low memory will be allocated automatically for
   swiotlb bounce buffer.
See Documentation/admin-guide/kernel-parameters.txt for more information.

To verify loading the crashkernel, adapted kexec-tools is attached below:
https://github.com/chenjh005/kexec-tools/tree/build-test-riscv-v2

Following test cases have been performed as expected:
1) crashkernel=256M                          //low=256M
2) crashkernel=1G                            //low=1G
3) crashkernel=4G                            //high=4G, low=128M(default)
4) crashkernel=4G crashkernel=256M,high      //high=4G, low=128M(default), high is ignored
5) crashkernel=4G crashkernel=256M,low       //high=4G, low=128M(default), low is ignored
6) crashkernel=4G,high                       //high=4G, low=128M(default)
7) crashkernel=256M,low                      //low=0M, invalid
8) crashkernel=4G,high crashkernel=256M,low  //high=4G, low=256M
9) crashkernel=4G,high crashkernel=4G,low    //high=0M, low=0M, invalid
10) crashkernel=512M@0xd0000000              //low=512M
11) crashkernel=1G,high crashkernel=0M,low   //high=1G, low=0M

* b4-shazam-merge:
  docs: kdump: Update the crashkernel description for riscv
  riscv: kdump: Implement crashkernel=X,[high,low]

Link: https://lore.kernel.org/r/20230726175000.2536220-1-chenjiahao16@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-31 00:18:28 -07:00
Linus Torvalds
cd99b9eb4b Documentation work keeps chugging along; stuff for 6.6 includes:
- Work from Carlos Bilbao to integrate rustdoc output into the generated
   HTML documentation.  This took some work to figure out how to do it
   without slowing the docs build and without creating people who don't have
   Rust installed, but Carlos got there.
 
 - Move the loongarch and mips architecture documentation under
   Documentation/arch/.
 
 - Some more maintainer documentation from Jakub
 
 ...plus the usual assortment of updates, translations, and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmTvqNkPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YgIgH/3drfLtlFtzLqDOzrzDXS8yGnE3pPdxw796b
 /ZFzAK16wYKaKevYoIz8bVGGKaE1sEUW0mhlq4KGdfZuxLG8YnWS8URyCW4FDU2E
 6qNL+8oJ8LZfID46f9Q8ZgfEz7yF/mhCqPk7MEswYtwbscs2ZTGCTGYB/5BHlBuT
 LR+M89uLmHgr8S1o24v30OgiX+VvQFyu0xoxIhbiqUZvBd/XdfX2pgYd9BGzMj5q
 C2ZP+V14g36c5pV0EO9TwhCXOF/WVrp7DbjbfWAsqBSLxvpXPydH2q1DUzGeQtP1
 exujrBD1O8q3pPdaNA5R+h6cWlHmUZug9mE4BRLp9ErGrozwJsQ=
 =C3Uv
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.6' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "Documentation work keeps chugging along; this includes:

   - Work from Carlos Bilbao to integrate rustdoc output into the
     generated HTML documentation. This took some work to figure out how
     to do it without slowing the docs build and without creating people
     who don't have Rust installed, but Carlos got there

   - Move the loongarch and mips architecture documentation under
     Documentation/arch/

   - Some more maintainer documentation from Jakub

  ... plus the usual assortment of updates, translations, and fixes"

* tag 'docs-6.6' of git://git.lwn.net/linux: (56 commits)
  Docu: genericirq.rst: fix irq-example
  input: docs: pxrc: remove reference to phoenix-sim
  Documentation: serial-console: Fix literal block marker
  docs/mm: remove references to hmm_mirror ops and clean typos
  docs/zh_CN: correct regi_chg(),regi_add() to region_chg(),region_add()
  Documentation: Fix typos
  Documentation/ABI: Fix typos
  scripts: kernel-doc: fix macro handling in enums
  scripts: kernel-doc: parse DEFINE_DMA_UNMAP_[ADDR|LEN]
  Documentation: riscv: Update boot image header since EFI stub is supported
  Documentation: riscv: Add early boot document
  Documentation: arm: Add bootargs to the table of added DT parameters
  docs: kernel-parameters: Refer to the correct bitmap function
  doc: update params of memhp_default_state=
  docs: Add book to process/kernel-docs.rst
  docs: sparse: fix invalid link addresses
  docs: vfs: clean up after the iterate() removal
  docs: Add a section on surveys to the researcher guidelines
  docs: move mips under arch
  docs: move loongarch under arch
  ...
2023-08-30 20:05:42 -07:00
Linus Torvalds
6c1b980a7e dma-maping updates for Linux 6.6
- allow dynamic sizing of the swiotlb buffer, to cater for secure
    virtualization workloads that require all I/O to be bounce buffered
    (Petr Tesarik)
  - move a declaration to a header (Arnd Bergmann)
  - check for memory region overlap in dma-contiguous (Binglei Wang)
  - remove the somewhat dangerous runtime swiotlb-xen enablement and
    unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross)
  - per-node CMA improvements (Yajun Deng)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmTuDHkLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYOqvhAApMk2/ceTgVH17sXaKE822+xKvgv377O6TlggMeGG
 W4zA0KD69DNz0AfaaCc5U5f7n8Ld/YY1RsvkHW4b3jgw+KRTeQr0jjitBgP5kP2M
 A1+qxdyJpCTwiPt9s2+JFVPeyZ0s52V6OJODKRG3s0ore55R+U09VySKtASON+q3
 GMKfWqQteKC+thg7NkrQ7JUixuo84oICws+rZn4K9ifsX2O0HYW6aMW0feRfZjJH
 r0TgqZc4RdPTSaF22oapR9Ls39+7hp/pBvoLm5sBNA3cl5C3X4VWo9ERMU1jW9h+
 VYQv39NycUspgskWJmpbU06/+ooYqQlwHSR/vdNusmFIvxo4tf6/UX72YO5F8Dar
 ap0wYGauiEwTjSnhVxPTXk3obWyWEsgFAeRnPdTlH2CNmv38QZU2HLb8eU1pcXxX
 j+WI2Ewy9z22uBVYiPOKpdW1jkSfmlmfPp/8SbAdua7I3YQ90rQN6AvU06zAi/cL
 NQTgO81E4jPkygqAVgS/LeYziWAQ73yM7m9ExThtTgqFtHortwhJ4Fd8XKtvtvEb
 viXAZ/WZtQBv/CIKAW98NhgIDP/SPOT8ym6V35WK+kkNFMS6LMSQUfl9GgbHGyFa
 n9icMm7BmbDtT1+AKNafG9En4DtAf9M9QNidAVOyfrsIk6S0gZoZwvIStkA7on8a
 cNY=
 =kVVr
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-maping updates from Christoph Hellwig:

 - allow dynamic sizing of the swiotlb buffer, to cater for secure
   virtualization workloads that require all I/O to be bounce buffered
   (Petr Tesarik)

 - move a declaration to a header (Arnd Bergmann)

 - check for memory region overlap in dma-contiguous (Binglei Wang)

 - remove the somewhat dangerous runtime swiotlb-xen enablement and
   unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross)

 - per-node CMA improvements (Yajun Deng)

* tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping:
  swiotlb: optimize get_max_slots()
  swiotlb: move slot allocation explanation comment where it belongs
  swiotlb: search the software IO TLB only if the device makes use of it
  swiotlb: allocate a new memory pool when existing pools are full
  swiotlb: determine potential physical address limit
  swiotlb: if swiotlb is full, fall back to a transient memory pool
  swiotlb: add a flag whether SWIOTLB is allowed to grow
  swiotlb: separate memory pool data from other allocator data
  swiotlb: add documentation and rename swiotlb_do_find_slots()
  swiotlb: make io_tlb_default_mem local to swiotlb.c
  swiotlb: bail out of swiotlb_init_late() if swiotlb is already allocated
  dma-contiguous: check for memory region overlap
  dma-contiguous: support numa CMA for specified node
  dma-contiguous: support per-numa CMA for all architectures
  dma-mapping: move arch_dma_set_mask() declaration to header
  swiotlb: unexport is_swiotlb_active
  x86: always initialize xen-swiotlb when xen-pcifront is enabling
  xen/pci: add flag for PCI passthrough being possible
2023-08-29 20:32:10 -07:00
Linus Torvalds
d68b4b6f30 - An extensive rework of kexec and crash Kconfig from Eric DeVolder
("refactor Kconfig to consolidate KEXEC and CRASH options").
 
 - kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
   couple of macros to args.h").
 
 - gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
   commands").
 
 - vsprintf inclusion rationalization from Andy Shevchenko
   ("lib/vsprintf: Rework header inclusions").
 
 - Switch the handling of kdump from a udev scheme to in-kernel handling,
   by Eric DeVolder ("crash: Kernel handling of CPU and memory hot
   un/plug").
 
 - Many singleton patches to various parts of the tree
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO2GpAAKCRDdBJ7gKXxA
 juW3AQD1moHzlSN6x9I3tjm5TWWNYFoFL8af7wXDJspp/DWH/AD/TO0XlWWhhbYy
 QHy7lL0Syha38kKLMXTM+bN6YQHi9AU=
 =WJQa
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - An extensive rework of kexec and crash Kconfig from Eric DeVolder
   ("refactor Kconfig to consolidate KEXEC and CRASH options")

 - kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
   couple of macros to args.h")

 - gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
   commands")

 - vsprintf inclusion rationalization from Andy Shevchenko
   ("lib/vsprintf: Rework header inclusions")

 - Switch the handling of kdump from a udev scheme to in-kernel
   handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory
   hot un/plug")

 - Many singleton patches to various parts of the tree

* tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits)
  document while_each_thread(), change first_tid() to use for_each_thread()
  drivers/char/mem.c: shrink character device's devlist[] array
  x86/crash: optimize CPU changes
  crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()
  crash: hotplug support for kexec_load()
  x86/crash: add x86 crash hotplug support
  crash: memory and CPU hotplug sysfs attributes
  kexec: exclude elfcorehdr from the segment digest
  crash: add generic infrastructure for crash hotplug support
  crash: move a few code bits to setup support of crash hotplug
  kstrtox: consistently use _tolower()
  kill do_each_thread()
  nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
  scripts/bloat-o-meter: count weak symbol sizes
  treewide: drop CONFIG_EMBEDDED
  lockdep: fix static memory detection even more
  lib/vsprintf: declare no_hash_pointers in sprintf.h
  lib/vsprintf: split out sprintf() and friends
  kernel/fork: stop playing lockless games for exe_file replacement
  adfs: delete unused "union adfs_dirtail" definition
  ...
2023-08-29 14:53:51 -07:00
Linus Torvalds
b96a3e9142 - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list")
- Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
   reduces the special-case code for handling hugetlb pages in GUP.  It
   also speeds up GUP handling of transparent hugepages.
 
 - Peng Zhang provides some maple tree speedups ("Optimize the fast path
   of mas_store()").
 
 - Sergey Senozhatsky has improved te performance of zsmalloc during
   compaction (zsmalloc: small compaction improvements").
 
 - Domenico Cerasuolo has developed additional selftest code for zswap
   ("selftests: cgroup: add zswap test program").
 
 - xu xin has doe some work on KSM's handling of zero pages.  These
   changes are mainly to enable the user to better understand the
   effectiveness of KSM's treatment of zero pages ("ksm: support tracking
   KSM-placed zero-pages").
 
 - Jeff Xu has fixes the behaviour of memfd's
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").
 
 - David Howells has fixed an fscache optimization ("mm, netfs, fscache:
   Stop read optimisation when folio removed from pagecache").
 
 - Axel Rasmussen has given userfaultfd the ability to simulate memory
   poisoning ("add UFFDIO_POISON to simulate memory poisoning with UFFD").
 
 - Miaohe Lin has contributed some routine maintenance work on the
   memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
   check").
 
 - Peng Zhang has contributed some maintenance work on the maple tree
   code ("Improve the validation for maple tree and some cleanup").
 
 - Hugh Dickins has optimized the collapsing of shmem or file pages into
   THPs ("mm: free retracted page table by RCU").
 
 - Jiaqi Yan has a patch series which permits us to use the healthy
   subpages within a hardware poisoned huge page for general purposes
   ("Improve hugetlbfs read on HWPOISON hugepages").
 
 - Kemeng Shi has done some maintenance work on the pagetable-check code
   ("Remove unused parameters in page_table_check").
 
 - More folioification work from Matthew Wilcox ("More filesystem folio
   conversions for 6.6"), ("Followup folio conversions for zswap").  And
   from ZhangPeng ("Convert several functions in page_io.c to use a
   folio").
 
 - page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").
 
 - Baoquan He has converted some architectures to use the GENERIC_IOREMAP
   ioremap()/iounmap() code ("mm: ioremap: Convert architectures to take
   GENERIC_IOREMAP way").
 
 - Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
   batched/deferred tlb shootdown during page reclamation/migration").
 
 - Better maple tree lockdep checking from Liam Howlett ("More strict
   maple tree lockdep").  Liam also developed some efficiency improvements
   ("Reduce preallocations for maple tree").
 
 - Cleanup and optimization to the secondary IOMMU TLB invalidation, from
   Alistair Popple ("Invalidate secondary IOMMU TLB on permission
   upgrade").
 
 - Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
   for arm64").
 
 - Kemeng Shi provides some maintenance work on the compaction code ("Two
   minor cleanups for compaction").
 
 - Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle most
   file-backed faults under the VMA lock").
 
 - Aneesh Kumar contributes code to use the vmemmap optimization for DAX
   on ppc64, under some circumstances ("Add support for DAX vmemmap
   optimization for ppc64").
 
 - page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
   data in page_ext"), ("minor cleanups to page_ext header").
 
 - Some zswap cleanups from Johannes Weiner ("mm: zswap: three
   cleanups").
 
 - kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").
 
 - VMA handling cleanups from Kefeng Wang ("mm: convert to
   vma_is_initial_heap/stack()").
 
 - DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
   implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
   address ranges and DAMON monitoring targets").
 
 - Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").
 
 - Liam Howlett has improved the maple tree node replacement code
   ("maple_tree: Change replacement strategy").
 
 - ZhangPeng has a general code cleanup - use the K() macro more widely
   ("cleanup with helper macro K()").
 
 - Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for memmap
   on memory feature on ppc64").
 
 - pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
   in page_alloc"), ("Two minor cleanups for get pageblock migratetype").
 
 - Vishal Moola introduces a memory descriptor for page table tracking,
   "struct ptdesc" ("Split ptdesc from struct page").
 
 - memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
   for vm.memfd_noexec").
 
 - MM include file rationalization from Hugh Dickins ("arch: include
   asm/cacheflush.h in asm/hugetlb.h").
 
 - THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
   output").
 
 - kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
   object_cache instead of kmemleak_initialized").
 
 - More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
   and _folio_order").
 
 - A VMA locking scalability improvement from Suren Baghdasaryan
   ("Per-VMA lock support for swap and userfaults").
 
 - pagetable handling cleanups from Matthew Wilcox ("New page table range
   API").
 
 - A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
   using page->private on tail pages for THP_SWAP + cleanups").
 
 - Cleanups and speedups to the hugetlb fault handling from Matthew
   Wilcox ("Change calling convention for ->huge_fault").
 
 - Matthew Wilcox has also done some maintenance work on the MM subsystem
   documentation ("Improve mm documentation").
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO1JUQAKCRDdBJ7gKXxA
 jrMwAP47r/fS8vAVT3zp/7fXmxaJYTK27CTAM881Gw1SDhFM/wEAv8o84mDenCg6
 Nfio7afS1ncD+hPYT8947UnLxTgn+ww=
 =Afws
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Some swap cleanups from Ma Wupeng ("fix WARN_ON in
   add_to_avail_list")

 - Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
   reduces the special-case code for handling hugetlb pages in GUP. It
   also speeds up GUP handling of transparent hugepages.

 - Peng Zhang provides some maple tree speedups ("Optimize the fast path
   of mas_store()").

 - Sergey Senozhatsky has improved te performance of zsmalloc during
   compaction (zsmalloc: small compaction improvements").

 - Domenico Cerasuolo has developed additional selftest code for zswap
   ("selftests: cgroup: add zswap test program").

 - xu xin has doe some work on KSM's handling of zero pages. These
   changes are mainly to enable the user to better understand the
   effectiveness of KSM's treatment of zero pages ("ksm: support
   tracking KSM-placed zero-pages").

 - Jeff Xu has fixes the behaviour of memfd's
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").

 - David Howells has fixed an fscache optimization ("mm, netfs, fscache:
   Stop read optimisation when folio removed from pagecache").

 - Axel Rasmussen has given userfaultfd the ability to simulate memory
   poisoning ("add UFFDIO_POISON to simulate memory poisoning with
   UFFD").

 - Miaohe Lin has contributed some routine maintenance work on the
   memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
   check").

 - Peng Zhang has contributed some maintenance work on the maple tree
   code ("Improve the validation for maple tree and some cleanup").

 - Hugh Dickins has optimized the collapsing of shmem or file pages into
   THPs ("mm: free retracted page table by RCU").

 - Jiaqi Yan has a patch series which permits us to use the healthy
   subpages within a hardware poisoned huge page for general purposes
   ("Improve hugetlbfs read on HWPOISON hugepages").

 - Kemeng Shi has done some maintenance work on the pagetable-check code
   ("Remove unused parameters in page_table_check").

 - More folioification work from Matthew Wilcox ("More filesystem folio
   conversions for 6.6"), ("Followup folio conversions for zswap"). And
   from ZhangPeng ("Convert several functions in page_io.c to use a
   folio").

 - page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").

 - Baoquan He has converted some architectures to use the
   GENERIC_IOREMAP ioremap()/iounmap() code ("mm: ioremap: Convert
   architectures to take GENERIC_IOREMAP way").

 - Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
   batched/deferred tlb shootdown during page reclamation/migration").

 - Better maple tree lockdep checking from Liam Howlett ("More strict
   maple tree lockdep"). Liam also developed some efficiency
   improvements ("Reduce preallocations for maple tree").

 - Cleanup and optimization to the secondary IOMMU TLB invalidation,
   from Alistair Popple ("Invalidate secondary IOMMU TLB on permission
   upgrade").

 - Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
   for arm64").

 - Kemeng Shi provides some maintenance work on the compaction code
   ("Two minor cleanups for compaction").

 - Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle
   most file-backed faults under the VMA lock").

 - Aneesh Kumar contributes code to use the vmemmap optimization for DAX
   on ppc64, under some circumstances ("Add support for DAX vmemmap
   optimization for ppc64").

 - page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
   data in page_ext"), ("minor cleanups to page_ext header").

 - Some zswap cleanups from Johannes Weiner ("mm: zswap: three
   cleanups").

 - kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").

 - VMA handling cleanups from Kefeng Wang ("mm: convert to
   vma_is_initial_heap/stack()").

 - DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
   implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
   address ranges and DAMON monitoring targets").

 - Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").

 - Liam Howlett has improved the maple tree node replacement code
   ("maple_tree: Change replacement strategy").

 - ZhangPeng has a general code cleanup - use the K() macro more widely
   ("cleanup with helper macro K()").

 - Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for
   memmap on memory feature on ppc64").

 - pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
   in page_alloc"), ("Two minor cleanups for get pageblock
   migratetype").

 - Vishal Moola introduces a memory descriptor for page table tracking,
   "struct ptdesc" ("Split ptdesc from struct page").

 - memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
   for vm.memfd_noexec").

 - MM include file rationalization from Hugh Dickins ("arch: include
   asm/cacheflush.h in asm/hugetlb.h").

 - THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
   output").

 - kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
   object_cache instead of kmemleak_initialized").

 - More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
   and _folio_order").

 - A VMA locking scalability improvement from Suren Baghdasaryan
   ("Per-VMA lock support for swap and userfaults").

 - pagetable handling cleanups from Matthew Wilcox ("New page table
   range API").

 - A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
   using page->private on tail pages for THP_SWAP + cleanups").

 - Cleanups and speedups to the hugetlb fault handling from Matthew
   Wilcox ("Change calling convention for ->huge_fault").

 - Matthew Wilcox has also done some maintenance work on the MM
   subsystem documentation ("Improve mm documentation").

* tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (489 commits)
  maple_tree: shrink struct maple_tree
  maple_tree: clean up mas_wr_append()
  secretmem: convert page_is_secretmem() to folio_is_secretmem()
  nios2: fix flush_dcache_page() for usage from irq context
  hugetlb: add documentation for vma_kernel_pagesize()
  mm: add orphaned kernel-doc to the rst files.
  mm: fix clean_record_shared_mapping_range kernel-doc
  mm: fix get_mctgt_type() kernel-doc
  mm: fix kernel-doc warning from tlb_flush_rmaps()
  mm: remove enum page_entry_size
  mm: allow ->huge_fault() to be called without the mmap_lock held
  mm: move PMD_ORDER to pgtable.h
  mm: remove checks for pte_index
  memcg: remove duplication detection for mem_cgroup_uncharge_swap
  mm/huge_memory: work on folio->swap instead of page->private when splitting folio
  mm/swap: inline folio_set_swap_entry() and folio_swap_entry()
  mm/swap: use dedicated entry for swap in folio
  mm/swap: stop using page->private on tail pages for THP_SWAP
  selftests/mm: fix WARNING comparing pointer to 0
  selftests: cgroup: fix test_kmem_memcg_deletion kernel mem check
  ...
2023-08-29 14:25:26 -07:00
Linus Torvalds
f2586d921c Hi,
Contents:
 
 - Restrict linking of keys to .ima and .evm keyrings based on
   digitalSignature attribute in the certificate.
 - PowerVM: load machine owner keys into the .machine [1] keyring.
 - PowerVM: load module signing keys into the secondary trusted keyring
   (keys blessed by the vendor).
 - tpm_tis_spi: half-duplex transfer mode
 - tpm_tis: retry corrupted transfers
 - Apply revocation list (.mokx) to an all system keyrings (e.g. .machine
   keyring).
 
 [1] https://blogs.oracle.com/linux/post/the-machine-keyring
 
 BR, Jarkko
 -----BEGIN PGP SIGNATURE-----
 
 iIgEABYIADAWIQRE6pSOnaBC00OEHEIaerohdGur0gUCZN5/qBIcamFya2tvQGtl
 cm5lbC5vcmcACgkQGnq6IXRrq9J4GQEAstTtQfGGrx5KInOTMWOvaq/Cum5iW4AD
 NefVfbUtCCQBANvFtxoPYQS5u6+rIdxzIwFiNUlOyt2uR2bkk4UUiPML
 =Vvs8
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull tpm updates from Jarkko Sakkinen:

 - Restrict linking of keys to .ima and .evm keyrings based on
   digitalSignature attribute in the certificate

 - PowerVM: load machine owner keys into the .machine [1] keyring

 - PowerVM: load module signing keys into the secondary trusted keyring
   (keys blessed by the vendor)

 - tpm_tis_spi: half-duplex transfer mode

 - tpm_tis: retry corrupted transfers

 - Apply revocation list (.mokx) to an all system keyrings (e.g.
   .machine keyring)

Link: https://blogs.oracle.com/linux/post/the-machine-keyring [1]

* tag 'tpmdd-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  certs: Reference revocation list for all keyrings
  tpm/tpm_tis_synquacer: Use module_platform_driver macro to simplify the code
  tpm: remove redundant variable len
  tpm_tis: Resend command to recover from data transfer errors
  tpm_tis: Use responseRetry to recover from data transfer errors
  tpm_tis: Move CRC check to generic send routine
  tpm_tis_spi: Add hardware wait polling
  KEYS: Replace all non-returning strlcpy with strscpy
  integrity: PowerVM support for loading third party code signing keys
  integrity: PowerVM machine keyring enablement
  integrity: check whether imputed trust is enabled
  integrity: remove global variable from machine_keyring.c
  integrity: ignore keys failing CA restrictions on non-UEFI platform
  integrity: PowerVM support for loading CA keys on machine keyring
  integrity: Enforce digitalSignature usage in the ima and evm keyrings
  KEYS: DigitalSignature link restriction
  tpm_tis: Revert "tpm_tis: Disable interrupts on ThinkPad T490s"
2023-08-29 08:05:18 -07:00
Linus Torvalds
330235e874 ACPI updates for 6.6-rc1
- Update the ACPICA code in the kernel to upstream revision 20230628
    including the following changes:
    * Suppress a GCC 12 dangling-pointer warning (Philip Prindeville).
    * Reformat the ACPI_STATE_COMMON macro and its users (George Guo).
    * Replace the ternary operator with ACPI_MIN() (Jiangshan Yi).
    * Add support for _DSC as per ACPI 6.5 (Saket Dumbre).
    * Remove a duplicate macro from zephyr header (Najumon B.A).
    * Add data structures for GED and _EVT tracking (Jose Marinho).
    * Fix misspelled CDAT DSMAS define (Dave Jiang).
    * Simplify an error message in acpi_ds_result_push() (Christophe
      Jaillet).
    * Add a struct size macro related to SRAT (Dave Jiang).
    * Add AML_NO_OPERAND_RESOLVE flag to Timer (Abhishek Mainkar).
    * Add support for RISC-V external interrupt controllers in MADT (Sunil
      V L).
    * Add RHCT flags, CMO and MMU nodes (Sunil V L).
    * Change ACPICA version to 20230628 (Bob Moore).
 
  - Introduce new wrappers for ACPICA notify handler install/remove and
    convert multiple drivers to using their own Notify() handlers instead
    of the ACPI bus type .notify() slated for removal (Michal Wilczynski).
 
  - Add backlight=native DMI quirk for Apple iMac12,1 and iMac12,2 (Hans
    de Goede).
 
  - Put ACPI video and its child devices explicitly into D0 on boot to
    avoid platform firmware confusion (Kai-Heng Feng).
 
  - Add backlight=native DMI quirk for Lenovo Ideapad Z470 (Jiri Slaby).
 
  - Support obtaining physical CPU ID from MADT on LoongArch (Bibo Mao).
 
  - Convert ACPI CPU initialization to using _OSC instead of _PDC that
    has been depreceted since 2018 and dropped from the specification in
    ACPI 6.5 (Michal Wilczynski, Rafael Wysocki).
 
  - Drop non-functional nocrt parameter from ACPI thermal (Mario
    Limonciello).
 
  - Clean up the ACPI thermal driver, rework the handling of firmware
    notifications in it and make it provide a table of generic trip point
    structures to the core during initialization (Rafael Wysocki).
 
  - Defer enumeration of devices with _DEP pointing to IVSC (Wentong Wu).
 
  - Install SystemCMOS address space handler for ACPI000E (TAD) to meet
    platform firmware expectations on some platforms (Zhang Rui).
 
  - Fix finding the generic error data in the ACPi extlog driver for
    compatibility with old and new firmware interface versions (Xiaochun
    Lee).
 
  - Remove assorted unused declarations of functions (Yue Haibing).
 
  - Move AMBA bus scan handling into arm64 specific directory (Sudeep
    Holla).
 
  - Fix and clean up suspend-to-idle interface for AMD systems (Mario
    Limonciello, Andy Shevchenko).
 
  - Fix string truncation warning in pnpacpi_add_device() (Sunil V L).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmTslAkSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxtvgQAIsYAHXgalMFCLTqxeRZ8IMMLh/oxVfp
 RDbZ7lsBNzhYKXNv0jczjDApvfOctplh+qVuRIyFPrVrD+xzPgI9eFaZx41B3UPy
 3UG/KphjtF+sAu8rSFM2qcL2P+NGWe1Okr2G8Mbvu6bkGjfHShX6HGR7YgIyrZVO
 VMVLWqLvjvvg/jkraozArTtlvMwTrpBFxgWQVPK9TLRwx+JfB7WV6Z1XkgbllpR5
 8Q+O9wyVWfAIrLk/HZJdYNkdwREi46d226ePs6tJkwzE8KCUna0U4WbVVPRjbSfI
 Mkpy7Bnn91EZdN43/3ZzPBNO6NJi7UsD46EwRMzf1v7KyfsttnKw/v/SMke6gKQW
 43gi0trdbVxEdnN33CmBb2k82YQ5tjVuNOwKNy4xvFZ4vTE+WTz2imXMdW28biLT
 sU1eYJXPzBUQ4Ja2x56a1DOp2C1uOcSHbTKNzmtFsmT2FIWOKN+9PrOjaFEn7cyU
 FwWSzcONLE/QC0cSr7cWYym06aY8INtAlCm0hrlpwBDyiVDACz/QZ3NQAmbXKr5z
 5JIDQ+8v6YvpIx48yDlTYaQPMEskkZ2udkmwZCh6Vs0fMGyo3DDWvAIltPmHuIoj
 KekfDK5pkJjftK67IlHioAkS3KGHy4VTybi4Sx8KjYy9Q4GPQ4TL2h18kInXGu4f
 tv7U9J6zx9mJ
 =aXc6
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "These include new ACPICA material, a rework of the ACPI thermal
  driver, a switch-over of the ACPI processor driver to using _OSC
  instead of (long deprecated) _PDC for CPU initialization, a rework of
  firmware notifications handling in several drivers, fixes and cleanups
  for suspend-to-idle handling on AMD systems, ACPI backlight driver
  updates and more.

  Specifics:

   - Update the ACPICA code in the kernel to upstream revision 20230628
     including the following changes:
      - Suppress a GCC 12 dangling-pointer warning (Philip Prindeville)
      - Reformat the ACPI_STATE_COMMON macro and its users (George Guo)
      - Replace the ternary operator with ACPI_MIN() (Jiangshan Yi)
      - Add support for _DSC as per ACPI 6.5 (Saket Dumbre)
      - Remove a duplicate macro from zephyr header (Najumon B.A)
      - Add data structures for GED and _EVT tracking (Jose Marinho)
      - Fix misspelled CDAT DSMAS define (Dave Jiang)
      - Simplify an error message in acpi_ds_result_push() (Christophe
        Jaillet)
      - Add a struct size macro related to SRAT (Dave Jiang)
      - Add AML_NO_OPERAND_RESOLVE flag to Timer (Abhishek Mainkar)
      - Add support for RISC-V external interrupt controllers in MADT
        (Sunil V L)
      - Add RHCT flags, CMO and MMU nodes (Sunil V L)
      - Change ACPICA version to 20230628 (Bob Moore)

   - Introduce new wrappers for ACPICA notify handler install/remove and
     convert multiple drivers to using their own Notify() handlers
     instead of the ACPI bus type .notify() slated for removal (Michal
     Wilczynski)

   - Add backlight=native DMI quirk for Apple iMac12,1 and iMac12,2
     (Hans de Goede)

   - Put ACPI video and its child devices explicitly into D0 on boot to
     avoid platform firmware confusion (Kai-Heng Feng)

   - Add backlight=native DMI quirk for Lenovo Ideapad Z470 (Jiri Slaby)

   - Support obtaining physical CPU ID from MADT on LoongArch (Bibo Mao)

   - Convert ACPI CPU initialization to using _OSC instead of _PDC that
     has been depreceted since 2018 and dropped from the specification
     in ACPI 6.5 (Michal Wilczynski, Rafael Wysocki)

   - Drop non-functional nocrt parameter from ACPI thermal (Mario
     Limonciello)

   - Clean up the ACPI thermal driver, rework the handling of firmware
     notifications in it and make it provide a table of generic trip
     point structures to the core during initialization (Rafael Wysocki)

   - Defer enumeration of devices with _DEP pointing to IVSC (Wentong
     Wu)

   - Install SystemCMOS address space handler for ACPI000E (TAD) to meet
     platform firmware expectations on some platforms (Zhang Rui)

   - Fix finding the generic error data in the ACPi extlog driver for
     compatibility with old and new firmware interface versions
     (Xiaochun Lee)

   - Remove assorted unused declarations of functions (Yue Haibing)

   - Move AMBA bus scan handling into arm64 specific directory (Sudeep
     Holla)

   - Fix and clean up suspend-to-idle interface for AMD systems (Mario
     Limonciello, Andy Shevchenko)

   - Fix string truncation warning in pnpacpi_add_device() (Sunil V L)"

* tag 'acpi-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (66 commits)
  ACPI: x86: s2idle: Add a function to get LPS0 constraint for a device
  ACPI: x86: s2idle: Add for_each_lpi_constraint() helper
  ACPI: x86: s2idle: Add more debugging for AMD constraints parsing
  ACPI: x86: s2idle: Fix a logic error parsing AMD constraints table
  ACPI: x86: s2idle: Catch multiple ACPI_TYPE_PACKAGE objects
  ACPI: x86: s2idle: Post-increment variables when getting constraints
  ACPI: Adjust #ifdef for *_lps0_dev use
  ACPI: TAD: Install SystemCMOS address space handler for ACPI000E
  ACPI: Remove assorted unused declarations of functions
  ACPI: extlog: Fix finding the generic error data for v3 structure
  PNP: ACPI: Fix string truncation warning
  ACPI: Remove unused extern declaration acpi_paddr_to_node()
  ACPI: video: Add backlight=native DMI quirk for Apple iMac12,1 and iMac12,2
  ACPI: video: Put ACPI video and its child devices into D0 on boot
  ACPI: processor: LoongArch: Get physical ID from MADT
  ACPI: scan: Defer enumeration of devices with a _DEP pointing to IVSC device
  ACPI: thermal: Eliminate code duplication from acpi_thermal_notify()
  ACPI: thermal: Drop unnecessary thermal zone callbacks
  ACPI: thermal: Rework thermal_get_trend()
  ACPI: thermal: Use trip point table to register thermal zones
  ...
2023-08-28 17:58:39 -07:00
Linus Torvalds
e5b7ca09e9 s390 updates for 6.6 merge window
- Add vfio-ap support to pass-through crypto devices to secure execution
   guests
 
 - Add API ordinal 6 support to zcrypt_ep11misc device drive, which is
   required to handle key generate and key derive (e.g. secure key to
   protected key) correctly
 
 - Add missing secure/has_secure sysfs files for the case where it is not
   possible to figure where a system has been booted from. Existing user
   space relies on that these files are always present
 
 - Fix DCSS block device driver list corruption, caused by incorrect
   error handling
 
 - Convert virt_to_pfn() and pfn_to_virt() from defines to static inline
   functions to enforce type checking
 
 - Cleanups, improvements, and minor fixes to the kernel mapping setup
 
 - Fix various virtual vs physical address confusions
 
 - Move pfault code to separate file, since it has nothing to do with
   regular fault handling
 
 - Move s390 documentation to Documentation/arch/ like it has been done
   for other architectures already
 
 - Add HAVE_FUNCTION_GRAPH_RETVAL support
 
 - Factor out the s390_hypfs filesystem and add a new config option for
   it. The filesystem is deprecated and as soon as all users are gone it
   can be removed some time in the not so near future
 
 - Remove support for old CEX2 and CEX3 crypto cards from zcrypt device
   driver
 
 - Add support for user-defined certificates: receive user-defined
   certificates with a diagnose call and provide them via 'cert_store'
   keyring to user space
 
 - Couple of other small fixes and improvements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmTrqNYACgkQIg7DeRsp
 bsKkUBAApWXr3WCJA2tige34AnFwmskx4sBxl/fgwcwJrC55fED1jKWaiXOM6isv
 P+hqavZnks3gXZdYcD3kxXkNMh+fPNWw7BAL35J5Gu1VShA/jlbTC6ZrvUO3t+Fy
 NsdLvBDbNDdyUzQF7w0Xb0jyIxqhJTRyhLfR5oXES63FHomv2F/vofu4jWR/q+cc
 F9mcnoDeN4zLdssdvl6WtPX4nEY9RpG0QOh67drnxuq+8v7sL8gKN4ti94Rp6vhs
 g4NhNs9xgRIPoOcX2KlSIdFqO9P12jSXZq0G4HcOp8UGQvgU/mS+UG3pQwV3ZJLS
 3/kUJZ4/CwQa1xUFtPGP1/4AngGNOnhT9FCD4KrqjDkRZmLsd5RvURe6L1zQ3vbZ
 KnX7q0Otx4xRVYPlbHb9aP+tC7f3Q10ytBAps616qZoA/2SMss2BLZiiPBpCCvDp
 L+9dRhBGYCP2PSe6H/qGQFfMW+uY7QF+NDcDAT5mX1lS8OVrGJxqM7Q+sY2pMLGo
 5nR16LvM9g6W/ZnsVn0+BWg4CgaPMi+PMfMPxs/o9RG+/0d1AJx1aLSiHdP1pXog
 8/Wg4GaaJ27S4Ers0JUmH7VDO+QkkLvAArstjk8l59r1XslWiBP5USebkxtgu6EQ
 ehAh0+oa432ALq8Rn1FK/X+pWFumbTVf8OPwR8YEjDbeTPIBCqg=
 =ewd9
 -----END PGP SIGNATURE-----

Merge tag 's390-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Heiko Carstens:

 - Add vfio-ap support to pass-through crypto devices to secure
   execution guests

 - Add API ordinal 6 support to zcrypt_ep11misc device drive, which is
   required to handle key generate and key derive (e.g. secure key to
   protected key) correctly

 - Add missing secure/has_secure sysfs files for the case where it is
   not possible to figure where a system has been booted from. Existing
   user space relies on that these files are always present

 - Fix DCSS block device driver list corruption, caused by incorrect
   error handling

 - Convert virt_to_pfn() and pfn_to_virt() from defines to static inline
   functions to enforce type checking

 - Cleanups, improvements, and minor fixes to the kernel mapping setup

 - Fix various virtual vs physical address confusions

 - Move pfault code to separate file, since it has nothing to do with
   regular fault handling

 - Move s390 documentation to Documentation/arch/ like it has been done
   for other architectures already

 - Add HAVE_FUNCTION_GRAPH_RETVAL support

 - Factor out the s390_hypfs filesystem and add a new config option for
   it. The filesystem is deprecated and as soon as all users are gone it
   can be removed some time in the not so near future

 - Remove support for old CEX2 and CEX3 crypto cards from zcrypt device
   driver

 - Add support for user-defined certificates: receive user-defined
   certificates with a diagnose call and provide them via 'cert_store'
   keyring to user space

 - Couple of other small fixes and improvements all over the place

* tag 's390-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (66 commits)
  s390/pci: use builtin_misc_device macro to simplify the code
  s390/vfio-ap: make sure nib is shared
  KVM: s390: export kvm_s390_pv*_is_protected functions
  s390/uv: export uv_pin_shared for direct usage
  s390/vfio-ap: check for TAPQ response codes 0x35 and 0x36
  s390/vfio-ap: handle queue state change in progress on reset
  s390/vfio-ap: use work struct to verify queue reset
  s390/vfio-ap: store entire AP queue status word with the queue object
  s390/vfio-ap: remove upper limit on wait for queue reset to complete
  s390/vfio-ap: allow deconfigured queue to be passed through to a guest
  s390/vfio-ap: wait for response code 05 to clear on queue reset
  s390/vfio-ap: clean up irq resources if possible
  s390/vfio-ap: no need to check the 'E' and 'I' bits in APQSW after TAPQ
  s390/ipl: refactor deprecated strncpy
  s390/ipl: fix virtual vs physical address confusion
  s390/zcrypt_ep11misc: support API ordinal 6 with empty pin-blob
  s390/paes: fix PKEY_TYPE_EP11_AES handling for secure keyblobs
  s390/pkey: fix PKEY_TYPE_EP11_AES handling for sysfs attributes
  s390/pkey: fix PKEY_TYPE_EP11_AES handling in PKEY_VERIFYKEY2 IOCTL
  s390/pkey: fix PKEY_TYPE_EP11_AES handling in PKEY_KBLOB2PROTK[23]
  ...
2023-08-28 17:22:39 -07:00
Linus Torvalds
68cadad11f RCU pull request for v6.6
doc.2023.07.14b: Documentation updates.
 
 fixes.2023.08.16a: Miscellaneous fixes, perhaps most notably simplifying
 	SRCU_NOTIFIER_INIT() as suggested.
 
 rcu-tasks.2023.07.24a:  RCU Tasks updates, most notably treating
 	Tasks RCU callbacks as lazy while still treating synchronous
 	grace periods as urgent.  Also fixes one bug that restores the
 	ability to apply debug-objects to RCU Tasks and another that
 	fixes a race condition that could result in false-positive
 	failures of the boot-time self-test code.
 
 rcuscale.2023.07.14b: RCU-scalability performance-test updates,
 	most notably adding the ability to measure the RCU-Tasks's
 	grace-period kthread's CPU consumption.  This proved
 	quite useful for the rcu-tasks.2023.07.24a work.
 
 refscale.2023.07.14b: Reference-acquisition/release performance-test
 	updates, including a fix for an uninitialized wait_queue_head_t.
 
 torture.2023.08.14a: Miscellaneous torture-test updates.
 
 torturescripts.2023.07.20a: Torture-test scripting updates, including
 	removal of the non-longer-functional formal-verification scripts,
 	test builds of individual RCU Tasks flavors, better diagnostics
 	for loss of connectivity for distributed rcutorture tests,
 	disabling of reboot loops in qemu/KVM-based rcutorture testing,
 	and passing of init parameters to rcutorture's init program.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmTjkssTHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jITND/9zEqYNbeFrcBs/YaHdoAjsNgOt1IYN
 csfF/KArVgdvmrwlV/nEaQMLaJcw9X7DVU5+7E2JbbDaB/2FSacseNyKk6mfgSVK
 /0rnTOXpqI9/T1HiJObWZvDQFuKL12bfteXWGJg1sMt2JUGZ4nAWhdZ3xRjp2XkO
 89qB5r0fF8gyGwvQ3M29ss8T9Oy0uUNJmDY/QyVxHM6dhkpSAezFffKzD7C4zkSV
 WucRTpYJ7bs6otBGtVmwz3x60UAuLwcVfQyB+CTbnGLsps9yAYU+1DDVdm7olcr3
 ARXMeboeodMvy9jWXhtbWRVAAob4lVUDXQN27kb4sBgroRQBfQXMuByRAU6s0VtX
 frOl6rlbORuAetsC8wFL0IFVn4yTpvXKbYw7h1MXTs7gVVbl33O9FieGvWu0r79/
 VR4Xw+JbmYWtyvFV8Zaq4iIEcOe+PeNH6u0bPx+htsHYd1+DUG2UY0MVmJQ3a4sb
 ygejA6mguCk7KBzWab8wdDpgAfhNwg0T9a+LQYcaskuD5SSWjYqqg6i1ulqqqyiE
 bOfRKDX4mWmAobWKHLssqUrjiLbxfygIaHjCrt7rWJKPIs1bK/WfWa4JbrE0NRwK
 9IDd1lWc9C+zoUpjyZWSG3ahK5lWo2u4sPNoRtMQjowjobIz1cBhaEwmFe72bG7C
 FCKb7Da2oUaLOw==
 =EujZ
 -----END PGP SIGNATURE-----

Merge tag 'rcu.2023.08.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull RCU updates from Paul McKenney:

 - Documentation updates

 - Miscellaneous fixes, perhaps most notably simplifying
   SRCU_NOTIFIER_INIT() as suggested

 - RCU Tasks updates, most notably treating Tasks RCU callbacks as lazy
   while still treating synchronous grace periods as urgent. Also fixes
   one bug that restores the ability to apply debug-objects to RCU Tasks
   and another that fixes a race condition that could result in
   false-positive failures of the boot-time self-test code

 - RCU-scalability performance-test updates, most notably adding the
   ability to measure the RCU-Tasks's grace-period kthread's CPU
   consumption. This proved quite useful for the RCU Tasks work

 - Reference-acquisition/release performance-test updates, including a
   fix for an uninitialized wait_queue_head_t

 - Miscellaneous torture-test updates

 - Torture-test scripting updates, including removal of the
   non-longer-functional formal-verification scripts, test builds of
   individual RCU Tasks flavors, better diagnostics for loss of
   connectivity for distributed rcutorture tests, disabling of reboot
   loops in qemu/KVM-based rcutorture testing, and passing of init
   parameters to rcutorture's init program

* tag 'rcu.2023.08.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (64 commits)
  rcu: Use WRITE_ONCE() for assignments to ->next for rculist_nulls
  rcu: Make the rcu_nocb_poll boot parameter usable via boot config
  rcu: Mark __rcu_irq_enter_check_tick() ->rcu_urgent_qs load
  srcu,notifier: Remove #ifdefs in favor of SRCU Tiny srcu_usage
  rcutorture: Stop right-shifting torture_random() return values
  torture: Stop right-shifting torture_random() return values
  torture: Move stutter_wait() timeouts to hrtimers
  torture: Move torture_shuffle() timeouts to hrtimers
  torture: Move torture_onoff() timeouts to hrtimers
  torture: Make torture_hrtimeout_*() use TASK_IDLE
  torture: Add lock_torture writer_fifo module parameter
  torture: Add a kthread-creation callback to _torture_create_kthread()
  rcu-tasks: Fix boot-time RCU tasks debug-only deadlock
  rcu-tasks: Permit use of debug-objects with RCU Tasks flavors
  checkpatch: Complain about unexpected uses of RCU Tasks Trace
  torture: Cause mkinitrd.sh to indicate failure on compile errors
  torture: Make init program dump command-line arguments
  torture: Switch qemu from -nographic to -display none
  torture: Add init-program support for loongarch
  torture: Avoid torture-test reboot loops
  ...
2023-08-28 13:19:28 -07:00
Andrei Emeltchenko
ac6804fbf4 Documentation: serial-console: Fix literal block marker
Make rendered text readable by fixing literal block marker, changing
":" to "::".

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20230825091626.354352-1-Andrei.Emeltchenko.news@gmail.com
2023-08-28 12:42:03 -06:00
Linus Torvalds
de16588a77 v6.6-vfs.misc
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXTxQAKCRCRxhvAZXjc
 okaVAP94WAlItvDRt/z2Wtzf0+RqPZeTXEdGTxua8+RxqCyYIQD+OO5nRfKQPHlV
 AqqGJMKItQMSMIYgB5ftqVhNWZfnHgM=
 =pSEW
 -----END PGP SIGNATURE-----

Merge tag 'v6.6-vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "This contains the usual miscellaneous features, cleanups, and fixes
  for vfs and individual filesystems.

  Features:

   - Block mode changes on symlinks and rectify our broken semantics

   - Report file modifications via fsnotify() for splice

   - Allow specifying an explicit timeout for the "rootwait" kernel
     command line option. This allows to timeout and reboot instead of
     always waiting indefinitely for the root device to show up

   - Use synchronous fput for the close system call

  Cleanups:

   - Get rid of open-coded lockdep workarounds for async io submitters
     and replace it all with a single consolidated helper

   - Simplify epoll allocation helper

   - Convert simple_write_begin and simple_write_end to use a folio

   - Convert page_cache_pipe_buf_confirm() to use a folio

   - Simplify __range_close to avoid pointless locking

   - Disable per-cpu buffer head cache for isolated cpus

   - Port ecryptfs to kmap_local_page() api

   - Remove redundant initialization of pointer buf in pipe code

   - Unexport the d_genocide() function which is only used within core
     vfs

   - Replace printk(KERN_ERR) and WARN_ON() with WARN()

  Fixes:

   - Fix various kernel-doc issues

   - Fix refcount underflow for eventfds when used as EFD_SEMAPHORE

   - Fix a mainly theoretical issue in devpts

   - Check the return value of __getblk() in reiserfs

   - Fix a racy assert in i_readcount_dec

   - Fix integer conversion issues in various functions

   - Fix LSM security context handling during automounts that prevented
     NFS superblock sharing"

* tag 'v6.6-vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (39 commits)
  cachefiles: use kiocb_{start,end}_write() helpers
  ovl: use kiocb_{start,end}_write() helpers
  aio: use kiocb_{start,end}_write() helpers
  io_uring: use kiocb_{start,end}_write() helpers
  fs: create kiocb_{start,end}_write() helpers
  fs: add kerneldoc to file_{start,end}_write() helpers
  io_uring: rename kiocb_end_write() local helper
  splice: Convert page_cache_pipe_buf_confirm() to use a folio
  libfs: Convert simple_write_begin and simple_write_end to use a folio
  fs/dcache: Replace printk and WARN_ON by WARN
  fs/pipe: remove redundant initialization of pointer buf
  fs: Fix kernel-doc warnings
  devpts: Fix kernel-doc warnings
  doc: idmappings: fix an error and rephrase a paragraph
  init: Add support for rootwait timeout parameter
  vfs: fix up the assert in i_readcount_dec
  fs: Fix one kernel-doc comment
  docs: filesystems: idmappings: clarify from where idmappings are taken
  fs/buffer.c: disable per-CPU buffer_head cache for isolated CPUs
  vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing
  ...
2023-08-28 10:17:14 -07:00