The conversion of all GPIO drivers to using the .set_rv() and
.set_multiple_rv() callbacks from struct gpio_chip (which - unlike their
predecessors - return an integer and allow the controller drivers to
indicate failures to users) is now complete and the legacy ones have
been removed. Rename the new callbacks back to their original names in
one sweeping change.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
- The 2 patch series "squashfs: Remove page->mapping references" from
Matthew Wilcox gets us closer to being able to remove page->mapping.
- The 5 patch series "relayfs: misc changes" from Jason Xing does some
maintenance and minor feature addition work in relayfs.
- The 5 patch series "kdump: crashkernel reservation from CMA" from Jiri
Bohac switches us from static preallocation of the kdump crashkernel's
working memory over to dynamic allocation. So the difficulty of
a-priori estimation of the second kernel's needs is removed and the
first kernel obtains extra memory.
- The 5 patch series "generalize panic_print's dump function to be used
by other kernel parts" from Feng Tang implements some consolidation and
rationalizatio of the various ways in which a faiing kernel splats
information at the operator.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaI+82gAKCRDdBJ7gKXxA
jj4JAP9xb+w9DrBY6sa+7KTPIb+aTqQ7Zw3o9O2m+riKQJv6jAEA6aEwRnDA0451
fDT5IqVlCWGvnVikdZHSnvhdD7TGsQ0=
=rT71
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
"Significant patch series in this pull request:
- "squashfs: Remove page->mapping references" (Matthew Wilcox) gets
us closer to being able to remove page->mapping
- "relayfs: misc changes" (Jason Xing) does some maintenance and
minor feature addition work in relayfs
- "kdump: crashkernel reservation from CMA" (Jiri Bohac) switches
us from static preallocation of the kdump crashkernel's working
memory over to dynamic allocation. So the difficulty of a-priori
estimation of the second kernel's needs is removed and the first
kernel obtains extra memory
- "generalize panic_print's dump function to be used by other
kernel parts" (Feng Tang) implements some consolidation and
rationalization of the various ways in which a failing kernel
splats information at the operator
* tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (80 commits)
tools/getdelays: add backward compatibility for taskstats version
kho: add test for kexec handover
delaytop: enhance error logging and add PSI feature description
samples: Kconfig: fix spelling mistake "instancess" -> "instances"
fat: fix too many log in fat_chain_add()
scripts/spelling.txt: add notifer||notifier to spelling.txt
xen/xenbus: fix typo "notifer"
net: mvneta: fix typo "notifer"
drm/xe: fix typo "notifer"
cxl: mce: fix typo "notifer"
KVM: x86: fix typo "notifer"
MAINTAINERS: add maintainers for delaytop
ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below()
ucount: fix atomic_long_inc_below() argument type
kexec: enable CMA based contiguous allocation
stackdepot: make max number of pools boot-time configurable
lib/xxhash: remove unused functions
init/Kconfig: restore CONFIG_BROKEN help text
lib/raid6: update recov_rvv.c zero page usage
docs: update docs after introducing delaytop
...
- Introduce regular REGSET note macros arch-wide (Dave Martin)
- Remove arbitrary 4K limitation of program header size (Yin Fengwei)
- Reorder function qualifiers for copy_clone_args_from_user() (Dishank Jogi)
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaIVKiAAKCRA2KwveOeQk
u4zBAP4zUNj2+XyixVPXCzv+Hkle6zWs7yrzdA2yLxe8Qtwj5AD+N2I6MUGcCFGW
W+uWxlWTtGLDqh1CplIUqTlxMi39Og4=
=vYnE
-----END PGP SIGNATURE-----
Merge tag 'execve-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull execve updates from Kees Cook:
- Introduce regular REGSET note macros arch-wide (Dave Martin)
- Remove arbitrary 4K limitation of program header size (Yin Fengwei)
- Reorder function qualifiers for copy_clone_args_from_user() (Dishank Jogi)
* tag 'execve-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (25 commits)
fork: reorder function qualifiers for copy_clone_args_from_user
binfmt_elf: remove the 4k limitation of program header size
binfmt_elf: Warn on missing or suspicious regset note names
xtensa: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
um: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
x86/ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
sparc: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
sh: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
s390/ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
riscv: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
powerpc/ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
parisc: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
openrisc: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
nios2: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
MIPS: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
m68k: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
LoongArch: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
hexagon: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
csky: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
arm64: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
...
Restricted pointers ("%pK") are not meant to be used through printk().
It can unintentionally expose security sensitive, raw pointer values.
Use regular pointer formatting instead.
Link: https://lore.kernel.org/lkml/20250113171731-dc10e3c1-da64-4af0-b767-7c7070468023@linutronix.de/
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Patch series "kdump: crashkernel reservation from CMA", v5.
This series implements a way to reserve additional crash kernel memory
using CMA.
Currently, all the memory for the crash kernel is not usable by the 1st
(production) kernel. It is also unmapped so that it can't be corrupted by
the fault that will eventually trigger the crash. This makes sense for
the memory actually used by the kexec-loaded crash kernel image and initrd
and the data prepared during the load (vmcoreinfo, ...). However, the
reserved space needs to be much larger than that to provide enough
run-time memory for the crash kernel and the kdump userspace. Estimating
the amount of memory to reserve is difficult. Being too careful makes
kdump likely to end in OOM, being too generous takes even more memory from
the production system. Also, the reservation only allows reserving a
single contiguous block (or two with the "low" suffix). I've seen systems
where this fails because the physical memory is fragmented.
By reserving additional crashkernel memory from CMA, the main crashkernel
reservation can be just large enough to fit the kernel and initrd image,
minimizing the memory taken away from the production system. Most of the
run-time memory for the crash kernel will be memory previously available
to userspace in the production system. As this memory is no longer
wasted, the reservation can be done with a generous margin, making kdump
more reliable. Kernel memory that we need to preserve for dumping is
normally not allocated from CMA, unless it is explicitly allocated as
movable. Currently this is only the case for memory ballooning and zswap.
Such movable memory will be missing from the vmcore. User data is
typically not dumped by makedumpfile. When dumping of user data is
intended this new CMA reservation cannot be used.
There are five patches in this series:
The first adds a new ",cma" suffix to the recenly introduced generic
crashkernel parsing code. parse_crashkernel() takes one more argument to
store the cma reservation size.
The second patch implements reserve_crashkernel_cma() which performs the
reservation. If the requested size is not available in a single range,
multiple smaller ranges will be reserved.
The third patch updates Documentation/, explicitly mentioning the
potential DMA corruption of the CMA-reserved memory.
The fourth patch adds a short delay before booting the kdump kernel,
allowing pending DMA transfers to finish.
The fifth patch enables the functionality for x86 as a proof of
concept. There are just three things every arch needs to do:
- call reserve_crashkernel_cma()
- include the CMA-reserved ranges in the physical memory map
- exclude the CMA-reserved ranges from the memory available
through /proc/vmcore by excluding them from the vmcoreinfo
PT_LOAD ranges.
Adding other architectures is easy and I can do that as soon as this
series is merged.
With this series applied, specifying
crashkernel=100M craskhernel=1G,cma
on the command line will make a standard crashkernel reservation
of 100M, where kexec will load the kernel and initrd.
An additional 1G will be reserved from CMA, still usable by the production
system. The crash kernel will have 1.1G memory available. The 100M can
be reliably predicted based on the size of the kernel and initrd.
The new cma suffix is completely optional. When no
crashkernel=size,cma is specified, everything works as before.
This patch (of 5):
Add a new cma_size parameter to parse_crashkernel(). When not NULL, call
__parse_crashkernel to parse the CMA reservation size from
"crashkernel=size,cma" and store it in cma_size.
Set cma_size to NULL in all calls to parse_crashkernel().
Link: https://lkml.kernel.org/r/aEqnxxfLZMllMC8I@dwarf.suse.cz
Link: https://lkml.kernel.org/r/aEqoQckgoTQNULnh@dwarf.suse.cz
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Donald Dutile <ddutile@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Philipp Rudo <prudo@redhat.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Tao Liu <ltao@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
On MIPS architecture with CPS-based SMP support, all CPU cores in the
same cluster run at the same frequency since they share the same L2
cache, requiring a fixed CPU/L2 cache ratio.
This allows to implement calibrate_delay_is_known(), which will return
0 (triggering calibration) only for the primary CPU of each
cluster. For other CPUs, we can simply reuse the value from their
cluster's primary CPU core.
With the introduction of this patch, a configuration running 32 cores
spread across two clusters sees a significant reduction in boot time
by approximately 600 milliseconds.
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
The initial implementation of this function goes through all the CPUs
in a cluster to determine if the current CPU is the only one
running. This process occurs every time the function is called.
However, during boot, we already perform this task, so let's take
advantage of this opportunity to create and fill a CPU bitmask that
can be easily and efficiently used later.
This patch modifies the function to allow providing the first
available online CPU when one already exists, which is necessary for
delay CPU calibration optimization.
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
It is possible that MMID is supported at the CPU level, but its
integration in a SoC prevents its usage. For instance, if the
System-level Interconnect (also known as Network on Chip) does not
support global invalidation, then the MMID feature is not usable. The
current implementation of MMID relies on the GINV* instructions.
This patch allows the disabling of MMID based on a device tree
property, as this issue cannot be detected at runtime.
MMID is set up very early during the boot process, even before device
tree data can be accessed. Therefore, when we determine whether MMID
needs to be disabled, some MMID setup has already been performed for
the boot CPU. Consequently, we must revert the MMID setup on the first
CPU before disabling the feature for the subsequent CPUs that will be
initialized later.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Instead of having the core code guess the note name for each regset,
use USER_REGSET_NOTE_TYPE() to pick the correct name from elf.h.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: linux-mips@vger.kernel.org
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250701135616.29630-12-Dave.Martin@arm.com
Signed-off-by: Kees Cook <kees@kernel.org>
Not all tasks have an ABI associated or vDSO mapped,
for example kthreads never do.
If such a task ever ends up calling stack_top(), it will derefence the
NULL ABI pointer and crash.
This can for example happen when using kunit:
mips_stack_top+0x28/0xc0
arch_pick_mmap_layout+0x190/0x220
kunit_vm_mmap_init+0xf8/0x138
__kunit_add_resource+0x40/0xa8
kunit_vm_mmap+0x88/0xd8
usercopy_test_init+0xb8/0x240
kunit_try_run_case+0x5c/0x1a8
kunit_generic_run_threadfn_adapter+0x28/0x50
kthread+0x118/0x240
ret_from_kernel_thread+0x14/0x1c
Only dereference the ABI point if it is set.
The GIC page is also included as it is specific to the vDSO.
Also move the randomization adjustment into the same conditional.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Introduce file_getattr() and file_setattr() syscalls to manipulate inode
extended attributes. The syscalls takes pair of file descriptor and
pathname. Then it operates on inode opened accroding to openat()
semantics. The struct file_attr is passed to obtain/change extended
attributes.
This is an alternative to FS_IOC_FSSETXATTR ioctl with a difference
that file don't need to be open as we can reference it with a path
instead of fd. By having this we can manipulated inode extended
attributes not only on regular files but also on special ones. This
is not possible with FS_IOC_FSSETXATTR ioctl as with special files
we can not call ioctl() directly on the filesystem inode using fd.
This patch adds two new syscalls which allows userspace to get/set
extended inode attributes on special files by using parent directory
and a path - *at() like syscall.
CC: linux-api@vger.kernel.org
CC: linux-fsdevel@vger.kernel.org
CC: linux-xfs@vger.kernel.org
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-6-c4e3bc35227b@kernel.org
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
- Add support for the EXPORT_SYMBOL_GPL_FOR_MODULES() macro, which exports a
symbol only to specified modules
- Improve ABI handling in gendwarfksyms
- Forcibly link lib-y objects to vmlinux even if CONFIG_MODULES=n
- Add checkers for redundant or missing <linux/export.h> inclusion
- Deprecate the extra-y syntax
- Fix a genksyms bug when including enum constants from *.symref files
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmhEZc4VHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGVAgQAKLRdBGga1kBJJFIkUOHWC5+g/je
U/dO5rGnuOLviWDexC6QT8AQV2N+dQXhB11x+KacSu1bwowsEvwuegtA6VqwbETs
tyWmB0PftEzVyPfc+Rjfy0LDfKkiKkm4RhXiMwcem/rlw45gvJXrVU7jJin9fI3A
So8glpOAX+mEizUHkjZkS51nkYCZFDsn7hVo0X43vqjeFrrFGLEQ5xas4Ci+dkY3
9g8Q5bFL8CC5PHjSO8wFftCcAWwTukAht6CSSb522MKGnCVZ9RxTmRwEPXrBmXtS
5eWa8yg6y0tFVmot8iwZGBYleAWDNsj0a2j2oVjUN+EF91sk3WQApJVNBok/nQFb
4MgO3N3UXZdy4tYkBX8tMgOcGkfjZAFoNxSUm5oVouh9NyT0dpqYHhJHBNVbVJoF
igQWeVOYcioDjeU1iXnP2cw64q44ROfxmOpDxOSRz9PTM6CCya1R0m/zzBLV6Lwk
rzlXk1LLf+jIfgmS5RLlkCgrXS1U0vNGXxQH9Ui9dZSEtzdU7qt5WQ/Rz44bEBhS
OeIlJfMMx6QYJztJc/BaUjkKsutTkII52QctRbRCj/nKswHd8SnHV+xk1c2WPxrg
yKq10rPpdg1BcvmODY6cmcndt7ogDRfkogm2gvGQIBZEglRimpmpg51sZQRD0ueE
0rt12TmktsLbglB4
=Dy49
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Add support for the EXPORT_SYMBOL_GPL_FOR_MODULES() macro, which
exports a symbol only to specified modules
- Improve ABI handling in gendwarfksyms
- Forcibly link lib-y objects to vmlinux even if CONFIG_MODULES=n
- Add checkers for redundant or missing <linux/export.h> inclusion
- Deprecate the extra-y syntax
- Fix a genksyms bug when including enum constants from *.symref files
* tag 'kbuild-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (28 commits)
genksyms: Fix enum consts from a reference affecting new values
arch: use always-$(KBUILD_BUILTIN) for vmlinux.lds
kbuild: set y instead of 1 to KBUILD_{BUILTIN,MODULES}
efi/libstub: use 'targets' instead of extra-y in Makefile
module: make __mod_device_table__* symbols static
scripts/misc-check: check unnecessary #include <linux/export.h> when W=1
scripts/misc-check: check missing #include <linux/export.h> when W=1
scripts/misc-check: add double-quotes to satisfy shellcheck
kbuild: move W=1 check for scripts/misc-check to top-level Makefile
scripts/tags.sh: allow to use alternative ctags implementation
kconfig: introduce menu type enum
docs: symbol-namespaces: fix reST warning with literal block
kbuild: link lib-y objects to vmlinux forcibly even when CONFIG_MODULES=n
tinyconfig: enable CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
docs/core-api/symbol-namespaces: drop table of contents and section numbering
modpost: check forbidden MODULE_IMPORT_NS("module:") at compile time
kbuild: move kbuild syntax processing to scripts/Makefile.build
Makefile: remove dependency on archscripts for header installation
Documentation/kbuild: Add new gendwarfksyms kABI rules
Documentation/kbuild: Drop section numbers
...
The extra-y syntax is deprecated. Instead, use always-$(KBUILD_BUILTIN),
which behaves equivalently.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Added support for parallel CPU bring up on EyeQ
Other cleanups and fixes
-----BEGIN PGP SIGNATURE-----
iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmhCkBcaHHRzYm9nZW5k
QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHDdzQ//RUuUrNQ2ItcrnoBqdjGt
AgQr6N1h7FkKKxhW7dR8CTFNskrH9p1E8sQXaVLaeEh22U7DVdVnqIwrGPEKwsr4
UDuYawat+KEo076IHAlzbT0Uphk6n8lucReVDDPe8Crv2ySJsMNEJtmqWSgecoi2
vCDuJokMnwrrTJg/zLHziJZMqKpvBrp/9OJIa8RsKZC+CdVZtfwp/I6uxF/gZRc1
gLfaqOaUftq6bk0bFccng2Mmb2EjROy8AF9z4aFa3WzZRyn7dZjlY6moeS/IZJJi
DsnXJnuagHKZE65kU4kQXEHM4IPeyQ1Q1uvARElluUJMqKXGeD0zu7wc67oMlAiu
LABsLn+P935N11uLQ68BzMfEHhcWIrTMOz+dJY//ViuPW2C9Ix7x0zYZAKDdWDuI
utOhdsiSakYYfoZcG6HCkaT5w2w+ag4/Cl0Z5+CFbcc1w8Jk+uHGeEjcuK+pbsNR
pYzSEi+Svx3q6k4+w8bDGLEKpJVviH2LBsBMZLcvQCTwo2dF6JQ9pAOnRQAu94H4
VyKMSHVqdoy2yBJzmsPlQCtquc7yaVGx1ykiAokEULT8eM0Sj8qY9Y95XFNTr7PM
wheOkShQGupBB7EtitNoazcWfBIAilKJPzC4Xhf/hysKQDF/sQLGs1fs3F0L+AP+
J0d1ZUiHZRYEahNpsO3d+10=
=Ag5K
-----END PGP SIGNATURE-----
Merge tag 'mips_6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS updates from Thomas Bogendoerfer:
- Added support for EcoNet platform
- Added support for parallel CPU bring up on EyeQ
- Other cleanups and fixes
* tag 'mips_6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (23 commits)
MIPS: loongson2ef: lemote-2f: add missing function prototypes
MIPS: loongson2ef: cs5536: add missing function prototypes
MIPS: SMP: Move the AP sync point before the calibration delay
mips: econet: Fix incorrect Kconfig dependencies
MAINTAINERS: Add entry for newly added EcoNet platform.
mips: dts: Add EcoNet DTS with EN751221 and SmartFiber XP8421-B board
dt-bindings: vendor-prefixes: Add SmartFiber
mips: Add EcoNet MIPS platform support
dt-bindings: mips: Add EcoNet platform binding
MIPS: bcm63xx: nvram: avoid inefficient use of crc32_le_combine()
mips: dts: pic32: pic32mzda: Rename the sdhci nodename to match with common mmc-controller binding
MIPS: SMP: Move the AP sync point before the non-parallel aware functions
MIPS: Replace strcpy() with strscpy() in vpe_elfload()
MIPS: BCM63XX: Replace strcpy() with strscpy() in board_prom_init()
mips: ptrace: Improve code formatting and indentation
MIPS: SMP: Implement parallel CPU bring up for EyeQ
mips: Add -std= flag specified in KBUILD_CFLAGS to vdso CFLAGS
MIPS: Loongson64: Add missing '#interrupt-cells' for loongson64c_ls7a
mips: dts: realtek: Add MDIO controller
MIPS: txx9: gpio: use new line value setter callbacks
...
Core & generic-arch updates:
- Add support for dynamic constraints and propagate it to
the Intel driver (Kan Liang)
- Fix & enhance driver-specific throttling support (Kan Liang)
- Record sample last_period before updating on the
x86 and PowerPC platforms (Mark Barnett)
- Make perf_pmu_unregister() usable (Peter Zijlstra)
- Unify perf_event_free_task() / perf_event_exit_task_context()
(Peter Zijlstra)
- Simplify perf_event_release_kernel() and perf_event_free_task()
(Peter Zijlstra)
- Allocate non-contiguous AUX pages by default (Yabin Cui)
Uprobes updates:
- Add support to emulate NOP instructions (Jiri Olsa)
- selftests/bpf: Add 5-byte NOP uprobe trigger benchmark (Jiri Olsa)
x86 Intel PMU enhancements:
- Support Intel Auto Counter Reload [ACR] (Kan Liang)
- Add PMU support for Clearwater Forest (Dapeng Mi)
- Arch-PEBS preparatory changes: (Dapeng Mi)
- Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
- Decouple BTS initialization from PEBS initialization
- Introduce pairs of PEBS static calls
x86 AMD PMU enhancements:
- Use hrtimer for handling overflows in the AMD uncore driver
(Sandipan Das)
- Prevent UMC counters from saturating (Sandipan Das)
Fixes and cleanups:
- Fix put_ctx() ordering (Frederic Weisbecker)
- Fix irq work dereferencing garbage (Frederic Weisbecker)
- Misc fixes and cleanups (Changbin Du, Frederic Weisbecker,
Ian Rogers, Ingo Molnar, Kan Liang, Peter Zijlstra, Qing Wang,
Sandipan Das, Thorsten Blum)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmgy4zoRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1j6QRAAvQ4GBPrdJLb8oXkLjCmWSp9PfM1h2IW0
reUrcV0BPRAwz4T60QEU2KyiEjvKxNghR6bNw4i3slAZ8EFwP9eWE/0ZYOo5+W/N
wv8vsopv/oZd2L2G5TgxDJf+tLPkqnTvp651LmGAbquPFONN1lsya9UHVPnt2qtv
fvFhjW6D828VoevRcUCsdoEUNlFDkUYQ2c3M1y5H2AI6ILDVxLsp5uYtuVUP+2lQ
7UI/elqRIIblTGT7G9LvTGiXZMm8T58fe1OOLekT6NdweJ3XEt1kMdFo/SCRYfzU
eDVVVLSextZfzBXNPtAEAlM3aSgd8+4m5sACiD1EeOUNjo5J9Sj1OOCa+bZGF/Rl
XNv5Kcp6Kh1T4N5lio8DE/NabmHDqDMbUGfud+VTS8uLLku4kuOWNMxJTD1nQ2Zz
BMfJhP89G9Vk07F9fOGuG1N6mKhIKNOgXh0S92tB7XDHcdJegueu2xh4ZszBL1QK
JVXa4DbnDj+y0LvnV+A5Z6VILr5RiCAipDb9ascByPja6BbN10Nf9Aj4nWwRTwbO
ut5OK/fDKmSjEHn1+a42d4iRxdIXIWhXCyxEhH+hJXEFx9htbQ3oAbXAEedeJTlT
g9QYGAjL96QEd0CqviorV8KyU59nVkEPoLVCumXBZ0WWhNwU6GdAmsW1hLfxQdLN
sp+XHhfxf8M=
=tPRs
-----END PGP SIGNATURE-----
Merge tag 'perf-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events updates from Ingo Molnar:
"Core & generic-arch updates:
- Add support for dynamic constraints and propagate it to the Intel
driver (Kan Liang)
- Fix & enhance driver-specific throttling support (Kan Liang)
- Record sample last_period before updating on the x86 and PowerPC
platforms (Mark Barnett)
- Make perf_pmu_unregister() usable (Peter Zijlstra)
- Unify perf_event_free_task() / perf_event_exit_task_context()
(Peter Zijlstra)
- Simplify perf_event_release_kernel() and perf_event_free_task()
(Peter Zijlstra)
- Allocate non-contiguous AUX pages by default (Yabin Cui)
Uprobes updates:
- Add support to emulate NOP instructions (Jiri Olsa)
- selftests/bpf: Add 5-byte NOP uprobe trigger benchmark (Jiri Olsa)
x86 Intel PMU enhancements:
- Support Intel Auto Counter Reload [ACR] (Kan Liang)
- Add PMU support for Clearwater Forest (Dapeng Mi)
- Arch-PEBS preparatory changes: (Dapeng Mi)
- Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
- Decouple BTS initialization from PEBS initialization
- Introduce pairs of PEBS static calls
x86 AMD PMU enhancements:
- Use hrtimer for handling overflows in the AMD uncore driver
(Sandipan Das)
- Prevent UMC counters from saturating (Sandipan Das)
Fixes and cleanups:
- Fix put_ctx() ordering (Frederic Weisbecker)
- Fix irq work dereferencing garbage (Frederic Weisbecker)
- Misc fixes and cleanups (Changbin Du, Frederic Weisbecker, Ian
Rogers, Ingo Molnar, Kan Liang, Peter Zijlstra, Qing Wang, Sandipan
Das, Thorsten Blum)"
* tag 'perf-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
perf/headers: Clean up <linux/perf_event.h> a bit
perf/uapi: Clean up <uapi/linux/perf_event.h> a bit
perf/uapi: Fix PERF_RECORD_SAMPLE comments in <uapi/linux/perf_event.h>
mips/perf: Remove driver-specific throttle support
xtensa/perf: Remove driver-specific throttle support
sparc/perf: Remove driver-specific throttle support
loongarch/perf: Remove driver-specific throttle support
csky/perf: Remove driver-specific throttle support
arc/perf: Remove driver-specific throttle support
alpha/perf: Remove driver-specific throttle support
perf/apple_m1: Remove driver-specific throttle support
perf/arm: Remove driver-specific throttle support
s390/perf: Remove driver-specific throttle support
powerpc/perf: Remove driver-specific throttle support
perf/x86/zhaoxin: Remove driver-specific throttle support
perf/x86/amd: Remove driver-specific throttle support
perf/x86/intel: Remove driver-specific throttle support
perf: Only dump the throttle log for the leader
perf: Fix the throttle logic for a group
perf/core: Add the is_event_in_freq_mode() helper to simplify the code
...
In the calibration delay process, some resources are shared, so it's
better to move it after the parallel execution part. Thanks to the
patch optimizing CPU delay calibration, this change has no impact on
the boot time improvements gained from CPU parallel boot.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
The throttle support has been added in the generic code. Remove
the driver-specific throttle support.
Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250520181644.2673067-17-kan.liang@linux.intel.com
When CONFIG_HOTPLUG_PARALLEL is enabled, the code executing before
cpuhp_ap_sync_alive() is executed in parallel, while after it is
serialized. The functions set_cpu_sibling_map() and set_cpu_core_map()
were not designed to be executed in parallel, so by moving the
cpuhp_ap_sync_alive() before cpuhp_ap_sync_alive(), we then ensure
they will be called serialized.
The measurement done on EyeQ5 did not show any relevant boot time
increase after applying this patch.
Fixes: 76c43eb507 ("MIPS: SMP: Implement parallel CPU bring up for EyeQ")
Reported-by: Huacai Chen <chenhuacai@kernel.org>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Recently the rollback region has been changed into an
idle interrupt region [1]. This patch make the appropriate
changes renaming functions and macro, to reflect the change.
[1] https://lore.kernel.org/linux-mips/20250403161143.361461-2-marco.crivellari@suse.com/
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Fix missing .cpuidle.text section assignment for r4k_wait() to correct
backtracing with nmi_backtrace().
Fixes: 97c8580e85 ("MIPS: Annotate cpu_wait implementations with __cpuidle")
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
MIPS re-enables interrupts on its idle routine and performs
a TIF_NEED_RESCHED check afterwards before putting the CPU to sleep.
The IRQs firing between the check and the 'wait' instruction may set the
TIF_NEED_RESCHED flag. In order to deal with this possible race, IRQs
interrupting __r4k_wait() rollback their return address to the
beginning of __r4k_wait() so that TIF_NEED_RESCHED is checked
again before going back to sleep.
However idle IRQs can also queue timers that may require a tick
reprogramming through a new generic idle loop iteration but those timers
would go unnoticed here because __r4k_wait() only checks
TIF_NEED_RESCHED. It doesn't check for pending timers.
Fix this with fast-forwarding idle IRQs return address to the end of the
idle routine instead of the beginning, so that the generic idle loop
handles both TIF_NEED_RESCHED and pending timers.
CONFIG_CPU_MICROMIPS has been removed along with the nop instructions.
There, NOPs are 2 byte in size, so change the code with 3 _ssnop which are
always 4 byte and remove the ifdef. Added ehb to make sure the hazard
is always cleared.
Fixes: c65a5480ff ("[MIPS] Fix potential latency problem due to non-atomic cpu_wait.")
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Use tabs instead of spaces in regs_query_register_offset() and
syscall_trace_leave(), and properly indent multiple getters.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Added support for starting CPUs in parallel on EyeQ to speed up boot time.
On EyeQ5, booting 8 CPUs is now ~90ms faster.
On EyeQ6, booting 32 CPUs is now ~650ms faster.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
struct gpio_chip now has callbacks for setting line values that return
an integer, allowing to indicate failures. Convert the drivers to using
them.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Added quirks for enabling multi-cluster mode on EyeQ6
Added DTS clocks for ralink
Cleanup realtek DTS
Other cleanups and fixes
-----BEGIN PGP SIGNATURE-----
iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmfnxUoaHHRzYm9nZW5k
QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHD/MA//aMzae6TL2S/vKVx/QCwL
I+Lww1O8CvCgRBiBVkVoS5eXeqt8NzyCwicFT5LmLxI+lw5wFrgBO6rbcq2NxLHY
q9I8dz5qTpGomT7TWoL8t97ZGxpLSYIqx4cqvvQrIhczWCGsvgAtep03sstlX6iT
RouZ6JIxRh0xSRuloT1947C6dH204t/kuFDJ4uCXrbkmstXhQ/+dip8s7/XMaXHk
sU1cjPV8Tpxlwv8X9SR39IKVk7LoHW/92w/J1se5CosHnSlJ7IjaUuO3Y7bp3xdw
SLYBxge/JfIcXh0YURS2hhlsdls9e55Zrg+ACn8DTDfFy6kq6v5ZsUEsnWpCKfQl
IymQ5246EJgtk8hNGe77qjiMnxbkNeM4v8DB7UVZHPxsGBrUlSNYZL2fSMdXkd2Y
JlobF9QgyXf+aZyLtn1g3B6yn8Ajd2L5rmq4fXKbRhMfMno/nnsi8nZ24OPp3lkY
KoWkXPwNZfxE7xOnBjk9BleijIHoC7w291j4h4/QMfzJqlkpRDJAKGOTBMOUYrER
B5PYGPjTWD8Os0xzp/PGkTMagoKcxzX7LYfAL0iEQ3ixgtwHtni/gUGJItBK6XTK
1KQZaXDcr5EAa6IcGjaNTrpaSQK5duZThPRiUE6fqboHFtB7JmLJZmsFE0BZbcEe
mJ21gTa4hFQOQ3J5YZoytEI=
=hUDw
-----END PGP SIGNATURE-----
Merge tag 'mips_6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS updates from Thomas Bogendoerfer:
- Add support for multi-cluster configuration
- Add quirks for enabling multi-cluster mode on EyeQ6
- Add DTS clocks for ralink
- Cleanup realtek DTS
- Other cleanups and fixes
* tag 'mips_6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (35 commits)
MIPS: config: omega2+, vocore2: enable CLK_MTMIPS
arch: mips: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
MIPS: cm: Fix warning if MIPS_CM is disabled
MIPS: Fix Macro name
MIPS: ds1287: Match ds1287_set_base_clock() function types
MIPS: cevt-ds1287: Add missing ds1287.h include
MIPS: dec: Declare which_prom() as static
MIPS: Loongson2ef: Replace deprecated strncpy() with strscpy()
mips: dts: ralink: mt7628a: update system controller node and its consumers
mips: dts: ralink: mt7620a: update system controller node and its consumers
mips: dts: ralink: rt3883: update system controller node and its consumers
mips: dts: ralink: rt3050: update system controller node and its consumers
mips: dts: ralink: rt2880: update system controller node and its consumers
dt-bindings: clock: add clock definitions for Ralink SoCs
MIPS: Use arch specific syscall name match function
mips: dts: realtek: Add restart to Cisco SG220-26P
mips: dts: realtek: Add RTL838x SoC peripherals
mips: dts: realtek: Replace uart clock property
mips: dts: realtek: Correct uart interrupt-parent
mips: dts: realtek: Add SoC IRQ node for RTL838x
...
- Consolidate the VDSO storage
The VDSO data storage and data layout has been largely architecture
specific for historical reasons. That increases the maintenance effort
and causes inconsistencies over and over.
There is no real technical reason for architecture specific layouts and
implementations. The architecture specific details can easily be
integrated into a generic layout, which also reduces the amount of
duplicated code for managing the mappings.
Convert all architectures over to a unified layout and common mapping
infrastructure. This splits the VDSO data layout into subsystem
specific blocks, timekeeping, random and architecture parts, which
provides a better structure and allows to improve and update the
functionalities without conflict and interaction.
- Rework the timekeeping data storage
The current implementation is designed for exposing system timekeeping
accessors, which was good enough at the time when it was designed.
PTP and Time Sensitive Networking (TSN) change that as there are
requirements to expose independent PTP clocks, which are not related to
system timekeeping.
Replace the monolithic data storage by a structured layout, which
allows to add support for independent PTP clocks on top while reusing
both the data structures and the time accessor implementations.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmfgSWUTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoYGED/0f/M8YyacAyErDYW4ufW+zh2sUidSf
GVlK0Jn5BMljOoye+y2XfTxuvvXxEDjJNYiJm2uKGPdV29tjNXreGK39XyNqXPu5
jwR4f/IN/QVSM2nCO6jyydMz8ympJ2k6M4RewwmxXBL2KsUzzJWSKTgRNqM5Tdjs
1RhJMjkQVTiiSYerBpHXYCeZLM7/VEfZ120uuzVAYPXo0/R6zuyF7IBgIao9hbfO
IQeCMLLfpDQHQhwquTA8ZbWqQusiEoSYHT+kTDa3eXDDbE/2UklAUs9gaatI979x
73zs0Yqxyx2iIGaghACWOAbKdcBWBeCYDw5fFwYVKn4VMQi1+wcxbtOYL767jp9o
vfkLXGilXcVkvDjv4fH+e1NoJXXBxq1Ug1silKdOeJzenQF8Q1i3tavkWUVCNfwH
qyOIM72NiCEWbYBDcz0lwBxEAyO4o0E6NP1bDc4y50VedEYIbXwSh0QGrdev1abn
rjY9vsuUR9oznmZ6BRPPxMTY87gOSHoKvqydgSZUACEgLV9346f5qZf341OReYai
MXUmXOM4+LdyaM1+Mec8ppvjMbLw+736NZyZtT2InusEBE+Ddp25L3hYiWnklJu8
2uwv0AoyrwaJ8y6ADOX4thcLZq0gND0Z/Ayz/XvpeI30eftsGUCt5KOVlqwfwOkI
4EQKvk2fAixPxg==
=rwei
-----END PGP SIGNATURE-----
Merge tag 'timers-vdso-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull VDSO infrastructure updates from Thomas Gleixner:
- Consolidate the VDSO storage
The VDSO data storage and data layout has been largely architecture
specific for historical reasons. That increases the maintenance
effort and causes inconsistencies over and over.
There is no real technical reason for architecture specific layouts
and implementations. The architecture specific details can easily be
integrated into a generic layout, which also reduces the amount of
duplicated code for managing the mappings.
Convert all architectures over to a unified layout and common mapping
infrastructure. This splits the VDSO data layout into subsystem
specific blocks, timekeeping, random and architecture parts, which
provides a better structure and allows to improve and update the
functionalities without conflict and interaction.
- Rework the timekeeping data storage
The current implementation is designed for exposing system
timekeeping accessors, which was good enough at the time when it was
designed.
PTP and Time Sensitive Networking (TSN) change that as there are
requirements to expose independent PTP clocks, which are not related
to system timekeeping.
Replace the monolithic data storage by a structured layout, which
allows to add support for independent PTP clocks on top while reusing
both the data structures and the time accessor implementations.
* tag 'timers-vdso-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits)
sparc/vdso: Always reject undefined references during linking
x86/vdso: Always reject undefined references during linking
vdso: Rework struct vdso_time_data and introduce struct vdso_clock
vdso: Move architecture related data before basetime data
powerpc/vdso: Prepare introduction of struct vdso_clock
arm64/vdso: Prepare introduction of struct vdso_clock
x86/vdso: Prepare introduction of struct vdso_clock
time/namespace: Prepare introduction of struct vdso_clock
vdso/namespace: Rename timens_setup_vdso_data() to reflect new vdso_clock struct
vdso/vsyscall: Prepare introduction of struct vdso_clock
vdso/gettimeofday: Prepare helper functions for introduction of struct vdso_clock
vdso/gettimeofday: Prepare do_coarse_timens() for introduction of struct vdso_clock
vdso/gettimeofday: Prepare do_coarse() for introduction of struct vdso_clock
vdso/gettimeofday: Prepare do_hres_timens() for introduction of struct vdso_clock
vdso/gettimeofday: Prepare do_hres() for introduction of struct vdso_clock
vdso/gettimeofday: Prepare introduction of struct vdso_clock
vdso/helpers: Prepare introduction of struct vdso_clock
vdso/datapage: Define vdso_clock to prepare for multiple PTP clocks
vdso: Make vdso_time_data cacheline aligned
arm64: Make asm/cache.h compatible with vDSO
...
- avoid the lock trip seccomp_filter_release in common case (Mateusz Guzik)
- remove unused 'sd' argument through-out (Oleg Nesterov)
- selftests/seccomp: Add hard-coded __NR_uretprobe for x86_64
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ9hJKgAKCRA2KwveOeQk
u3UDAQCUPSJ+iNGEsh9ErHJZWmg+LQJ999fOVrWWz+onjXFoQgD/cEHyLtSnz1Oa
RbdGEqkBkRTXgOdkcIW5pJ4lBtfIpQQ=
=z/8H
-----END PGP SIGNATURE-----
Merge tag 'seccomp-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp updates from Kees Cook:
- avoid the lock trip seccomp_filter_release in common case (Mateusz
Guzik)
- remove unused 'sd' argument through-out (Oleg Nesterov)
- selftests/seccomp: Add hard-coded __NR_uretprobe for x86_64
* tag 'seccomp-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
seccomp: avoid the lock trip seccomp_filter_release in common case
seccomp: remove the 'sd' argument from __seccomp_filter()
seccomp: remove the 'sd' argument from __secure_computing()
seccomp: fix the __secure_computing() stub for !HAVE_ARCH_SECCOMP_FILTER
seccomp/mips: change syscall_trace_enter() to use secure_computing()
selftests/seccomp: Add hard-coded __NR_uretprobe for x86_64
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90qAwAKCRCRxhvAZXjc
on7lAP0akpIsJMWREg9tLwTNTySI1b82uKec0EAgM6T7n/PYhAD/T4zoY8UYU0Pr
qCxwTXHUVT6bkNhjREBkfqq9OkPP8w8=
=GxeN
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.15-rc1.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount updates from Christian Brauner:
- Mount notifications
The day has come where we finally provide a new api to listen for
mount topology changes outside of /proc/<pid>/mountinfo. A mount
namespace file descriptor can be supplied and registered with
fanotify to listen for mount topology changes.
Currently notifications for mount, umount and moving mounts are
generated. The generated notification record contains the unique
mount id of the mount.
The listmount() and statmount() api can be used to query detailed
information about the mount using the received unique mount id.
This allows userspace to figure out exactly how the mount topology
changed without having to generating diffs of /proc/<pid>/mountinfo
in userspace.
- Support O_PATH file descriptors with FSCONFIG_SET_FD in the new mount
api
- Support detached mounts in overlayfs
Since last cycle we support specifying overlayfs layers via file
descriptors. However, we don't allow detached mounts which means
userspace cannot user file descriptors received via
open_tree(OPEN_TREE_CLONE) and fsmount() directly. They have to
attach them to a mount namespace via move_mount() first.
This is cumbersome and means they have to undo mounts via umount().
Allow them to directly use detached mounts.
- Allow to retrieve idmappings with statmount
Currently it isn't possible to figure out what idmapping has been
attached to an idmapped mount. Add an extension to statmount() which
allows to read the idmapping from the mount.
- Allow creating idmapped mounts from mounts that are already idmapped
So far it isn't possible to allow the creation of idmapped mounts
from already idmapped mounts as this has significant lifetime
implications. Make the creation of idmapped mounts atomic by allow to
pass struct mount_attr together with the open_tree_attr() system call
allowing to solve these issues without complicating VFS lookup in any
way.
The system call has in general the benefit that creating a detached
mount and applying mount attributes to it becomes an atomic operation
for userspace.
- Add a way to query statmount() for supported options
Allow userspace to query which mount information can be retrieved
through statmount().
- Allow superblock owners to force unmount
* tag 'vfs-6.15-rc1.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (21 commits)
umount: Allow superblock owners to force umount
selftests: add tests for mount notification
selinux: add FILE__WATCH_MOUNTNS
samples/vfs: fix printf format string for size_t
fs: allow changing idmappings
fs: add kflags member to struct mount_kattr
fs: add open_tree_attr()
fs: add copy_mount_setattr() helper
fs: add vfs_open_tree() helper
statmount: add a new supported_mask field
samples/vfs: add STATMOUNT_MNT_{G,U}IDMAP
selftests: add tests for using detached mount with overlayfs
samples/vfs: check whether flag was raised
statmount: allow to retrieve idmappings
uidgid: add map_id_range_up()
fs: allow detached mounts in clone_private_mount()
selftests/overlayfs: test specifying layers as O_PATH file descriptors
fs: support O_PATH fds with FSCONFIG_SET_FD
vfs: add notifications for mount attach and detach
fanotify: notify on mount attach and detach
...
Address the issue of cevt-ds1287.c not including the ds1287.h header
file.
Fix follow errors with gcc-14 when -Werror:
arch/mips/kernel/cevt-ds1287.c:15:5: error: no previous prototype for ‘ds1287_timer_state’ [-Werror=missing-prototypes]
15 | int ds1287_timer_state(void)
| ^~~~~~~~~~~~~~~~~~
arch/mips/kernel/cevt-ds1287.c:20:5: error: no previous prototype for ‘ds1287_set_base_clock’ [-Werror=missing-prototypes]
20 | int ds1287_set_base_clock(unsigned int hz)
| ^~~~~~~~~~~~~~~~~~~~~
arch/mips/kernel/cevt-ds1287.c:103:12: error: no previous prototype for ‘ds1287_clockevent_init’ [-Werror=missing-prototypes]
103 | int __init ds1287_clockevent_init(int irq)
| ^~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make[7]: *** [scripts/Makefile.build:207: arch/mips/kernel/cevt-ds1287.o] Error 1
make[7]: *** Waiting for unfinished jobs....
make[6]: *** [scripts/Makefile.build:465: arch/mips/kernel] Error 2
make[6]: *** Waiting for unfinished jobs....
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Some CM3.5 devices incorrectly report that hardware cache
initialization has completed, and also claim to support hardware cache
initialization when they don't actually do so. This commit fixes this
issue by retrieving the correct information from the device tree and
allowing the system to bypass the hardware cache initialization
step. Instead, it relies on manual operation. As a result, multi-user
support is now possible for these CPUs.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Some information that should be retrieved at runtime for the Coherence
Manager can be either absent or wrong. This patch allows checking if
some of this information is available from the device tree and updates
the internal variable accordingly.
For now, only the compatible string associated with the broken HCI is
being retrieved.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Probe for & boot CPUs (cores & VPs) in secondary clusters (ie. not the
cluster that began booting Linux) when they are present in systems with
CM 3.5 or higher.
Signed-off-by: Paul Burton <paulburton@kernel.org>
Signed-off-by: Chao-ying Fu <cfu@wavecomp.com>
Signed-off-by: Dragan Mladjenovic <dragan.mladjenovic@syrmia.com>
Signed-off-by: Aleksandar Rikalo <arikalo@gmail.com>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Tested-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
In preparation for supporting multi-cluster systems, introduce a struct
cluster_boot_config as an extra layer in the boot configuration
maintained by the MIPS Coherent Processing System (CPS) SMP
implementation. For now only one struct cluster_boot_config will be
allocated & we'll simply defererence its core_config field to find the
struct core_boot_config array which can be used to boot as usual.
Signed-off-by: Paul Burton <paulburton@kernel.org>
Signed-off-by: Dragan Mladjenovic <dragan.mladjenovic@syrmia.com>
Signed-off-by: Aleksandar Rikalo <arikalo@gmail.com>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Tested-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
The pm-cps code has up until now used per-CPU variables indexed by core,
rather than CPU number, in order to share data amongst sibling CPUs (ie.
VPs/threads in a core). This works fine for single cluster systems, but
with multi-cluster systems a core number is no longer unique in the
system, leading to sharing between CPUs that are not actually siblings.
Avoid this issue by using per-CPU variables as they are more generally
used - ie. access them using CPU numbers rather than core numbers.
Sharing between siblings is then accomplished by:
- Assigning the same pointer to entries for each sibling CPU for the
nc_asm_enter & ready_count variables, which allow this by virtue of
being per-CPU pointers.
- Indexing by the first CPU set in a CPUs cpu_sibling_map in the case
of pm_barrier, for which we can't use the previous approach because
the per-CPU variable is not a pointer.
Signed-off-by: Paul Burton <paulburton@kernel.org>
Signed-off-by: Dragan Mladjenovic <dragan.mladjenovic@syrmia.com>
Signed-off-by: Aleksandar Rikalo <arikalo@gmail.com>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Tested-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
The generic storage implementation provides the same features as the
custom one. However it can be shared between architectures, making
maintenance easier.
Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-13-13a4669dfc8c@linutronix.de
We have several places across the kernel where we want to access another
task's syscall arguments, such as ptrace(2), seccomp(2), etc., by making
a call to syscall_get_arguments().
This works for register arguments right away by accessing the task's
`regs' member of `struct pt_regs', however for stack arguments seen with
32-bit/o32 kernels things are more complicated. Technically they ought
to be obtained from the user stack with calls to an access_remote_vm(),
but we have an easier way available already.
So as to be able to access syscall stack arguments as regular function
arguments following the MIPS calling convention we copy them over from
the user stack to the kernel stack in arch/mips/kernel/scall32-o32.S, in
handle_sys(), to the current stack frame's outgoing argument space at
the top of the stack, which is where the handler called expects to see
its incoming arguments. This area is also pointed at by the `pt_regs'
pointer obtained by task_pt_regs().
Make the o32 stack argument space a proper member of `struct pt_regs'
then, by renaming the existing member from `pad0' to `args' and using
generated offsets to access the space. No functional change though.
With the change in place the o32 kernel stack frame layout at the entry
to a syscall handler invoked by handle_sys() is therefore as follows:
$sp + 68 -> | ... | <- pt_regs.regs[9]
+---------------------+
$sp + 64 -> | $t0 | <- pt_regs.regs[8]
+---------------------+
$sp + 60 -> | $a3/argument #4 | <- pt_regs.regs[7]
+---------------------+
$sp + 56 -> | $a2/argument #3 | <- pt_regs.regs[6]
+---------------------+
$sp + 52 -> | $a1/argument #2 | <- pt_regs.regs[5]
+---------------------+
$sp + 48 -> | $a0/argument #1 | <- pt_regs.regs[4]
+---------------------+
$sp + 44 -> | $v1 | <- pt_regs.regs[3]
+---------------------+
$sp + 40 -> | $v0 | <- pt_regs.regs[2]
+---------------------+
$sp + 36 -> | $at | <- pt_regs.regs[1]
+---------------------+
$sp + 32 -> | $zero | <- pt_regs.regs[0]
+---------------------+
$sp + 28 -> | stack argument #8 | <- pt_regs.args[7]
+---------------------+
$sp + 24 -> | stack argument #7 | <- pt_regs.args[6]
+---------------------+
$sp + 20 -> | stack argument #6 | <- pt_regs.args[5]
+---------------------+
$sp + 16 -> | stack argument #5 | <- pt_regs.args[4]
+---------------------+
$sp + 12 -> | psABI space for $a3 | <- pt_regs.args[3]
+---------------------+
$sp + 8 -> | psABI space for $a2 | <- pt_regs.args[2]
+---------------------+
$sp + 4 -> | psABI space for $a1 | <- pt_regs.args[1]
+---------------------+
$sp + 0 -> | psABI space for $a0 | <- pt_regs.args[0]
+---------------------+
holding user data received and with the first 4 frame slots reserved by
the psABI for the compiler to spill the incoming arguments from $a0-$a3
registers (which it sometimes does according to its needs) and the next
4 frame slots designated by the psABI for any stack function arguments
that follow. This data is also available for other tasks to peek/poke
at as reqired and where permitted.
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Add open_tree_attr() which allow to atomically create a detached mount
tree and set mount options on it. If OPEN_TREE_CLONE is used this will
allow the creation of a detached mount with a new set of mount options
without it ever being exposed to userspace without that set of mount
options applied.
Link: https://lore.kernel.org/r/20250128-work-mnt_idmap-update-v2-v1-3-c25feb0d2eb3@kernel.org
Reviewed-by: "Seth Forshee (DigitalOcean)" <sforshee@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
arch/mips/Kconfig selects HAVE_ARCH_SECCOMP_FILTER so syscall_trace_enter()
can just use __secure_computing(NULL) and rely on populate_seccomp_data(sd)
and "sd == NULL" checks in __secure_computing(sd) paths.
With the change above syscall_trace_enter() can just use secure_computing()
and avoid #ifdef + test_thread_flag(TIF_SECCOMP). CONFIG_GENERIC_ENTRY is
not defined, so test_syscall_work(SECCOMP) will check TIF_SECCOMP.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Reviewed-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20250128150300.GA15318@redhat.com
Signed-off-by: Kees Cook <kees@kernel.org>
This reverts commit bc7584e009.
The split IPC system calls for o32 have been introduced with modern
version only. Changing this breaks ABI.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
-----BEGIN PGP SIGNATURE-----
iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmeXoDUaHHRzYm9nZW5k
QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHAorw/8CaBkXlyYNQzlSD6WK9Si
yfZAhKp3BdZWUfImlRJFQkAWaORz3LWtMg6T5odQpara3D5tvUTZ6leXDsVjwctA
pZXv1xiAg9SymQQiWLdHawGG6xWCG+Z0eI8vQrXjPp+6fI6c7pq6AfDjW+TQ9dnl
0cZEPUItgj0TBv8dSs2hKyHRotU7Fn2sagWVhMy5KeLnp8qK1rq/aZaFehNXIuCo
bslcsoT1Sex6YwCawST94f+kWAzDOigp/M32+7gQJXDQljbjTCXBtWkpoP0QRl55
bcdiTB1sSvLBlCurFI/0ktvxm5lfyjmMmvvWPYgNW50v3QROr9xWyjkRWhXtSBZk
M8MozS9+TH51QtPOf+uIkWY5KkIAJILw85eVtC5gP8L980qvUkBhREm36m1Vy3NT
xUTRKuaeBm7FNu2lD3wj16F5K4YdtE+Zwr+luRnxTWpdeauMoDkyzW8QIitO8CEw
2gedxeivvs+kAufkodmv+heC6Nk/fkoYGm5AfHeCIdpwFXLV/gtCP9PvhiUiE+Le
tKShGvFOB8hj0UlS7z6XPlInXvwuqZmKhRc214ymKTWmNLYDKutRo++GlNZALzCV
ts47pcw76ktkC8ptYkzxOXBWwpR1NOdGvAgV9Px6N/211Ww6wRn2BPb6oToSebE0
ZERE04cQGnm16i06qwD38Hc=
=IdU/
-----END PGP SIGNATURE-----
Merge tag 'mips_6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS updates from Thomas Bogendoerfer:
"Cleanups and fixes"
* tag 'mips_6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: pci-legacy: Override pci_address_to_pio
MIPS: Loongson64: env: Use str_on_off() helper in prom_lefi_init_env()
MIPS: migrate to generic rule for built-in DTBs
mips: fix shmctl/semctl/msgctl syscall for o32
mips/math-emu: fix emulation of the prefx instruction
MIPS: Loongson: Add comments for interface_info
MIPS: Loongson64: remove ROM Size unit in boardinfo
MIPS: traps: Use str_enabled_disabled() in parity_protection_init()
MIPS: ftrace: Declare ftrace_get_parent_ra_addr() as static
Revert "MIPS: csrc-r4k: Select HAVE_UNSTABLE_SCHED_CLOCK if SMP && 64BIT"
MIPS: Fix the wrong format specifier
MIPS: Add a blank line after __HEAD
MIPS: kernel: Rename read/write_c0_ecc to read/writec0_errctl
Before SLUB initialization, various subsystems used memblock_alloc to
allocate memory. In most cases, when memory allocation fails, an
immediate panic is required. To simplify this behavior and reduce
repetitive checks, introduce `memblock_alloc_or_panic`. This function
ensures that memory allocation failures result in a panic automatically,
improving code readability and consistency across subsystems that require
this behavior.
[guoweikang.kernel@gmail.com: arch/s390: save_area_alloc default failure behavior changed to panic]
Link: https://lkml.kernel.org/r/20250109033136.2845676-1-guoweikang.kernel@gmail.com
Link: https://lore.kernel.org/lkml/Z2fknmnNtiZbCc7x@kernel.org/
Link: https://lkml.kernel.org/r/20250102072528.650926-1-guoweikang.kernel@gmail.com
Signed-off-by: Guo Weikang <guoweikang.kernel@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> [s390]
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "mm: update mips to use do_mmap(), make mmap_region()
internal".
Currently the only user of mmap_region() outside of the memory management
code is the MIPS VDSO implementation.
This uses mmap_region() to map a 'delay slot emulation page' at the top of
the stack which is read-only and executable.
This mapping requires that an already-acquired mmap write lock is utilised
and that uffd and populate logic is ignored. This rules out vm_mmap(),
however do_mmap() fits the bill.
Adapt this code to use do_mmap() and then once done, make mmap_region()
internal and userland testable, and avoid any other uses of mmap_region(),
which is absolutely and strictly an internal mm function which bypasses a
great number of checks and logic.
This patch (of 2):
mmap_region() is an internal memory management implementation detail that
is not intended to be used outside of the memory management subsystem.
Map the delay slot emulation page using do_mmap() which makes use of the
already-held mmap write lock and bypasses unneeded populate and
userfaultfd logic.
This should have the precise same behaviour as the existing logic.
Link: https://lkml.kernel.org/r/cover.1735819274.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/ef076e381570f709e5c2c142dc030ec5b3309a0e.1735819274.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Jann Horn <jannh@google.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The commit 275f22148e ("ipc: rename old-style shmctl/semctl/msgctl
syscalls") switched various architectures to use sys_old_*ctl() with
ipc_parse_version, including mips n32/n64. However, for mips o32, commit
0d6040d468 ("arch: add split IPC system calls where needed") added
separate IPC syscalls without properly using the old-style handlers.
This causes applications using uClibc-ng to fail with -EINVAL when
calling semctl/shmctl/msgctl with IPC_64 flag, as uClibc-ng uses the
syscall numbers from kernel headers to determine whether to use the IPC
multiplexer or split syscalls. In contrast, glibc is unaffected as it
uses a unified feature test macro __ASSUME_DIRECT_SYSVIPC_SYSCALLS
(disabled for mips-o32) to make this decision.
Fix this by switching the o32 ABI entries for semctl, shmctl and msgctl
to use the old-style handlers, matching the behavior of other
architectures and fixing compatibility with uClibc-ng.
Signed-off-by: Ism Hong <ism.hong@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Remove hard-coded strings by using the str_enabled_disabled() helper
function.
Use pr_info() instead of printk(KERN_INFO) to silence multiple
checkpatch warnings.
Suggested-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Declare ftrace_get_parent_ra_addr() as static to suppress clang
compiler warning that 'no previous prototype'. This function is
not intended to be called from other parts.
Fix follow error with clang-19:
arch/mips/kernel/ftrace.c:251:15: error: no previous prototype for function 'ftrace_get_parent_ra_addr' [-Werror,-Wmissing-prototypes]
251 | unsigned long ftrace_get_parent_ra_addr(unsigned long self_ra, unsigned long
| ^
arch/mips/kernel/ftrace.c:251:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
251 | unsigned long ftrace_get_parent_ra_addr(unsigned long self_ra, unsigned long
| ^
| static
1 error generated.
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Make a minor change to eliminate a static checker warning. The type
of cpu is unsigned int, so the correct format specifier should be
%u instead of %d.
Signed-off-by: liujing <liujing@cmss.chinamobile.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>