Commit Graph

388 Commits

Author SHA1 Message Date
Linus Torvalds
a530a36bb5 Kbuild updates for v6.17
- Fix a shortcut key issue in menuconfig
 
  - Fix missing rebuild of kheaders
 
  - Sort the symbol dump generated by gendwarfsyms
 
  - Support zboot extraction in scripts/extract-vmlinux
 
  - Migrate gconfig to GTK 3
 
  - Add TAR variable to allow overriding the default tar command
 
  - Hand over Kbuild maintainership
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmiSr38VHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGcTIP/RCVr/OEJgVXg8dOBNhNuhGoidnM
 2uqRcaza68tOSegpFGcfd9bhO1TCR/O5SL117TS8UEx4f9ge7gk/+XVZC8i1268m
 9+V6eJd8QI34nB/EezMnrhvFmn2kC0UMuSldZYQJ2cReLjtapBP2xBtWnxi+Zyyw
 +FjdHwQln7E8UaB/gMqh9KVVOytX4NIUUZEA/78nd4eJaJbLxJ/5ztAxGLB//bXI
 Rr6bjAeOmIfRWS9QWnGzNzHmzp4SSmU+/gdLXyaWlmoVjeut8O+BJXvQRNfswk8K
 JXmk6uZUx6CNheCca2RaM0i6XAArkpOQc/7v7Ul/rSriTxdxAVUghjk0fNrXJGvQ
 kBjewOTUXg8f4xhuPAL3nkWmCh0s0tV3Q0EneAaJuUck5yRkW0bxxKa74h6ji2q8
 8RsNS5Mq0yYiR1gmCqhEmTN6/wDamSpGkHT0k6ukipixjCr5DP2QFP/xT3d7qSwc
 W6msliXgBmY/ZrGoJXy4zjmu5vxKfAes+7JLBDxUKjdItC818qwSPf+nbvVIdJqb
 K4/hcNDuYPKd/8N8YapPjbGfTBk9Xqp74ez4xg5XNIBPS+cE5k/mePletbCzkzC5
 vzjoNgVUmpPGPdDaBk+S4jSEzWUi575aQx590OfdoXsBt4CQVHHk4PEM1Qh5cWnZ
 m6tx1oqfruovU6Gx
 =MtyJ
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:
 "This is the last pull request from me.

  I'm grateful to have been able to continue as a maintainer for eight
  years. From the next cycle, Nathan and Nicolas will maintain Kbuild.

   - Fix a shortcut key issue in menuconfig

   - Fix missing rebuild of kheaders

   - Sort the symbol dump generated by gendwarfsyms

   - Support zboot extraction in scripts/extract-vmlinux

   - Migrate gconfig to GTK 3

   - Add TAR variable to allow overriding the default tar command

   - Hand over Kbuild maintainership"

* tag 'kbuild-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (92 commits)
  MAINTAINERS: hand over Kbuild maintenance
  kheaders: make it possible to override TAR
  kbuild: userprogs: use correct linker when mixing clang and GNU ld
  kconfig: lxdialog: replace strcpy() with strncpy() in inputbox.c
  kconfig: lxdialog: replace strcpy with snprintf in print_autowrap
  kconfig: gconf: refactor text_insert_help()
  kconfig: gconf: remove unneeded variable in text_insert_msg
  kconfig: gconf: use hyphens in signals
  kconfig: gconf: replace GtkImageMenuItem with GtkMenuItem
  kconfig: gconf: Fix Back button behavior
  kconfig: gconf: fix single view to display dependent symbols correctly
  scripts: add zboot support to extract-vmlinux
  gendwarfksyms: order -T symtypes output by name
  gendwarfksyms: use preferred form of sizeof for allocation
  kconfig: qconf: confine {begin,end}Group to constructor and destructor
  kconfig: qconf: fix ConfigList::updateListAllforAll()
  kconfig: add a function to dump all menu entries in a tree-like format
  kconfig: gconf: show GTK version in About dialog
  kconfig: gconf: replace GtkHPaned and GtkVPaned with GtkPaned
  kconfig: gconf: replace GdkColor with GdkRGBA
  ...
2025-08-06 07:32:52 +03:00
Linus Torvalds
c6439bfaab Deferred unwind changes for 6.17
This is the core infrastructure for the deferred unwinder that is required
 for sframes[1]. Several other patch series is based on this work although
 those patch series are not dependent on each other. In order to simplify the
 development, having this core series upstream will allow the other series to
 be worked on in parallel. The other series are:
 
 - The two patches to implement x86:
   https://lore.kernel.org/linux-trace-kernel/20250717004958.260781923@kernel.org/
   https://lore.kernel.org/linux-trace-kernel/20250717004958.432327787@kernel.org/
 
 - The s390 work:
   https://lore.kernel.org/linux-trace-kernel/20250710163522.3195293-1-jremus@linux.ibm.com/
 
 - The perf work:
   https://lore.kernel.org/linux-trace-kernel/20250718164119.089692174@kernel.org/
 
 - The ftrace work:
   https://lore.kernel.org/linux-trace-kernel/20250424192612.505622711@goodmis.org/
 
 - The sframe work:
   https://lore.kernel.org/linux-trace-kernel/20250717012848.927473176@kernel.org/
 
 And more is on the way.
 
 The core infrastructure adds the following in kernel APIs:
 
 - int unwind_user_faultable(struct unwind_stacktrace *trace);
 
     Performs a user space stack trace that may fault user pages in.
 
 - int unwind_deferred_init(struct unwind_work *work, unwind_callback_t func);
 
     Allows a tracer to register with the unwind deferred infrastructure.
 
 - int unwind_deferred_request(struct unwind_work *work, u64 *cookie);
 
     Used when a tracer request a deferred trace. Can be called from interrupt
     or NMI context.
 
 - void unwind_deferred_cancel(struct unwind_work *work);
 
     Called by a tracer to unregister from the deferred unwind infrastructure.
 
 - void unwind_deferred_task_exit(struct task_struct *task);
 
     Called by task exit code to flush any pending unwind requests.
 
 - void unwind_task_init(struct task_struct *task);
 
     Called by do_fork() to initialize the task struct for the deferred
     unwinder.
 
 - void unwind_task_free(struct task_struct *task);
 
     Called by do_exit() to free up any resources used by the deferred
     unwinder.
 
 None of the above is actually compiled unless an architecture enables it,
 which none currently do.
 
 [1] https://sourceware.org/binutils/wiki/sframe
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaIt9IhQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qqqzAQCMT/6qmSq7O746JF0MuGC6fTZnSbAc
 XGz4JigEqLTRewEA2kaJmD7PBsSRzFdiK2gvyKn95l+PZyWtE9MjTsqeSAc=
 =Lsbm
 -----END PGP SIGNATURE-----

Merge tag 'trace-deferred-unwind-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull initial deferred unwind infrastructure from Steven Rostedt:
 "This is the core infrastructure for the deferred unwinder that is
  required for sframes[1]. Several other patch series are based on this
  work although those patch series are not dependent on each other. In
  order to simplify the development, having this core series upstream
  will allow the other series to be worked on in parallel. The other
  series are:

    - The two patches to implement x86 support [2] [3]

    - The s390 work [4]

    - The perf work [5]

    - The ftrace work [6]

    - The sframe work [7]

  And more is on the way.

  The core infrastructure adds the following in kernel APIs:

    - int unwind_user_faultable(struct unwind_stacktrace *trace);

        Performs a user space stack trace that may fault user pages in.

    - int unwind_deferred_init(struct unwind_work *work, unwind_callback_t func);

        Allows a tracer to register with the unwind deferred
        infrastructure.

    - int unwind_deferred_request(struct unwind_work *work, u64 *cookie);

        Used when a tracer request a deferred trace. Can be called from
        interrupt or NMI context.

    - void unwind_deferred_cancel(struct unwind_work *work);

        Called by a tracer to unregister from the deferred unwind
        infrastructure.

    - void unwind_deferred_task_exit(struct task_struct *task);

        Called by task exit code to flush any pending unwind requests.

    - void unwind_task_init(struct task_struct *task);

        Called by do_fork() to initialize the task struct for the
        deferred unwinder.

    - void unwind_task_free(struct task_struct *task);

        Called by do_exit() to free up any resources used by the
        deferred unwinder.

    None of the above is actually compiled unless an architecture enables it,
    which none currently do"

Link: https://sourceware.org/binutils/wiki/sframe [1]
Link: https://lore.kernel.org/linux-trace-kernel/20250717004958.260781923@kernel.org/ [2]
Link: https://lore.kernel.org/linux-trace-kernel/20250717004958.432327787@kernel.org/ [3]
Link: https://lore.kernel.org/linux-trace-kernel/20250710163522.3195293-1-jremus@linux.ibm.com/ [4]
Link: https://lore.kernel.org/linux-trace-kernel/20250718164119.089692174@kernel.org/ [5]
Link: https://lore.kernel.org/linux-trace-kernel/20250424192612.505622711@goodmis.org/ [6]
Link: https://lore.kernel.org/linux-trace-kernel/20250717012848.927473176@kernel.org/ [7]

* tag 'trace-deferred-unwind-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  unwind: Finish up unwind when a task exits
  unwind deferred: Use SRCU unwind_deferred_task_work()
  unwind: Add USED bit to only have one conditional on way back to user space
  unwind deferred: Add unwind_completed mask to stop spurious callbacks
  unwind deferred: Use bitmask to determine which callbacks to call
  unwind_user/deferred: Make unwind deferral requests NMI-safe
  unwind_user/deferred: Add deferred unwinding interface
  unwind_user/deferred: Add unwind cache
  unwind_user/deferred: Add unwind_user_faultable()
  unwind_user: Add user space unwinding API with frame pointer support
2025-08-01 09:46:24 -07:00
Linus Torvalds
d9104cec3e bpf-next-6.17
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmiINnEACgkQ6rmadz2v
 bToBnA/9F+A3R6rTwGk4HK3xpfc/nm2Tanl3oRN7S2ub/mskDOtWSIyG6cVFZ0UG
 1fK6IkByyRIpAF/5qhdlw8drRXHkQtGLA0lP2L9llm4X1mHLofB18y9OeLrDE1WN
 KwNP06+IGX9W802lCGSIXOY+VmRscVfXSMokyQt2ilHplKjOnDqJcYkWupi3T2rC
 mz79FY9aEl2YrIcpj9RXz+8nwP49pZBuW2P0IM5PAIj4BJBXShrUp8T1nz94okNe
 NFsnAyRxjWpUT0McEgtA9WvpD9lZqujYD8Qp0KlGZWmI3vNpV5d9S1+dBcEb1n7q
 dyNMkTF3oRrJhhg4VqoHc6fVpzSEoZ9ZxV5Hx4cs+ganH75D4YbdGqx/7mR3DUgH
 MZh6rHF1pGnK7TAm7h5gl3ZRAOkZOaahbe1i01NKo9CEe5fSh3AqMyzJYoyGHRKi
 xDN39eQdWBNA+hm1VkbK2Bv93Rbjrka2Kj+D3sSSO9Bo/u3ntcknr7LW39idKz62
 Q8dkKHcCEtun7gjk0YXPF013y81nEohj1C+52BmJ2l5JitM57xfr6YOaQpu7DPDE
 AJbHx6ASxKdyEETecd0b+cXUPQ349zmRXy0+CDMAGKpBicC0H0mHhL14cwOY1Hfu
 EIpIjmIJGI3JNF6T5kybcQGSBOYebdV0FFgwSllzPvuYt7YsHCs=
 =/O3j
 -----END PGP SIGNATURE-----

Merge tag 'bpf-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Pull bpf updates from Alexei Starovoitov:

 - Remove usermode driver (UMD) framework (Thomas Weißschuh)

 - Introduce Strongly Connected Component (SCC) in the verifier to
   detect loops and refine register liveness (Eduard Zingerman)

 - Allow 'void *' cast using bpf_rdonly_cast() and corresponding
   '__arg_untrusted' for global function parameters (Eduard Zingerman)

 - Improve precision for BPF_ADD and BPF_SUB operations in the verifier
   (Harishankar Vishwanathan)

 - Teach the verifier that constant pointer to a map cannot be NULL
   (Ihor Solodrai)

 - Introduce BPF streams for error reporting of various conditions
   detected by BPF runtime (Kumar Kartikeya Dwivedi)

 - Teach the verifier to insert runtime speculation barrier (lfence on
   x86) to mitigate speculative execution instead of rejecting the
   programs (Luis Gerhorst)

 - Various improvements for 'veristat' (Mykyta Yatsenko)

 - For CONFIG_DEBUG_KERNEL config warn on internal verifier errors to
   improve bug detection by syzbot (Paul Chaignon)

 - Support BPF private stack on arm64 (Puranjay Mohan)

 - Introduce bpf_cgroup_read_xattr() kfunc to read xattr of cgroup's
   node (Song Liu)

 - Introduce kfuncs for read-only string opreations (Viktor Malik)

 - Implement show_fdinfo() for bpf_links (Tao Chen)

 - Reduce verifier's stack consumption (Yonghong Song)

 - Implement mprog API for cgroup-bpf programs (Yonghong Song)

* tag 'bpf-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (192 commits)
  selftests/bpf: Migrate fexit_noreturns case into tracing_failure test suite
  selftests/bpf: Add selftest for attaching tracing programs to functions in deny list
  bpf: Add log for attaching tracing programs to functions in deny list
  bpf: Show precise rejected function when attaching fexit/fmod_ret to __noreturn functions
  bpf: Fix various typos in verifier.c comments
  bpf: Add third round of bounds deduction
  selftests/bpf: Test invariants on JSLT crossing sign
  selftests/bpf: Test cross-sign 64bits range refinement
  selftests/bpf: Update reg_bound range refinement logic
  bpf: Improve bounds when s64 crosses sign boundary
  bpf: Simplify bounds refinement from s32
  selftests/bpf: Enable private stack tests for arm64
  bpf, arm64: JIT support for private stack
  bpf: Move bpf_jit_get_prog_name() to core.c
  bpf, arm64: Fix fp initialization for exception boundary
  umd: Remove usermode driver framework
  bpf/preload: Don't select USERMODE_DRIVER
  selftests/bpf: Fix test dynptr/test_dynptr_memset_xdp_chunks failure
  selftests/bpf: Fix test dynptr/test_dynptr_copy_xdp failure
  selftests/bpf: Increase xdp data size for arm64 64K page size
  ...
2025-07-30 09:58:50 -07:00
Josh Poimboeuf
71753c6ed2 unwind_user: Add user space unwinding API with frame pointer support
Introduce a generic API for unwinding user stacks.

In order to expand user space unwinding to be able to handle more complex
scenarios, such as deferred unwinding and reading user space information,
create a generic interface that all architectures can use that support the
various unwinding methods.

This is an alternative method for handling user space stack traces from
the simple stack_trace_save_user() API. This does not replace that
interface, but this interface will be used to expand the functionality of
user space stack walking.

None of the structures introduced will be exposed to user space tooling.

Support for frame pointer unwinding is added. For an architecture to
support frame pointer unwinding it needs to enable
CONFIG_HAVE_UNWIND_USER_FP and define ARCH_INIT_USER_FP_FRAME.

By encoding the frame offsets in struct unwind_user_frame, much of this
code can also be reused for future unwinder implementations like sframe.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Indu Bhagat <indu.bhagat@oracle.com>
Cc: "Jose E. Marchesi" <jemarch@gnu.org>
Cc: Beau Belgrave <beaub@linux.microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Sam James <sam@gentoo.org>
Link: https://lore.kernel.org/20250729182404.975790139@kernel.org
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Co-developed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/all/20250710164301.3094-2-mathieu.desnoyers@efficios.com/
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Co-developed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-29 14:46:07 -04:00
Kees Cook
6676fd3c99 kstack_erase: Add -mgeneral-regs-only to silence Clang warnings
Once CONFIG_KSTACK_ERASE is enabled with Clang on i386, the build warns:

  kernel/kstack_erase.c:168:2: warning: function with attribute 'no_caller_saved_registers' should only call a function with attribute 'no_caller_saved_registers' or be compiled with '-mgeneral-regs-only' [-Wexcessive-regsave]

Add -mgeneral-regs-only for the kstack_erase handler, to make Clang feel
better (it is effectively a no-op flag for the kernel). No binary
changes encountered.

Build & boot tested with Clang 21 on x86_64, and i386.
Build tested with GCC 14.2.0 on x86_64, i386, arm64, and arm.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/all/20250726004313.GA3650901@ax162
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-26 14:28:35 -07:00
Thomas Weißschuh
b7b3500bd4 umd: Remove usermode driver framework
The code is unused since 98e20e5e13 ("bpfilter: remove bpfilter"),
therefore remove it.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Link: https://lore.kernel.org/bpf/20250721-remove-usermode-driver-v1-2-0d0083334382@linutronix.de
2025-07-26 21:03:04 +02:00
Kees Cook
57fbad15c2 stackleak: Rename STACKLEAK to KSTACK_ERASE
In preparation for adding Clang sanitizer coverage stack depth tracking
that can support stack depth callbacks:

- Add the new top-level CONFIG_KSTACK_ERASE option which will be
  implemented either with the stackleak GCC plugin, or with the Clang
  stack depth callback support.
- Rename CONFIG_GCC_PLUGIN_STACKLEAK as needed to CONFIG_KSTACK_ERASE,
  but keep it for anything specific to the GCC plugin itself.
- Rename all exposed "STACKLEAK" names and files to "KSTACK_ERASE" (named
  for what it does rather than what it protects against), but leave as
  many of the internals alone as possible to avoid even more churn.

While here, also split "prev_lowest_stack" into CONFIG_KSTACK_ERASE_METRICS,
since that's the only place it is referenced from.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250717232519.2984886-1-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-21 21:35:01 -07:00
Masahiro Yamada
1a0faff283 kheaders: rebuild kheaders_data.tar.xz when KBUILD_BUILD_TIMESTAMP is changed
This problem is similar to commit 7f8256ae0e ("initramfs: Encode
dependency on KBUILD_BUILD_TIMESTAMP"): kernel/gen_kheaders.sh has an
internal dependency on KBUILD_BUILD_TIMESTAMP that is not exposed to
make, so changing KBUILD_BUILD_TIMESTAMP will not trigger a rebuild
of the archive.

Move $(KBUILD_BUILD_TIMESTAMP) to the Makefile so that is is recorded
in the *.cmd file.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-06-24 20:30:37 +09:00
Masahiro Yamada
626c54af35 kheaders: rebuild kheaders_data.tar.xz when a file is modified within a minute
When a header file is changed, kernel/gen_kheaders.sh may fail to update
kernel/kheaders_data.tar.xz.

[steps to reproduce]

[1] Build kernel/kheaders_data.tar.xz

  $ make -j$(nproc) kernel/kheaders.o
    DESCEND objtool
    INSTALL libsubcmd_headers
    CALL    scripts/checksyscalls.sh
    CHK     kernel/kheaders_data.tar.xz
    GEN     kernel/kheaders_data.tar.xz
    CC      kernel/kheaders.o

[2] Modify a header without changing the file size

  $ sed -i s/0xdeadbeef/0xfeedbeef/ include/linux/elfnote.h

[3] Rebuild kernel/kheaders_data.tar.xz

  $ make -j$(nproc) kernel/kheaders.o
    DESCEND objtool
    INSTALL libsubcmd_headers
    CALL    scripts/checksyscalls.sh
    CHK     kernel/kheaders_data.tar.xz

kernel/kheaders_data.tar.xz is not updated if steps [1] - [3] are run
within the same minute.

The headers_md5 variable stores the MD5 hash of the 'ls -l' output
for all header files. This hash value is used to determine whether
kheaders_data.tar.xz needs to be rebuilt. However, 'ls -l' prints the
modification times with minute-level granularity. If a file is modified
within the same minute and its size remains the same, the MD5 hash does
not change.

To reliably detect file modifications, this commit rewrites
kernel/gen_kheaders.sh to output header dependencies to
kernel/.kheaders_data.tar.xz.cmd. Then, Make compares the timestamps
and reruns kernel/gen_kheaders.sh when necessary. This is the standard
mechanism used by Make and Kbuild.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-06-24 20:30:37 +09:00
Linus Torvalds
7d4e49a77d - The 3 patch series "hung_task: extend blocking task stacktrace dump to
semaphore" from Lance Yang enhances the hung task detector.  The
   detector presently dumps the blocking tasks's stack when it is blocked
   on a mutex.  Lance's series extends this to semaphores.
 
 - The 2 patch series "nilfs2: improve sanity checks in dirty state
   propagation" from Wentao Liang addresses a couple of minor flaws in
   nilfs2.
 
 - The 2 patch series "scripts/gdb: Fixes related to lx_per_cpu()" from
   Illia Ostapyshyn fixes a couple of issues in the gdb scripts.
 
 - The 9 patch series "Support kdump with LUKS encryption by reusing LUKS
   volume keys" from Coiby Xu addresses a usability problem with kdump.
   When the dump device is LUKS-encrypted, the kdump kernel may not have
   the keys to the encrypted filesystem.  A full writeup of this is in the
   series [0/N] cover letter.
 
 - The 2 patch series "sysfs: add counters for lockups and stalls" from
   Max Kellermann adds /sys/kernel/hardlockup_count and
   /sys/kernel/hardlockup_count and /sys/kernel/rcu_stall_count.
 
 - The 3 patch series "fork: Page operation cleanups in the fork code"
   from Pasha Tatashin implements a number of code cleanups in fork.c.
 
 - The 3 patch series "scripts/gdb/symbols: determine KASLR offset on
   s390 during early boot" from Ilya Leoshkevich fixes some s390 issues in
   the gdb scripts.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaDuCvQAKCRDdBJ7gKXxA
 jrkxAQCnFAp/uK9ckkbN4nfpJ0+OMY36C+A+dawSDtuRsIkXBAEAq3e6MNAUdg5W
 Ca0cXdgSIq1Op7ZKEA+66Km6Rfvfow8=
 =g45L
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2025-05-31-15-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - "hung_task: extend blocking task stacktrace dump to semaphore" from
   Lance Yang enhances the hung task detector.

   The detector presently dumps the blocking tasks's stack when it is
   blocked on a mutex. Lance's series extends this to semaphores

 - "nilfs2: improve sanity checks in dirty state propagation" from
   Wentao Liang addresses a couple of minor flaws in nilfs2

 - "scripts/gdb: Fixes related to lx_per_cpu()" from Illia Ostapyshyn
   fixes a couple of issues in the gdb scripts

 - "Support kdump with LUKS encryption by reusing LUKS volume keys" from
   Coiby Xu addresses a usability problem with kdump.

   When the dump device is LUKS-encrypted, the kdump kernel may not have
   the keys to the encrypted filesystem. A full writeup of this is in
   the series [0/N] cover letter

 - "sysfs: add counters for lockups and stalls" from Max Kellermann adds
   /sys/kernel/hardlockup_count and /sys/kernel/hardlockup_count and
   /sys/kernel/rcu_stall_count

 - "fork: Page operation cleanups in the fork code" from Pasha Tatashin
   implements a number of code cleanups in fork.c

 - "scripts/gdb/symbols: determine KASLR offset on s390 during early
   boot" from Ilya Leoshkevich fixes some s390 issues in the gdb
   scripts

* tag 'mm-nonmm-stable-2025-05-31-15-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (67 commits)
  llist: make llist_add_batch() a static inline
  delayacct: remove redundant code and adjust indentation
  squashfs: add optional full compressed block caching
  crash_dump, nvme: select CONFIGFS_FS as built-in
  scripts/gdb/symbols: determine KASLR offset on s390 during early boot
  scripts/gdb/symbols: factor out pagination_off()
  scripts/gdb/symbols: factor out get_vmlinux()
  kernel/panic.c: format kernel-doc comments
  mailmap: update and consolidate Casey Connolly's name and email
  nilfs2: remove wbc->for_reclaim handling
  fork: define a local GFP_VMAP_STACK
  fork: check charging success before zeroing stack
  fork: clean-up naming of vm_stack/vm_struct variables in vmap stacks code
  fork: clean-up ifdef logic around stack allocation
  kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count
  kernel/watchdog: add /sys/kernel/{hard,soft}lockup_count
  x86/crash: make the page that stores the dm crypt keys inaccessible
  x86/crash: pass dm crypt keys to kdump kernel
  Revert "x86/mm: Remove unused __set_memory_prot()"
  crash_dump: retrieve dm crypt keys in kdump kernel
  ...
2025-05-31 19:12:53 -07:00
Coiby Xu
180cf31af7 crash_dump: make dm crypt keys persist for the kdump kernel
A configfs /sys/kernel/config/crash_dm_crypt_keys is provided for user
space to make the dm crypt keys persist for the kdump kernel.  Take the
case of dumping to a LUKS-encrypted target as an example, here is the life
cycle of the kdump copies of LUKS volume keys,

 1. After the 1st kernel loads the initramfs during boot, systemd uses
    an user-input passphrase to de-crypt the LUKS volume keys or simply
    TPM-sealed volume keys and then save the volume keys to specified
    keyring (using the --link-vk-to-keyring API) and the keys will expire
    within specified time.

 2. A user space tool (kdump initramfs loader like kdump-utils) create
    key items inside /sys/kernel/config/crash_dm_crypt_keys to inform
    the 1st kernel which keys are needed.

 3. When the kdump initramfs is loaded by the kexec_file_load
    syscall, the 1st kernel will iterate created key items, save the
    keys to kdump reserved memory.

 4. When the 1st kernel crashes and the kdump initramfs is booted, the
    kdump initramfs asks the kdump kernel to create a user key using the
    key stored in kdump reserved memory by writing yes to
    /sys/kernel/crash_dm_crypt_keys/restore. Then the LUKS encrypted
    device is unlocked with libcryptsetup's --volume-key-keyring API.

 5. The system gets rebooted to the 1st kernel after dumping vmcore to
    the LUKS encrypted device is finished

Eventually the keys have to stay in the kdump reserved memory for the
kdump kernel to unlock encrypted volumes.  During this process, some
measures like letting the keys expire within specified time are desirable
to reduce security risk.

This patch assumes,
1) there are 128 LUKS devices at maximum to be unlocked thus
   MAX_KEY_NUM=128.

2) a key description won't exceed 128 bytes thus KEY_DESC_MAX_LEN=128.

And here is a demo on how to interact with
/sys/kernel/config/crash_dm_crypt_keys,

    # Add key #1
    mkdir /sys/kernel/config/crash_dm_crypt_keys/7d26b7b4-e342-4d2d-b660-7426b0996720
    # Add key #1's description
    echo cryptsetup:7d26b7b4-e342-4d2d-b660-7426b0996720 > /sys/kernel/config/crash_dm_crypt_keys/description

    # how many keys do we have now?
    cat /sys/kernel/config/crash_dm_crypt_keys/count
    1

    # Add key# 2 in the same way

    # how many keys do we have now?
    cat /sys/kernel/config/crash_dm_crypt_keys/count
    2

    # the tree structure of /crash_dm_crypt_keys configfs
    tree /sys/kernel/config/crash_dm_crypt_keys/
    /sys/kernel/config/crash_dm_crypt_keys/
    ├── 7d26b7b4-e342-4d2d-b660-7426b0996720
    │   └── description
    ├── count
    ├── fce2cd38-4d59-4317-8ce2-1fd24d52c46a
    │   └── description

Link: https://lkml.kernel.org/r/20250502011246.99238-3-coxu@redhat.com
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: "Daniel P. Berrange" <berrange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Jan Pazdziora <jpazdziora@redhat.com>
Cc: Liu Pingfan <kernelfans@gmail.com>
Cc: Milan Broz <gmazyland@gmail.com>
Cc: Ondrej Kozina <okozina@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-21 10:48:20 -07:00
Alexander Graf
3dc92c3114 kexec: add Kexec HandOver (KHO) generation helpers
Add the infrastructure to generate Kexec HandOver metadata.  Kexec
HandOver is a mechanism that allows Linux to preserve state - arbitrary
properties as well as memory locations - across kexec.

It does so using 2 concepts:

  1) KHO FDT - Every KHO kexec carries a KHO specific flattened device tree
     blob that describes preserved memory regions. Device drivers can
     register to KHO to serialize and preserve their states before kexec.

  2) Scratch Regions - CMA regions that we allocate in the first kernel.
     CMA gives us the guarantee that no handover pages land in those
     regions, because handover pages must be at a static physical memory
     location. We use these regions as the place to load future kexec
     images so that they won't collide with any handover data.

Link: https://lkml.kernel.org/r/20250509074635.3187114-5-changyuanl@google.com
Signed-off-by: Alexander Graf <graf@amazon.com>
Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Co-developed-by: Pratyush Yadav <ptyadav@amazon.de>
Signed-off-by: Pratyush Yadav <ptyadav@amazon.de>
Co-developed-by: Changyuan Lyu <changyuanl@google.com>
Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anthony Yznaga <anthony.yznaga@oracle.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ashish Kalra <ashish.kalra@amd.com>
Cc: Ben Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Gowans <jgowans@amazon.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Saravana Kannan <saravanak@google.com>
Cc: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Thomas Lendacky <thomas.lendacky@amd.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12 23:50:39 -07:00
Josh Poimboeuf
2cbb20b008 tracing: Disable branch profiling in noinstr code
CONFIG_TRACE_BRANCH_PROFILING inserts a call to ftrace_likely_update()
for each use of likely() or unlikely().  That breaks noinstr rules if
the affected function is annotated as noinstr.

Disable branch profiling for files with noinstr functions.  In addition
to some individual files, this also includes the entire arch/x86
subtree, as well as the kernel/entry, drivers/cpuidle, and drivers/idle
directories, all of which are noinstr-heavy.

Due to the nature of how sched binaries are built by combining multiple
.c files into one, branch profiling is disabled more broadly across the
sched code than would otherwise be needed.

This fixes many warnings like the following:

  vmlinux.o: warning: objtool: do_syscall_64+0x40: call to ftrace_likely_update() leaves .noinstr.text section
  vmlinux.o: warning: objtool: __rdgsbase_inactive+0x33: call to ftrace_likely_update() leaves .noinstr.text section
  vmlinux.o: warning: objtool: handle_bug.isra.0+0x198: call to ftrace_likely_update() leaves .noinstr.text section
  ...

Reported-by: Ingo Molnar <mingo@kernel.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/fb94fc9303d48a5ed370498f54500cc4c338eb6d.1742586676.git.jpoimboe@kernel.org
2025-03-22 09:49:26 +01:00
Mike Rapoport (Microsoft)
0e8b67982b mm: move kernel/numa.c to mm/
Patch series "mm: introduce numa_memblks", v4.

Following the discussion about handling of CXL fixed memory windows on
arm64 [1] I decided to bite the bullet and move numa_memblks from x86 to
the generic code so they will be available on arm64/riscv and maybe on
loongarch sometime later.

While it could be possible to use memblock to describe CXL memory windows,
it currently lacks notion of unpopulated memory ranges and numa_memblks
does implement this.

Another reason to make numa_memblks generic is that both arch_numa (arm64
and riscv) and loongarch use trimmed copy of x86 code although there is no
fundamental reason why the same code cannot be used on all these
platforms.  Having numa_memblks in mm/ will make it's interaction with
ACPI and FDT more consistent and I believe will reduce maintenance burden.

And with generic numa_memblks it is (almost) straightforward to enable
NUMA emulation on arm64 and riscv.

The first 9 commits in this series are cleanups that are not strictly
related to numa_memblks.
Commits 10-16 slightly reorder code in x86 to allow extracting numa_memblks
and NUMA emulation to the generic code.
Commits 17-19 actually move the code from arch/x86/ to mm/ and commits 20-22
does some aftermath cleanups.
Commit 23 updates of_numa_init() to return error of no NUMA nodes were
found in the device tree.
Commit 24 switches arch_numa to numa_memblks.
Commit 25 enables usage of phys_to_target_node() and
memory_add_physaddr_to_nid() with numa_memblks.
Commit 26 moves the description for numa=fake from x86 to admin-guide.

[1] https://lore.kernel.org/all/20240529171236.32002-1-Jonathan.Cameron@huawei.com/


This patch (of 26):

The stub functions in kernel/numa.c belong to mm/ rather than to kernel/

Link: https://lkml.kernel.org/r/20240807064110.1003856-1-rppt@kernel.org
Link: https://lkml.kernel.org/r/20240807064110.1003856-2-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Zi Yan <ziy@nvidia.com> # for x86_64 and arm64
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [arm64 + CXL via QEMU]
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Rob Herring (Arm) <robh@kernel.org>
Cc: Samuel Holland <samuel.holland@sifive.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-03 21:15:26 -07:00
Baoquan He
02aff84805 crash: split crash dumping code out from kexec_core.c
Currently, KEXEC_CORE select CRASH_CORE automatically because crash codes
need be built in to avoid compiling error when building kexec code even
though the crash dumping functionality is not enabled. E.g
--------------------
CONFIG_CRASH_CORE=y
CONFIG_KEXEC_CORE=y
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
---------------------

After splitting out crashkernel reservation code and vmcoreinfo exporting
code, there's only crash related code left in kernel/crash_core.c. Now
move crash related codes from kexec_core.c to crash_core.c and only build it
in when CONFIG_CRASH_DUMP=y.

And also wrap up crash codes inside CONFIG_CRASH_DUMP ifdeffery scope,
or replace inappropriate CONFIG_KEXEC_CORE ifdef with CONFIG_CRASH_DUMP
ifdef in generic kernel files.

With these changes, crash_core codes are abstracted from kexec codes and
can be disabled at all if only kexec reboot feature is wanted.

Link: https://lkml.kernel.org/r/20240124051254.67105-5-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Klara Modin <klarasmodin@gmail.com>
Cc: Michael Kelley <mhklinux@outlook.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-23 17:48:22 -08:00
Baoquan He
2c44b67e2e crash: remove dependency of FA_DUMP on CRASH_DUMP
In kdump kernel, /proc/vmcore is an elf file mapping the crashed kernel's
old memory content. Its elf header is constructed in 1st kernel and passed
to kdump kernel via elfcorehdr_addr. Config CRASH_DUMP enables the code
of 1st kernel's old memory accessing in different architectures.

Currently, config FA_DUMP has dependency on CRASH_DUMP because fadump
needs access global variable 'elfcorehdr_addr' to judge if it's in
kdump kernel within function is_kdump_kernel(). In the current
kernel/crash_dump.c, variable 'elfcorehdr_addr' is defined, and function
setup_elfcorehdr() used to parse kernel parameter to fetch the passed
value of elfcorehdr_addr. Only for accessing elfcorehdr_addr, FA_DUMP
really doesn't have to depends on CRASH_DUMP.

To remove the dependency of FA_DUMP on CRASH_DUMP to avoid confusion,
rename kernel/crash_dump.c to kernel/elfcorehdr.c, and build it when
CONFIG_VMCORE_INFO is ebabled. With this, FA_DUMP doesn't need to depend
on CRASH_DUMP.

[bhe@redhat.com: power/fadump: make FA_DUMP select CRASH_DUMP]
  Link: https://lkml.kernel.org/r/Zb8D1ASrgX0qVm9z@MiWiFi-R3L-srv
Link: https://lkml.kernel.org/r/20240124051254.67105-4-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Klara Modin <klarasmodin@gmail.com>
Cc: Michael Kelley <mhklinux@outlook.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-23 17:48:22 -08:00
Baoquan He
443cbaf9e2 crash: split vmcoreinfo exporting code out from crash_core.c
Now move the relevant codes into separate files:
kernel/crash_reserve.c, include/linux/crash_reserve.h.

And add config item CRASH_RESERVE to control its enabling.

And also update the old ifdeffery of CONFIG_CRASH_CORE, including of
<linux/crash_core.h> and config item dependency on CRASH_CORE
accordingly.

And also do renaming as follows:
 - arch/xxx/kernel/{crash_core.c => vmcore_info.c}
because they are only related to vmcoreinfo exporting on x86, arm64,
riscv.

And also Remove config item CRASH_CORE, and rely on CONFIG_KEXEC_CORE to
decide if build in crash_core.c.

[yang.lee@linux.alibaba.com: remove duplicated include in vmcore_info.c]
  Link: https://lkml.kernel.org/r/20240126005744.16561-1-yang.lee@linux.alibaba.com
Link: https://lkml.kernel.org/r/20240124051254.67105-3-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Klara Modin <klarasmodin@gmail.com>
Cc: Michael Kelley <mhklinux@outlook.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-23 17:48:22 -08:00
Baoquan He
85fcde402d kexec: split crashkernel reservation code out from crash_core.c
Patch series "Split crash out from kexec and clean up related config
items", v3.

Motivation:
=============
Previously, LKP reported a building error. When investigating, it can't
be resolved reasonablly with the present messy kdump config items.

 https://lore.kernel.org/oe-kbuild-all/202312182200.Ka7MzifQ-lkp@intel.com/

The kdump (crash dumping) related config items could causes confusions:

Firstly,

CRASH_CORE enables codes including
 - crashkernel reservation;
 - elfcorehdr updating;
 - vmcoreinfo exporting;
 - crash hotplug handling;

Now fadump of powerpc, kcore dynamic debugging and kdump all selects
CRASH_CORE, while fadump
 - fadump needs crashkernel parsing, vmcoreinfo exporting, and accessing
   global variable 'elfcorehdr_addr';
 - kcore only needs vmcoreinfo exporting;
 - kdump needs all of the current kernel/crash_core.c.

So only enabling PROC_CORE or FA_DUMP will enable CRASH_CORE, this
mislead people that we enable crash dumping, actual it's not.

Secondly,

It's not reasonable to allow KEXEC_CORE select CRASH_CORE.

Because KEXEC_CORE enables codes which allocate control pages, copy
kexec/kdump segments, and prepare for switching. These codes are
shared by both kexec reboot and kdump. We could want kexec reboot,
but disable kdump. In that case, CRASH_CORE should not be selected.

 --------------------
 CONFIG_CRASH_CORE=y
 CONFIG_KEXEC_CORE=y
 CONFIG_KEXEC=y
 CONFIG_KEXEC_FILE=y
 ---------------------

Thirdly,

It's not reasonable to allow CRASH_DUMP select KEXEC_CORE.

That could make KEXEC_CORE, CRASH_DUMP are enabled independently from
KEXEC or KEXEC_FILE. However, w/o KEXEC or KEXEC_FILE, the KEXEC_CORE
code built in doesn't make any sense because no kernel loading or
switching will happen to utilize the KEXEC_CORE code.
 ---------------------
 CONFIG_CRASH_CORE=y
 CONFIG_KEXEC_CORE=y
 CONFIG_CRASH_DUMP=y
 ---------------------

In this case, what is worse, on arch sh and arm, KEXEC relies on MMU,
while CRASH_DUMP can still be enabled when !MMU, then compiling error is
seen as the lkp test robot reported in above link.

 ------arch/sh/Kconfig------
 config ARCH_SUPPORTS_KEXEC
         def_bool MMU

 config ARCH_SUPPORTS_CRASH_DUMP
         def_bool BROKEN_ON_SMP
 ---------------------------

Changes:
===========
1, split out crash_reserve.c from crash_core.c;
2, split out vmcore_infoc. from crash_core.c;
3, move crash related codes in kexec_core.c into crash_core.c;
4, remove dependency of FA_DUMP on CRASH_DUMP;
5, clean up kdump related config items;
6, wrap up crash codes in crash related ifdefs on all 8 arch-es
   which support crash dumping, except of ppc;

Achievement:
===========
With above changes, I can rearrange the config item logic as below (the right
item depends on or is selected by the left item):

    PROC_KCORE -----------> VMCORE_INFO

               |----------> VMCORE_INFO
    FA_DUMP----|
               |----------> CRASH_RESERVE

                                                    ---->VMCORE_INFO
                                                   /
                                                   |---->CRASH_RESERVE
    KEXEC      --|                                /|
                 |--> KEXEC_CORE--> CRASH_DUMP-->/-|---->PROC_VMCORE
    KEXEC_FILE --|                               \ |
                                                   \---->CRASH_HOTPLUG


    KEXEC      --|
                 |--> KEXEC_CORE (for kexec reboot only)
    KEXEC_FILE --|

Test
========
On all 8 architectures, including x86_64, arm64, s390x, sh, arm, mips,
riscv, loongarch, I did below three cases of config item setting and
building all passed. Take configs on x86_64 as exampmle here:

(1) Both CONFIG_KEXEC and KEXEC_FILE is unset, then all kexec/kdump
items are unset automatically:
# Kexec and crash features
# CONFIG_KEXEC is not set
# CONFIG_KEXEC_FILE is not set
# end of Kexec and crash features

(2) set CONFIG_KEXEC_FILE and 'make olddefconfig':
---------------
# Kexec and crash features
CONFIG_CRASH_RESERVE=y
CONFIG_VMCORE_INFO=y
CONFIG_KEXEC_CORE=y
CONFIG_KEXEC_FILE=y
CONFIG_CRASH_DUMP=y
CONFIG_CRASH_HOTPLUG=y
CONFIG_CRASH_MAX_MEMORY_RANGES=8192
# end of Kexec and crash features
---------------

(3) unset CONFIG_CRASH_DUMP in case 2 and execute 'make olddefconfig':
------------------------
# Kexec and crash features
CONFIG_KEXEC_CORE=y
CONFIG_KEXEC_FILE=y
# end of Kexec and crash features
------------------------

Note:
For ppc, it needs investigation to make clear how to split out crash
code in arch folder. Hope Hari and Pingfan can help have a look, see if
it's doable. Now, I make it either have both kexec and crash enabled, or
disable both of them altogether.


This patch (of 14):

Both kdump and fa_dump of ppc rely on crashkernel reservation.  Move the
relevant codes into separate files: crash_reserve.c,
include/linux/crash_reserve.h.

And also add config item CRASH_RESERVE to control its enabling of the
codes.  And update config items which has relationship with crashkernel
reservation.

And also change ifdeffery from CONFIG_CRASH_CORE to CONFIG_CRASH_RESERVE
when those scopes are only crashkernel reservation related.

And also rename arch/XXX/include/asm/{crash_core.h => crash_reserve.h} on
arm64, x86 and risc-v because those architectures' crash_core.h is only
related to crashkernel reservation.

[akpm@linux-foundation.org: s/CRASH_RESEERVE/CRASH_RESERVE/, per Klara Modin]
Link: https://lkml.kernel.org/r/20240124051254.67105-1-bhe@redhat.com
Link: https://lkml.kernel.org/r/20240124051254.67105-2-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Klara Modin <klarasmodin@gmail.com>
Cc: Michael Kelley <mhklinux@outlook.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-23 17:48:21 -08:00
Kent Overstreet
d7a73e3f08 kernel/numa.c: Move logging out of numa.h
Moving these stub functions to a .c file means we can kill a sched.h
dependency on printk.h.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-20 19:26:30 -05:00
Linus Torvalds
4e3c09e954 v6.5-rc1-modules-next
The changes queued up for v6.5-rc1 for modules are pretty tame, mostly
 code removal of moving of code. Only two minor functional changes are
 made, the only one which stands out is Sebastian Andrzej Siewior's
 simplification of module reference counting by removing preempt_disable()
 and that has been tested on linux-next for well over a month without
 no regressions. I'm now, I guess, also a kitchen sink for some kallsyms
 changes.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmScfl0SHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoin8oMP/0DQK+r3BZimknzz6rF0EBStNZ/dIK2W
 1Q/r/ER4VKQKYxklc1M74K+7IX8ZDCYxqlaDS9lvAkRDWNC+t69aNZEib2odJleC
 p6WB30P0JIwfZZC0DS/ct3vrWZTyUhw7aOtvABRmjBfiJ3lFlU092Glvk1w1aFbD
 UrNRomPu4CujzfmnGj3VGc+HVSOEK0F1/GLm9ClrsR8SzKEpQmH4ALI/ON69B0ea
 PmL+d1Wyt6WEoH0hlV1TOXNdHUb3ZO1riSSfDYQ7TiG2AM5w1t4n26YRusc16hYU
 6Bx4OGt52ZJYR3btsRQlcylF4R5DUo+boDkM0NqEDU/3ciGMg6DgKdHnYCBN1w+X
 ZO8aXK1MIgF7W6CqSz+8HCsu5CuCos55FgM22dPbpZr3OEFCWemqnV+cYCu1DA+M
 Gbnn883ZLtt+R+qikD3135s+LxYIvxSuQrj+B3ZoQeIKEtAlyxuhrUJbU0tOns0j
 05PrkI8J1FtIysdlNZeIFg752IPtjp/0QNB4R46m40mT16L0TSjEP7c+zcPDryMb
 84SdLqh1gis0QZRkoH6JbMBDeT2dtuxqtQ5dTPka4s1mtg3SvRYr53sCJg+gQ8e2
 CBW6jgrIf3F4RIMMiSfXpSf4yVVxXxJAEFnGLRXhQ2HkUnk3mdGEfsZc7ucrsnlK
 f/KwaEzmLD9c
 =gjKD
 -----END PGP SIGNATURE-----

Merge tag 'v6.5-rc1-modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull module updates from Luis Chamberlain:
 "The changes queued up for modules are pretty tame, mostly code removal
  of moving of code.

  Only two minor functional changes are made, the only one which stands
  out is Sebastian Andrzej Siewior's simplification of module reference
  counting by removing preempt_disable() and that has been tested on
  linux-next for well over a month without no regressions.

  I'm now, I guess, also a kitchen sink for some kallsyms changes"

[ There was a mis-communication about the concurrent module load changes
  that I had expected to come through Luis despite me authoring the
  patch. So some of the module updates were left hanging in the email
  ether, and I just committed them separately.

  It's my bad - I should have made it more clear that I expected my
  own patches to come through the module tree too. Now they missed
  linux-next, but hopefully that won't cause any issues    - Linus ]

* tag 'v6.5-rc1-modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  kallsyms: make kallsyms_show_value() as generic function
  kallsyms: move kallsyms_show_value() out of kallsyms.c
  kallsyms: remove unsed API lookup_symbol_attrs
  kallsyms: remove unused arch_get_kallsym() helper
  module: Remove preempt_disable() from module reference counting.
2023-06-28 15:51:08 -07:00
Douglas Anderson
1f423c905a watchdog/hardlockup: detect hard lockups using secondary (buddy) CPUs
Implement a hardlockup detector that doesn't doesn't need any extra
arch-specific support code to detect lockups.  Instead of using something
arch-specific we will use the buddy system, where each CPU watches out for
another one.  Specifically, each CPU will use its softlockup hrtimer to
check that the next CPU is processing hrtimer interrupts by verifying that
a counter is increasing.

NOTE: unlike the other hard lockup detectors, the buddy one can't easily
show what's happening on the CPU that locked up just by doing a simple
backtrace.  It relies on some other mechanism in the system to get
information about the locked up CPUs.  This could be support for NMI
backtraces like [1], it could be a mechanism for printing the PC of locked
CPUs at panic time like [2] / [3], or it could be something else.  Even
though that means we still rely on arch-specific code, this arch-specific
code seems to often be implemented even on architectures that don't have a
hardlockup detector.

This style of hardlockup detector originated in some downstream Android
trees and has been rebased on / carried in ChromeOS trees for quite a long
time for use on arm and arm64 boards.  Historically on these boards we've
leveraged mechanism [2] / [3] to get information about hung CPUs, but we
could move to [1].

Although the original motivation for the buddy system was for use on
systems without an arch-specific hardlockup detector, it can still be
useful to use even on systems that _do_ have an arch-specific hardlockup
detector.  On x86, for instance, there is a 24-part patch series [4] in
progress switching the arch-specific hard lockup detector from a scarce
perf counter to a less-scarce hardware resource.  Potentially the buddy
system could be a simpler alternative to free up the perf counter but
still get hard lockup detection.

Overall, pros (+) and cons (-) of the buddy system compared to an
arch-specific hardlockup detector (which might be implemented using
perf):
+ The buddy system is usable on systems that don't have an
  arch-specific hardlockup detector, like arm32 and arm64 (though it's
  being worked on for arm64 [5]).
+ The buddy system may free up scarce hardware resources.
+ If a CPU totally goes out to lunch (can't process NMIs) the buddy
  system could still detect the problem (though it would be unlikely
  to be able to get a stack trace).
+ The buddy system uses the same timer function to pet the hardlockup
  detector on the running CPU as it uses to detect hardlockups on
  other CPUs. Compared to other hardlockup detectors, this means it
  generates fewer interrupts and thus is likely better able to let
  CPUs stay idle longer.
- If all CPUs are hard locked up at the same time the buddy system
  can't detect it.
- If we don't have SMP we can't use the buddy system.
- The buddy system needs an arch-specific mechanism (possibly NMI
  backtrace) to get info about the locked up CPU.

[1] https://lore.kernel.org/r/20230419225604.21204-1-dianders@chromium.org
[2] https://issuetracker.google.com/172213129
[3] https://docs.kernel.org/trace/coresight/coresight-cpu-debug.html
[4] https://lore.kernel.org/lkml/20230301234753.28582-1-ricardo.neri-calderon@linux.intel.com/
[5] https://lore.kernel.org/linux-arm-kernel/20220903093415.15850-1-lecopzer.chen@mediatek.com/

Link: https://lkml.kernel.org/r/20230519101840.v5.14.I6bf789d21d0c3d75d382e7e51a804a7a51315f2c@changeid
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Tzung-Bi Shih <tzungbi@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masayoshi Mizuma <msys.mizuma@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 17:44:21 -07:00
Douglas Anderson
6ea0d04211 watchdog/perf: rename watchdog_hld.c to watchdog_perf.c
The code currently in "watchdog_hld.c" is for detecting hardlockups using
perf, as evidenced by the line in the Makefile that only compiles this
file if CONFIG_HARDLOCKUP_DETECTOR_PERF is defined.  Rename the file to
prepare for the buddy hardlockup detector, which doesn't use perf.

It could be argued that the new name makes it less obvious that this is a
hardlockup detector.  While true, it's not hard to remember that the
"perf" detector is always a hardlockup detector and it's nice not to have
names that are too convoluted.

Link: https://lkml.kernel.org/r/20230519101840.v5.7.Ice803cb078d0e15fb2cbf49132f096ee2bd4199d@changeid
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Colin Cross <ccross@android.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masayoshi Mizuma <msys.mizuma@gmail.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Pingfan Liu <kernelfans@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Tzung-Bi Shih <tzungbi@chromium.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 17:44:19 -07:00
Maninder Singh
b06e9318bf kallsyms: move kallsyms_show_value() out of kallsyms.c
function kallsyms_show_value() is used by other parts
like modules_open(), kprobes_read() etc. which can work in case of
!KALLSYMS also.

e.g. as of now lsmod do not show module address if KALLSYMS is disabled.
since kallsyms_show_value() defination is not present, it returns false
in !KALLSYMS.

/ # lsmod
test 12288 0 - Live 0x0000000000000000 (O)

So kallsyms_show_value() can be made generic
without dependency on KALLSYMS.

Thus moving out function to a new file ksyms_common.c.

With this patch code is just moved to new file
and no functional change.

Co-developed-by: Onkarnath <onkarnath.1@samsung.com>
Signed-off-by: Onkarnath <onkarnath.1@samsung.com>
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-06-08 12:27:20 -07:00
Linus Torvalds
b6a7828502 modules-6.4-rc1
The summary of the changes for this pull requests is:
 
  * Song Liu's new struct module_memory replacement
  * Nick Alcock's MODULE_LICENSE() removal for non-modules
  * My cleanups and enhancements to reduce the areas where we vmalloc
    module memory for duplicates, and the respective debug code which
    proves the remaining vmalloc pressure comes from userspace.
 
 Most of the changes have been in linux-next for quite some time except
 the minor fixes I made to check if a module was already loaded
 prior to allocating the final module memory with vmalloc and the
 respective debug code it introduces to help clarify the issue. Although
 the functional change is small it is rather safe as it can only *help*
 reduce vmalloc space for duplicates and is confirmed to fix a bootup
 issue with over 400 CPUs with KASAN enabled. I don't expect stable
 kernels to pick up that fix as the cleanups would have also had to have
 been picked up. Folks on larger CPU systems with modules will want to
 just upgrade if vmalloc space has been an issue on bootup.
 
 Given the size of this request, here's some more elaborate details
 on this pull request.
 
 The functional change change in this pull request is the very first
 patch from Song Liu which replaces the struct module_layout with a new
 struct module memory. The old data structure tried to put together all
 types of supported module memory types in one data structure, the new
 one abstracts the differences in memory types in a module to allow each
 one to provide their own set of details. This paves the way in the
 future so we can deal with them in a cleaner way. If you look at changes
 they also provide a nice cleanup of how we handle these different memory
 areas in a module. This change has been in linux-next since before the
 merge window opened for v6.3 so to provide more than a full kernel cycle
 of testing. It's a good thing as quite a bit of fixes have been found
 for it.
 
 Jason Baron then made dynamic debug a first class citizen module user by
 using module notifier callbacks to allocate / remove module specific
 dynamic debug information.
 
 Nick Alcock has done quite a bit of work cross-tree to remove module
 license tags from things which cannot possibly be module at my request
 so to:
 
   a) help him with his longer term tooling goals which require a
      deterministic evaluation if a piece a symbol code could ever be
      part of a module or not. But quite recently it is has been made
      clear that tooling is not the only one that would benefit.
      Disambiguating symbols also helps efforts such as live patching,
      kprobes and BPF, but for other reasons and R&D on this area
      is active with no clear solution in sight.
 
   b) help us inch closer to the now generally accepted long term goal
      of automating all the MODULE_LICENSE() tags from SPDX license tags
 
 In so far as a) is concerned, although module license tags are a no-op
 for non-modules, tools which would want create a mapping of possible
 modules can only rely on the module license tag after the commit
 8b41fc4454 ("kbuild: create modules.builtin without Makefile.modbuiltin
 or tristate.conf").  Nick has been working on this *for years* and
 AFAICT I was the only one to suggest two alternatives to this approach
 for tooling. The complexity in one of my suggested approaches lies in
 that we'd need a possible-obj-m and a could-be-module which would check
 if the object being built is part of any kconfig build which could ever
 lead to it being part of a module, and if so define a new define
 -DPOSSIBLE_MODULE [0]. A more obvious yet theoretical approach I've
 suggested would be to have a tristate in kconfig imply the same new
 -DPOSSIBLE_MODULE as well but that means getting kconfig symbol names
 mapping to modules always, and I don't think that's the case today. I am
 not aware of Nick or anyone exploring either of these options. Quite
 recently Josh Poimboeuf has pointed out that live patching, kprobes and
 BPF would benefit from resolving some part of the disambiguation as
 well but for other reasons. The function granularity KASLR (fgkaslr)
 patches were mentioned but Joe Lawrence has clarified this effort has
 been dropped with no clear solution in sight [1].
 
 In the meantime removing module license tags from code which could never
 be modules is welcomed for both objectives mentioned above. Some
 developers have also welcomed these changes as it has helped clarify
 when a module was never possible and they forgot to clean this up,
 and so you'll see quite a bit of Nick's patches in other pull
 requests for this merge window. I just picked up the stragglers after
 rc3. LWN has good coverage on the motivation behind this work [2] and
 the typical cross-tree issues he ran into along the way. The only
 concrete blocker issue he ran into was that we should not remove the
 MODULE_LICENSE() tags from files which have no SPDX tags yet, even if
 they can never be modules. Nick ended up giving up on his efforts due
 to having to do this vetting and backlash he ran into from folks who
 really did *not understand* the core of the issue nor were providing
 any alternative / guidance. I've gone through his changes and dropped
 the patches which dropped the module license tags where an SPDX
 license tag was missing, it only consisted of 11 drivers.  To see
 if a pull request deals with a file which lacks SPDX tags you
 can just use:
 
   ./scripts/spdxcheck.py -f \
 	$(git diff --name-only commid-id | xargs echo)
 
 You'll see a core module file in this pull request for the above,
 but that's not related to his changes. WE just need to add the SPDX
 license tag for the kernel/module/kmod.c file in the future but
 it demonstrates the effectiveness of the script.
 
 Most of Nick's changes were spread out through different trees,
 and I just picked up the slack after rc3 for the last kernel was out.
 Those changes have been in linux-next for over two weeks.
 
 The cleanups, debug code I added and final fix I added for modules
 were motivated by David Hildenbrand's report of boot failing on
 a systems with over 400 CPUs when KASAN was enabled due to running
 out of virtual memory space. Although the functional change only
 consists of 3 lines in the patch "module: avoid allocation if module is
 already present and ready", proving that this was the best we can
 do on the modules side took quite a bit of effort and new debug code.
 
 The initial cleanups I did on the modules side of things has been
 in linux-next since around rc3 of the last kernel, the actual final
 fix for and debug code however have only been in linux-next for about a
 week or so but I think it is worth getting that code in for this merge
 window as it does help fix / prove / evaluate the issues reported
 with larger number of CPUs. Userspace is not yet fixed as it is taking
 a bit of time for folks to understand the crux of the issue and find a
 proper resolution. Worst come to worst, I have a kludge-of-concept [3]
 of how to make kernel_read*() calls for modules unique / converge them,
 but I'm currently inclined to just see if userspace can fix this
 instead.
 
 [0] https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/
 [1] https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com
 [2] https://lwn.net/Articles/927569/
 [3] https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmRG4m0SHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinQ2oP/0xlvKwJg6Ey8fHZF0qv8VOskE80zoLF
 hMazU3xfqLA+1TQvouW1YBxt3jwS3t1Ehs+NrV+nY9Yzcm0MzRX/n3fASJVe7nRr
 oqWWQU+voYl5Pw1xsfdp6C8IXpBQorpYby3Vp0MAMoZyl2W2YrNo36NV488wM9KC
 jD4HF5Z6xpnPSZTRR7AgW9mo7FdAtxPeKJ76Bch7lH8U6omT7n36WqTw+5B1eAYU
 YTOvrjRs294oqmWE+LeebyiOOXhH/yEYx4JNQgCwPdxwnRiGJWKsk5va0hRApqF/
 WW8dIqdEnjsa84lCuxnmWgbcPK8cgmlO0rT0DyneACCldNlldCW1LJ0HOwLk9pea
 p3JFAsBL7TKue4Tos6I7/4rx1ufyBGGIigqw9/VX5g0Iif+3BhWnqKRfz+p9wiMa
 Fl7cU6u7yC68CHu1HBSisK16cYMCPeOnTSd89upHj8JU/t74O6k/ARvjrQ9qmNUt
 c5U+OY+WpNJ1nXQydhY/yIDhFdYg8SSpNuIO90r4L8/8jRQYXNG80FDd1UtvVDuy
 eq0r2yZ8C0XHSlOT9QHaua/tWV/aaKtyC/c0hDRrigfUrq8UOlGujMXbUnrmrWJI
 tLJLAc7ePWAAoZXGSHrt0U27l029GzLwRdKqJ6kkDANVnTeOdV+mmBg9zGh3/Mp6
 agiwdHUMVN7X
 =56WK
 -----END PGP SIGNATURE-----

Merge tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull module updates from Luis Chamberlain:
 "The summary of the changes for this pull requests is:

   - Song Liu's new struct module_memory replacement

   - Nick Alcock's MODULE_LICENSE() removal for non-modules

   - My cleanups and enhancements to reduce the areas where we vmalloc
     module memory for duplicates, and the respective debug code which
     proves the remaining vmalloc pressure comes from userspace.

  Most of the changes have been in linux-next for quite some time except
  the minor fixes I made to check if a module was already loaded prior
  to allocating the final module memory with vmalloc and the respective
  debug code it introduces to help clarify the issue. Although the
  functional change is small it is rather safe as it can only *help*
  reduce vmalloc space for duplicates and is confirmed to fix a bootup
  issue with over 400 CPUs with KASAN enabled. I don't expect stable
  kernels to pick up that fix as the cleanups would have also had to
  have been picked up. Folks on larger CPU systems with modules will
  want to just upgrade if vmalloc space has been an issue on bootup.

  Given the size of this request, here's some more elaborate details:

  The functional change change in this pull request is the very first
  patch from Song Liu which replaces the 'struct module_layout' with a
  new 'struct module_memory'. The old data structure tried to put
  together all types of supported module memory types in one data
  structure, the new one abstracts the differences in memory types in a
  module to allow each one to provide their own set of details. This
  paves the way in the future so we can deal with them in a cleaner way.
  If you look at changes they also provide a nice cleanup of how we
  handle these different memory areas in a module. This change has been
  in linux-next since before the merge window opened for v6.3 so to
  provide more than a full kernel cycle of testing. It's a good thing as
  quite a bit of fixes have been found for it.

  Jason Baron then made dynamic debug a first class citizen module user
  by using module notifier callbacks to allocate / remove module
  specific dynamic debug information.

  Nick Alcock has done quite a bit of work cross-tree to remove module
  license tags from things which cannot possibly be module at my request
  so to:

   a) help him with his longer term tooling goals which require a
      deterministic evaluation if a piece a symbol code could ever be
      part of a module or not. But quite recently it is has been made
      clear that tooling is not the only one that would benefit.
      Disambiguating symbols also helps efforts such as live patching,
      kprobes and BPF, but for other reasons and R&D on this area is
      active with no clear solution in sight.

   b) help us inch closer to the now generally accepted long term goal
      of automating all the MODULE_LICENSE() tags from SPDX license tags

  In so far as a) is concerned, although module license tags are a no-op
  for non-modules, tools which would want create a mapping of possible
  modules can only rely on the module license tag after the commit
  8b41fc4454 ("kbuild: create modules.builtin without
  Makefile.modbuiltin or tristate.conf").

  Nick has been working on this *for years* and AFAICT I was the only
  one to suggest two alternatives to this approach for tooling. The
  complexity in one of my suggested approaches lies in that we'd need a
  possible-obj-m and a could-be-module which would check if the object
  being built is part of any kconfig build which could ever lead to it
  being part of a module, and if so define a new define
  -DPOSSIBLE_MODULE [0].

  A more obvious yet theoretical approach I've suggested would be to
  have a tristate in kconfig imply the same new -DPOSSIBLE_MODULE as
  well but that means getting kconfig symbol names mapping to modules
  always, and I don't think that's the case today. I am not aware of
  Nick or anyone exploring either of these options. Quite recently Josh
  Poimboeuf has pointed out that live patching, kprobes and BPF would
  benefit from resolving some part of the disambiguation as well but for
  other reasons. The function granularity KASLR (fgkaslr) patches were
  mentioned but Joe Lawrence has clarified this effort has been dropped
  with no clear solution in sight [1].

  In the meantime removing module license tags from code which could
  never be modules is welcomed for both objectives mentioned above. Some
  developers have also welcomed these changes as it has helped clarify
  when a module was never possible and they forgot to clean this up, and
  so you'll see quite a bit of Nick's patches in other pull requests for
  this merge window. I just picked up the stragglers after rc3. LWN has
  good coverage on the motivation behind this work [2] and the typical
  cross-tree issues he ran into along the way. The only concrete blocker
  issue he ran into was that we should not remove the MODULE_LICENSE()
  tags from files which have no SPDX tags yet, even if they can never be
  modules. Nick ended up giving up on his efforts due to having to do
  this vetting and backlash he ran into from folks who really did *not
  understand* the core of the issue nor were providing any alternative /
  guidance. I've gone through his changes and dropped the patches which
  dropped the module license tags where an SPDX license tag was missing,
  it only consisted of 11 drivers. To see if a pull request deals with a
  file which lacks SPDX tags you can just use:

    ./scripts/spdxcheck.py -f \
	$(git diff --name-only commid-id | xargs echo)

  You'll see a core module file in this pull request for the above, but
  that's not related to his changes. WE just need to add the SPDX
  license tag for the kernel/module/kmod.c file in the future but it
  demonstrates the effectiveness of the script.

  Most of Nick's changes were spread out through different trees, and I
  just picked up the slack after rc3 for the last kernel was out. Those
  changes have been in linux-next for over two weeks.

  The cleanups, debug code I added and final fix I added for modules
  were motivated by David Hildenbrand's report of boot failing on a
  systems with over 400 CPUs when KASAN was enabled due to running out
  of virtual memory space. Although the functional change only consists
  of 3 lines in the patch "module: avoid allocation if module is already
  present and ready", proving that this was the best we can do on the
  modules side took quite a bit of effort and new debug code.

  The initial cleanups I did on the modules side of things has been in
  linux-next since around rc3 of the last kernel, the actual final fix
  for and debug code however have only been in linux-next for about a
  week or so but I think it is worth getting that code in for this merge
  window as it does help fix / prove / evaluate the issues reported with
  larger number of CPUs. Userspace is not yet fixed as it is taking a
  bit of time for folks to understand the crux of the issue and find a
  proper resolution. Worst come to worst, I have a kludge-of-concept [3]
  of how to make kernel_read*() calls for modules unique / converge
  them, but I'm currently inclined to just see if userspace can fix this
  instead"

Link: https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/ [0]
Link: https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com [1]
Link: https://lwn.net/Articles/927569/ [2]
Link: https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org [3]

* tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (121 commits)
  module: add debugging auto-load duplicate module support
  module: stats: fix invalid_mod_bytes typo
  module: remove use of uninitialized variable len
  module: fix building stats for 32-bit targets
  module: stats: include uapi/linux/module.h
  module: avoid allocation if module is already present and ready
  module: add debug stats to help identify memory pressure
  module: extract patient module check into helper
  modules/kmod: replace implementation with a semaphore
  Change DEFINE_SEMAPHORE() to take a number argument
  module: fix kmemleak annotations for non init ELF sections
  module: Ignore L0 and rename is_arm_mapping_symbol()
  module: Move is_arm_mapping_symbol() to module_symbol.h
  module: Sync code of is_arm_mapping_symbol()
  scripts/gdb: use mem instead of core_layout to get the module address
  interconnect: remove module-related code
  interconnect: remove MODULE_LICENSE in non-modules
  zswap: remove MODULE_LICENSE in non-modules
  zpool: remove MODULE_LICENSE in non-modules
  x86/mm/dump_pagetables: remove MODULE_LICENSE in non-modules
  ...
2023-04-27 16:36:55 -07:00
Luis Chamberlain
25be451aa4 module: fold usermode helper kmod into modules directory
The kernel/kmod.c is already only built if we enabled modules, so
just stuff it under kernel/module/kmod.c and unify the MAINTAINERS
file for it.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-03-24 11:33:08 -07:00
Mike Christie
e297cd54b3
vhost_task: Allow vhost layer to use copy_process
Qemu will create vhost devices in the kernel which perform network, SCSI,
etc IO and management operations from worker threads created by the
kthread API. Because the kthread API does a copy_process on the kthreadd
thread, the vhost layer has to use kthread_use_mm to access the Qemu
thread's memory and cgroup_attach_task_all to add itself to the Qemu
thread's cgroups, and it bypasses the RLIMIT_NPROC limit which can result
in VMs creating more threads than the admin expected.

This patch adds a new struct vhost_task which can be used instead of
kthreads. They allow the vhost layer to use copy_process and inherit
the userspace process's mm and cgroups, the task is accounted for
under the userspace's nproc count and can be seen in its process tree,
and other features like namespaces work and are inherited by default.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-03-23 12:45:36 +01:00
Linus Torvalds
51094a24b8 kernel hardening fixes for v6.2-rc1
- Fix CFI failure with KASAN (Sami Tolvanen)
 
 - Fix LKDTM + CFI under GCC 7 and 8 (Kristina Martsenko)
 
 - Limit CONFIG_ZERO_CALL_USED_REGS to Clang > 15.0.6 (Nathan Chancellor)
 
 - Ignore "contents" argument in LoadPin's LSM hook handling
 
 - Fix paste-o in /sys/kernel/warn_count API docs
 
 - Use READ_ONCE() consistently for oops/warn limit reading
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmOl9ckWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJlOKEACX/N/ukox8Kg16wfeetiB/Gu2H
 AdFF1MqoLjVRxNfFDf2RNolVVW88QTlu3eQqi5n8wnBiMAlkIqM6vtQXidWW/MEz
 xfaZjIECFxalAdZDoQI/7QjPt0TmRyG2LGxMxpxE7ZczLUOn1nAIeBm6JvbGFCJY
 6Sj7vdhwuqMuPAw3obefLgKs1e7fkJW7Uow3YokSWrRuLYnzd6CsF/QVs9fI1KJ7
 GVnrpD+hzdeYbLKIvXYycks5M8WJfCkB8vGx5OVqUk9e9XghIc8YnIO/k+aHow+1
 UB/ZfhciwTae04gA8GKK6mM595N1arnJBUtopvl3laH8QmuOa6oxUsWrn8Sf0IHe
 7GZXYeJ6cMPeHz+AnkJDSx1nzzcmXGmuMrCJRic3WurVMZVHU4aP8amDqdhl1Taq
 O/SvwutoWZJ8OJ0JtEguBsKM8so8h8PSO9LAfj//h8rRWh4beMD9p/epEGNeX7RF
 +KpiMmQ0bCpcJiA4uSJpInzewtuNnsd+hTLOwdflC777odxnHwli9mhhStmz5rA+
 UeoHy4tGYCMO7TRkZdWSP9bu+c5SuKZSuFVSTFDOo9zQuw/SIWk5MJRWPzCcIBCH
 5T/s3JoneziFlyWSQ861hH560RdN9vC8cE+Se2bttOXFBF42Gm+LLa3vItbecxz+
 DamqIuIfnrAfNRo9Ww==
 =NHw1
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.2-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull kernel hardening fixes from Kees Cook:

 - Fix CFI failure with KASAN (Sami Tolvanen)

 - Fix LKDTM + CFI under GCC 7 and 8 (Kristina Martsenko)

 - Limit CONFIG_ZERO_CALL_USED_REGS to Clang > 15.0.6 (Nathan
   Chancellor)

 - Ignore "contents" argument in LoadPin's LSM hook handling

 - Fix paste-o in /sys/kernel/warn_count API docs

 - Use READ_ONCE() consistently for oops/warn limit reading

* tag 'hardening-v6.2-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  cfi: Fix CFI failure with KASAN
  exit: Use READ_ONCE() for all oops/warn limit reads
  security: Restrict CONFIG_ZERO_CALL_USED_REGS to gcc or clang > 15.0.6
  lkdtm: cfi: Make PAC test work with GCC 7 and 8
  docs: Fix path paste-o for /sys/kernel/warn_count
  LoadPin: Ignore the "contents" argument of the LSM hooks
2022-12-23 12:00:24 -08:00
Sami Tolvanen
cf8016408d cfi: Fix CFI failure with KASAN
When CFI_CLANG and KASAN are both enabled, LLVM doesn't generate a
CFI type hash for asan.module_ctor functions in translation units
where CFI is disabled, which leads to a CFI failure during boot when
do_ctors calls the affected constructors:

  CFI failure at do_basic_setup+0x64/0x90 (target:
  asan.module_ctor+0x0/0x28; expected type: 0xa540670c)

Specifically, this happens because CFI is disabled for
kernel/cfi.c. There's no reason to keep CFI disabled here anymore, so
fix the failure by not filtering out CC_FLAGS_CFI for the file.

Note that https://reviews.llvm.org/rG3b14862f0a96 fixed the issue
where LLVM didn't emit CFI type hashes for any sanitizer constructors,
but now type hashes are emitted correctly for TUs that use CFI.

Link: https://github.com/ClangBuiltLinux/linux/issues/1742
Fixes: 8924560094 ("cfi: Switch to -fsanitize=kcfi")
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221222225747.3538676-1-samitolvanen@google.com
2022-12-23 10:04:31 -08:00
Zhen Lei
30f3bb0977 kallsyms: Add self-test facility
Added test cases for basic functions and performance of functions
kallsyms_lookup_name(), kallsyms_on_each_symbol() and
kallsyms_on_each_match_symbol(). It also calculates the compression rate
of the kallsyms compression algorithm for the current symbol set.

The basic functions test begins by testing a set of symbols whose address
values are known. Then, traverse all symbol addresses and find the
corresponding symbol name based on the address. It's impossible to
determine whether these addresses are correct, but we can use the above
three functions along with the addresses to test each other. Due to the
traversal operation of kallsyms_on_each_symbol() is too slow, only 60
symbols can be tested in one second, so let it test on average once
every 128 symbols. The other two functions validate all symbols.

If the basic functions test is passed, print only performance test
results. If the test fails, print error information, but do not perform
subsequent performance tests.

Start self-test automatically after system startup if
CONFIG_KALLSYMS_SELFTEST=y.

Example of output content: (prefix 'kallsyms_selftest:' is omitted
 start
  ---------------------------------------------------------
 | nr_symbols | compressed size | original size | ratio(%) |
 |---------------------------------------------------------|
 |     107543 |       1357912   |      2407433  |  56.40   |
  ---------------------------------------------------------
 kallsyms_lookup_name() looked up 107543 symbols
 The time spent on each symbol is (ns): min=630, max=35295, avg=7353
 kallsyms_on_each_symbol() traverse all: 11782628 ns
 kallsyms_on_each_match_symbol() traverse all: 9261 ns
 finish

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2022-11-15 00:42:02 -08:00
Alexander Potapenko
79dbd006a6 kmsan: disable instrumentation of unsupported common kernel code
EFI stub cannot be linked with KMSAN runtime, so we disable
instrumentation for it.

Instrumenting kcov, stackdepot or lockdep leads to infinite recursion
caused by instrumentation hooks calling instrumented code again.

Link: https://lkml.kernel.org/r/20220915150417.722975-13-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:03:20 -07:00
Juergen Gross
a870544ca9 kernel: remove platform_has() infrastructure
The only use case of the platform_has() infrastructure has been
removed again, so remove the whole feature.

Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 guest using Xen
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20220622063838.8854-3-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-01 07:42:56 +02:00
Juergen Gross
2130a790ca kernel: add platform_has() infrastructure
Add a simple infrastructure for setting, resetting and querying
platform feature flags.

Flags can be either global or architecture specific.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 only
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-06 08:06:00 +02:00
Linus Torvalds
ef98f9cfe2 Modules updates for v5.19-rc1
As promised, for v5.19 I queued up quite a bit of work for modules, but
 still with a pretty conservative eye. These changes have been soaking on
 modules-next (and so linux-next) for quite some time, the code shift was
 merged onto modules-next on March 22, and the last patch was queued on May
 5th.
 
 The following are the highlights of what bells and whistles we will get for
 v5.19:
 
  1) It was time to tidy up kernel/module.c and one way of starting with
     that effort was to split it up into files. At my request Aaron Tomlin
     spearheaded that effort with the goal to not introduce any
     functional at all during that endeavour.  The penalty for the split
     is +1322 bytes total, +112 bytes in data, +1210 bytes in text while
     bss is unchanged. One of the benefits of this other than helping
     make the code easier to read and review is summoning more help on review
     for changes with livepatching so kernel/module/livepatch.c is now
     pegged as maintained by the live patching folks.
 
     The before and after with just the move on a defconfig on x86-64:
 
      $ size kernel/module.o
         text    data     bss     dec     hex filename
        38434    4540     104   43078    a846 kernel/module.o
 
      $ size -t kernel/module/*.o
         text    data     bss     dec     hex filename
        4785     120       0    4905    1329 kernel/module/kallsyms.o
       28577    4416     104   33097    8149 kernel/module/main.o
        1158       8       0    1166     48e kernel/module/procfs.o
         902     108       0    1010     3f2 kernel/module/strict_rwx.o
        3390       0       0    3390     d3e kernel/module/sysfs.o
         832       0       0     832     340 kernel/module/tree_lookup.o
       39644    4652     104   44400    ad70 (TOTALS)
 
  2) Aaron added module unload taint tracking (MODULE_UNLOAD_TAINT_TRACKING),
     so to enable tracking unloaded modules which did taint the kernel.
 
  3) Christophe Leroy added CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC
     which lets architectures to request having modules data in vmalloc
     area instead of module area. There are three reasons why an
     architecture might want this:
 
     a) On some architectures (like book3s/32) it is not possible to protect
        against execution on a page basis. The exec stuff can be mapped by
        different arch segment sizes (on book3s/32 that is 256M segments). By
        default the module area is in an Exec segment while vmalloc area is in
        a NoExec segment. Using vmalloc lets you muck with module data as
        NoExec on those architectures whereas before you could not.
 
     b) By pushing more module data to vmalloc you also increase the
        probability of module text to remain within a closer distance
        from kernel core text and this reduces trampolines, this has been
        reported on arm first and powerpc folks are following that lead.
 
     c) Free'ing module_alloc() (Exec by default) area leaves this
        exposed as Exec by default, some architectures have some
        security enhancements to set this as NoExec on free, and splitting
        module data with text let's future generic special allocators
        be added to the kernel without having developers try to grok
        the tribal knowledge per arch. Work like Rick Edgecombe's
        permission vmalloc interface [0] becomes easier to address over
        time.
 
        [0] https://lore.kernel.org/lkml/20201120202426.18009-1-rick.p.edgecombe@intel.com/#r
 
  4) Masahiro Yamada's symbol search enhancements
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmKOnHkSHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinFw4P/1ADdvfj+b6wbAxou6tPa2ZKnx/ImEnE
 0T1P/n2guWg+2Q8oYjqifTpadGzr8td4c/PaGb5UpfdEOdBIyIGklrVZpQ+xkqfT
 X4KIvqsf4ajL24OKxOSNtvL8RXEIDUhJ4Veq6BImBk8CPrPjsUBlNyAIlvV0aom2
 BsFROQ2pMTSCiFY47gkMKLBlBny1l7zktoF0lhWTzHimw8VSDbTJFlu+fZvspd0o
 lCqiHTkpiBSJDSEEjqk0lT6wIb27fvdzjmjy+Ur71bBKiPIEPiL5XNUufkGe6oB3
 mnTOPow+wPTQc0dtkTpCHQYXE/a70Sbkwp1JfkbSYeHzJLlFru/tkmKiwN0RUo9l
 0mY7VPEKuQWmxsOkLqvwcPBGx5JOSWOJKrbgpFmH+RLgeEgEa8t7uQDURK2KeIj8
 P7ZzN5M2klKIHHA4vjfekYOJAb1Tii9Ibp7iGeiYxf93mPJBqwvRwbtBXBZpB4ce
 FoDrxwEq812KPW7P2O1kgOvq7Fn1KWh0wVeKc8iBGxFxJhzOQY86H1ZRWDLAxRss
 Rr1PMLt2TbTLUBt7MzR4vrg0NoQvpLYyf2jGFjWyZDRHU8nLeHkOlQot3xRDAtq9
 Bpx5mSlM9BGfPibd1Kw4BaxBha5vVCQ+AcleT+NWnCjw4I0wLoFi9RLUSyItn9No
 tlHLgdrM2a54
 =cxtr
 -----END PGP SIGNATURE-----

Merge tag 'modules-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull modules updates from  Luis Chamberlain:

 - It was time to tidy up kernel/module.c and one way of starting with
   that effort was to split it up into files. At my request Aaron Tomlin
   spearheaded that effort with the goal to not introduce any functional
   at all during that endeavour. The penalty for the split is +1322
   bytes total, +112 bytes in data, +1210 bytes in text while bss is
   unchanged. One of the benefits of this other than helping make the
   code easier to read and review is summoning more help on review for
   changes with livepatching so kernel/module/livepatch.c is now pegged
   as maintained by the live patching folks.

   The before and after with just the move on a defconfig on x86-64:

     $ size kernel/module.o
        text    data     bss     dec     hex filename
       38434    4540     104   43078    a846 kernel/module.o

     $ size -t kernel/module/*.o
        text    data     bss     dec     hex filename
       4785     120       0    4905    1329 kernel/module/kallsyms.o
      28577    4416     104   33097    8149 kernel/module/main.o
       1158       8       0    1166     48e kernel/module/procfs.o
        902     108       0    1010     3f2 kernel/module/strict_rwx.o
       3390       0       0    3390     d3e kernel/module/sysfs.o
        832       0       0     832     340 kernel/module/tree_lookup.o
      39644    4652     104   44400    ad70 (TOTALS)

 - Aaron added module unload taint tracking (MODULE_UNLOAD_TAINT_TRACKING),
   to enable tracking unloaded modules which did taint the kernel.

 - Christophe Leroy added CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC
   which lets architectures to request having modules data in vmalloc
   area instead of module area. There are three reasons why an
   architecture might want this:

    a) On some architectures (like book3s/32) it is not possible to
       protect against execution on a page basis. The exec stuff can be
       mapped by different arch segment sizes (on book3s/32 that is 256M
       segments). By default the module area is in an Exec segment while
       vmalloc area is in a NoExec segment. Using vmalloc lets you muck
       with module data as NoExec on those architectures whereas before
       you could not.

    b) By pushing more module data to vmalloc you also increase the
       probability of module text to remain within a closer distance
       from kernel core text and this reduces trampolines, this has been
       reported on arm first and powerpc folks are following that lead.

    c) Free'ing module_alloc() (Exec by default) area leaves this
       exposed as Exec by default, some architectures have some security
       enhancements to set this as NoExec on free, and splitting module
       data with text let's future generic special allocators be added
       to the kernel without having developers try to grok the tribal
       knowledge per arch. Work like Rick Edgecombe's permission vmalloc
       interface [0] becomes easier to address over time.

       [0] https://lore.kernel.org/lkml/20201120202426.18009-1-rick.p.edgecombe@intel.com/#r

 - Masahiro Yamada's symbol search enhancements

* tag 'modules-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (33 commits)
  module: merge check_exported_symbol() into find_exported_symbol_in_section()
  module: do not binary-search in __ksymtab_gpl if fsa->gplok is false
  module: do not pass opaque pointer for symbol search
  module: show disallowed symbol name for inherit_taint()
  module: fix [e_shstrndx].sh_size=0 OOB access
  module: Introduce module unload taint tracking
  module: Move module_assert_mutex_or_preempt() to internal.h
  module: Make module_flags_taint() accept a module's taints bitmap and usable outside core code
  module.h: simplify MODULE_IMPORT_NS
  powerpc: Select ARCH_WANTS_MODULES_DATA_IN_VMALLOC on book3s/32 and 8xx
  module: Remove module_addr_min and module_addr_max
  module: Add CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC
  module: Introduce data_layout
  module: Prepare for handling several RB trees
  module: Always have struct mod_tree_root
  module: Rename debug_align() as strict_align()
  module: Rework layout alignment to avoid BUG_ON()s
  module: Move module_enable_x() and frob_text() in strict_rwx.c
  module: Make module_enable_x() independent of CONFIG_ARCH_HAS_STRICT_MODULE_RWX
  module: Move version support into a separate file
  ...
2022-05-26 17:13:43 -07:00
Christophe Leroy
8fd4ddda2f static_call: Don't make __static_call_return0 static
System.map shows that vmlinux contains several instances of
__static_call_return0():

	c0004fc0 t __static_call_return0
	c0011518 t __static_call_return0
	c00d8160 t __static_call_return0

arch_static_call_transform() uses the middle one to check whether we are
setting a call to __static_call_return0 or not:

	c0011520 <arch_static_call_transform>:
	c0011520:       3d 20 c0 01     lis     r9,-16383	<== r9 =  0xc001 << 16
	c0011524:       39 29 15 18     addi    r9,r9,5400	<== r9 += 0x1518
	c0011528:       7c 05 48 00     cmpw    r5,r9		<== r9 has value 0xc0011518 here

So if static_call_update() is called with one of the other instances of
__static_call_return0(), arch_static_call_transform() won't recognise it.

In order to work properly, global single instance of __static_call_return0() is required.

Fixes: 3f2a8fc4b1 ("static_call/x86: Add __static_call_return0()")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/30821468a0e7d28251954b578e5051dc09300d04.1647258493.git.christophe.leroy@csgroup.eu
2022-04-05 09:59:38 +02:00
Aaron Tomlin
cfc1d27789 module: Move all into module/
No functional changes.

This patch moves all module related code into a separate directory,
modifies each file name and creates a new Makefile. Note: this effort
is in preparation to refactor core module code.

Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2022-04-04 12:57:54 -07:00
Masami Hiramatsu
73f9b911fa kprobes: Use rethook for kretprobe if possible
Use rethook for kretprobe function return hooking if the arch sets
CONFIG_HAVE_RETHOOK=y. In this case, CONFIG_KRETPROBE_ON_RETHOOK is
set to 'y' automatically, and the kretprobe internal data fields
switches to use rethook. If not, it continues to use kretprobe
specific function return hooks.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164826162556.2455864.12255833167233452047.stgit@devnote2
2022-03-28 19:38:09 -07:00
Dmitry Torokhov
b1ae6dc41e module: add in-kernel support for decompressing
Current scheme of having userspace decompress kernel modules before
loading them into the kernel runs afoul of LoadPin security policy, as
it loses link between the source of kernel module on the disk and binary
blob that is being loaded into the kernel. To solve this issue let's
implement decompression in kernel, so that we can pass a file descriptor
of compressed module file into finit_module() which will keep LoadPin
happy.

To let userspace know what compression/decompression scheme kernel
supports it will create /sys/module/compression attribute. kmod can read
this attribute and decide if it can pass compressed file to
finit_module(). New MODULE_INIT_COMPRESSED_DATA flag indicates that the
kernel should attempt to decompress the data read from file descriptor
prior to trying load the module.

To simplify things kernel will only implement single decompression
method matching compression method selected when generating modules.
This patch implements gzip and xz; more can be added later,

Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2022-01-11 18:45:02 -08:00
Linus Torvalds
79ef0c0014 Tracing updates for 5.16:
- kprobes: Restructured stack unwinder to show properly on x86 when a stack
   dump happens from a kretprobe callback.
 
 - Fix to bootconfig parsing
 
 - Have tracefs allow owner and group permissions by default (only denying
   others). There's been pressure to allow non root to tracefs in a
   controlled fashion, and using groups is probably the safest.
 
 - Bootconfig memory managament updates.
 
 - Bootconfig clean up to have the tools directory be less dependent on
   changes in the kernel tree.
 
 - Allow perf to be traced by function tracer.
 
 - Rewrite of function graph tracer to be a callback from the function tracer
   instead of having its own trampoline (this change will happen on an arch
   by arch basis, and currently only x86_64 implements it).
 
 - Allow multiple direct trampolines (bpf hooks to functions) be batched
   together in one synchronization.
 
 - Allow histogram triggers to add variables that can perform calculations
   against the event's fields.
 
 - Use the linker to determine architecture callbacks from the ftrace
   trampoline to allow for proper parameter prototypes and prevent warnings
   from the compiler.
 
 - Extend histogram triggers to key off of variables.
 
 - Have trace recursion use bit magic to determine preempt context over if
   branches.
 
 - Have trace recursion disable preemption as all use cases do anyway.
 
 - Added testing for verification of tracing utilities.
 
 - Various small clean ups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYYBdxhQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qp1sAQD2oYFwaG3sx872gj/myBcHIBSKdiki
 Hry5csd8zYDBpgD+Poylopt5JIbeDuoYw/BedgEXmscZ8Qr7VzjAXdnv/Q4=
 =Loz8
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - kprobes: Restructured stack unwinder to show properly on x86 when a
   stack dump happens from a kretprobe callback.

 - Fix to bootconfig parsing

 - Have tracefs allow owner and group permissions by default (only
   denying others). There's been pressure to allow non root to tracefs
   in a controlled fashion, and using groups is probably the safest.

 - Bootconfig memory managament updates.

 - Bootconfig clean up to have the tools directory be less dependent on
   changes in the kernel tree.

 - Allow perf to be traced by function tracer.

 - Rewrite of function graph tracer to be a callback from the function
   tracer instead of having its own trampoline (this change will happen
   on an arch by arch basis, and currently only x86_64 implements it).

 - Allow multiple direct trampolines (bpf hooks to functions) be batched
   together in one synchronization.

 - Allow histogram triggers to add variables that can perform
   calculations against the event's fields.

 - Use the linker to determine architecture callbacks from the ftrace
   trampoline to allow for proper parameter prototypes and prevent
   warnings from the compiler.

 - Extend histogram triggers to key off of variables.

 - Have trace recursion use bit magic to determine preempt context over
   if branches.

 - Have trace recursion disable preemption as all use cases do anyway.

 - Added testing for verification of tracing utilities.

 - Various small clean ups and fixes.

* tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (101 commits)
  tracing/histogram: Fix semicolon.cocci warnings
  tracing/histogram: Fix documentation inline emphasis warning
  tracing: Increase PERF_MAX_TRACE_SIZE to handle Sentinel1 and docker together
  tracing: Show size of requested perf buffer
  bootconfig: Initialize ret in xbc_parse_tree()
  ftrace: do CPU checking after preemption disabled
  ftrace: disable preemption when recursion locked
  tracing/histogram: Document expression arithmetic and constants
  tracing/histogram: Optimize division by a power of 2
  tracing/histogram: Covert expr to const if both operands are constants
  tracing/histogram: Simplify handling of .sym-offset in expressions
  tracing: Fix operator precedence for hist triggers expression
  tracing: Add division and multiplication support for hist triggers
  tracing: Add support for creating hist trigger variables from literal
  selftests/ftrace: Stop tracing while reading the trace file by default
  MAINTAINERS: Update KPROBES and TRACING entries
  test_kprobes: Move it from kernel/ to lib/
  docs, kprobes: Remove invalid URL and add new reference
  samples/kretprobes: Fix return value if register_kretprobe() failed
  lib/bootconfig: Fix the xbc_get_info kerneldoc
  ...
2021-11-01 20:05:19 -07:00
Tiezhu Yang
b9e94a7bb6 test_kprobes: Move it from kernel/ to lib/
Since config KPROBES_SANITY_TEST is in lib/Kconfig.debug, it is better to
let test_kprobes.c in lib/, just like other similar tests found in lib/.

Link: https://lkml.kernel.org/r/1635213091-24387-4-git-send-email-yangtiezhu@loongson.cn

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-26 17:23:46 -04:00
Peter Zijlstra
77e52ae354 futex: Move to kernel/futex/
In preparation for splitup..

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-2-andrealmeid@collabora.com
2021-10-07 13:51:07 +02:00
Masahiro Yamada
46b41d5dd8 kbuild: update config_data.gz only when the content of .config is changed
If the timestamp of the .config file is updated, config_data.gz is
regenerated, then vmlinux is re-linked. This occurs even if the content
of the .config has not changed at all.

This issue was mitigated by commit 67424f61f8 ("kconfig: do not write
.config if the content is the same"); Kconfig does not update the
.config when it ends up with the identical configuration.

The issue is remaining when the .config is created by *_defconfig with
some config fragment(s) applied on top.

This is typical for powerpc and mips, where several *_defconfig targets
are constructed by using merge_config.sh.

One workaround is to have the copy of the .config. The filechk rule
updates the copy, kernel/config_data, by checking the content instead
of the timestamp.

With this commit, the second run with the same configuration avoids
the needless rebuilds.

  $ make ARCH=mips defconfig all
   [ snip ]
  $ make ARCH=mips defconfig all
  *** Default configuration is based on target '32r2el_defconfig'
  Using ./arch/mips/configs/generic_defconfig as base
  Merging arch/mips/configs/generic/32r2.config
  Merging arch/mips/configs/generic/el.config
  Merging ./arch/mips/configs/generic/board-boston.config
  Merging ./arch/mips/configs/generic/board-ni169445.config
  Merging ./arch/mips/configs/generic/board-ocelot.config
  Merging ./arch/mips/configs/generic/board-ranchu.config
  Merging ./arch/mips/configs/generic/board-sead-3.config
  Merging ./arch/mips/configs/generic/board-xilfpga.config
  #
  # configuration written to .config
  #
    SYNC    include/config/auto.conf
    CALL    scripts/checksyscalls.sh
    CALL    scripts/atomic/check-atomics.sh
    CHK     include/generated/compile.h
    CHK     include/generated/autoksyms.h

Reported-by: Elliot Berman <eberman@codeaurora.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-05-02 00:43:35 +09:00
Sami Tolvanen
cf68fffb66 add support for Clang CFI
This change adds support for Clang’s forward-edge Control Flow
Integrity (CFI) checking. With CONFIG_CFI_CLANG, the compiler
injects a runtime check before each indirect function call to ensure
the target is a valid function with the correct static type. This
restricts possible call targets and makes it more difficult for
an attacker to exploit bugs that allow the modification of stored
function pointers. For more details, see:

  https://clang.llvm.org/docs/ControlFlowIntegrity.html

Clang requires CONFIG_LTO_CLANG to be enabled with CFI to gain
visibility to possible call targets. Kernel modules are supported
with Clang’s cross-DSO CFI mode, which allows checking between
independently compiled components.

With CFI enabled, the compiler injects a __cfi_check() function into
the kernel and each module for validating local call targets. For
cross-module calls that cannot be validated locally, the compiler
calls the global __cfi_slowpath_diag() function, which determines
the target module and calls the correct __cfi_check() function. This
patch includes a slowpath implementation that uses __module_address()
to resolve call targets, and with CONFIG_CFI_CLANG_SHADOW enabled, a
shadow map that speeds up module look-ups by ~3x.

Clang implements indirect call checking using jump tables and
offers two methods of generating them. With canonical jump tables,
the compiler renames each address-taken function to <function>.cfi
and points the original symbol to a jump table entry, which passes
__cfi_check() validation. This isn’t compatible with stand-alone
assembly code, which the compiler doesn’t instrument, and would
result in indirect calls to assembly code to fail. Therefore, we
default to using non-canonical jump tables instead, where the compiler
generates a local jump table entry <function>.cfi_jt for each
address-taken function, and replaces all references to the function
with the address of the jump table entry.

Note that because non-canonical jump table addresses are local
to each component, they break cross-module function address
equality. Specifically, the address of a global function will be
different in each module, as it's replaced with the address of a local
jump table entry. If this address is passed to a different module,
it won’t match the address of the same function taken there. This
may break code that relies on comparing addresses passed from other
components.

CFI checking can be disabled in a function with the __nocfi attribute.
Additionally, CFI can be disabled for an entire compilation unit by
filtering out CC_FLAGS_CFI.

By default, CFI failures result in a kernel panic to stop a potential
exploit. CONFIG_CFI_PERMISSIVE enables a permissive mode, where the
kernel prints out a rate-limited warning instead, and allows execution
to continue. This option is helpful for locating type mismatches, but
should only be enabled during development.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210408182843.1754385-2-samitolvanen@google.com
2021-04-08 16:04:20 -07:00
Chris Wilson
bfe3911a91 kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
Userspace has discovered the functionality offered by SYS_kcmp and has
started to depend upon it. In particular, Mesa uses SYS_kcmp for
os_same_file_description() in order to identify when two fd (e.g. device
or dmabuf) point to the same struct file. Since they depend on it for
core functionality, lift SYS_kcmp out of the non-default
CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.

Rasmus Villemoes also pointed out that systemd uses SYS_kcmp to
deduplicate the per-service file descriptor store.

Note that some distributions such as Ubuntu are already enabling
CHECKPOINT_RESTORE in their configs and so, by extension, SYS_kcmp.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/3046
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> # DRM depends on kcmp
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> # systemd uses kcmp
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210205220012.1983-1-chris@chris-wilson.co.uk
2021-02-16 09:59:41 +01:00
Dmitry Vyukov
c1cb05e77f kcov: don't instrument with UBSAN
Both KCOV and UBSAN use compiler instrumentation.  If UBSAN detects a bug
in KCOV, it may cause infinite recursion via printk and other common
functions.  We already don't instrument KCOV with KASAN/KCSAN for this
reason, don't instrument it with UBSAN as well.

As a side effect this also resolves the following gcc warning:

conflicting types for built-in function '__sanitizer_cov_trace_switch';
expected 'void(long unsigned int, void *)'
[-Wbuiltin-declaration-mismatch]

It's only reported when kcov.c is compiled with any of the sanitizers
enabled.  Size of the arguments is correct, it's just that gcc uses 'long'
on 64-bit arches and 'long long' on 32-bit arches, while kernel type is
always 'long long'.

Link: https://lkml.kernel.org/r/20201209100152.2492072-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Suggested-by: Marco Elver <elver@google.com>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15 22:46:19 -08:00
Linus Torvalds
aab7ce2b09 ACPI updates for 5.11-rc1
- Update ACPICA code in the kernel to upstream revision 20201113
    with changes as follows:
 
    * Add 5 new UUIDs to the known UUID table (Bob Moore).
    * Remove extreaneous "the" in comments (Colin Ian King).
    * Add function trace macros to improve debugging (Erik Kaneda).
    * Fix interpreter memory leak (Erik Kaneda).
    * Handle "orphan" _REG for GPIO OpRegions (Hans de Goede).
 
  - Introduce resource_union() and resource_intersection() helpers
    and clean up some resource-manipulation code with the help of
    them (Andy Shevchenko).
 
  - Revert problematic commit related to the handling of resources
    in the ACPI core (Daniel Scally).
 
  - Extend the ACPI device enumeration documentation and the
    gpio-line-names _DSD property documentation, clean up the
    latter (Flavio Suligoi).
 
  - Clean up _DEP handling during device enumeration, modify the list
    of _DEP exceptions and the handling of it and fix up terminology
    related to _DEP (Hans de Goede, Rafael Wysocki).
 
  - Eliminate in_interrupt() usage from the ACPI EC driver (Sebastian
    Andrzej Siewior).
 
  - Clean up the advance_transaction() routine and related code in
    the ACPI EC driver (Rafael Wysocki).
 
  - Add new backlight quirk for GIGABYTE GB-BXBT-2807 (Jasper
    St. Pierre).
 
  - Make assorted janitorial changes in several ACPI-related pieces
    of code (Hanjun Guo, Jason Yan, Punit Agrawal).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl/Y80sSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxLnsP/0B+6dSTqLp7S1UJbScta1gKnDwyFGAi
 eA3MqyMfF08DiwWocJoDDjttxo5d+0NV2g9LdcScdwGrKB9ddWuhA6/IT83B7YA8
 K4Yie4Gm1StLR+rs+zLJy9GBQE3P/Ya058WJJ+mvaTkn4SlI1qf8ZhOEzlCArWpM
 0g9+Y7n16A8/9dTTd8eVf80UrPLOuxPPIZuSsCdZCDEOLGPH+zJive0X0wEqsXvU
 hiq8DymH2ooIcscYXu3wEKtAfcxUjGhmrK8/M7dJ1j9Mh7JxQsIIcQIQogG2uE53
 Gm71E1LkQg3FYzFCYmtSNgOnjMMDfV3r/WH4f49kK1Hajn2erYvEv5kXpF+J+Q6X
 rXXRk17puszq5/gvrebDFpUou7MmV4smOeXPQIobJuXnc9c48QEBqx8tshuJXsVv
 5EzD6tITGl/iLu3bqH5M9G5BjIZYB9kjUPzxJ3ezl9rtJXsiMN21MkM4j3KR1rnW
 g3JYMuw1hwa9wVNFraMyCOtcfncCPJNzdi9TT3ZzMDLHTKbd0jmpyBKws+dI5G1z
 y7utKOi65utX6mTtOwhITVbSu7Gura7FRXkxo3K7kG8WViYXhDvB1mLylrpipNDl
 ZWgX6XMYEtRhSvvTwZM0Wcgs9p2JIgfQvbz1Q9pv5FLbf1nceRMaiiGrJTPiGYZD
 /maVvuddxEWM
 =5LZe
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "These update the ACPICA code in the kernel to upstream revision
  20201113, fix and clean up some resources manipulation code, extend
  the enumeration and gpio-line-names property documentation, clean up
  the handling of _DEP during device enumeration, add a new backlight
  DMI quirk, clean up transaction handling in the EC driver and make
  some assorted janitorial changes.

  Specifics:

   - Update ACPICA code in the kernel to upstream revision 20201113 with
     changes as follows:
       * Add 5 new UUIDs to the known UUID table (Bob Moore)
       * Remove extreaneous "the" in comments (Colin Ian King)
       * Add function trace macros to improve debugging (Erik Kaneda)
       * Fix interpreter memory leak (Erik Kaneda)
       * Handle "orphan" _REG for GPIO OpRegions (Hans de Goede)

   - Introduce resource_union() and resource_intersection() helpers and
     clean up some resource-manipulation code with the help of them
     (Andy Shevchenko)

   - Revert problematic commit related to the handling of resources in
     the ACPI core (Daniel Scally)

   - Extend the ACPI device enumeration documentation and the
     gpio-line-names _DSD property documentation, clean up the latter
     (Flavio Suligoi)

   - Clean up _DEP handling during device enumeration, modify the list
     of _DEP exceptions and the handling of it and fix up terminology
     related to _DEP (Hans de Goede, Rafael Wysocki)

   - Eliminate in_interrupt() usage from the ACPI EC driver (Sebastian
     Andrzej Siewior)

   - Clean up the advance_transaction() routine and related code in the
     ACPI EC driver (Rafael Wysocki)

   - Add new backlight quirk for GIGABYTE GB-BXBT-2807 (Jasper St
     Pierre)

   - Make assorted janitorial changes in several ACPI-related pieces of
     code (Hanjun Guo, Jason Yan, Punit Agrawal)"

* tag 'acpi-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (40 commits)
  ACPI: scan: Fix up _DEP-related terminology with supplier/consumer
  ACPI: scan: Drop INT3396 from acpi_ignore_dep_ids[]
  ACPI: video: Add DMI quirk for GIGABYTE GB-BXBT-2807
  Revert "ACPI / resources: Use AE_CTRL_TERMINATE to terminate resources walks"
  ACPI: scan: Add PNP0D80 to the _DEP exceptions list
  ACPI: scan: Call acpi_get_object_info() from acpi_add_single_object()
  ACPI: scan: Add acpi_info_matches_hids() helper
  ACPICA: Update version to 20201113
  ACPICA: Interpreter: fix memory leak by using existing buffer
  ACPICA: Add function trace macros to improve debugging
  ACPICA: Also handle "orphan" _REG methods for GPIO OpRegions
  ACPICA: Remove extreaneous "the" in comments
  ACPICA: Add 5 new UUIDs to the known UUID table
  resource: provide meaningful MODULE_LICENSE() in test suite
  ASoC: Intel: catpt: Replace open coded variant of resource_intersection()
  ACPI: processor: Drop duplicate setting of shared_cpu_map
  ACPI: EC: Clean up status flags checks in advance_transaction()
  ACPI: EC: Untangle error handling in advance_transaction()
  ACPI: EC: Simplify error handling in advance_transaction()
  ACPI: EC: Rename acpi_ec_is_gpe_raised()
  ...
2020-12-15 16:39:06 -08:00
Arnd Bergmann
6e7b64b9dd elfcore: fix building with clang
kernel/elfcore.c only contains weak symbols, which triggers a bug with
clang in combination with recordmcount:

  Cannot find symbol for section 2: .text.
  kernel/elfcore.o: failed

Move the empty stubs into linux/elfcore.h as inline functions.  As only
two architectures use these, just use the architecture specific Kconfig
symbols to key off the declaration.

Link: https://lkml.kernel.org/r/20201204165742.3815221-2-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Barret Rhoden <brho@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-11 14:02:14 -08:00
Andy Shevchenko
5df38ca6af resource: Add test cases for new resource API
Add test cases for newly added resource APIs.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-11-17 18:06:28 +01:00
Linus Torvalds
746b25b1aa Kbuild updates for v5.10
- Support 'make compile_commands.json' to generate the compilation
    database more easily, avoiding stale entries
 
  - Support 'make clang-analyzer' and 'make clang-tidy' for static checks
    using clang-tidy
 
  - Preprocess scripts/modules.lds.S to allow CONFIG options in the module
    linker script
 
  - Drop cc-option tests from compiler flags supported by our minimal
    GCC/Clang versions
 
  - Use always 12-digits commit hash for CONFIG_LOCALVERSION_AUTO=y
 
  - Use sha1 build id for both BFD linker and LLD
 
  - Improve deb-pkg for reproducible builds and rootless builds
 
  - Remove stale, useless scripts/namespace.pl
 
  - Turn -Wreturn-type warning into error
 
  - Fix build error of deb-pkg when CONFIG_MODULES=n
 
  - Replace 'hostname' command with more portable 'uname -n'
 
  - Various Makefile cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl+RfS0VHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGG1QP/2hzoMzK1YXErPUhGrhYU1rxz7Nu
 HkLTIkyKF1HPwSJf5XyNW/FTBI4SDlkNoVg/weEDCS1yFxxpvQLIck8ChzA1kIIM
 P+1IfBWOTzqn91XsapU2zwSno3gylphVchVIvYAB3oLUotGeMSluy1cQtBRzyA5D
 rj2Q7H8fzkzk3YoBcBC/BOKDlfo/usqQ1X/gsfRFwN/BJxeZSYoujNBE7KtHaDsd
 8K/ggBIqmST4NBn+M8c11d8CxzvWbtG1gq3EkUL5nG8T13DsGn1EFC0SPt85bkvv
 f9YywfJi37HixhZzK6tXYjN/PWoiEY6z90mhd0NtZghQT7kQMiTQ3sWrM8dX3ssf
 phBzO94uFQDjhyxOaSSsCoI/TIciAPo4+G8PNjcaEtj63IEfhEz/dnlstYwY5Y9P
 Pp3aZtVjSGJwGW2u2EUYj6paFVqjf6DXQjQKPNHnsYCEidIvFTjjguRGvx9gl6mx
 yd8oseOsAtOEf0alRe9MMdvN17O3UrRAxgBdap7fktg02TLVRGxZIbuwKmBf29ho
 ORl9zeFkYBn6XQFyuItJoXy/kYFyHDaBEPYCRQcY4dwqcjZIiAc/FhYbqYthJ59L
 5vLN2etmDIVSuUv1J5nBqHHGCqJChykbqg7riQ651dCNKw4gZB8ctCay2lXhBXMg
 1mqOcoG5WWL7//F+
 =tZRN
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Support 'make compile_commands.json' to generate the compilation
   database more easily, avoiding stale entries

 - Support 'make clang-analyzer' and 'make clang-tidy' for static checks
   using clang-tidy

 - Preprocess scripts/modules.lds.S to allow CONFIG options in the
   module linker script

 - Drop cc-option tests from compiler flags supported by our minimal
   GCC/Clang versions

 - Use always 12-digits commit hash for CONFIG_LOCALVERSION_AUTO=y

 - Use sha1 build id for both BFD linker and LLD

 - Improve deb-pkg for reproducible builds and rootless builds

 - Remove stale, useless scripts/namespace.pl

 - Turn -Wreturn-type warning into error

 - Fix build error of deb-pkg when CONFIG_MODULES=n

 - Replace 'hostname' command with more portable 'uname -n'

 - Various Makefile cleanups

* tag 'kbuild-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (34 commits)
  kbuild: Use uname for LINUX_COMPILE_HOST detection
  kbuild: Only add -fno-var-tracking-assignments for old GCC versions
  kbuild: remove leftover comment for filechk utility
  treewide: remove DISABLE_LTO
  kbuild: deb-pkg: clean up package name variables
  kbuild: deb-pkg: do not build linux-headers package if CONFIG_MODULES=n
  kbuild: enforce -Werror=return-type
  scripts: remove namespace.pl
  builddeb: Add support for all required debian/rules targets
  builddeb: Enable rootless builds
  builddeb: Pass -n to gzip for reproducible packages
  kbuild: split the build log of kallsyms
  kbuild: explicitly specify the build id style
  scripts/setlocalversion: make git describe output more reliable
  kbuild: remove cc-option test of -Werror=date-time
  kbuild: remove cc-option test of -fno-stack-check
  kbuild: remove cc-option test of -fno-strict-overflow
  kbuild: move CFLAGS_{KASAN,UBSAN,KCSAN} exports to relevant Makefiles
  kbuild: remove redundant CONFIG_KASAN check from scripts/Makefile.kasan
  kbuild: do not create built-in objects for external module builds
  ...
2020-10-22 13:13:57 -07:00
Sami Tolvanen
0f6372e522 treewide: remove DISABLE_LTO
This change removes all instances of DISABLE_LTO from
Makefiles, as they are currently unused, and the preferred
method of disabling LTO is to filter out the flags instead.

Note added by Masahiro Yamada:
DISABLE_LTO was added as preparation for GCC LTO, but GCC LTO was
not pulled into the mainline. (https://lkml.org/lkml/2014/4/8/272)

Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-10-21 00:28:53 +09:00
Linus Torvalds
41eea65e2a Merge tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes from Ingo Molnar:

 - Debugging for smp_call_function()

 - RT raw/non-raw lock ordering fixes

 - Strict grace periods for KASAN

 - New smp_call_function() torture test

 - Torture-test updates

 - Documentation updates

 - Miscellaneous fixes

[ This doesn't actually pull the tag - I've dropped the last merge from
  the RCU branch due to questions about the series.   - Linus ]

* tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
  smp: Make symbol 'csd_bug_count' static
  kernel/smp: Provide CSD lock timeout diagnostics
  smp: Add source and destination CPUs to __call_single_data
  rcu: Shrink each possible cpu krcp
  rcu/segcblist: Prevent useless GP start if no CBs to accelerate
  torture: Add gdb support
  rcutorture: Allow pointer leaks to test diagnostic code
  rcutorture: Hoist OOM registry up one level
  refperf: Avoid null pointer dereference when buf fails to allocate
  rcutorture: Properly synchronize with OOM notifier
  rcutorture: Properly set rcu_fwds for OOM handling
  torture: Add kvm.sh --help and update help message
  rcutorture: Add CONFIG_PROVE_RCU_LIST to TREE05
  torture: Update initrd documentation
  rcutorture: Replace HTTP links with HTTPS ones
  locktorture: Make function torture_percpu_rwsem_init() static
  torture: document --allcpus argument added to the kvm.sh script
  rcutorture: Output number of elapsed grace periods
  rcutorture: Remove KCSAN stubs
  rcu: Remove unused "cpu" parameter from rcu_report_qs_rdp()
  ...
2020-10-18 14:34:50 -07:00