Commit Graph

4567 Commits

Author SHA1 Message Date
Arnd Bergmann
6053915252 RISC-V Sophgo Devicetree fixes for v6.15-rc1
Just one minor fix to correct DMA data-width
 configuration for CV18xx.
 
 Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEdoBX2jyDC9ZCTwZjDCzASqG0i0IFAmgRzJcACgkQDCzASqG0
 i0JxJgv7B6wAej6oTDhLuAfWINmET17+vsmLLvdGIw4fonIqO0CbCQAB6Zco8tW8
 63WikULIsLiaoreLlQ1k/ydENj2cL6VQj4VyficDMt8zxBbs6Hfm1l8w+iz6+K5Q
 5bOm8yGnR8AFnCeCu1eDsKFYe4ZxIzG1Ya07NPSDAu28f/AvwGd5YqCUos55GCOQ
 AuXvsFSr8EwTEVXGhkaYt7jqfwWgUAkQfzeOXpvPojqbnVMKp1+oH1PH1s3S251t
 XfBgssQjiCYAJtDoKeYaChzniisdwgyd1QovSeoeTeciZUc31TzNBt19z5aXuIpp
 CH3b1enGRVAvKIzEyvDSJWNs8NVQl5Q+0peiaYzQ3Ql0iuFjhFYLd4K5kbwbpMRv
 bduUXvTledr27KtQ7O+IjjpxcF55yj5bbKimk6FM+Pmb0+bNZzyXeihs3FGHm7tx
 gKsF0+MHonybB/47FGqjnKhq9wnpRvFKCJeWmAsIBtzMmrc209+VPaa3hKveS2AN
 AY8LGJTH
 =4v7N
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmgeJsMACgkQYKtH/8kJ
 Uieg+xAA2ckAQkuxTtGYVKimTgJAOhzBcg9QRBeYbFaQUSG/rF1/uov0/WlJWCsx
 4GjETNmmA4+C0JXtud6gyqld2gVZU7Fa9hFylv4s7DZJSfS8EGbdjTKGDmUNamQv
 LTiCmoRtODAQnWaB+pTBCjuyWzBwDl3PS//3Q58Kql/Gha+CrQBZTiMz6S3dnNV5
 2czwWGwCxvI29aI7mpx7MXKoDGlqiGbAFQDOjvdYT4bVBjHcHc/hZKwv/HTM3L9x
 ioUF6KYo8jVvjJQ8P1IlBqTmxMZwHKFiJHMkNELf/Xw9h7OIIsTaPnjW7Af5p9Z3
 bCXoZTH9blkDCs0xSCdrzweBLrRnZjN32qFYSk5rBJ1kP/Nri3bWlBtqykgEoo43
 SMkITWSNZjHCfbGmgGo/HcIPeaMLPyTqle7BS8w7Z/5yubCbdjvI/ItYOfggp7UU
 CYR0Nd5UDB8XwDI1NpN/U1dsovtahgedTIDK+t3furjGRDlsrXKwaPr4OYgR8To3
 BwNMtNlN5bOFe0C+UuBoWik5npNM8jjN/6N/ARbAecqnv8EV0svbIb0PPalEwLYA
 IqKcIiaxIgxmYvzOWynQ0xwYWFgIIJAI/Vw0EVJlm8tZAfA0hfQGH+LoWf6DuQ6e
 zmAFyEekNp4nHUZvz+duUIVce9ozbW78AW33046nlR4LiRKel9M=
 =3e4j
 -----END PGP SIGNATURE-----

Merge tag 'riscv-sophgo-dt-fixes-for-v6.15-rc1' of https://github.com/sophgo/linux into arm/fixes

RISC-V Sophgo Devicetree fixes for v6.15-rc1

Just one minor fix to correct DMA data-width
configuration for CV18xx.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>

* tag 'riscv-sophgo-dt-fixes-for-v6.15-rc1' of https://github.com/sophgo/linux:
  riscv: dts: sophgo: fix DMA data-width configuration for CV18xx

Link: https://lore.kernel.org/r/MA0P287MB2262454C19B8899BC1694D04FE832@MA0P287MB2262.INDP287.PROD.OUTLOOK.COM
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-05-09 18:01:07 +02:00
Palmer Dabbelt
1a3f698088
Merge patch series "riscv: Add vendor extensions support for SiFive"
Cyan Yang <cyan.yang@sifive.com> says:

This patch set adds four vendor-specific ISA extensions from SiFive:
"xsfvqmaccdod", "xsfvqmaccqoq", "xsfvfnrclipxfqf", and "xsfvfwmaccqqq".

Additionally, a new hwprobe key, RISCV_HWPROBE_KEY_VENDOR_EXT_SIFIVE_0,
has been added to query which SiFive vendor extensions are supported on
the current platform.

Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Link: https://lore.kernel.org/r/20250418053239.4351-1-cyan.yang@sifive.com

* b4-shazam-merge:
  riscv: hwprobe: Add SiFive xsfvfwmaccqqq vendor extension
  riscv: hwprobe: Document SiFive xsfvfwmaccqqq vendor extension
  riscv: Add SiFive xsfvfwmaccqqq vendor extension
  dt-bindings: riscv: Add xsfvfwmaccqqq ISA extension description
  riscv: hwprobe: Add SiFive xsfvfnrclipxfqf vendor extension
  riscv: hwprobe: Document SiFive xsfvfnrclipxfqf vendor extension
  riscv: Add SiFive xsfvfnrclipxfqf vendor extension
  dt-bindings: riscv: Add xsfvfnrclipxfqf ISA extension description
  riscv: hwprobe: Add SiFive vendor extension support and probe for xsfqmaccdod and xsfqmaccqoq
  riscv: hwprobe: Document SiFive xsfvqmaccdod and xsfvqmaccqoq vendor extensions
  riscv: Add SiFive xsfvqmaccdod and xsfvqmaccqoq vendor extensions
  dt-bindings: riscv: Add xsfvqmaccdod and xsfvqmaccqoq ISA extension description

Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 11:01:47 -07:00
Cyan Yang
d9669e33c8
riscv: hwprobe: Add SiFive xsfvfwmaccqqq vendor extension
Add hwprobe for SiFive "xsfvfwmaccqqq" vendor extension.

Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Link: https://lore.kernel.org/r/20250418053239.4351-13-cyan.yang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 11:01:44 -07:00
Cyan Yang
34e9b16b4b
riscv: Add SiFive xsfvfwmaccqqq vendor extension
Add SiFive vendor extension "xsfvfwmaccqqq" support to the kernel.

Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Link: https://lore.kernel.org/r/20250418053239.4351-11-cyan.yang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 11:01:44 -07:00
Cyan Yang
1d91224394
riscv: hwprobe: Add SiFive xsfvfnrclipxfqf vendor extension
Add hwprobe for SiFive "xsfvfnrclipxfqf" vendor extension.

Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Link: https://lore.kernel.org/r/20250418053239.4351-9-cyan.yang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 11:01:44 -07:00
Cyan Yang
e84fffe21b
riscv: Add SiFive xsfvfnrclipxfqf vendor extension
Add SiFive vendor extension "xsfvfnrclipxfqf" support to the kernel.

Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Link: https://lore.kernel.org/r/20250418053239.4351-7-cyan.yang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 11:01:43 -07:00
Cyan Yang
1a6274f035
riscv: hwprobe: Add SiFive vendor extension support and probe for xsfqmaccdod and xsfqmaccqoq
Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_SIFIVE_0" which allows
userspace to probe for the new vendor extensions from SiFive. Also, add
new hwprobe for SiFive "xsfvqmaccdod" and "xsfvqmaccqoq" vendor
extensions.

Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Link: https://lore.kernel.org/r/20250418053239.4351-5-cyan.yang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 11:01:43 -07:00
Cyan Yang
e8fd215ed0
riscv: hwprobe: Document SiFive xsfvqmaccdod and xsfvqmaccqoq vendor extensions
Document the support for sifive vendor extensions using the key
RISCV_HWPROBE_KEY_VENDOR_EXT_SIFIVE_0 and two vendor extensions for SiFive
Int8 Matrix Multiplication Instructions using
RISCV_HWPROBE_VENDOR_EXT_XSFVQMACCDOD and
RISCV_HWPROBE_VENDOR_EXT_XSFVQMACCQOQ.

Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Link: https://lore.kernel.org/r/20250418053239.4351-4-cyan.yang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 11:01:43 -07:00
Cyan Yang
2d147d77ae
riscv: Add SiFive xsfvqmaccdod and xsfvqmaccqoq vendor extensions
Add SiFive vendor extension support to the kernel with the target of
"xsfvqmaccdod" and "xsfvqmaccqoq".

Signed-off-by: Cyan Yang <cyan.yang@sifive.com>
Link: https://lore.kernel.org/r/20250418053239.4351-3-cyan.yang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 11:01:43 -07:00
Xi Ruoyao
2940954c1a
riscv: vDSO: Remove --hash-style=both
When RISC-V borned, DT_GNU_HASH had already became the de-facto
standard so DT_HASH is just wasting storage space.  Remove the explicit
--hash-style=both setting and rely on the distro toolchain default,
which is most likely "gnu" (i.e. generating only DT_GNU_HASH, no
DT_HASH).

Following the logic of commit 48f6430505
("arm64/vdso: Remove --hash-style=sysv").

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Link: https://lore.kernel.org/r/20250224112042.60282-2-xry111@xry111.site
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 10:46:45 -07:00
Palmer Dabbelt
259aaf03d7
Merge patch series "riscv: uaccess: optimisations"
Cyril Bur <cyrilbur@tenstorrent.com> says:

This series tries to optimize riscv uaccess by allowing the use of
user_access_begin() and user_access_end() which permits grouping user accesses
and avoiding the CSR write penalty for each access.

The error path can also be optimised using asm goto which patches 3 and 4
achieve. This will speed up jumping to labels by avoiding the need of an
intermediary error type variable within the uaccess macros

I did read the discussion this series generated. It isn't clear to me
which direction to take the patches, if any.

* b4-shazam-merge:
  riscv: uaccess: use 'asm_goto_output' for get_user()
  riscv: uaccess: use 'asm goto' for put_user()
  riscv: uaccess: use input constraints for ptr of __put_user()
  riscv: implement user_access_begin() and families
  riscv: save the SR_SUM status over switches

Link: https://lore.kernel.org/r/20250410070526.3160847-1-cyrilbur@tenstorrent.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 10:08:01 -07:00
Jisheng Zhang
f6bff7827a
riscv: uaccess: use 'asm_goto_output' for get_user()
With 'asm goto' we don't need to test the error etc, the exception just
jumps to the error handling directly.

Unlike put_user(), get_user() must work around GCC bugs [1] when using
output clobbers in an asm goto statement.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921 # 1

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
[Cyril Bur: Rewritten commit message]
Signed-off-by: Cyril Bur <cyrilbur@tenstorrent.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250410070526.3160847-6-cyrilbur@tenstorrent.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 10:01:00 -07:00
Jisheng Zhang
cdf647e817
riscv: uaccess: use 'asm goto' for put_user()
With 'asm goto' we don't need to test the error etc, the exception just
jumps to the error handling directly.

Because there are no output clobbers which could trigger gcc bugs [1]
the use of asm_goto_output() macro is not necessary here. Not using
asm_goto_output() is desirable as the generated output asm will be
cleaner.

Use of the volatile keyword is redundant as per gcc 14.2.0 manual section
6.48.2.7 Goto Labels:
> Also note that an asm goto statement is always implicitly considered
  volatile.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921 # 1

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
[Cyril Bur: Rewritten commit message]
Signed-off-by: Cyril Bur <cyrilbur@tenstorrent.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250410070526.3160847-5-cyrilbur@tenstorrent.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 10:01:00 -07:00
Jisheng Zhang
62135bf660
riscv: uaccess: use input constraints for ptr of __put_user()
Putting ptr in the inputs as opposed to output may seem incorrect but
this is done for a few reasons:
- Not having it in the output permits the use of asm goto in a
  subsequent patch. There are bugs in gcc [1] which would otherwise
  prevent it.
- Since the output memory is userspace there isn't any real benefit from
  telling the compiler about the memory clobber.
- x86, arm and powerpc all use this technique.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921 # 1

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
[Cyril Bur: Rewritten commit message]
Signed-off-by: Cyril Bur <cyrilbur@tenstorrent.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250410070526.3160847-4-cyrilbur@tenstorrent.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 10:01:00 -07:00
Jisheng Zhang
19500c6dbc
riscv: implement user_access_begin() and families
Currently, when a function like strncpy_from_user() is called,
the userspace access protection is disabled and enabled
for every word read.

By implementing user_access_begin() and families, the protection
is disabled at the beginning of the copy and enabled at the end.

The __inttype macro is borrowed from x86 implementation.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Cyril Bur <cyrilbur@tenstorrent.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250410070526.3160847-3-cyrilbur@tenstorrent.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 10:01:00 -07:00
Ben Dooks
788aa64c01
riscv: save the SR_SUM status over switches
When threads/tasks are switched we need to ensure the old execution's
SR_SUM state is saved and the new thread has the old SR_SUM state
restored.

The issue was seen under heavy load especially with the syz-stress tool
running, with crashes as follows in schedule_tail:

Unable to handle kernel access to user memory without uaccess routines
at virtual address 000000002749f0d0
Oops [#1]
Modules linked in:
CPU: 1 PID: 4875 Comm: syz-executor.0 Not tainted
5.12.0-rc2-syzkaller-00467-g0d7588ab9ef9 #0
Hardware name: riscv-virtio,qemu (DT)
epc : schedule_tail+0x72/0xb2 kernel/sched/core.c:4264
 ra : task_pid_vnr include/linux/sched.h:1421 [inline]
 ra : schedule_tail+0x70/0xb2 kernel/sched/core.c:4264
epc : ffffffe00008c8b0 ra : ffffffe00008c8ae sp : ffffffe025d17ec0
 gp : ffffffe005d25378 tp : ffffffe00f0d0000 t0 : 0000000000000000
 t1 : 0000000000000001 t2 : 00000000000f4240 s0 : ffffffe025d17ee0
 s1 : 000000002749f0d0 a0 : 000000000000002a a1 : 0000000000000003
 a2 : 1ffffffc0cfac500 a3 : ffffffe0000c80cc a4 : 5ae9db91c19bbe00
 a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffe000082eba
 s2 : 0000000000040000 s3 : ffffffe00eef96c0 s4 : ffffffe022c77fe0
 s5 : 0000000000004000 s6 : ffffffe067d74e00 s7 : ffffffe067d74850
 s8 : ffffffe067d73e18 s9 : ffffffe067d74e00 s10: ffffffe00eef96e8
 s11: 000000ae6cdf8368 t3 : 5ae9db91c19bbe00 t4 : ffffffc4043cafb2
 t5 : ffffffc4043cafba t6 : 0000000000040000
status: 0000000000000120 badaddr: 000000002749f0d0 cause:
000000000000000f
Call Trace:
[<ffffffe00008c8b0>] schedule_tail+0x72/0xb2 kernel/sched/core.c:4264
[<ffffffe000005570>] ret_from_exception+0x0/0x14
Dumping ftrace buffer:
   (ftrace buffer empty)
---[ end trace b5f8f9231dc87dda ]---

The issue comes from the put_user() in schedule_tail
(kernel/sched/core.c) doing the following:

asmlinkage __visible void schedule_tail(struct task_struct *prev)
{
...
        if (current->set_child_tid)
                put_user(task_pid_vnr(current), current->set_child_tid);
...
}

the put_user() macro causes the code sequence to come out as follows:

1:	__enable_user_access()
2:	reg = task_pid_vnr(current);
3:	*current->set_child_tid = reg;
4:	__disable_user_access()

The problem is that we may have a sleeping function as argument which
could clear SR_SUM causing the panic above. This was fixed by
evaluating the argument of the put_user() macro outside the user-enabled
section in commit 285a76bb2c ("riscv: evaluate put_user() arg before
enabling user access")"

In order for riscv to take advantage of unsafe_get/put_XXX() macros and
to avoid the same issue we had with put_user() and sleeping functions we
must ensure code flow can go through switch_to() from within a region of
code with SR_SUM enabled and come back with SR_SUM still enabled. This
patch addresses the problem allowing future work to enable full use of
unsafe_get/put_XXX() macros without needing to take a CSR bit flip cost
on every access. Make switch_to() save and restore SR_SUM.

Reported-by: syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Cyril Bur <cyrilbur@tenstorrent.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Link: https://lore.kernel.org/r/20250410070526.3160847-2-cyrilbur@tenstorrent.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-05-08 10:01:00 -07:00
Samuel Holland
7f1c3de137 riscv: Disallow PR_GET_TAGGED_ADDR_CTRL without Supm
When the prctl() interface for pointer masking was added, it did not
check that the pointer masking ISA extension was supported, only the
individual submodes. Userspace could still attempt to disable pointer
masking and query the pointer masking state. commit 81de1afb2dd1
("riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL") disallowed
the former, as the senvcfg write could crash on older systems.
PR_GET_TAGGED_ADDR_CTRL state does not crash, because it reads only
kernel-internal state and not senvcfg, but it should still be disallowed
for consistency.

Fixes: 09d6775f50 ("riscv: Add support for userspace pointer masking")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20250507145230.2272871-1-samuel.holland@sifive.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08 12:01:01 +00:00
Nam Cao
ae08d55807 riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
When userspace does PR_SET_TAGGED_ADDR_CTRL, but Supm extension is not
available, the kernel crashes:

Oops - illegal instruction [#1]
    [snip]
epc : set_tagged_addr_ctrl+0x112/0x15a
 ra : set_tagged_addr_ctrl+0x74/0x15a
epc : ffffffff80011ace ra : ffffffff80011a30 sp : ffffffc60039be10
    [snip]
status: 0000000200000120 badaddr: 0000000010a79073 cause: 0000000000000002
    set_tagged_addr_ctrl+0x112/0x15a
    __riscv_sys_prctl+0x352/0x73c
    do_trap_ecall_u+0x17c/0x20c
    andle_exception+0x150/0x15c

Fix it by checking if Supm is available.

Fixes: 09d6775f50 ("riscv: Add support for userspace pointer masking")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20250504101920.3393053-1-namcao@linutronix.de
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08 12:01:01 +00:00
Clément Léger
897e8aece3 riscv: misaligned: use get_user() instead of __get_user()
Now that we can safely handle user memory accesses while in the
misaligned access handlers, use get_user() instead of __get_user() to
have user memory access checks.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250422162324.956065-4-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08 12:00:58 +00:00
Clément Léger
453805f0a2 riscv: misaligned: enable IRQs while handling misaligned accesses
We can safely reenable IRQs if coming from userspace. This allows to
access user memory that could potentially trigger a page fault.

Fixes: b686ecdeac ("riscv: misaligned: Restrict user access to kernel memory")
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250422162324.956065-3-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08 12:00:36 +00:00
Clément Léger
fd94de9f9e riscv: misaligned: factorize trap handling
Since both load/store and user/kernel should use almost the same path and
that we are going to add some code around that, factorize it.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250422162324.956065-2-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08 12:00:13 +00:00
Michal Wilczynski
a4c95b924d riscv: dts: thead: Add device tree VO clock controller
VO clocks reside in a different address space from the AP clocks on the
T-HEAD SoC. Add the device tree node of a clock-controller to handle
VO address space as well.

Reviewed-by: Drew Fustini <drew@pdp7.com>
Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
Signed-off-by: Drew Fustini <drew@pdp7.com>
2025-05-07 23:38:41 -07:00
Nylon Chen
eb16b3727c riscv: misaligned: Add handling for ZCB instructions
Add support for the Zcb extension's compressed half-word instructions
(C.LHU, C.LH, and C.SH) in the RISC-V misaligned access trap handler.

Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Nylon Chen <nylon.chen@sifive.com>
Fixes: 956d705dd2 ("riscv: Unaligned load/store handling for M_MODE")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250411073850.3699180-2-nylon.chen@sifive.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-05 13:09:38 +00:00
Herbert Xu
491d6024f2 crypto: riscv/sha256 - Add simd block function
Add CRYPTO_ARCH_HAVE_LIB_SHA256_SIMD and a SIMD block function
so that the caller can decide whether to use SIMD.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-05 18:20:45 +08:00
Herbert Xu
67488527af crypto: arch/sha256 - Export block functions as GPL only
Export the block functions as GPL only, there is no reason
to let arbitrary modules use these internal functions.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-05 18:20:45 +08:00
Herbert Xu
ef93f15628 Revert "crypto: run initcalls for generic implementations earlier"
This reverts commit c4741b2305.

Crypto API self-tests no longer run at registration time and now
occur either at late_initcall or upon the first use.

Therefore the premise of the above commit no longer exists.  Revert
it and subsequent additions of subsys_initcall and arch_initcall.

Note that lib/crypto calls will stay at subsys_initcall (or rather
downgraded from arch_initcall) because they may need to occur
before Crypto API registration.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-05 18:20:44 +08:00
Eric Biggers
bf52d93865 crypto: riscv/sha256 - implement library instead of shash
Instead of providing crypto_shash algorithms for the arch-optimized
SHA-256 code, instead implement the SHA-256 library.  This is much
simpler, it makes the SHA-256 library functions be arch-optimized, and
it fixes the longstanding issue where the arch-optimized SHA-256 was
disabled by default.  SHA-256 still remains available through
crypto_shash, but individual architectures no longer need to handle it.

To match sha256_blocks_arch(), change the type of the nblocks parameter
of the assembly function from int to size_t.  The assembly function
actually already treated it as size_t.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-05 18:20:43 +08:00
Herbert Xu
fba4aafaba Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux v6.15-rc5
Merge mainline to pick up bcachefs poly1305 patch 4bf4b5046d
("bcachefs: use library APIs for ChaCha20 and Poly1305").  This
is a prerequisite for removing the poly1305 shash algorithm.
2025-05-05 13:25:15 +08:00
Radim Krčmář
87ec7d5249 KVM: RISC-V: reset smstateen CSRs
Not resetting smstateen is a potential security hole, because VU might
be able to access state that VS does not properly context-switch.

Fixes: 81f0f314fe ("RISCV: KVM: Add sstateen0 context save/restore")
Signed-off-by: Radim Krčmář <rkrcmar@ventanamicro.com>
Link: https://lore.kernel.org/r/20250403112522.1566629-8-rkrcmar@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-05-01 18:26:14 +05:30
Ze Huang
3e6244429b riscv: dts: sophgo: fix DMA data-width configuration for CV18xx
The "snps,data-width" property[1] defines the AXI data width of the DMA
controller as:

    width = 8 × (2^n) bits

(0 = 8 bits, 1 = 16 bits, 2 = 32 bits, ..., 6 = 512 bits)
where "n" is the value of "snps,data-width".

For the CV18xx DMA controller, the correct AXI data width is 32 bits,
corresponding to "snps,data-width = 2".

Test results on Milkv Duo S can be found here [2].

Link: https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/dma/snps%2Cdw-axi-dmac.yaml#L74 [1]
Link: https://gist.github.com/Sutter099/4fa99bb2d89e5af975983124704b3861 [2]

Fixes: 514951a81a ("riscv: dts: sophgo: cv18xx: add DMA controller")
Co-developed-by: Yu Yuan <yu.yuan@sjtu.edu.cn>
Signed-off-by: Yu Yuan <yu.yuan@sjtu.edu.cn>
Signed-off-by: Ze Huang <huangze@whut.edu.cn>
Link: https://lore.kernel.org/r/20250428-duo-dma-config-v1-1-eb6ad836ca42@whut.edu.cn
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Signed-off-by: Chen Wang <wangchen20@iscas.ac.cn>
2025-04-30 14:51:43 +08:00
Yury Norov
e5bf9a4b68 riscv: switch set_icache_stale_mask() to using non-atomic assign_cpu()
The atomic cpumask_assign_cpu() follows non-atomic cpumask_setall(),
which makes the whole operation non-atomic. Fix this by relaxing to
non-atomic __assign_cpu().

Fixes: 7c1e5b9690 ("riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF")
Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
2025-04-29 15:58:37 -04:00
Charlie Jenkins
5b3d6103b3 riscv: entry: Split ret_from_fork() into user and kernel
This function was unified into a single function in commit ab9164dae2
("riscv: entry: Consolidate ret_from_kernel_thread into ret_from_fork").
However that imposed a performance degradation.

Partially reverting this commit to have ret_from_fork() split again,
results in a 1% increase on the number of times fork is able to be called
per second.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/all/20250320-riscv_optimize_entry-v6-2-63e187e26041@rivosinc.com
2025-04-29 08:27:10 +02:00
Charlie Jenkins
f955aa8723 riscv: entry: Convert ret_from_fork() to C
Move the main section of ret_from_fork() to C to allow inlining of
syscall_exit_to_user_mode().

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/all/20250320-riscv_optimize_entry-v6-1-63e187e26041@rivosinc.com
2025-04-29 08:27:10 +02:00
Eric Biggers
879f47548b crypto: lib/chacha - remove INTERNAL symbol and selection of CRYPTO
Now that the architecture-optimized ChaCha kconfig symbols are defined
regardless of CRYPTO, there is no need for CRYPTO_LIB_CHACHA to select
CRYPTO.  So, remove that.  This makes the indirection through the
CRYPTO_LIB_CHACHA_INTERNAL symbol unnecessary, so get rid of that and
just use CRYPTO_LIB_CHACHA directly.  Finally, make the fallback to the
generic implementation use a default value instead of a select; this
makes it consistent with how the arch-optimized code gets enabled and
also with how CRYPTO_LIB_BLAKE2S_GENERIC gets enabled.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-28 19:40:54 +08:00
Eric Biggers
d604877c2f crypto: riscv - move library functions to arch/riscv/lib/crypto/
Continue disentangling the crypto library functions from the generic
crypto infrastructure by moving the riscv ChaCha library functions into
a new directory arch/riscv/lib/crypto/ that does not depend on CRYPTO.
This mirrors the distinction between crypto/ and lib/crypto/.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-28 19:40:53 +08:00
Haylen Chu
60c6d37972 riscv: defconfig: spacemit: enable clock controller driver for SpacemiT K1
Clock controller unit, or CCU, generates various clocks frequency for
peripherals integrated in SpacemiT K1 SoC and is essential for normal
operation. Let's enable it as built-in driver in defconfig.

Signed-off-by: Haylen Chu <heylenay@4d2.org>
Reviewed-by: Alex Elder <elder@riscstar.com>
Reviewed-by: Yixun Lan <dlan@gentoo.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-04-26 00:41:03 +01:00
Linus Torvalds
c3137514f1 RISC-V Fixes for 6.15-rc4
* A fix for a missing icache flush in uprobes, which manifests as at
   least a BFF selftest failure on the Spacemit X1.
 * A workaround for build warnings in flush_icache_range().
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmgLvt0THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiSXTEACYyBa3WiAywufa8qeMli8s5McgIp76
 WQhTakhp2/2KWegbFfn4+ugvY9NhpY3CGGESJPo1tgHvvDh6kceVEV1jqVuQuPBl
 IEh4NImUc9UvimL9QhqsIEkGiF3lTSB+gsmp9VE/tk2BsPcq07oJgtTCaB87JQND
 mLpcBkHSu8T6YbN7ZM8XSVc0sbvTEzKFGBtXyVu5y/7OehiNXeFKVIZ1Qdv1tBf/
 QbfC43UBQ++mO7v6QTDy7BwWcBR3r6EBR/rLxutjyefhXNwHQVEGk2WP9pIVcc2P
 B4nihfiCfDFCwijgRQFtH54Cwdxr253bdci5tEw640hQCRHAJSVXKjjdxcdX9FXm
 qlXyXDgmDXUqGR5gC34i1eUT+ahjyvOeS39SJbE2QT8ZRbNRXIU6KNN9sO6h5x7P
 sN6pJin04nT0No9ADGnUza7rtVlEgCpM1KYTOVgVltHYJ5PtSdHv1dIiYjqlJx7G
 LaAbs93FUE6f7aJd6un/RMAb6g9sCu2GnDTU+iivU4IKRXQ+Bf1LSEst3lYNxMln
 r6o7jg0ZZFa9cqBKGVl4G74YOBzRi5wzlwtJsxo7uA3tc0WnlFVVSZv6lryGYR1S
 e8dMttYDQ+vpWrP5Qvo55YnebRt3khmxcWbVTcA8EBiX5cPHAvo456JBby7h7vqU
 BA8wEj0K7O/+Eg==
 =SrfS
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for a missing icache flush in uprobes, which manifests as at
   least a BFF selftest failure on the Spacemit X1

 - A workaround for build warnings in flush_icache_range()

* tag 'riscv-for-linus-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: uprobes: Add missing fence.i after building the XOL buffer
  riscv: Replace function-like macro by static inline function
2025-04-25 13:22:08 -07:00
Michal Wilczynski
1b136de08b riscv: dts: thead: Introduce reset controller node
T-HEAD TH1520 SoC requires to put the GPU out of the reset state as part
of the power-up sequence.

Link: https://lore.kernel.org/linux-riscv/81e53e3a-5873-44c7-9070-5596021daa42@samsung.com/
Reviewed-by: Drew Fustini <drew@pdp7.com>
Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
[drew: remove hunk that included thead,th1520-reset.h]
Signed-off-by: Drew Fustini <drew@pdp7.com>
2025-04-25 11:43:44 -07:00
Yixun Lan
8ec8b25dde riscv: defconfig: spacemit: enable gpio support for K1 SoC
Enable GPIO support, in order to activate follow-up GPIO LED,
and ethernet reset pin.

Signed-off-by: Yixun Lan <dlan@gentoo.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-04-25 15:35:30 +01:00
Nathan Chancellor
bf0b4f1526 crypto: riscv - Use SYM_FUNC_START for functions only called directly
After some recent changes to the RISC-V crypto code that turned some
indirect function calls into direct ones, builds with CONFIG_CFI_CLANG
fail with:

  ld.lld: error: undefined symbol: __kcfi_typeid_sm3_transform_zvksh_zvkb
  >>> referenced by arch/riscv/crypto/sm3-riscv64-zvksh-zvkb.o:(.text+0x2) in archive vmlinux.a

  ld.lld: error: undefined symbol: __kcfi_typeid_sha512_transform_zvknhb_zvkb
  >>> referenced by arch/riscv/crypto/sha512-riscv64-zvknhb-zvkb.o:(.text+0x2) in archive vmlinux.a

  ld.lld: error: undefined symbol: __kcfi_typeid_sha256_transform_zvknha_or_zvknhb_zvkb
  >>> referenced by arch/riscv/crypto/sha256-riscv64-zvknha_or_zvknhb-zvkb.o:(.text+0x2) in archive vmlinux.a

As these functions are no longer indirectly called (i.e., have their
address taken), the compiler will not emit __kcfi_typeid symbols for
them but SYM_TYPED_FUNC_START expects these to exist at link time.

Switch the definitions of these functions to use SYM_FUNC_START, as they
no longer need kCFI type information since they are only called
directly.

Fixes: 1523eaed0a ("crypto: riscv/sm3 - Use API partial block handling")
Fixes: 561aab1104 ("crypto: riscv/sha512 - Use API partial block handling")
Fixes: e6c5597bad ("crypto: riscv/sha256 - Use API partial block handling")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-25 10:46:05 +08:00
Björn Töpel
7d1d19a11c
riscv: uprobes: Add missing fence.i after building the XOL buffer
The XOL (execute out-of-line) buffer is used to single-step the
replaced instruction(s) for uprobes. The RISC-V port was missing a
proper fence.i (i$ flushing) after constructing the XOL buffer, which
can result in incorrect execution of stale/broken instructions.

This was found running the BPF selftests "test_progs:
uprobe_autoattach, attach_probe" on the Spacemit K1/X60, where the
uprobes tests randomly blew up.

Reviewed-by: Guo Ren <guoren@kernel.org>
Fixes: 74784081aa ("riscv: Add uprobes supported")
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20250419111402.1660267-2-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-24 13:20:02 -07:00
Björn Töpel
121f34341d
riscv: Replace function-like macro by static inline function
The flush_icache_range() function is implemented as a "function-like
macro with unused parameters", which can result in "unused variables"
warnings.

Replace the macro with a static inline function, as advised by
Documentation/process/coding-style.rst.

Fixes: 08f051eda3 ("RISC-V: Flush I$ when making a dirty page executable")
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20250419111402.1660267-1-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-24 13:20:01 -07:00
Linus Torvalds
a79be02bba Fix mis-uses of 'cc-option' for warning disablement
This was triggered by one of my mis-uses causing odd build warnings on
sparc in linux-next, but while figuring out why the "obviously correct"
use of cc-option caused such odd breakage, I found eight other cases of
the same thing in the tree.

The root cause is that 'cc-option' doesn't work for checking negative
warning options (ie things like '-Wno-stringop-overflow') because gcc
will silently accept options it doesn't recognize, and so 'cc-option'
ends up thinking they are perfectly fine.

And it all works, until you have a situation where _another_ warning is
emitted.  At that point the compiler will go "Hmm, maybe the user
intended to disable this warning but used that wrong option that I
didn't recognize", and generate a warning for the unrecognized negative
option.

Which explains why we have several cases of this in the tree: the
'cc-option' test really doesn't work for this situation, but most of the
time it simply doesn't matter that ity doesn't work.

The reason my recently added case caused problems on sparc was pointed
out by Thomas Weißschuh: the sparc build had a previous explicit warning
that then triggered the new one.

I think the best fix for this would be to make 'cc-option' a bit smarter
about this sitation, possibly by adding an intentional warning to the
test case that then triggers the unrecognized option warning reliably.

But the short-term fix is to replace 'cc-option' with an existing helper
designed for this exact case: 'cc-disable-warning', which picks the
negative warning but uses the positive form for testing the compiler
support.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/all/20250422204718.0b4e3f81@canb.auug.org.au/
Explained-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-23 10:08:29 -07:00
Herbert Xu
1523eaed0a crypto: riscv/sm3 - Use API partial block handling
Use the Crypto API partial block handling.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 15:52:47 +08:00
Herbert Xu
561aab1104 crypto: riscv/sha512 - Use API partial block handling
Use the Crypto API partial block handling.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 15:52:46 +08:00
Herbert Xu
e6c5597bad crypto: riscv/sha256 - Use API partial block handling
Use the Crypto API partial block handling.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 15:52:45 +08:00
Herbert Xu
867a2177c2 crypto: riscv/ghash - Use API partial block handling
Use the Crypto API partial block handling.

As this was the last user relying on crypto/ghash.h for gf128mul.h,
remove the unnecessary inclusion of gf128mul.h from crypto/ghash.h.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-23 11:33:47 +08:00
Eric Biggers
8821d26926 crypto: lib/chacha - restore ability to remove modules
Though the module_exit functions are now no-ops, they should still be
defined, since otherwise the modules become unremovable.

Fixes: 08820553f3 ("crypto: arm/chacha - remove the redundant skcipher algorithms")
Fixes: 8c28abede1 ("crypto: arm64/chacha - remove the skcipher algorithms")
Fixes: f7915484c0 ("crypto: powerpc/chacha - remove the skcipher algorithms")
Fixes: ceba0eda83 ("crypto: riscv/chacha - implement library instead of skcipher")
Fixes: 632ab0978f ("crypto: x86/chacha - remove the skcipher algorithms")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-19 11:18:28 +08:00
Palmer Dabbelt
85f79dece5
Merge patch series "riscv: misaligned: Add ZCB handling and fix sleeping function"
Nylon Chen <nylon.chen@sifive.com> says:

1. Adds support for ZCB compressed instructions (C.LHU, C.LH, C.SH).

2. Fixes a bug where copy_from/to_user() calls in non-sleepable contexts
triggered attempts to sleep.

Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Nylon Chen nylon.chen@sifive.com

Nylon Chen (2):
  riscv: misaligned: Add handling for ZCB instructions
  riscv: misaligned: fix sleeping function called during misaligned
    access handling

* b4-shazam-merge:
  riscv: misaligned: fix sleeping function called during misaligned access handling
  riscv: misaligned: Add handling for ZCB instructions

Link: https://lore.kernel.org/r/20250411073850.3699180-1-nylon.chen@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-18 10:52:26 -07:00
Nylon Chen
7b30b1b04e
riscv: misaligned: Add handling for ZCB instructions
Add support for the Zcb extension's compressed half-word instructions
(C.LHU, C.LH, and C.SH) in the RISC-V misaligned access trap handler.

Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Nylon Chen <nylon.chen@sifive.com>
Link: https://lore.kernel.org/r/20250411073850.3699180-2-nylon.chen@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-18 10:52:26 -07:00
Nylon Chen
61a74ad254
riscv: misaligned: fix sleeping function called during misaligned access handling
Use copy_from_user_nofault() and copy_to_user_nofault() instead of
copy_from/to_user functions in the misaligned access trap handlers.

The following bug report was found when executing misaligned memory
accesses:

BUG: sleeping function called from invalid context at ./include/linux/uaccess.h:162
in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 115, name: two
preempt_count: 0, expected: 0
CPU: 0 UID: 0 PID: 115 Comm: two Not tainted 6.14.0-rc5 #24
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
 [<ffffffff800160ea>] dump_backtrace+0x1c/0x24
 [<ffffffff80002304>] show_stack+0x28/0x34
 [<ffffffff80010fae>] dump_stack_lvl+0x4a/0x68
 [<ffffffff80010fe0>] dump_stack+0x14/0x1c
 [<ffffffff8004e44e>] __might_resched+0xfa/0x104
 [<ffffffff8004e496>] __might_sleep+0x3e/0x62
 [<ffffffff801963c4>] __might_fault+0x1c/0x24
 [<ffffffff80425352>] _copy_from_user+0x28/0xaa
 [<ffffffff8000296c>] handle_misaligned_store+0x204/0x254
 [<ffffffff809eae82>] do_trap_store_misaligned+0x24/0xee
 [<ffffffff809f4f1a>] handle_exception+0x146/0x152

Fixes: b686ecdeac ("riscv: misaligned: Restrict user access to kernel memory")
Fixes: 441381506b ("riscv: misaligned: remove CONFIG_RISCV_M_MODE specific code")

Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Nylon Chen <nylon.chen@sifive.com>
Link: https://lore.kernel.org/r/20250411073850.3699180-3-nylon.chen@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-18 10:52:26 -07:00
Palmer Dabbelt
615e705fc8 riscv fixes for 6.15-rc3
- A couple of fixes regarding module relocations
 - Fix a build error by implementing missing alternative macros
 - Another fix for kexec by fixing /proc/iomem
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQgN2CKhD/Nf5v80u9kP7K8koXvigUCZ//mhQAKCRBkP7K8koXv
 iqjNAQCke5JNfU6ZhIfwZokrWCIU5CJO0WF9uGl0u972YiazkQD/fKC7nL0e6BDZ
 2/Ha31y8pmjLUDAlC8+yxzuhec5QyQ4=
 =A6JZ
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmf/8pgTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiQikD/95qOZZXzE55oTgD7noMwyDZQ8x41ND
 5fsH2YF+oziC3Z97IkEtgj24Bz0QY3Q5YLdB1lY7Ps0RrAXhC5KHAZyhOYe5sBwF
 oaB3b/MyuiPot7GbqSMN6400wV9mTiU4+os+GNoWutsH/Z3tQuFNTryO+Qy9jlzX
 raX/TxLKd8AVqva9y1U9wqoH2iLN2cXvY3/lfXCgTgDsYKVNMIFyOgEuaMiWq3Yt
 EjkT1A9RmNOaxpfPOhcmq5DkCKlNpih9t4OLurAFJTcmajVyjxP12SrmcfNQPaPz
 HdV9pGjC5ZfNribm2NrHtl8sJXppXUBGeKxFdT+7MGYfqCAsiBUXQ/PByybPU3Na
 Zy7576h9kXhsmwXQJvt8x7lrBhjBfGWwuVtyVZFcCKuvj9aPOoZPBOHA7iX2uuC3
 hF/6DWha8IDLO2PGYo6B4lCUnPZkAqBaOpaXmxCYAaCwKaiCx9x9zIYDsJfbD9ZC
 H+GpAizu8iCMtZXrg9pA5QvMEBTbFOLhjgD1KCk01KY8hsUihnAGfu4mktYsXYEe
 PokxhYIIvzcCRD6Yi5vCtVxL6YXjuTYHo8tNEENoFxi8nvdST3AhqkgbrP9MorOW
 879Zqiz/G37aFGOUh7uDGKoP1EcGdlZPWfnfKc+jSUNdStWWLBcHbkcd7EGeg+ov
 1MTELcNHvxl6EQ==
 =M/+3
 -----END PGP SIGNATURE-----

Merge tag 'riscv-fixes-6.15-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into fixes

riscv fixes for 6.15-rc3

- A couple of fixes regarding module relocations
- Fix a build error by implementing missing alternative macros
- Another fix for kexec by fixing /proc/iomem

* tag 'riscv-fixes-6.15-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux:
  riscv: Avoid fortify warning in syscall_get_arguments()
  riscv: Provide all alternative macros all the time
  riscv: module: Allocate PLT entries for R_RISCV_PLT32
  riscv: module: Fix out-of-bounds relocation access
  riscv: Properly export reserved regions in /proc/iomem
  riscv: Fix unaligned access info messages
2025-04-16 11:10:25 -07:00
Joel Stanley
bafa451a96
riscv: defconfig: Remove EXPERT
The setting of EXPERT is a leftover from when the riscv defconfig was
first added. As mentioned in the EXPERT Kconfig help text it is not
intended to be set in the usual case.

Upon removal a bunch of intrusive debug-related kernel options are no
longer set, which is good. A few may want to come back in the future but
let those be advocated for on a case by case basis.

NAMESPACES, SYSFS_SYSCALL and MEDIA_SUPPORT_FILTER default on and thus
fall out of the defconfig.

Set VIDEO_CADENCE_CSI2RX=y to ensure VIDEO_CADENCE_CSI2RX stays enabled.

Set DEBUG_KERNEL=y in line with other arch defconfigs. This turns on
tracing.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20250415122832.982610-1-joel@jms.id.au
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-16 08:22:27 -07:00
Palmer Dabbelt
dc3e30b499
Merge patch series "riscv: Rework the arch_kgdb_breakpoint() implementation"
WangYuli <wangyuli@uniontech.com> says:

1. The arch_kgdb_breakpoint() function defines the kgdb_compiled_break
   symbol using inline assembly.

   There's a potential issue where the compiler might inline
   arch_kgdb_breakpoint(), which would then define the kgdb_compiled_break
   symbol multiple times, leading to fail to link vmlinux.o.

   This isn't merely a potential compilation problem. The intent here
   is to determine the global symbol address of kgdb_compiled_break,
   and if this function is inlined multiple times, it would logically
   be a grave error.

2. Remove ".option norvc/.option rvc" to fix a bug that the C extension
   would unconditionally enable even if the kernel is being built with
   CONFIG_RISCV_ISA_C=n.

* b4-shazam-merge:
  riscv: KGDB: Remove ".option norvc/.option rvc" for kgdb_compiled_break
  riscv: KGDB: Do not inline arch_kgdb_breakpoint()

Link: https://lore.kernel.org/r/D5A83DF3A06E1DF9+20250411072905.55134-1-wangyuli@uniontech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-16 07:31:23 -07:00
WangYuli
550c2aa787
riscv: KGDB: Remove ".option norvc/.option rvc" for kgdb_compiled_break
[ Quoting Samuel Holland: ]

  This is a separate issue, but using ".option rvc" here is a bug.
  It will unconditionally enable the C extension for the rest of
  the file, even if the kernel is being built with CONFIG_RISCV_ISA_C=n.

[ Quoting Palmer Dabbelt: ]

  We're just looking at the address of kgdb_compiled_break, so it's
  fine if it ends up as a c.ebreak.

[ Quoting Alexandre Ghiti: ]

  .option norvc is used to prevent the assembler from using compressed
  instructions, but it's generally used when we need to ensure the
  size of the instructions that are used, which is not the case here
  as noted by Palmer since we only care about the address. So yes
  it will work fine with C enabled :)

So let's just remove them all.

Link: https://lore.kernel.org/all/4b4187c1-77e5-44b7-885f-d6826723dd9a@sifive.com/
Link: https://lore.kernel.org/all/mhng-69513841-5068-441d-be8f-2aeebdc56a08@palmer-ri-x1c9a/
Link: https://lore.kernel.org/all/23693e7f-4fff-40f3-a437-e06d827278a5@ghiti.fr/
Fixes: fe89bd2be8 ("riscv: Add KGDB support")
Cc: Samuel Holland <samuel.holland@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Link: https://lore.kernel.org/r/8B431C6A4626225C+20250411073222.56820-2-wangyuli@uniontech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-16 07:29:34 -07:00
WangYuli
3af4bec9c1
riscv: KGDB: Do not inline arch_kgdb_breakpoint()
The arch_kgdb_breakpoint() function defines the kgdb_compiled_break
symbol using inline assembly.

There's a potential issue where the compiler might inline
arch_kgdb_breakpoint(), which would then define the kgdb_compiled_break
symbol multiple times, leading to fail to link vmlinux.o.

This isn't merely a potential compilation problem. The intent here
is to determine the global symbol address of kgdb_compiled_break,
and if this function is inlined multiple times, it would logically
be a grave error.

Link: https://lore.kernel.org/all/4b4187c1-77e5-44b7-885f-d6826723dd9a@sifive.com/
Link: https://lore.kernel.org/all/5b0adf9b-2b22-43fe-ab74-68df94115b9a@ghiti.fr/
Link: https://lore.kernel.org/all/23693e7f-4fff-40f3-a437-e06d827278a5@ghiti.fr/
Fixes: fe89bd2be8 ("riscv: Add KGDB support")
Co-developed-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Link: https://lore.kernel.org/r/F22359AFB6FF9FD8+20250411073222.56820-1-wangyuli@uniontech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-16 07:29:33 -07:00
Xi Ruoyao
89079520ce
RISC-V: vDSO: Wire up getrandom() vDSO implementation
Hook up the generic vDSO implementation to the generic vDSO getrandom
implementation by providing the required __arch_chacha20_blocks_nostack
and getrandom_syscall implementations. Also wire up the selftests.

The benchmark result:

	vdso: 25000000 times in 2.466341333 seconds
	libc: 25000000 times in 41.447720005 seconds
	syscall: 25000000 times in 41.043926672 seconds

	vdso: 25000000 x 256 times in 162.286219353 seconds
	libc: 25000000 x 256 times in 2953.855018685 seconds
	syscall: 25000000 x 256 times in 2796.268546000 seconds

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Link: https://lore.kernel.org/r/20250411024600.16045-1-xry111@xry111.site
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-16 07:28:34 -07:00
Nathan Chancellor
1413708f99 riscv: Avoid fortify warning in syscall_get_arguments()
When building with CONFIG_FORTIFY_SOURCE=y and W=1, there is a warning
because of the memcpy() in syscall_get_arguments():

  In file included from include/linux/string.h:392,
                   from include/linux/bitmap.h:13,
                   from include/linux/cpumask.h:12,
                   from arch/riscv/include/asm/processor.h:55,
                   from include/linux/sched.h:13,
                   from kernel/ptrace.c:13:
  In function 'fortify_memcpy_chk',
      inlined from 'syscall_get_arguments.isra' at arch/riscv/include/asm/syscall.h:66:2:
  include/linux/fortify-string.h:580:25: error: call to '__read_overflow2_field' declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning]
    580 |                         __read_overflow2_field(q_size_field, size);
        |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

The fortified memcpy() routine enforces that the source is not overread
and the destination is not overwritten if the size of either field and
the size of the copy are known at compile time. The memcpy() in
syscall_get_arguments() intentionally overreads from a1 to a5 in
'struct pt_regs' but this is bigger than the size of a1.

Normally, this could be solved by wrapping a1 through a5 with
struct_group() but there was already a struct_group() applied to these
members in commit bba547810c ("riscv: tracing: Fix
__write_overflow_field in ftrace_partial_regs()").

Just avoid memcpy() altogether and write the copying of args from regs
manually, which clears up the warning at the expense of three extra
lines of code.

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Dmitry V. Levin <ldv@strace.io>
Link: https://lore.kernel.org/r/20250409-riscv-avoid-fortify-warning-syscall_get_arguments-v1-1-7853436d4755@kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-16 11:09:38 +00:00
Herbert Xu
f4065b2f63 crypto: lib/sm3 - Move sm3 library into lib/crypto
Move the sm3 library code into lib/crypto.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-16 15:36:24 +08:00
Michal Wilczynski
2bae46e3de riscv: dts: thead: Introduce power domain nodes with aon firmware
The DRM Imagination GPU requires a power-domain driver. In the T-HEAD
TH1520 SoC implements power management capabilities through the E902
core, which can be communicated with through the mailbox, using firmware
protocol.

Add AON node, which servers as a power-domain controller.

Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
Reviewed-by: Drew Fustini <drew@pdp7.com>
Signed-off-by: Drew Fustini <drew@pdp7.com>
2025-04-14 19:04:44 -07:00
Andrew Jones
fb53a9aa5f riscv: Provide all alternative macros all the time
We need to provide all six forms of the alternative macros
(ALTERNATIVE, ALTERNATIVE_2, _ALTERNATIVE_CFG, _ALTERNATIVE_CFG_2,
__ALTERNATIVE_CFG, __ALTERNATIVE_CFG_2) for all four cases derived
from the two ifdefs (RISCV_ALTERNATIVE, __ASSEMBLY__) in order to
ensure all configs can compile. Define this missing ones and ensure
all are defined to consume all parameters passed.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202504130710.3IKz6Ibs-lkp@intel.com/
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250414120947.135173-2-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-14 13:07:08 +00:00
Samuel Holland
1ee1313f47 riscv: module: Allocate PLT entries for R_RISCV_PLT32
apply_r_riscv_plt32_rela() may need to emit a PLT entry for the
referenced symbol, so there must be space allocated in the PLT.

Fixes: 8fd6c51423 ("riscv: Add remaining module relocations")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250409171526.862481-2-samuel.holland@sifive.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-14 13:07:07 +00:00
Samuel Holland
0b4cce68ef riscv: module: Fix out-of-bounds relocation access
The current code allows rel[j] to access one element past the end of the
relocation section. Simplify to num_relocations which is equivalent to
the existing size expression.

Fixes: 080c4324fa ("riscv: optimize ELF relocation function in riscv")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Maxim Kochetkov <fido_max@inbox.ru>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250409171526.862481-1-samuel.holland@sifive.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-14 13:07:07 +00:00
Björn Töpel
e94eb7ea6f riscv: Properly export reserved regions in /proc/iomem
The /proc/iomem represents the kernel's memory map. Regions marked
with "Reserved" tells the user that the range should not be tampered
with. Kexec-tools, when using the older kexec_load syscall relies on
the "Reserved" regions to build the memory segments, that will be the
target of the new kexec'd kernel.

The RISC-V port tries to expose all reserved regions to userland, but
some regions were not properly exposed: Regions that resided in both
the "regular" and reserved memory block, e.g. the EFI Memory Map. A
missing entry could result in reserved memory being overwritten.

It turns out, that arm64, and loongarch had a similar issue a while
back:

  commit d91680e687 ("arm64: Fix /proc/iomem for reserved but not memory regions")
  commit 50d7ba36b9 ("arm64: export memblock_reserve()d regions via /proc/iomem")

Similar to the other ports, resolve the issue by splitting the regions
in an arch initcall, since we need a working allocator.

Fixes: ffe0e52612 ("RISC-V: Improve init_resources()")
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250409182129.634415-1-bjorn@kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-14 13:07:07 +00:00
Andrew Jones
4410160560 riscv: Fix unaligned access info messages
Ensure we only print messages about command line parameters when
the parameters are actually in use. Also complain about the use
of the vector parameter when vector support isn't available.

Fixes: aecb09e091 ("riscv: Add parameter for skipping access speed tests")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Closes: https://lore.kernel.org/all/CAMuHMdVEp2_ho51gkpLLJG2HimqZ1gZ0fa=JA4uNNZjFFqaPMg@mail.gmail.com/
Closes: https://lore.kernel.org/all/CAMuHMdWVMP0MYCLFq+b7H_uz-2omdFiDDUZq0t_gw0L9rrJtkQ@mail.gmail.com/
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250409153650.84433-2-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-14 13:07:07 +00:00
Nathan Chancellor
adf53771a3
riscv: Avoid fortify warning in syscall_get_arguments()
When building with CONFIG_FORTIFY_SOURCE=y and W=1, there is a warning
because of the memcpy() in syscall_get_arguments():

  In file included from include/linux/string.h:392,
                   from include/linux/bitmap.h:13,
                   from include/linux/cpumask.h:12,
                   from arch/riscv/include/asm/processor.h:55,
                   from include/linux/sched.h:13,
                   from kernel/ptrace.c:13:
  In function 'fortify_memcpy_chk',
      inlined from 'syscall_get_arguments.isra' at arch/riscv/include/asm/syscall.h:66:2:
  include/linux/fortify-string.h:580:25: error: call to '__read_overflow2_field' declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning]
    580 |                         __read_overflow2_field(q_size_field, size);
        |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

The fortified memcpy() routine enforces that the source is not overread
and the destination is not overwritten if the size of either field and
the size of the copy are known at compile time. The memcpy() in
syscall_get_arguments() intentionally overreads from a1 to a5 in
'struct pt_regs' but this is bigger than the size of a1.

Normally, this could be solved by wrapping a1 through a5 with
struct_group() but there was already a struct_group() applied to these
members in commit bba547810c ("riscv: tracing: Fix
__write_overflow_field in ftrace_partial_regs()").

Just avoid memcpy() altogether and write the copying of args from regs
manually, which clears up the warning at the expense of three extra
lines of code.

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Dmitry V. Levin <ldv@strace.io>
Fixes: e2c0cdfba7 ("RISC-V: User-facing API")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250409-riscv-avoid-fortify-warning-syscall_get_arguments-v1-1-7853436d4755@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-04-10 10:11:57 -07:00
Michal Wilczynski
54fe9380a5 riscv: Enable PM_GENERIC_DOMAINS for T-Head SoCs
T-Head SoCs feature separate power domains (power islands) for major
components like the GPU, Audio, and NPU. To manage the power states of
these components effectively, the kernel requires generic power domain
support.

This commit enables `CONFIG_PM_GENERIC_DOMAINS` for T-Head SoCs,
allowing the power domain driver for these components to be compiled and
integrated. This ensures proper power management and energy efficiency
on T-Head platforms.

By selecting `PM_GENERIC_DOMAINS`, we provide the necessary framework
for the power domain drivers to function correctly on RISC-V
architecture with T-Head SoCs.

Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
Reviewed-by: Drew Fustini <drew@pdp7.com>
Acked-by: Drew Fustini <drew@pdp7.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-04-07 16:51:48 +01:00
Will Pierce
8578b2f7e1
riscv: Use kvmalloc_array on relocation_hashtable
The number of relocations may be a huge value that is unallocatable
by kmalloc. Use kvmalloc instead so that it does not fail.

Fixes: 8fd6c51423 ("riscv: Add remaining module relocations")
Suggested-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Will Pierce <wgpierce17@gmail.com>
Link: https://lore.kernel.org/r/20250402081426.5197-1-wgpierce17@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-07 08:21:11 +00:00
Eric Biggers
4aa6dc909e crypto: chacha - centralize the skcipher wrappers for arch code
Following the example of the crc32 and crc32c code, make the crypto
subsystem register both generic and architecture-optimized chacha20,
xchacha20, and xchacha12 skcipher algorithms, all implemented on top of
the appropriate library functions.  This eliminates the need for every
architecture to implement the same skcipher glue code.

To register the architecture-optimized skciphers only when
architecture-optimized code is actually being used, add a function
chacha_is_arch_optimized() and make each arch implement it.  Change each
architecture's ChaCha module_init function to arch_initcall so that the
CPU feature detection is guaranteed to run before
chacha_is_arch_optimized() gets called by crypto/chacha.c.  In the case
of s390, remove the CPU feature based module autoloading, which is no
longer needed since the module just gets pulled in via function linkage.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-07 13:22:28 +08:00
Eric Biggers
ceba0eda83 crypto: riscv/chacha - implement library instead of skcipher
Currently the RISC-V optimized ChaCha20 is only wired up to the
crypto_skcipher API, which makes it unavailable to users of the library
API.  The crypto_skcipher API for ChaCha20 is going to change to be
implemented on top of the library API, so the library API needs to be
supported.  And of course it's needed anyway to serve the library users.

Therefore, change the RISC-V ChaCha20 code to implement the library API
instead of the crypto_skcipher API.

The library functions take the ChaCha state matrix directly (instead of
key and IV) and support both ChaCha20 and ChaCha12.  To make the RISC-V
code work properly for that, change the assembly code to take the state
matrix directly and add a nrounds parameter.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-07 13:22:28 +08:00
Linus Torvalds
f4d2ef4825 Kbuild updates for v6.15
- Improve performance in gendwarfksyms
 
  - Remove deprecated EXTRA_*FLAGS and KBUILD_ENABLE_EXTRA_GCC_CHECKS
 
  - Support CONFIG_HEADERS_INSTALL for ARCH=um
 
  - Use more relative paths to sources files for better reproducibility
 
  - Support the loong64 Debian architecture
 
  - Add Kbuild bash completion
 
  - Introduce intermediate vmlinux.unstripped for architectures that need
    static relocations to be stripped from the final vmlinux
 
  - Fix versioning in Debian packages for -rc releases
 
  - Treat missing MODULE_DESCRIPTION() as an error
 
  - Convert Nios2 Makefiles to use the generic rule for built-in DTB
 
  - Add debuginfo support to the RPM package
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmfxp2EVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGkIUP/AgNiP6or6fmY5+HSyjlrdutBWAh
 QNW0AiKh5vytmBIv63/i103OE0SRbt+U6IApn9c7FQKkeuyIlD1e9NfSwFMZixmP
 P7t6JqDCL61G5d3W2Iisqle1cpBoVvNgUwu0k3sTSXl0vNsDbiyxcCzQzLhZMKsd
 O+Ppwp3zNGE2vIUwpIjzJsR5Dt/Z5MfuKDi4UShsyWpFZ1rg9X93YKc9QJOXjKwj
 4Np2x2cukDo2oz4uXuZQ8F1+bOFsKYoilCwjtxlrC6BO0lSPiJsRTN6nGJ0ejns9
 GGD56mBNGcGk+NEPGhAMQmZHqNAP4JfjEvAgaoSBn0Rdnjd9Cj/2T+4n61xkR4Wu
 MXCP/LEJ3MyctmkZjUq+0fDAe2wjxuaAG15kAHCha+9KxIG2NzHbf2XXb4E49DDU
 2rw3fqA41/cKCq1ZEaqRn3pZZgU6ysfsEW42JmnNxO+7zz9k8RX4rk8CVaVIEUuw
 Xojkis//KnE6+OCBe6Tb0H2Rzo0JF3AG2eNF4zY/xnc562FRIMS19WYS38tKZng6
 Gr1BRG0bA4t9mf2Vck1W1LcAb3Jh0mddtyrgYKhbcwq0YOj2q/H6F50DkC+wL282
 wvhV6B/vKAH8BByEWAn3rBcN0N+w/VFc0uPCz//tkoAm4nPg8PvKq63JHPrHsyZe
 mOMhifoiVbjF4KFo
 =GiQ6
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Improve performance in gendwarfksyms

 - Remove deprecated EXTRA_*FLAGS and KBUILD_ENABLE_EXTRA_GCC_CHECKS

 - Support CONFIG_HEADERS_INSTALL for ARCH=um

 - Use more relative paths to sources files for better reproducibility

 - Support the loong64 Debian architecture

 - Add Kbuild bash completion

 - Introduce intermediate vmlinux.unstripped for architectures that need
   static relocations to be stripped from the final vmlinux

 - Fix versioning in Debian packages for -rc releases

 - Treat missing MODULE_DESCRIPTION() as an error

 - Convert Nios2 Makefiles to use the generic rule for built-in DTB

 - Add debuginfo support to the RPM package

* tag 'kbuild-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (40 commits)
  kbuild: rpm-pkg: build a debuginfo RPM
  kconfig: merge_config: use an empty file as initfile
  nios2: migrate to the generic rule for built-in DTB
  rust: kbuild: skip `--remap-path-prefix` for `rustdoc`
  kbuild: pacman-pkg: hardcode module installation path
  kbuild: deb-pkg: don't set KBUILD_BUILD_VERSION unconditionally
  modpost: require a MODULE_DESCRIPTION()
  kbuild: make all file references relative to source root
  x86: drop unnecessary prefix map configuration
  kbuild: deb-pkg: add comment about future removal of KDEB_COMPRESS
  kbuild: Add a help message for "headers"
  kbuild: deb-pkg: remove "version" variable in mkdebian
  kbuild: deb-pkg: fix versioning for -rc releases
  Documentation/kbuild: Fix indentation in modules.rst example
  x86: Get rid of Makefile.postlink
  kbuild: Create intermediate vmlinux build with relocations preserved
  kbuild: Introduce Kconfig symbol for linking vmlinux with relocations
  kbuild: link-vmlinux.sh: Make output file name configurable
  kbuild: do not generate .tmp_vmlinux*.map when CONFIG_VMLINUX_MAP=y
  Revert "kheaders: Ignore silly-rename files"
  ...
2025-04-05 15:46:50 -07:00
Linus Torvalds
4a1d8ababd RISC-V Patches for the 6.15 Merge Window, Part 1
* The sub-architecture selection Kconfig system has been cleaned up,
   the documentation has been improved, and various detections have been
   fixed.
 * The vector-related extensions dependencies are now validated when
   parsing from device tree and in the DT bindings.
 * Misaligned access probing can be overridden via a kernel command-line
   parameter, along with various fixes to misalign access handling.
 * Support for relocatable !MMU kernels builds.
 * Support for hpge pfnmaps, which should improve TLB utilization.
 * Support for runtime constants, which improves the d_hash()
   performance.
 * Support for bfloat16, Zicbom, Zaamo, Zalrsc, Zicntr, Zihpm.
 * Various fixes, including:
       - We were missing a secondary mmu notifier call when flushing the
 	tlb which is required for IOMMU.
       - Fix ftrace panics by saving the registers as expected by ftrace.
       - Fix a couple of stimecmp usage related to cpu hotplug.
       - purgatory_start is now aligned as per the STVEC requirements.
       - A fix for hugetlb when calculating the size of non-present PTEs.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmfv/soTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYierZEACDwI9lJFCEbQPon3z8rAy1moTj0+AZ
 bMfZFqMphUTrJ0cMm2+Bc+XZgck12zHCyu1UljDcZVYMCHA9aOoj5C5NkBBVLCuL
 uLYrhIoQXtJaVIANiFl0SHAZmh2s2OoSgmUzrEZ8JGlHpKCF7EVX5bHEsOvzn9ir
 B2W992W6q3ISuKXHKsTpa7rmTtf7swGYg6zW3pX3l6HmY+EMEQOcQl0tAB383J/T
 lm0K4+YvLpRJdm2ARpNGWlcFXj9/UXUM5hplK3aBAHpPKQ5/83/4tMDsfRvhpEVC
 VJXNgK+H4XLD542aQ8d4ZROguyhwn9e2n6Dkv0OqfNk4lg5pUBcJUZftQ+rB7AWg
 VYB1KVpxhwcruheXJFz8S3EzjZTcS+JrcD80vvx8JmHdXkZwHTfYUgiFwe/TR7yr
 b518fEbXpVwDZiCbaAe3Cmpw0mlNnSVmU4hgNbiwt0fu9DGdPN9WQbyds68RKb7A
 TWwDmmD6kV2BTWl0mHPtu9VhX58CDG+0WYbHA7r82p2T50187766C92GYfN2UPpz
 lH0iMRDkmucclZ3fEoosJ+HsDntc4oe6Bhdzuj52Q7vBpDd/QB6t5cfrlDpEEdgU
 3qoWMN5mb5l1rbvrqENh5ZgmEpzV8K0R5F5quiXh/9wO0y1kepDslTqC2oXK/m0p
 DzsvvD6UnNMOUQ==
 =nCJo
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - The sub-architecture selection Kconfig system has been cleaned up,
   the documentation has been improved, and various detections have been
   fixed

 - The vector-related extensions dependencies are now validated when
   parsing from device tree and in the DT bindings

 - Misaligned access probing can be overridden via a kernel command-line
   parameter, along with various fixes to misalign access handling

 - Support for relocatable !MMU kernels builds

 - Support for hpge pfnmaps, which should improve TLB utilization

 - Support for runtime constants, which improves the d_hash()
   performance

 - Support for bfloat16, Zicbom, Zaamo, Zalrsc, Zicntr, Zihpm

 - Various fixes, including:
      - We were missing a secondary mmu notifier call when flushing the
        tlb which is required for IOMMU
      - Fix ftrace panics by saving the registers as expected by ftrace
      - Fix a couple of stimecmp usage related to cpu hotplug
      - purgatory_start is now aligned as per the STVEC requirements
      - A fix for hugetlb when calculating the size of non-present PTEs

* tag 'riscv-for-linus-6.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (65 commits)
  riscv: Add norvc after .option arch in runtime const
  riscv: Make sure toolchain supports zba before using zba instructions
  riscv/purgatory: 4B align purgatory_start
  riscv/kexec_file: Handle R_RISCV_64 in purgatory relocator
  selftests: riscv: fix v_exec_initval_nolibc.c
  riscv: Fix hugetlb retrieval of number of ptes in case of !present pte
  riscv: print hartid on bringup
  riscv: Add norvc after .option arch in runtime const
  riscv: Remove CONFIG_PAGE_OFFSET
  riscv: Support CONFIG_RELOCATABLE on riscv32
  asm-generic: Always define Elf_Rel and Elf_Rela
  riscv: Support CONFIG_RELOCATABLE on NOMMU
  riscv: Allow NOMMU kernels to access all of RAM
  riscv: Remove duplicate CONFIG_PAGE_OFFSET definition
  RISC-V: errata: Use medany for relocatable builds
  dt-bindings: riscv: document vector crypto requirements
  dt-bindings: riscv: add vector sub-extension dependencies
  dt-bindings: riscv: d requires f
  RISC-V: add f & d extension validation checks
  RISC-V: add vector crypto extension validation checks
  ...
2025-04-04 09:49:17 -07:00
Palmer Dabbelt
3eb64093f5 riscv patches for 6.15-rc1, part 2
* A bunch of fixes:
   - 2 fixes in the purgatory code which prevented kexec to work
   - Workaround an issue with gcc-15
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQgN2CKhD/Nf5v80u9kP7K8koXvigUCZ+uRYgAKCRBkP7K8koXv
 iiroAQCIF7ojJGZvdRfAeknzb1WKM2GFucVTRxwyicyg/9omGQD5AYGYsaQSbN4H
 j8ToELbTEnsY8YqRaQgm/AiuIkpM1AE=
 =db0s
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAmfurxYTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQSEaD/9Lp/ZQxW2+oCZQ/MxXPnn7MVBn4ncY
 SC6xVzdSye14A9RyaTUnCZIklNhOA5iKs5uZBm3mH0MTaL5K/LqtO+gKCkduT30k
 Rt5DKJWaXzsi3QNVq12Lakun7xJV8auYJP3rWj6rLdSo8wOG3PF2T4+L9kie+WtB
 EhxdfOK3oF1JV0a0q7E0D599K9E/y0T/kXeNzjvt4uhJVHY096LJY3GcMBvAUuDx
 aS4gF3DS2iURvZfCFZKIJhzXMz5p2bQqngDvpNit1cmHgT/4EBM7hahqfGXlP1oG
 pUTsktsnPBntXWCIKLfsac6XMKVCj3J5pqmn8cvhyALO3AjB+kU921xhRe9OyMpL
 zlBb/4B9AB4Yf6EMGLecbzaf/WX2m+L/vS+AJdD7D88X9k4kSnT4WBs90gUwyD/I
 wCGadWQtZvrwH6LENdiuuyLdHldmG76hnHjglIBJSkQCqTBFlnvHwlYI7QQ3AXd7
 TrRS2G7tcMaAd0tyIJ9FaaZdlgmc7wTQjvaJz1oTwx8nHRo4ApwEnRs1oCxyDO3I
 L+cfVcQLdsHnxdUeCssLkJHAfjU4HC8jweh7u1Q0LDcrdJb3nMkwzYrHTXunGrK6
 TELnsbnRNrYmvsV0AWEiJB0ymgoCuhapqSUuaLyWaeoKXOPKrZltkRsPBKbUGaIm
 V3z/XjtMJenbOQ==
 =tmof
 -----END PGP SIGNATURE-----

Merge tag 'riscv-mw2-6.15-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into for-next

riscv patches for 6.15-rc1, part 2

* A bunch of fixes:
  - 2 fixes in the purgatory code which prevented kexec to work
  - Workaround an issue with gcc-15

* tag 'riscv-mw2-6.15-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux:
  riscv: Add norvc after .option arch in runtime const
  riscv: Make sure toolchain supports zba before using zba instructions
  riscv/purgatory: 4B align purgatory_start
  riscv/kexec_file: Handle R_RISCV_64 in purgatory relocator
  selftests: riscv: fix v_exec_initval_nolibc.c
  riscv: Fix hugetlb retrieval of number of ptes in case of !present pte
  riscv: print hartid on bringup
  dt-bindings: riscv: document vector crypto requirements
  dt-bindings: riscv: add vector sub-extension dependencies
  dt-bindings: riscv: d requires f
  RISC-V: add f & d extension validation checks
  RISC-V: add vector crypto extension validation checks
  RISC-V: add vector extension validation checks
2025-04-03 08:53:19 -07:00
Qi Zheng
4239c198e8 riscv: pgtable: unconditionally use tlb_remove_ptdesc()
To support fast gup, the commit 69be3fb111 ("riscv: enable
MMU_GATHER_RCU_TABLE_FREE for SMP && MMU") did the following:

1) use tlb_remove_page_ptdesc() for those platforms which use IPI to
   perform TLB shootdown

2) use tlb_remove_ptdesc() for those platforms which use SBI to perform
   TLB shootdown

The tlb_remove_page_ptdesc() is the wrapper of the tlb_remove_page().  By
design, the tlb_remove_page() should be used to remove a normal page from
a page table entry, and should not be used for page table pages.

The tlb_remove_ptdesc() is the wrapper of the tlb_remove_table(), which is
designed specifically for freeing page table pages.  If the
CONFIG_MMU_GATHER_TABLE_FREE is enabled, the tlb_remove_table() will use
semi RCU to free page table pages, that is:

 - batch table freeing: asynchronous free by RCU
 - single table freeing: IPI + synchronous free

If the CONFIG_MMU_GATHER_TABLE_FREE is disabled, the tlb_remove_table()
will fall back to pagetable_dtor() + tlb_remove_page().

For case 1), since we need to perform TLB shootdown before freeing the
page table page, the local_irq_save() in fast gup can block the freeing
and protect the fast gup page walker.  Therefore we can ensure safety by
just using tlb_remove_page_ptdesc().  In addition, we can also the
tlb_remove_ptdesc()/tlb_remove_table() to achieve it, and it doesn't
matter whether CONFIG_MMU_GATHER_RCU_TABLE_FREE is selected.  And in
theory, the performance of freeing pages asynchronously via RCU will not
be lower than synchronous free.

For case 2), since local_irq_save() only disable S-privilege IPI irq but
not M-privilege's, which is used by the SBI implementation to perform TLB
shootdown, so we must select CONFIG_MMU_GATHER_RCU_TABLE_FREE and use
tlb_remove_ptdesc() to ensure safety.  The riscv selects this config for
SMP && MMU, the CONFIG_RISCV_SBI is dependent on MMU.  Therefore, only the
UP system may have the situation where CONFIG_MMU_GATHER_RCU_TABLE_FREE is
disabled but CONFIG_RISCV_SBI is enabled.  But there is no freeing vs fast
gup race in the UP system.

So, in summary, we can use tlb_remove_ptdesc() to support fast gup in all
cases, and this interface is specifically designed for page table pages. 
So let's use it unconditionally.

Link: https://lkml.kernel.org/r/9025595e895515515c95e48db54b29afa489c41d.1740454179.git.zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickens <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-04-01 15:17:14 -07:00
Linus Torvalds
d6b02199cd - The 7 patch series "powerpc/crash: use generic crashkernel
reservation" from Sourabh Jain changes powerpc's kexec code to use more
   of the generic layers.
 
 - The 2 patch series "get_maintainer: report subsystem status
   separately" from Vlastimil Babka makes some long-requested improvements
   to the get_maintainer output.
 
 - The 4 patch series "ucount: Simplify refcounting with rcuref_t" from
   Sebastian Siewior cleans up and optimizing the refcounting in the ucount
   code.
 
 - The 12 patch series "reboot: support runtime configuration of
   emergency hw_protection action" from Ahmad Fatoum improves the ability
   for a driver to perform an emergency system shutdown or reboot.
 
 - The 16 patch series "Converge on using secs_to_jiffies() part two"
   from Easwar Hariharan performs further migrations from
   msecs_to_jiffies() to secs_to_jiffies().
 
 - The 7 patch series "lib/interval_tree: add some test cases and
   cleanup" from Wei Yang permits more userspace testing of kernel library
   code, adds some more tests and performs some cleanups.
 
 - The 2 patch series "hung_task: Dump the blocking task stacktrace" from
   Masami Hiramatsu arranges for the hung_task detector to dump the stack
   of the blocking task and not just that of the blocked task.
 
 - The 4 patch series "resource: Split and use DEFINE_RES*() macros" from
   Andy Shevchenko provides some cleanups to the resource definition
   macros.
 
 - Plus the usual shower of singleton patches - please see the individual
   changelogs for details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ+nuqwAKCRDdBJ7gKXxA
 jtNqAQDxqJpjWkzn4yN9CNSs1ivVx3fr6SqazlYCrt3u89WQvwEA1oRrGpETzUGq
 r6khQUIcQImPPcjFqEFpuiSOU0MBZA0=
 =Kii8
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2025-03-30-18-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - The series "powerpc/crash: use generic crashkernel reservation" from
   Sourabh Jain changes powerpc's kexec code to use more of the generic
   layers.

 - The series "get_maintainer: report subsystem status separately" from
   Vlastimil Babka makes some long-requested improvements to the
   get_maintainer output.

 - The series "ucount: Simplify refcounting with rcuref_t" from
   Sebastian Siewior cleans up and optimizing the refcounting in the
   ucount code.

 - The series "reboot: support runtime configuration of emergency
   hw_protection action" from Ahmad Fatoum improves the ability for a
   driver to perform an emergency system shutdown or reboot.

 - The series "Converge on using secs_to_jiffies() part two" from Easwar
   Hariharan performs further migrations from msecs_to_jiffies() to
   secs_to_jiffies().

 - The series "lib/interval_tree: add some test cases and cleanup" from
   Wei Yang permits more userspace testing of kernel library code, adds
   some more tests and performs some cleanups.

 - The series "hung_task: Dump the blocking task stacktrace" from Masami
   Hiramatsu arranges for the hung_task detector to dump the stack of
   the blocking task and not just that of the blocked task.

 - The series "resource: Split and use DEFINE_RES*() macros" from Andy
   Shevchenko provides some cleanups to the resource definition macros.

 - Plus the usual shower of singleton patches - please see the
   individual changelogs for details.

* tag 'mm-nonmm-stable-2025-03-30-18-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (77 commits)
  mailmap: consolidate email addresses of Alexander Sverdlin
  fs/procfs: fix the comment above proc_pid_wchan()
  relay: use kasprintf() instead of fixed buffer formatting
  resource: replace open coded variant of DEFINE_RES()
  resource: replace open coded variants of DEFINE_RES_*_NAMED()
  resource: replace open coded variant of DEFINE_RES_NAMED_DESC()
  resource: split DEFINE_RES_NAMED_DESC() out of DEFINE_RES_NAMED()
  samples: add hung_task detector mutex blocking sample
  hung_task: show the blocker task if the task is hung on mutex
  kexec_core: accept unaccepted kexec segments' destination addresses
  watchdog/perf: optimize bytes copied and remove manual NUL-termination
  lib/interval_tree: fix the comment of interval_tree_span_iter_next_gap()
  lib/interval_tree: skip the check before go to the right subtree
  lib/interval_tree: add test case for span iteration
  lib/interval_tree: add test case for interval_tree_iter_xxx() helpers
  lib/rbtree: add random seed
  lib/rbtree: split tests
  lib/rbtree: enable userland test suite for rbtree related data structure
  checkpatch: describe --min-conf-desc-length
  scripts/gdb/symbols: determine KASLR offset on s390
  ...
2025-04-01 10:06:52 -07:00
Linus Torvalds
eb0ece1602 - The 6 patch series "Enable strict percpu address space checks" from
Uros Bizjak uses x86 named address space qualifiers to provide
   compile-time checking of percpu area accesses.
 
   This has caused a small amount of fallout - two or three issues were
   reported.  In all cases the calling code was founf to be incorrect.
 
 - The 4 patch series "Some cleanup for memcg" from Chen Ridong
   implements some relatively monir cleanups for the memcontrol code.
 
 - The 17 patch series "mm: fixes for device-exclusive entries (hmm)"
   from David Hildenbrand fixes a boatload of issues which David found then
   using device-exclusive PTE entries when THP is enabled.  More work is
   needed, but this makes thins better - our own HMM selftests now succeed.
 
 - The 2 patch series "mm: zswap: remove z3fold and zbud" from Yosry
   Ahmed remove the z3fold and zbud implementations.  They have been
   deprecated for half a year and nobody has complained.
 
 - The 5 patch series "mm: further simplify VMA merge operation" from
   Lorenzo Stoakes implements numerous simplifications in this area.  No
   runtime effects are anticipated.
 
 - The 4 patch series "mm/madvise: remove redundant mmap_lock operations
   from process_madvise()" from SeongJae Park rationalizes the locking in
   the madvise() implementation.  Performance gains of 20-25% were observed
   in one MADV_DONTNEED microbenchmark.
 
 - The 12 patch series "Tiny cleanup and improvements about SWAP code"
   from Baoquan He contains a number of touchups to issues which Baoquan
   noticed when working on the swap code.
 
 - The 2 patch series "mm: kmemleak: Usability improvements" from Catalin
   Marinas implements a couple of improvements to the kmemleak user-visible
   output.
 
 - The 2 patch series "mm/damon/paddr: fix large folios access and
   schemes handling" from Usama Arif provides a couple of fixes for DAMON's
   handling of large folios.
 
 - The 3 patch series "mm/damon/core: fix wrong and/or useless
   damos_walk() behaviors" from SeongJae Park fixes a few issues with the
   accuracy of kdamond's walking of DAMON regions.
 
 - The 3 patch series "expose mapping wrprotect, fix fb_defio use" from
   Lorenzo Stoakes changes the interaction between framebuffer deferred-io
   and core MM.  No functional changes are anticipated - this is
   preparatory work for the future removal of page structure fields.
 
 - The 4 patch series "mm/damon: add support for hugepage_size DAMOS
   filter" from Usama Arif adds a DAMOS filter which permits the filtering
   by huge page sizes.
 
 - The 4 patch series "mm: permit guard regions for file-backed/shmem
   mappings" from Lorenzo Stoakes extends the guard region feature from its
   present "anon mappings only" state.  The feature now covers shmem and
   file-backed mappings.
 
 - The 4 patch series "mm: batched unmap lazyfree large folios during
   reclamation" from Barry Song cleans up and speeds up the unmapping for
   pte-mapped large folios.
 
 - The 18 patch series "reimplement per-vma lock as a refcount" from
   Suren Baghdasaryan puts the vm_lock back into the vma.  Our reasons for
   pulling it out were largely bogus and that change made the code more
   messy.  This patchset provides small (0-10%) improvements on one
   microbenchmark.
 
 - The 5 patch series "Docs/mm/damon: misc DAMOS filters documentation
   fixes and improves" from SeongJae Park does some maintenance work on the
   DAMON docs.
 
 - The 27 patch series "hugetlb/CMA improvements for large systems" from
   Frank van der Linden addresses a pile of issues which have been observed
   when using CMA on large machines.
 
 - The 2 patch series "mm/damon: introduce DAMOS filter type for unmapped
   pages" from SeongJae Park enables users of DMAON/DAMOS to filter my the
   page's mapped/unmapped status.
 
 - The 19 patch series "zsmalloc/zram: there be preemption" from Sergey
   Senozhatsky teaches zram to run its compression and decompression
   operations preemptibly.
 
 - The 12 patch series "selftests/mm: Some cleanups from trying to run
   them" from Brendan Jackman fixes a pile of unrelated issues which
   Brendan encountered while runnimg our selftests.
 
 - The 2 patch series "fs/proc/task_mmu: add guard region bit to pagemap"
   from Lorenzo Stoakes permits userspace to use /proc/pid/pagemap to
   determine whether a particular page is a guard page.
 
 - The 7 patch series "mm, swap: remove swap slot cache" from Kairui Song
   removes the swap slot cache from the allocation path - it simply wasn't
   being effective.
 
 - The 5 patch series "mm: cleanups for device-exclusive entries (hmm)"
   from David Hildenbrand implements a number of unrelated cleanups in this
   code.
 
 - The 5 patch series "mm: Rework generic PTDUMP configs" from Anshuman
   Khandual implements a number of preparatoty cleanups to the
   GENERIC_PTDUMP Kconfig logic.
 
 - The 8 patch series "mm/damon: auto-tune aggregation interval" from
   SeongJae Park implements a feedback-driven automatic tuning feature for
   DAMON's aggregation interval tuning.
 
 - The 5 patch series "Fix lazy mmu mode" from Ryan Roberts fixes some
   issues in powerpc, sparc and x86 lazy MMU implementations.  Ryan did
   this in preparation for implementing lazy mmu mode for arm64 to optimize
   vmalloc.
 
 - The 2 patch series "mm/page_alloc: Some clarifications for migratetype
   fallback" from Brendan Jackman reworks some commentary to make the code
   easier to follow.
 
 - The 3 patch series "page_counter cleanup and size reduction" from
   Shakeel Butt cleans up the page_counter code and fixes a size increase
   which we accidentally added late last year.
 
 - The 3 patch series "Add a command line option that enables control of
   how many threads should be used to allocate huge pages" from Thomas
   Prescher does that.  It allows the careful operator to significantly
   reduce boot time by tuning the parallalization of huge page
   initialization.
 
 - The 3 patch series "Fix calculations in trace_balance_dirty_pages()
   for cgwb" from Tang Yizhou fixes the tracing output from the dirty page
   balancing code.
 
 - The 9 patch series "mm/damon: make allow filters after reject filters
   useful and intuitive" from SeongJae Park improves the handling of allow
   and reject filters.  Behaviour is made more consistent and the
   documention is updated accordingly.
 
 - The 5 patch series "Switch zswap to object read/write APIs" from Yosry
   Ahmed updates zswap to the new object read/write APIs and thus permits
   the removal of some legacy code from zpool and zsmalloc.
 
 - The 6 patch series "Some trivial cleanups for shmem" from Baolin Wang
   does as it claims.
 
 - The 20 patch series "fs/dax: Fix ZONE_DEVICE page reference counts"
   from Alistair Popple regularizes the weird ZONE_DEVICE page refcount
   handling in DAX, permittig the removal of a number of special-case
   checks.
 
 - The 4 patch series "refactor mremap and fix bug" from Lorenzo Stoakes
   is a preparatoty refactoring and cleanup of the mremap() code.
 
 - The 20 patch series "mm: MM owner tracking for large folios (!hugetlb)
   + CONFIG_NO_PAGE_MAPCOUNT" from David Hildenbrand reworks the manner in
   which we determine whether a large folio is known to be mapped
   exclusively into a single MM.
 
 - The 8 patch series "mm/damon: add sysfs dirs for managing DAMOS
   filters based on handling layers" from SeongJae Park adds a couple of
   new sysfs directories to ease the management of DAMON/DAMOS filters.
 
 - The 13 patch series "arch, mm: reduce code duplication in mem_init()"
   from Mike Rapoport consolidates many per-arch implementations of
   mem_init() into code generic code, where that is practical.
 
 - The 13 patch series "mm/damon/sysfs: commit parameters online via
   damon_call()" from SeongJae Park continues the cleaning up of sysfs
   access to DAMON internal data.
 
 - The 3 patch series "mm: page_ext: Introduce new iteration API" from
   Luiz Capitulino reworks the page_ext initialization to fix a boot-time
   crash which was observed with an unusual combination of compile and
   cmdline options.
 
 - The 8 patch series "Buddy allocator like (or non-uniform) folio split"
   from Zi Yan reworks the code to split a folio into smaller folios.  The
   main benefit is lessened memory consumption: fewer post-split folios are
   generated.
 
 - The 2 patch series "Minimize xa_node allocation during xarry split"
   from Zi Yan reduces the number of xarray xa_nodes which are generated
   during an xarray split.
 
 - The 2 patch series "drivers/base/memory: Two cleanups" from Gavin Shan
   performs some maintenance work on the drivers/base/memory code.
 
 - The 3 patch series "Add tracepoints for lowmem reserves, watermarks
   and totalreserve_pages" from Martin Liu adds some more tracepoints to
   the page allocator code.
 
 - The 4 patch series "mm/madvise: cleanup requests validations and
   classifications" from SeongJae Park cleans up some warts which SeongJae
   observed during his earlier madvise work.
 
 - The 3 patch series "mm/hwpoison: Fix regressions in memory failure
   handling" from Shuai Xue addresses two quite serious regressions which
   Shuai has observed in the memory-failure implementation.
 
 - The 5 patch series "mm: reliable huge page allocator" from Johannes
   Weiner makes huge page allocations cheaper and more reliable by reducing
   fragmentation.
 
 - The 5 patch series "Minor memcg cleanups & prep for memdescs" from
   Matthew Wilcox is preparatory work for the future implementation of
   memdescs.
 
 - The 4 patch series "track memory used by balloon drivers" from Nico
   Pache introduces a way to track memory used by our various balloon
   drivers.
 
 - The 2 patch series "mm/damon: introduce DAMOS filter type for active
   pages" from Nhat Pham permits users to filter for active/inactive pages,
   separately for file and anon pages.
 
 - The 2 patch series "Adding Proactive Memory Reclaim Statistics" from
   Hao Jia separates the proactive reclaim statistics from the direct
   reclaim statistics.
 
 - The 2 patch series "mm/vmscan: don't try to reclaim hwpoison folio"
   from Jinjiang Tu fixes our handling of hwpoisoned pages within the
   reclaim code.
 -----BEGIN PGP SIGNATURE-----
 
 iHQEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ+nZaAAKCRDdBJ7gKXxA
 jsOWAPiP4r7CJHMZRK4eyJOkvS1a1r+TsIarrFZtjwvf/GIfAQCEG+JDxVfUaUSF
 Ee93qSSLR1BkNdDw+931Pu0mXfbnBw==
 =Pn2K
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-03-30-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - The series "Enable strict percpu address space checks" from Uros
   Bizjak uses x86 named address space qualifiers to provide
   compile-time checking of percpu area accesses.

   This has caused a small amount of fallout - two or three issues were
   reported. In all cases the calling code was found to be incorrect.

 - The series "Some cleanup for memcg" from Chen Ridong implements some
   relatively monir cleanups for the memcontrol code.

 - The series "mm: fixes for device-exclusive entries (hmm)" from David
   Hildenbrand fixes a boatload of issues which David found then using
   device-exclusive PTE entries when THP is enabled. More work is
   needed, but this makes thins better - our own HMM selftests now
   succeed.

 - The series "mm: zswap: remove z3fold and zbud" from Yosry Ahmed
   remove the z3fold and zbud implementations. They have been deprecated
   for half a year and nobody has complained.

 - The series "mm: further simplify VMA merge operation" from Lorenzo
   Stoakes implements numerous simplifications in this area. No runtime
   effects are anticipated.

 - The series "mm/madvise: remove redundant mmap_lock operations from
   process_madvise()" from SeongJae Park rationalizes the locking in the
   madvise() implementation. Performance gains of 20-25% were observed
   in one MADV_DONTNEED microbenchmark.

 - The series "Tiny cleanup and improvements about SWAP code" from
   Baoquan He contains a number of touchups to issues which Baoquan
   noticed when working on the swap code.

 - The series "mm: kmemleak: Usability improvements" from Catalin
   Marinas implements a couple of improvements to the kmemleak
   user-visible output.

 - The series "mm/damon/paddr: fix large folios access and schemes
   handling" from Usama Arif provides a couple of fixes for DAMON's
   handling of large folios.

 - The series "mm/damon/core: fix wrong and/or useless damos_walk()
   behaviors" from SeongJae Park fixes a few issues with the accuracy of
   kdamond's walking of DAMON regions.

 - The series "expose mapping wrprotect, fix fb_defio use" from Lorenzo
   Stoakes changes the interaction between framebuffer deferred-io and
   core MM. No functional changes are anticipated - this is preparatory
   work for the future removal of page structure fields.

 - The series "mm/damon: add support for hugepage_size DAMOS filter"
   from Usama Arif adds a DAMOS filter which permits the filtering by
   huge page sizes.

 - The series "mm: permit guard regions for file-backed/shmem mappings"
   from Lorenzo Stoakes extends the guard region feature from its
   present "anon mappings only" state. The feature now covers shmem and
   file-backed mappings.

 - The series "mm: batched unmap lazyfree large folios during
   reclamation" from Barry Song cleans up and speeds up the unmapping
   for pte-mapped large folios.

 - The series "reimplement per-vma lock as a refcount" from Suren
   Baghdasaryan puts the vm_lock back into the vma. Our reasons for
   pulling it out were largely bogus and that change made the code more
   messy. This patchset provides small (0-10%) improvements on one
   microbenchmark.

 - The series "Docs/mm/damon: misc DAMOS filters documentation fixes and
   improves" from SeongJae Park does some maintenance work on the DAMON
   docs.

 - The series "hugetlb/CMA improvements for large systems" from Frank
   van der Linden addresses a pile of issues which have been observed
   when using CMA on large machines.

 - The series "mm/damon: introduce DAMOS filter type for unmapped pages"
   from SeongJae Park enables users of DMAON/DAMOS to filter my the
   page's mapped/unmapped status.

 - The series "zsmalloc/zram: there be preemption" from Sergey
   Senozhatsky teaches zram to run its compression and decompression
   operations preemptibly.

 - The series "selftests/mm: Some cleanups from trying to run them" from
   Brendan Jackman fixes a pile of unrelated issues which Brendan
   encountered while runnimg our selftests.

 - The series "fs/proc/task_mmu: add guard region bit to pagemap" from
   Lorenzo Stoakes permits userspace to use /proc/pid/pagemap to
   determine whether a particular page is a guard page.

 - The series "mm, swap: remove swap slot cache" from Kairui Song
   removes the swap slot cache from the allocation path - it simply
   wasn't being effective.

 - The series "mm: cleanups for device-exclusive entries (hmm)" from
   David Hildenbrand implements a number of unrelated cleanups in this
   code.

 - The series "mm: Rework generic PTDUMP configs" from Anshuman Khandual
   implements a number of preparatoty cleanups to the GENERIC_PTDUMP
   Kconfig logic.

 - The series "mm/damon: auto-tune aggregation interval" from SeongJae
   Park implements a feedback-driven automatic tuning feature for
   DAMON's aggregation interval tuning.

 - The series "Fix lazy mmu mode" from Ryan Roberts fixes some issues in
   powerpc, sparc and x86 lazy MMU implementations. Ryan did this in
   preparation for implementing lazy mmu mode for arm64 to optimize
   vmalloc.

 - The series "mm/page_alloc: Some clarifications for migratetype
   fallback" from Brendan Jackman reworks some commentary to make the
   code easier to follow.

 - The series "page_counter cleanup and size reduction" from Shakeel
   Butt cleans up the page_counter code and fixes a size increase which
   we accidentally added late last year.

 - The series "Add a command line option that enables control of how
   many threads should be used to allocate huge pages" from Thomas
   Prescher does that. It allows the careful operator to significantly
   reduce boot time by tuning the parallalization of huge page
   initialization.

 - The series "Fix calculations in trace_balance_dirty_pages() for cgwb"
   from Tang Yizhou fixes the tracing output from the dirty page
   balancing code.

 - The series "mm/damon: make allow filters after reject filters useful
   and intuitive" from SeongJae Park improves the handling of allow and
   reject filters. Behaviour is made more consistent and the documention
   is updated accordingly.

 - The series "Switch zswap to object read/write APIs" from Yosry Ahmed
   updates zswap to the new object read/write APIs and thus permits the
   removal of some legacy code from zpool and zsmalloc.

 - The series "Some trivial cleanups for shmem" from Baolin Wang does as
   it claims.

 - The series "fs/dax: Fix ZONE_DEVICE page reference counts" from
   Alistair Popple regularizes the weird ZONE_DEVICE page refcount
   handling in DAX, permittig the removal of a number of special-case
   checks.

 - The series "refactor mremap and fix bug" from Lorenzo Stoakes is a
   preparatoty refactoring and cleanup of the mremap() code.

 - The series "mm: MM owner tracking for large folios (!hugetlb) +
   CONFIG_NO_PAGE_MAPCOUNT" from David Hildenbrand reworks the manner in
   which we determine whether a large folio is known to be mapped
   exclusively into a single MM.

 - The series "mm/damon: add sysfs dirs for managing DAMOS filters based
   on handling layers" from SeongJae Park adds a couple of new sysfs
   directories to ease the management of DAMON/DAMOS filters.

 - The series "arch, mm: reduce code duplication in mem_init()" from
   Mike Rapoport consolidates many per-arch implementations of
   mem_init() into code generic code, where that is practical.

 - The series "mm/damon/sysfs: commit parameters online via
   damon_call()" from SeongJae Park continues the cleaning up of sysfs
   access to DAMON internal data.

 - The series "mm: page_ext: Introduce new iteration API" from Luiz
   Capitulino reworks the page_ext initialization to fix a boot-time
   crash which was observed with an unusual combination of compile and
   cmdline options.

 - The series "Buddy allocator like (or non-uniform) folio split" from
   Zi Yan reworks the code to split a folio into smaller folios. The
   main benefit is lessened memory consumption: fewer post-split folios
   are generated.

 - The series "Minimize xa_node allocation during xarry split" from Zi
   Yan reduces the number of xarray xa_nodes which are generated during
   an xarray split.

 - The series "drivers/base/memory: Two cleanups" from Gavin Shan
   performs some maintenance work on the drivers/base/memory code.

 - The series "Add tracepoints for lowmem reserves, watermarks and
   totalreserve_pages" from Martin Liu adds some more tracepoints to the
   page allocator code.

 - The series "mm/madvise: cleanup requests validations and
   classifications" from SeongJae Park cleans up some warts which
   SeongJae observed during his earlier madvise work.

 - The series "mm/hwpoison: Fix regressions in memory failure handling"
   from Shuai Xue addresses two quite serious regressions which Shuai
   has observed in the memory-failure implementation.

 - The series "mm: reliable huge page allocator" from Johannes Weiner
   makes huge page allocations cheaper and more reliable by reducing
   fragmentation.

 - The series "Minor memcg cleanups & prep for memdescs" from Matthew
   Wilcox is preparatory work for the future implementation of memdescs.

 - The series "track memory used by balloon drivers" from Nico Pache
   introduces a way to track memory used by our various balloon drivers.

 - The series "mm/damon: introduce DAMOS filter type for active pages"
   from Nhat Pham permits users to filter for active/inactive pages,
   separately for file and anon pages.

 - The series "Adding Proactive Memory Reclaim Statistics" from Hao Jia
   separates the proactive reclaim statistics from the direct reclaim
   statistics.

 - The series "mm/vmscan: don't try to reclaim hwpoison folio" from
   Jinjiang Tu fixes our handling of hwpoisoned pages within the reclaim
   code.

* tag 'mm-stable-2025-03-30-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (431 commits)
  mm/page_alloc: remove unnecessary __maybe_unused in order_to_pindex()
  x86/mm: restore early initialization of high_memory for 32-bits
  mm/vmscan: don't try to reclaim hwpoison folio
  mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper
  cgroup: docs: add pswpin and pswpout items in cgroup v2 doc
  mm: vmscan: split proactive reclaim statistics from direct reclaim statistics
  selftests/mm: speed up split_huge_page_test
  selftests/mm: uffd-unit-tests support for hugepages > 2M
  docs/mm/damon/design: document active DAMOS filter type
  mm/damon: implement a new DAMOS filter type for active pages
  fs/dax: don't disassociate zero page entries
  MM documentation: add "Unaccepted" meminfo entry
  selftests/mm: add commentary about 9pfs bugs
  fork: use __vmalloc_node() for stack allocation
  docs/mm: Physical Memory: Populate the "Zones" section
  xen: balloon: update the NR_BALLOON_PAGES state
  hv_balloon: update the NR_BALLOON_PAGES state
  balloon_compaction: update the NR_BALLOON_PAGES state
  meminfo: add a per node counter for balloon drivers
  mm: remove references to folio in __memcg_kmem_uncharge_page()
  ...
2025-04-01 09:29:18 -07:00
Charlie Jenkins
6ee928185a
riscv: Add norvc after .option arch in runtime const
.option arch clobbers .option norvc. Prevent gas from emitting
compressed instructions in the runtime const alternative blocks by
setting .option norvc after .option arch. This issue starts appearing on
gcc 15, which adds zca to the march.

Reported by: Klara Modin <klarasmodin@gmail.com>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Fixes: a44fb57221 ("riscv: Add runtime constant support")
Closes: https://lore.kernel.org/all/cc8f3525-20b7-445b-877b-2add28a160a2@gmail.com/
Tested-by: Klara Modin <klarasmodin@gmail.com>
Link: https://lore.kernel.org/r/20250331-fix_runtime_const_norvc-v1-1-89bc62687ab8@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01 07:09:21 +00:00
Alexandre Ghiti
8a2f20ac8e
riscv: Make sure toolchain supports zba before using zba instructions
Old toolchain like gcc 8.5.0 does not support zba, so we must check that
the toolchain supports this extension before using it in the kernel.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202503281836.8pntHm6I-lkp@intel.com/
Link: https://lore.kernel.org/r/20250328115422.253670-1-alexghiti@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01 07:07:13 +00:00
Björn Töpel
3f7023171d
riscv/purgatory: 4B align purgatory_start
When a crashkernel is launched on RISC-V, the entry to purgatory is
done by trapping via the stvec CSR. From riscv_kexec_norelocate():

  |  ...
  |  /*
  |   * Switch to physical addressing
  |   * This will also trigger a jump to CSR_STVEC
  |   * which in this case is the address of the new
  |   * kernel.
  |   */
  |  csrw    CSR_STVEC, a2
  |  csrw    CSR_SATP, zero

stvec requires that the address is 4B aligned, which was not the case,
e.g.:

  | Loaded purgatory at 0xffffc000
  | kexec_file: kexec_file_load: type:1, start:0xffffd232 head:0x4 flags:0x6

The address 0xffffd232 not 4B aligned.

Correct by adding proper function alignment.

With this change, crashkernels loaded with kexec-file will be able to
properly enter the purgatory.

Fixes: 736e30af58 ("RISC-V: Add purgatory")
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250328085313.1193815-1-bjorn@kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01 07:07:12 +00:00
Yao Zi
28093cfef5
riscv/kexec_file: Handle R_RISCV_64 in purgatory relocator
Commit 58ff537109 ("riscv: Omit optimized string routines when
using KASAN") introduced calls to EXPORT_SYMBOL() in assembly string
routines, which result in R_RISCV_64 relocations against
.export_symbol section. As these rountines are reused by RISC-V
purgatory and our relocator doesn't recognize these relocations, this
fails kexec-file-load with dmesg like

	[   11.344251] kexec_image: Unknown rela relocation: 2
	[   11.345972] kexec_image: Error loading purgatory ret=-8

Let's support R_RISCV_64 relocation to fix kexec on 64-bit RISC-V.
32-bit variant isn't covered since KEXEC_FILE and KEXEC_PURGATORY isn't
available.

Fixes: 58ff537109 ("riscv: Omit optimized string routines when using KASAN")
Signed-off-by: Yao Zi <ziyao@disroot.org>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20250326051445.55131-2-ziyao@disroot.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01 07:07:12 +00:00
Alexandre Ghiti
0049618433
Merge patch series "Add some validation for vector, vector crypto and fp stuff"
Conor Dooley <conor@kernel.org> says:

From: Conor Dooley <conor.dooley@microchip.com>

Yo,

This series is partly leveraging Clement's work adding a validate
callback in the extension detection code so that things like checking
for whether a vector crypto extension is usable can be done like:
	has_extension(<vector crypto>)
rather than
	has_vector() && has_extension(<vector crypto>)
which Eric pointed out was a poor design some months ago.

The rest of this is adding some requirements to the bindings that
prevent combinations of extensions disallowed by the ISA.

There's a bunch of over-long lines in here, but I thought that the
over-long lines were clearer than breaking them up.

Cheers,
Conor.

* patches from https://lore.kernel.org/r/20250312-abide-pancreas-3576b8c44d2c@spud:
  dt-bindings: riscv: document vector crypto requirements
  dt-bindings: riscv: add vector sub-extension dependencies
  dt-bindings: riscv: d requires f
  RISC-V: add f & d extension validation checks
  RISC-V: add vector crypto extension validation checks
  RISC-V: add vector extension validation checks

Link: https://lore.kernel.org/r/20250312-abide-pancreas-3576b8c44d2c@spud
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01 07:06:41 +00:00
Alexandre Ghiti
83d78ac677
riscv: Fix hugetlb retrieval of number of ptes in case of !present pte
Ryan sent a fix [1] for arm64 that applies to riscv too: in some hugetlb
functions, we must not use the pte value to get the size of a mapping
because the pte may not be present.

So use the already present size parameter for huge_pte_clear() and the
newly introduced size parameter for huge_ptep_get_and_clear(). And make
sure to gather A/D bits only on present ptes.

Fixes: 82a1a1f3bf ("riscv: mm: support Svnapot in hugetlb page")
Link: https://lore.kernel.org/all/20250217140419.1702389-1-ryan.roberts@arm.com/ [1]
Link: https://lore.kernel.org/r/20250317072551.572169-1-alexghiti@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01 07:03:03 +00:00
Yunhui Cui
a3313375e8
riscv: print hartid on bringup
Firmware randomly releases cores, so CPU numbers don't linearly map
to hartids. When the system has an exception, we care more about hartids.
Adding "dyndbg="file smpboot.c +p" loglevel=8" to the cmdline can output
the hartid.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250303083424.14309-1-cuiyunhui@bytedance.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01 07:03:03 +00:00
Charlie Jenkins
95c18b7ccd
riscv: Add norvc after .option arch in runtime const
.option arch clobbers .option norvc. Prevent gas from emitting
compressed instructions in the runtime const alternative blocks by
setting .option norvc after .option arch. This issue starts appearing on
gcc 15, which adds zca to the march.

Reported by: Klara Modin <klarasmodin@gmail.com>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Fixes: a44fb57221 ("riscv: Add runtime constant support")
Closes: https://lore.kernel.org/all/cc8f3525-20b7-445b-877b-2add28a160a2@gmail.com/
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250331-fix_runtime_const_norvc-v1-1-89bc62687ab8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-03-31 11:53:02 -07:00
Linus Torvalds
e5e0e6bebe This update includes the following changes:
API:
 
 - Remove legacy compression interface.
 - Improve scatterwalk API.
 - Add request chaining to ahash and acomp.
 - Add virtual address support to ahash and acomp.
 - Add folio support to acomp.
 - Remove NULL dst support from acomp.
 
 Algorithms:
 
 - Library options are fuly hidden (selected by kernel users only).
 - Add Kerberos5 algorithms.
 - Add VAES-based ctr(aes) on x86.
 - Ensure LZO respects output buffer length on compression.
 - Remove obsolete SIMD fallback code path from arm/ghash-ce.
 
 Drivers:
 
 - Add support for PCI device 0x1134 in ccp.
 - Add support for rk3588's standalone TRNG in rockchip.
 - Add Inside Secure SafeXcel EIP-93 crypto engine support in eip93.
 - Fix bugs in tegra uncovered by multi-threaded self-test.
 - Fix corner cases in hisilicon/sec2.
 
 Others:
 
 - Add SG_MITER_LOCAL to sg miter.
 - Convert ubifs, hibernate and xfrm_ipcomp from legacy API to acomp.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmfiQ9kACgkQxycdCkmx
 i6fFZg/9GWjC1FLEV66vNlYAIzFGwzwWdFGyQzXyP235Cphhm4qt9gx7P91N6Lvc
 pplVjNEeZHoP8lMw+AIeGc2cRhIwsvn8C+HA3tCBOoC1qSe8T9t7KHAgiRGd/0iz
 UrzVBFLYlR9i4tc0T5peyQwSctv8DfjWzduTmI3Ts8i7OQcfeVVgj3sGfWam7kjF
 1GJWIQH7aPzT8cwFtk8gAK1insuPPZelT1Ppl9kUeZe0XUibrP7Gb5G9simxXAyi
 B+nLCaJYS6Hc1f47cfR/qyZSeYQN35KTVrEoKb1pTYXfEtMv6W9fIvQVLJRYsqpH
 RUBdDJUseE+WckR6glX9USrh+Fv9d+HfsTXh1fhpApKU5sQJ7pDbUm4ge8p6htNG
 MIszbJPdqajYveRLuPUjFlUXaqomos8eT6BZA+RLHm1cogzEOm+5bjspbfRNAVPj
 x9KiDu5lXNiFj02v/MkLKUe3bnGIyVQnZNi7Rn0Rpxjv95tIjVpksZWMPJarxUC6
 5zdyM2I5X0Z9+teBpbfWyqfzSbAs/KpzV8S/xNvWDUT6NlpYGBeNXrCDTXcwJLAh
 PRW0w1EJUwsZbPi8GEh5jNzo/YK1cGsUKrihKv7YgqSSopMLI8e/WVr8nKZMVDFA
 O+6F6ec5lR7KsOIMGUqrBGFU1ccAeaLLvLK3H5J8//gMMg82Uik=
 =aQNt
 -----END PGP SIGNATURE-----

Merge tag 'v6.15-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Remove legacy compression interface
   - Improve scatterwalk API
   - Add request chaining to ahash and acomp
   - Add virtual address support to ahash and acomp
   - Add folio support to acomp
   - Remove NULL dst support from acomp

  Algorithms:
   - Library options are fuly hidden (selected by kernel users only)
   - Add Kerberos5 algorithms
   - Add VAES-based ctr(aes) on x86
   - Ensure LZO respects output buffer length on compression
   - Remove obsolete SIMD fallback code path from arm/ghash-ce

  Drivers:
   - Add support for PCI device 0x1134 in ccp
   - Add support for rk3588's standalone TRNG in rockchip
   - Add Inside Secure SafeXcel EIP-93 crypto engine support in eip93
   - Fix bugs in tegra uncovered by multi-threaded self-test
   - Fix corner cases in hisilicon/sec2

  Others:
   - Add SG_MITER_LOCAL to sg miter
   - Convert ubifs, hibernate and xfrm_ipcomp from legacy API to acomp"

* tag 'v6.15-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (187 commits)
  crypto: testmgr - Add multibuffer acomp testing
  crypto: acomp - Fix synchronous acomp chaining fallback
  crypto: testmgr - Add multibuffer hash testing
  crypto: hash - Fix synchronous ahash chaining fallback
  crypto: arm/ghash-ce - Remove SIMD fallback code path
  crypto: essiv - Replace memcpy() + NUL-termination with strscpy()
  crypto: api - Call crypto_alg_put in crypto_unregister_alg
  crypto: scompress - Fix incorrect stream freeing
  crypto: lib/chacha - remove unused arch-specific init support
  crypto: remove obsolete 'comp' compression API
  crypto: compress_null - drop obsolete 'comp' implementation
  crypto: cavium/zip - drop obsolete 'comp' implementation
  crypto: zstd - drop obsolete 'comp' implementation
  crypto: lzo - drop obsolete 'comp' implementation
  crypto: lzo-rle - drop obsolete 'comp' implementation
  crypto: lz4hc - drop obsolete 'comp' implementation
  crypto: lz4 - drop obsolete 'comp' implementation
  crypto: deflate - drop obsolete 'comp' implementation
  crypto: 842 - drop obsolete 'comp' implementation
  crypto: nx - Migrate to scomp API
  ...
2025-03-29 10:01:55 -07:00
Linus Torvalds
2f24482304 soc: devicetree updates for 6.15
There is new support for additional on-chip devices on Apple, Mediatek,
 Renesas, Rockchip, Samsung, Google, TI, ST, Nvidia and Amlogic devices.
 
 The Arm Morello reference platform gets a devicetree for booting in
 normal aarch64 mode. The hardware supports experimental CHERI support,
 which requires a modified kernel.
 
 The AMD (formerly Xilinx) Versal NET SoC gets added, this is a combined
 FPGA with Cortex-A78 CPUs in a SoC.
 
 Six new ST STM32MP2 SoC variants are added. Like the earlier STM32MP25,
 the MP211, MP213, MP215, MP231, MP233 and MP235 models are based on one
 or two Cortex-A35 cores but each feature a different set of I/O devices.
 
 Mediatek MT8370 is a minor variation of MT8390 with fewer CPU and
 GPU cores
 
 Apple T2 is the baseboard management controller on earlier Intel CPU
 based Macs, with 16 models now gaining initial support.
 
 All the above come with dts files for the reference boards. In
 addition, these boards are added for the SoCs that are already supported.
 
  - The Milk-V Jupiter board based on SpacemiT K1/M1
 
  - NetCube Systems Kumquat board based on the 32-bit Allwinner V3s SoC
 
  - Three boards based on 32-bit stm32mp1
 
  - 11 distinct board variants from Toradex and one from Variscite,
    all based on i.MX6
 
  - Google Pixel Pro 6 phone based on gs101 (Tensor)
 
  - Three additional variants of the i.MX8MP based "Skov" board
 
  - A second variant of the i.MX95 EVK board
 
  - Two boards based on Renesas SoCs
 
  - Four boards based the Rockchip RK35xx series, plus the RK3588
    "MNT Reform 2" laptop
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmfkN4kACgkQYKtH/8kJ
 Uied/g/+OA7VTWS9K9DMjmrDvrMn73GPRRuC3Se7NutHt5kQeQ8I0oFk3VILxiLx
 /Rck8imIac65ANejy7M0omPYbVak6lyC0PRT9v/gZT++3rQzbd5/MAXHyKftVk6V
 aEgIcFcbu3805dPf+ioIsyk5DshhlqRg0o3u9iMIJlzaykoLapSirnOW0/dcbz8V
 kZlxuJRkQopx/+7/sbOtKXcL6ybif9jmgflQHUdGBT7Z7lAnhnIceelfKWPCQPki
 TrQXG5YITJXEebyM399SwnAOl9pXs6z4KSYsULAfCONKOGwIXmcD1wDhZOtDUvoH
 tu/V07R70fbEhJMTY5eZEc4Ym0firfk3IU3VRGxxOZJEuBOXN49snZzutNZEyqjX
 WOjy4jgHTHhQQKc7++mB+5G7zfBPI2gcMnLWgU18Bq8LRcdLGOYPfQ6+PvkMJI8K
 Vn/otYr80cSoJIPAmJZ9KvZiRmdqkK1YLxfOcNNjmFwk2YJxta7wmnbYX4g+zkkT
 ZdlU5+iQp0BQnd37m3WU+b4Ed60dH/g5bflj6TzXEYqiHY4Kxoud9Z7AQTPj4FIu
 PJiz+7J1p4a/uKK8BbVMjvnomARggdacrVDTB2mCfr6Av6c9RifAcREm0x6hZ2L8
 dTR5fnh8zJ7sbmFwoS6ncUp4hgqotfZ/fYfK91xJ9rpyucOpnTE=
 =6ZZ6
 -----END PGP SIGNATURE-----

Merge tag 'soc-dt-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC devicetree updates from Arnd Bergmann:
 "There is new support for additional on-chip devices on Apple,
  Mediatek, Renesas, Rockchip, Samsung, Google, TI, ST, Nvidia and
  Amlogic devices.

  The Arm Morello reference platform gets a devicetree for booting in
  normal aarch64 mode. The hardware supports experimental CHERI support,
  which requires a modified kernel.

  The AMD (formerly Xilinx) Versal NET SoC gets added, this is a
  combined FPGA with Cortex-A78 CPUs in a SoC.

  Six new ST STM32MP2 SoC variants are added. Like the earlier
  STM32MP25, the MP211, MP213, MP215, MP231, MP233 and MP235 models are
  based on one or two Cortex-A35 cores but each feature a different set
  of I/O devices.

  Mediatek MT8370 is a minor variation of MT8390 with fewer CPU and GPU
  cores

  Apple T2 is the baseboard management controller on earlier Intel CPU
  based Macs, with 16 models now gaining initial support.

  All the above come with dts files for the reference boards. In
  addition, these boards are added for the SoCs that are already
  supported:

   - The Milk-V Jupiter board based on SpacemiT K1/M1

   - NetCube Systems Kumquat board based on the 32-bit Allwinner V3s SoC

   - Three boards based on 32-bit stm32mp1

   - 11 distinct board variants from Toradex and one from Variscite, all
     based on i.MX6

   - Google Pixel Pro 6 phone based on gs101 (Tensor)

   - Three additional variants of the i.MX8MP based "Skov" board

   - A second variant of the i.MX95 EVK board

   - Two boards based on Renesas SoCs

   - Four boards based the Rockchip RK35xx series, plus the RK3588 'MNT
     Reform 2' laptop"

* tag 'soc-dt-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (538 commits)
  arm64: dts: Add gpio_intc node for Amlogic A5 SoCs
  arm64: dts: Add gpio_intc node for Amlogic A4 SoCs
  arm64: dts: hi3660: Add property for fixing CPUIdle
  arm64: dts: rockchip: remove ethm0_clk0_25m_out from Sige5 gmac0
  arm64: dts: marvell: Use preferred node names for "simple-bus"
  arm64: dts: marvell: Drop unused CP11X_TYPE define
  arm64: dts: marvell: Move arch timer and pmu nodes to top-level
  arm64: dts: rockchip: Fix PWM pinctrl names
  arm64: dts: rockchip: fix RK3576 SCMI clock IDs
  dt-bindings: clock: rk3576: add SCMI clocks
  arm64: dts: rockchip: Fix pcie reset gpio on Orange Pi 5 Max
  arm64: dts: amd/seattle: Drop undocumented "spi-controller" properties
  arm64: dts: amd/seattle: Fix bus, mmc, and ethernet node names
  arm64: dts: amd/seattle: Move and simplify fixed clocks
  arm64: dts: amd/seattle: Base Overdrive B1 on top of B0 version
  arm64: dts: rockchip: Enable HDMI audio output for ArmSoM Sige7
  arm64: dts: rockchip: Enable onboard eMMC on Radxa E20C
  arm64: dts: rockchip: Add SDHCI controller for RK3528
  arm64: dts: rockchip: Remove bluetooth node from rock-3a
  arm64: dts: rockchip: Move rk356x scmi SHMEM to reserved memory
  ...
2025-03-27 09:01:37 -07:00
Linus Torvalds
1a9239bb42 Networking changes for 6.15.
Core & protocols
 ----------------
 
  - Continue Netlink conversions to per-namespace RTNL lock
    (IPv4 routing, routing rules, routing next hops, ARP ioctls).
 
  - Continue extending the use of netdev instance locks. As a driver
    opt-in protect queue operations and (in due course) ethtool
    operations with the instance lock and not RTNL lock.
 
  - Support collecting TCP timestamps (data submitted, sent, acked)
    in BPF, allowing for transparent (to the application) and lower
    overhead tracking of TCP RPC performance.
 
  - Tweak existing networking Rx zero-copy infra to support zero-copy
    Rx via io_uring.
 
  - Optimize MPTCP performance in single subflow mode by 29%.
 
  - Enable GRO on packets which went thru XDP CPU redirect (were queued
    for processing on a different CPU). Improving TCP stream performance
    up to 2x.
 
  - Improve performance of contended connect() by 200% by searching
    for an available 4-tuple under RCU rather than a spin lock.
    Bring an additional 229% improvement by tweaking hash distribution.
 
  - Avoid unconditionally touching sk_tsflags on RX, improving
    performance under UDP flood by as much as 10%.
 
  - Avoid skb_clone() dance in ping_rcv() to improve performance under
    ping flood.
 
  - Avoid FIB lookup in netfilter if socket is available, 20% perf win.
 
  - Rework network device creation (in-kernel) API to more clearly
    identify network namespaces and their roles.
    There are up to 4 namespace roles but we used to have just 2 netns
    pointer arguments, interpreted differently based on context.
 
  - Use sysfs_break_active_protection() instead of trylock to avoid
    deadlocks between unregistering objects and sysfs access.
 
  - Add a new sysctl and sockopt for capping max retransmit timeout
    in TCP.
 
  - Support masking port and DSCP in routing rule matches.
 
  - Support dumping IPv4 multicast addresses with RTM_GETMULTICAST.
 
  - Support specifying at what time packet should be sent on AF_XDP
    sockets.
 
  - Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin users.
 
  - Add Netlink YAML spec for WiFi (nl80211) and conntrack.
 
  - Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols
    which only need to be exported when IPv6 support is built as a module.
 
  - Age FDB entries based on Rx not Tx traffic in VxLAN, similar
    to normal bridging.
 
  - Allow users to specify source port range for GENEVE tunnels.
 
  - netconsole: allow attaching kernel release, CPU ID and task name
    to messages as metadata
 
 Driver API
 ----------
 
  - Continue rework / fixing of Energy Efficient Ethernet (EEE) across
    the SW layers. Delegate the responsibilities to phylink where possible.
    Improve its handling in phylib.
 
  - Support symmetric OR-XOR RSS hashing algorithm.
 
  - Support tracking and preserving IRQ affinity by NAPI itself.
 
  - Support loopback mode speed selection for interface selftests.
 
 Device drivers
 --------------
 
  - Remove the IBM LCS driver for s390.
 
  - Remove the sb1000 cable modem driver.
 
  - Add support for SFP module access over SMBus.
 
  - Add MCTP transport driver for MCTP-over-USB.
 
  - Enable XDP metadata support in multiple drivers.
 
  - Ethernet high-speed NICs:
    - Broadcom (bnxt):
      - add PCIe TLP Processing Hints (TPH) support for new AMD platforms
      - support dumping RoCE queue state for debug
      - opt into instance locking
    - Intel (100G, ice, idpf):
      - ice: rework MSI-X IRQ management and distribution
      - ice: support for E830 devices
      - iavf: add support for Rx timestamping
      - iavf: opt into instance locking
    - nVidia/Mellanox:
      - mlx4: use page pool memory allocator for Rx
      - mlx5: support for one PTP device per hardware clock
      - mlx5: support for 200Gbps per-lane link modes
      - mlx5: move IPSec policy check after decryption
    - AMD/Solarflare:
      - support FW flashing via devlink
    - Cisco (enic):
      - use page pool memory allocator for Rx
      - enable 32, 64 byte CQEs
      - get max rx/tx ring size from the device
    - Meta (fbnic):
      - support flow steering and RSS configuration
      - report queue stats
      - support TCP segmentation
      - support IRQ coalescing
      - support ring size configuration
    - Marvell/Cavium:
      - support AF_XDP
    - Wangxun:
      - support for PTP clock and timestamping
    - Huawei (hibmcge):
      - checksum offload
      - add more statistics
 
  - Ethernet virtual:
    - VirtIO net:
      - aggressively suppress Tx completions, improve perf by 96% with
        1 CPU and 55% with 2 CPUs
      - expose NAPI to IRQ mapping and persist NAPI settings
    - Google (gve):
      - support XDP in DQO RDA Queue Format
      - opt into instance locking
    - Microsoft vNIC:
      - support BIG TCP
 
  - Ethernet NICs consumer, and embedded:
    - Synopsys (stmmac):
      - cleanup Tx and Tx clock setting and other link-focused cleanups
      - enable SGMII and 2500BASEX mode switching for Intel platforms
      - support Sophgo SG2044
    - Broadcom switches (b53):
      - support for BCM53101
    - TI:
      - iep: add perout configuration support
      - icssg: support XDP
    - Cadence (macb):
      - implement BQL
    - Xilinx (axinet):
      - support dynamic IRQ moderation and changing coalescing at runtime
      - implement BQL
      - report standard stats
    - MediaTek:
      - support phylink managed EEE
    - Intel:
      - igc: don't restart the interface on every XDP program change
    - RealTek (r8169):
      - support reading registers of internal PHYs directly
      - increase max jumbo packet size on RTL8125/RTL8126
    - Airoha:
      - support for RISC-V NPU packet processing unit
      - enable scatter-gather and support MTU up to 9kB
    - Tehuti (tn40xx):
      - support cards with TN4010 MAC and an Aquantia AQR105 PHY
 
  - Ethernet PHYs:
    - support for TJA1102S, TJA1121
    - dp83tg720: add randomized polling intervals for link detection
    - dp83822: support changing the transmit amplitude voltage
    - support for LEDs on 88q2xxx
 
  - CAN:
    - canxl: support Remote Request Substitution bit access
    - flexcan: add S32G2/S32G3 SoC
 
  - WiFi:
    - remove cooked monitor support
    - strict mode for better AP testing
    - basic EPCS support
    - OMI RX bandwidth reduction support
    - batman-adv: add support for jumbo frames
 
  - WiFi drivers:
    - RealTek (rtw88):
      - support RTL8814AE and RTL8814AU
    - RealTek (rtw89):
      - switch using wiphy_lock and wiphy_work
      - add BB context to manipulate two PHY as preparation of MLO
      - improve BT-coexistence mechanism to play A2DP smoothly
    - Intel (iwlwifi):
      - add new iwlmld sub-driver for latest HW/FW combinations
    - MediaTek (mt76):
      - preparation for mt7996 Multi-Link Operation (MLO) support
    - Qualcomm/Atheros (ath12k):
      - continued work on MLO
    - Silabs (wfx):
      - Wake-on-WLAN support
 
  - Bluetooth:
    - add support for skb TX SND/COMPLETION timestamping
    - hci_core: enable buffer flow control for SCO/eSCO
    - coredump: log devcd dumps into the monitor
 
  - Bluetooth drivers:
    - intel: add support to configure TX power
    - nxp: handle bootloader error during cmd5 and cmd7
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmfkLC8ACgkQMUZtbf5S
 Irsb5g/+L7oKOf0ALbaV9kxFsoz8AymZfAW9i/27F07omGJGpks8oX6j6rQLgIRO
 OQOFcp7XEdDh1+jh82gHVuPrw2/6lchLtW8ARtzdiQKFr5DRjrsbtua6GRc8iBqA
 DIRCBFoV2HuMkF39Vr09HMa9AZAT7QR2RLsRGpSq8E8Z8xxKz0X7oujs10PFpMTE
 IVKhTrVrk+NDot/IU2hzVpnpup+0ld+T2/ZaBklJGcU8uDffImsqNepHRyCG5UC3
 xz74Ju23MAj24Gct+og0yFUooF+lUltKyVm0FYCDCY3bASTwgY01NR3kEH/0NQvM
 cywLzd/ngHm/SMD2ggVAHkjZUieiIVHdaZ53dgjDeBOQoVP6p0dgUK7EumXX8Mx4
 8ReR2UiGoYRPaq9c4o+IjG4K027MwVK2p+mF1a6MLa+20XcyMbev8FIRbbHtC/V4
 z5/FsOAxcuICWkA1hU9bODrrGzIqemmdRgKG8sGuTJCt/kYGAn72/TCATGNSaCJ0
 00n2jN1aepa7wtywHJ5MhVzxN9iQX7+geUHXz0BI+lK4e1Pmk+vjGksymb9ai2fk
 eQAUV9ekub6q68/J16scD7XeOUM37bTLiMBQeIF8UtZBOJscKiS71zn9QP9Twwxv
 P2pm01RDZUI+z5ZX3hc12Pm1vjRHaAh9S1JpAw/pTOVlQ+mAJEM=
 =XY0S
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Continue Netlink conversions to per-namespace RTNL lock
     (IPv4 routing, routing rules, routing next hops, ARP ioctls)

   - Continue extending the use of netdev instance locks. As a driver
     opt-in protect queue operations and (in due course) ethtool
     operations with the instance lock and not RTNL lock.

   - Support collecting TCP timestamps (data submitted, sent, acked) in
     BPF, allowing for transparent (to the application) and lower
     overhead tracking of TCP RPC performance.

   - Tweak existing networking Rx zero-copy infra to support zero-copy
     Rx via io_uring.

   - Optimize MPTCP performance in single subflow mode by 29%.

   - Enable GRO on packets which went thru XDP CPU redirect (were queued
     for processing on a different CPU). Improving TCP stream
     performance up to 2x.

   - Improve performance of contended connect() by 200% by searching for
     an available 4-tuple under RCU rather than a spin lock. Bring an
     additional 229% improvement by tweaking hash distribution.

   - Avoid unconditionally touching sk_tsflags on RX, improving
     performance under UDP flood by as much as 10%.

   - Avoid skb_clone() dance in ping_rcv() to improve performance under
     ping flood.

   - Avoid FIB lookup in netfilter if socket is available, 20% perf win.

   - Rework network device creation (in-kernel) API to more clearly
     identify network namespaces and their roles. There are up to 4
     namespace roles but we used to have just 2 netns pointer arguments,
     interpreted differently based on context.

   - Use sysfs_break_active_protection() instead of trylock to avoid
     deadlocks between unregistering objects and sysfs access.

   - Add a new sysctl and sockopt for capping max retransmit timeout in
     TCP.

   - Support masking port and DSCP in routing rule matches.

   - Support dumping IPv4 multicast addresses with RTM_GETMULTICAST.

   - Support specifying at what time packet should be sent on AF_XDP
     sockets.

   - Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin
     users.

   - Add Netlink YAML spec for WiFi (nl80211) and conntrack.

   - Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols
     which only need to be exported when IPv6 support is built as a
     module.

   - Age FDB entries based on Rx not Tx traffic in VxLAN, similar to
     normal bridging.

   - Allow users to specify source port range for GENEVE tunnels.

   - netconsole: allow attaching kernel release, CPU ID and task name to
     messages as metadata

  Driver API:

   - Continue rework / fixing of Energy Efficient Ethernet (EEE) across
     the SW layers. Delegate the responsibilities to phylink where
     possible. Improve its handling in phylib.

   - Support symmetric OR-XOR RSS hashing algorithm.

   - Support tracking and preserving IRQ affinity by NAPI itself.

   - Support loopback mode speed selection for interface selftests.

  Device drivers:

   - Remove the IBM LCS driver for s390

   - Remove the sb1000 cable modem driver

   - Add support for SFP module access over SMBus

   - Add MCTP transport driver for MCTP-over-USB

   - Enable XDP metadata support in multiple drivers

   - Ethernet high-speed NICs:
      - Broadcom (bnxt):
         - add PCIe TLP Processing Hints (TPH) support for new AMD
           platforms
         - support dumping RoCE queue state for debug
         - opt into instance locking
      - Intel (100G, ice, idpf):
         - ice: rework MSI-X IRQ management and distribution
         - ice: support for E830 devices
         - iavf: add support for Rx timestamping
         - iavf: opt into instance locking
      - nVidia/Mellanox:
         - mlx4: use page pool memory allocator for Rx
         - mlx5: support for one PTP device per hardware clock
         - mlx5: support for 200Gbps per-lane link modes
         - mlx5: move IPSec policy check after decryption
      - AMD/Solarflare:
         - support FW flashing via devlink
      - Cisco (enic):
         - use page pool memory allocator for Rx
         - enable 32, 64 byte CQEs
         - get max rx/tx ring size from the device
      - Meta (fbnic):
         - support flow steering and RSS configuration
         - report queue stats
         - support TCP segmentation
         - support IRQ coalescing
         - support ring size configuration
      - Marvell/Cavium:
         - support AF_XDP
      - Wangxun:
         - support for PTP clock and timestamping
      - Huawei (hibmcge):
         - checksum offload
         - add more statistics

   - Ethernet virtual:
      - VirtIO net:
         - aggressively suppress Tx completions, improve perf by 96%
           with 1 CPU and 55% with 2 CPUs
         - expose NAPI to IRQ mapping and persist NAPI settings
      - Google (gve):
         - support XDP in DQO RDA Queue Format
         - opt into instance locking
      - Microsoft vNIC:
         - support BIG TCP

   - Ethernet NICs consumer, and embedded:
      - Synopsys (stmmac):
         - cleanup Tx and Tx clock setting and other link-focused
           cleanups
         - enable SGMII and 2500BASEX mode switching for Intel platforms
         - support Sophgo SG2044
      - Broadcom switches (b53):
         - support for BCM53101
      - TI:
         - iep: add perout configuration support
         - icssg: support XDP
      - Cadence (macb):
         - implement BQL
      - Xilinx (axinet):
         - support dynamic IRQ moderation and changing coalescing at
           runtime
         - implement BQL
         - report standard stats
      - MediaTek:
         - support phylink managed EEE
      - Intel:
         - igc: don't restart the interface on every XDP program change
      - RealTek (r8169):
         - support reading registers of internal PHYs directly
         - increase max jumbo packet size on RTL8125/RTL8126
      - Airoha:
         - support for RISC-V NPU packet processing unit
         - enable scatter-gather and support MTU up to 9kB
      - Tehuti (tn40xx):
         - support cards with TN4010 MAC and an Aquantia AQR105 PHY

   - Ethernet PHYs:
      - support for TJA1102S, TJA1121
      - dp83tg720: add randomized polling intervals for link detection
      - dp83822: support changing the transmit amplitude voltage
      - support for LEDs on 88q2xxx

   - CAN:
      - canxl: support Remote Request Substitution bit access
      - flexcan: add S32G2/S32G3 SoC

   - WiFi:
      - remove cooked monitor support
      - strict mode for better AP testing
      - basic EPCS support
      - OMI RX bandwidth reduction support
      - batman-adv: add support for jumbo frames

   - WiFi drivers:
      - RealTek (rtw88):
         - support RTL8814AE and RTL8814AU
      - RealTek (rtw89):
         - switch using wiphy_lock and wiphy_work
         - add BB context to manipulate two PHY as preparation of MLO
         - improve BT-coexistence mechanism to play A2DP smoothly
      - Intel (iwlwifi):
         - add new iwlmld sub-driver for latest HW/FW combinations
      - MediaTek (mt76):
         - preparation for mt7996 Multi-Link Operation (MLO) support
      - Qualcomm/Atheros (ath12k):
         - continued work on MLO
      - Silabs (wfx):
         - Wake-on-WLAN support

   - Bluetooth:
      - add support for skb TX SND/COMPLETION timestamping
      - hci_core: enable buffer flow control for SCO/eSCO
      - coredump: log devcd dumps into the monitor

   - Bluetooth drivers:
      - intel: add support to configure TX power
      - nxp: handle bootloader error during cmd5 and cmd7"

* tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1681 commits)
  unix: fix up for "apparmor: add fine grained af_unix mediation"
  mctp: Fix incorrect tx flow invalidation condition in mctp-i2c
  net: usb: asix: ax88772: Increase phy_name size
  net: phy: Introduce PHY_ID_SIZE — minimum size for PHY ID string
  net: libwx: fix Tx L4 checksum
  net: libwx: fix Tx descriptor content for some tunnel packets
  atm: Fix NULL pointer dereference
  net: tn40xx: add pci-id of the aqr105-based Tehuti TN4010 cards
  net: tn40xx: prepare tn40xx driver to find phy of the TN9510 card
  net: tn40xx: create swnode for mdio and aqr105 phy and add to mdiobus
  net: phy: aquantia: add essential functions to aqr105 driver
  net: phy: aquantia: search for firmware-name in fwnode
  net: phy: aquantia: add probe function to aqr105 for firmware loading
  net: phy: Add swnode support to mdiobus_scan
  gve: add XDP DROP and PASS support for DQ
  gve: update XDP allocation path support RX buffer posting
  gve: merge packet buffer size fields
  gve: update GQ RX to use buf_size
  gve: introduce config-based allocation for XDP
  gve: remove xdp_xsk_done and xdp_xsk_wakeup statistics
  ...
2025-03-26 21:48:21 -07:00
Palmer Dabbelt
f633de4aa4
Merge patch series "riscv: Relocatable NOMMU kernels"
Samuel Holland <samuel.holland@sifive.com> says:

Currently, RISC-V NOMMU kernels are linked at CONFIG_PAGE_OFFSET, and
since they are not relocatable, must be loaded at this address as well.
CONFIG_PAGE_OFFSET is not a user-visible Kconfig option, so its value is
not obvious, and users must patch the kernel source if they want to load
it at a different address.

Make NOMMU kernels more portable by making them relocatable by default.
This allows a single kernel binary to work when loaded at any address.

* b4-shazam-merge:
  riscv: Remove CONFIG_PAGE_OFFSET
  riscv: Support CONFIG_RELOCATABLE on riscv32
  asm-generic: Always define Elf_Rel and Elf_Rela
  riscv: Support CONFIG_RELOCATABLE on NOMMU
  riscv: Allow NOMMU kernels to access all of RAM
  riscv: Remove duplicate CONFIG_PAGE_OFFSET definition

Link: https://lore.kernel.org/r/20241026171441.3047904-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-03-26 15:56:49 -07:00
Samuel Holland
e1cf2d009b
riscv: Remove CONFIG_PAGE_OFFSET
The current definition of CONFIG_PAGE_OFFSET is problematic for a couple
of reasons:
 1) The value is misleading for normal 64-bit kernels, where it is
    overridden at runtime if Sv48 or Sv39 is chosen. This is especially
    the case for XIP kernels, which always use Sv39.
 2) The option is not user-visible, but for NOMMU kernels it must be a
    valid RAM address, and for !RELOCATABLE it must additionally be the
    exact address where the kernel is loaded.

Fix both of these by removing the option.
 1) For MMU kernels, drop the indirection through Kconfig. Additionally,
    for XIP, drop the indirection through kernel_map.
 2) For NOMMU kernels, use the user-visible physical RAM base if
    provided. Otherwise, force the kernel to be relocatable.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Jesse Taube <mr.bossman075@gmail.com>
Link: https://lore.kernel.org/r/20241026171441.3047904-7-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-03-26 15:56:46 -07:00
Samuel Holland
ea2bde36a4
riscv: Support CONFIG_RELOCATABLE on riscv32
When adjusted to use the correctly-sized ELF types, relocate_kernel()
works on riscv32 as well. The caveat about crossing an intermediate page
table boundary does not apply to riscv32, since for Sv32 the early
kernel mapping uses only PGD entries. Since KASLR is not yet supported
on riscv32, this option is mostly useful for NOMMU.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241026171441.3047904-6-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-03-26 15:56:45 -07:00
Samuel Holland
51b766c79a
riscv: Support CONFIG_RELOCATABLE on NOMMU
Move relocate_kernel() out of the CONFIG_MMU block so it can be called
from the NOMMU version of setup_vm(). Set some offsets in kernel_map so
relocate_kernel() does not need to be modified. Relocatable NOMMU
kernels can be loaded to any physical memory address; they no longer
depend on CONFIG_PAGE_OFFSET.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241026171441.3047904-4-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-03-26 15:56:41 -07:00
Samuel Holland
2c0391b29b
riscv: Allow NOMMU kernels to access all of RAM
NOMMU kernels currently cannot access memory below the kernel link
address. Remove this restriction by setting PAGE_OFFSET to the actual
start of RAM, as determined from the devicetree. The kernel link address
must be a constant, so keep using CONFIG_PAGE_OFFSET for that purpose.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Jesse Taube <mr.bossman075@gmail.com>
Link: https://lore.kernel.org/r/20241026171441.3047904-3-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-03-26 15:56:40 -07:00
Samuel Holland
bffada8201
riscv: Remove duplicate CONFIG_PAGE_OFFSET definition
This definition is already provided by include/generated/autoconf.h,
so it does not need to be provided on the command line.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Jesse Taube <mr.bossman075@gmail.com>
Link: https://lore.kernel.org/r/20241026171441.3047904-2-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-03-26 15:56:38 -07:00
Palmer Dabbelt
bb58e1579f
RISC-V: errata: Use medany for relocatable builds
We're trying to mix non-PIC/PIE objects into the otherwise-PIE
relocatable kernels, to avoid GOT/PLT references during early boot
alternative resolution (which happens before the GOT/PLT are set up).

riscv64-unknown-linux-gnu-ld: arch/riscv/errata/sifive/errata.o: relocation R_RISCV_HI20 against `tlb_flush_all_threshold' can not be used when making a shared object; recompile with -fPIC
riscv64-unknown-linux-gnu-ld: arch/riscv/errata/thead/errata.o: relocation R_RISCV_HI20 against `riscv_cbom_block_size' can not be used when making a shared object; recompile with -fPIC

Fixes: 8dc2a7e802 ("riscv: Fix relocatable kernels with early alternatives using -fno-pie")
Link: https://lore.kernel.org/r/20250326224506.27165-2-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-03-26 15:56:37 -07:00
Linus Torvalds
ee6740fd34 CRC updates for 6.15
Another set of improvements to the kernel's CRC (cyclic redundancy
 check) code:
 
 - Rework the CRC64 library functions to be directly optimized, like what
   I did last cycle for the CRC32 and CRC-T10DIF library functions.
 
 - Rewrite the x86 PCLMULQDQ-optimized CRC code, and add VPCLMULQDQ
   support and acceleration for crc64_be and crc64_nvme.
 
 - Rewrite the riscv Zbc-optimized CRC code, and add acceleration for
   crc_t10dif, crc64_be, and crc64_nvme.
 
 - Remove crc_t10dif and crc64_rocksoft from the crypto API, since they
   are no longer needed there.
 
 - Rename crc64_rocksoft to crc64_nvme, as the old name was incorrect.
 
 - Add kunit test cases for crc64_nvme and crc7.
 
 - Eliminate redundant functions for calculating the Castagnoli CRC32,
   settling on just crc32c().
 
 - Remove unnecessary prompts from some of the CRC kconfig options.
 
 - Further optimize the x86 crc32c code.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZ+CGGhQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK3wRAP4tbnzawUmlIHIF0hleoADXehUgAhMt
 NZn15mGvyiuwIQEA8W9qvnLdFXZkdxhxAEvDDFjyrRauL6eGtr/GvCx4AQY=
 =wmKG
 -----END PGP SIGNATURE-----

Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull CRC updates from Eric Biggers:
 "Another set of improvements to the kernel's CRC (cyclic redundancy
  check) code:

   - Rework the CRC64 library functions to be directly optimized, like
     what I did last cycle for the CRC32 and CRC-T10DIF library
     functions

   - Rewrite the x86 PCLMULQDQ-optimized CRC code, and add VPCLMULQDQ
     support and acceleration for crc64_be and crc64_nvme

   - Rewrite the riscv Zbc-optimized CRC code, and add acceleration for
     crc_t10dif, crc64_be, and crc64_nvme

   - Remove crc_t10dif and crc64_rocksoft from the crypto API, since
     they are no longer needed there

   - Rename crc64_rocksoft to crc64_nvme, as the old name was incorrect

   - Add kunit test cases for crc64_nvme and crc7

   - Eliminate redundant functions for calculating the Castagnoli CRC32,
     settling on just crc32c()

   - Remove unnecessary prompts from some of the CRC kconfig options

   - Further optimize the x86 crc32c code"

* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (36 commits)
  x86/crc: drop the avx10_256 functions and rename avx10_512 to avx512
  lib/crc: remove unnecessary prompt for CONFIG_CRC64
  lib/crc: remove unnecessary prompt for CONFIG_LIBCRC32C
  lib/crc: remove unnecessary prompt for CONFIG_CRC8
  lib/crc: remove unnecessary prompt for CONFIG_CRC7
  lib/crc: remove unnecessary prompt for CONFIG_CRC4
  lib/crc7: unexport crc7_be_syndrome_table
  lib/crc_kunit.c: update comment in crc_benchmark()
  lib/crc_kunit.c: add test and benchmark for crc7_be()
  x86/crc32: optimize tail handling for crc32c short inputs
  riscv/crc64: add Zbc optimized CRC64 functions
  riscv/crc-t10dif: add Zbc optimized CRC-T10DIF function
  riscv/crc32: reimplement the CRC32 functions using new template
  riscv/crc: add "template" for Zbc optimized CRC functions
  x86/crc: add ANNOTATE_NOENDBR to suppress objtool warnings
  x86/crc32: improve crc32c_arch() code generation with clang
  x86/crc64: implement crc64_be and crc64_nvme using new template
  x86/crc-t10dif: implement crc_t10dif using new template
  x86/crc32: implement crc32_le using new template
  x86/crc: add "template" for [V]PCLMULQDQ based CRC functions
  ...
2025-03-25 18:33:04 -07:00
Linus Torvalds
edb0e8f6e2 ARM:
* Nested virtualization support for VGICv3, giving the nested
 hypervisor control of the VGIC hardware when running an L2 VM
 
 * Removal of 'late' nested virtualization feature register masking,
   making the supported feature set directly visible to userspace
 
 * Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage
   of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers
 
 * Paravirtual interface for discovering the set of CPU implementations
   where a VM may run, addressing a longstanding issue of guest CPU
   errata awareness in big-little systems and cross-implementation VM
   migration
 
 * Userspace control of the registers responsible for identifying a
   particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1),
   allowing VMs to be migrated cross-implementation
 
 * pKVM updates, including support for tracking stage-2 page table
   allocations in the protected hypervisor in the 'SecPageTable' stat
 
 * Fixes to vPMU, ensuring that userspace updates to the vPMU after
   KVM_RUN are reflected into the backing perf events
 
 LoongArch:
 
 * Remove unnecessary header include path
 
 * Assume constant PGD during VM context switch
 
 * Add perf events support for guest VM
 
 RISC-V:
 
 * Disable the kernel perf counter during configure
 
 * KVM selftests improvements for PMU
 
 * Fix warning at the time of KVM module removal
 
 x86:
 
 * Add support for aging of SPTEs without holding mmu_lock.  Not taking mmu_lock
   allows multiple aging actions to run in parallel, and more importantly avoids
   stalling vCPUs.  This includes an implementation of per-rmap-entry locking;
   aging the gfn is done with only a per-rmap single-bin spinlock taken, whereas
   locking an rmap for write requires taking both the per-rmap spinlock and
   the mmu_lock.
 
   Note that this decreases slightly the accuracy of accessed-page information,
   because changes to the SPTE outside aging might not use atomic operations
   even if they could race against a clear of the Accessed bit.  This is
   deliberate because KVM and mm/ tolerate false positives/negatives for
   accessed information, and testing has shown that reducing the latency of
   aging is far more beneficial to overall system performance than providing
   "perfect" young/old information.
 
 * Defer runtime CPUID updates until KVM emulates a CPUID instruction, to
   coalesce updates when multiple pieces of vCPU state are changing, e.g. as
   part of a nested transition.
 
 * Fix a variety of nested emulation bugs, and add VMX support for synthesizing
   nested VM-Exit on interception (instead of injecting #UD into L2).
 
 * Drop "support" for async page faults for protected guests that do not set
   SEND_ALWAYS (i.e. that only want async page faults at CPL3)
 
 * Bring a bit of sanity to x86's VM teardown code, which has accumulated
   a lot of cruft over the years.  Particularly, destroy vCPUs before
   the MMU, despite the latter being a VM-wide operation.
 
 * Add common secure TSC infrastructure for use within SNP and in the
   future TDX
 
 * Block KVM_CAP_SYNC_REGS if guest state is protected.  It does not make
   sense to use the capability if the relevant registers are not
   available for reading or writing.
 
 * Don't take kvm->lock when iterating over vCPUs in the suspend notifier to
   fix a largely theoretical deadlock.
 
 * Use the vCPU's actual Xen PV clock information when starting the Xen timer,
   as the cached state in arch.hv_clock can be stale/bogus.
 
 * Fix a bug where KVM could bleed PVCLOCK_GUEST_STOPPED across different
   PV clocks; restrict PVCLOCK_GUEST_STOPPED to kvmclock, as KVM's suspend
   notifier only accounts for kvmclock, and there's no evidence that the
   flag is actually supported by Xen guests.
 
 * Clean up the per-vCPU "cache" of its reference pvclock, and instead only
   track the vCPU's TSC scaling (multipler+shift) metadata (which is moderately
   expensive to compute, and rarely changes for modern setups).
 
 * Don't write to the Xen hypercall page on MSR writes that are initiated by
   the host (userspace or KVM) to fix a class of bugs where KVM can write to
   guest memory at unexpected times, e.g. during vCPU creation if userspace has
   set the Xen hypercall MSR index to collide with an MSR that KVM emulates.
 
 * Restrict the Xen hypercall MSR index to the unofficial synthetic range to
   reduce the set of possible collisions with MSRs that are emulated by KVM
   (collisions can still happen as KVM emulates Hyper-V MSRs, which also reside
   in the synthetic range).
 
 * Clean up and optimize KVM's handling of Xen MSR writes and xen_hvm_config.
 
 * Update Xen TSC leaves during CPUID emulation instead of modifying the CPUID
   entries when updating PV clocks; there is no guarantee PV clocks will be
   updated between TSC frequency changes and CPUID emulation, and guest reads
   of the TSC leaves should be rare, i.e. are not a hot path.
 
 x86 (Intel):
 
 * Fix a bug where KVM unnecessarily reads XFD_ERR from hardware and thus
   modifies the vCPU's XFD_ERR on a #NM due to CR0.TS=1.
 
 * Pass XFD_ERR as the payload when injecting #NM, as a preparatory step
   for upcoming FRED virtualization support.
 
 * Decouple the EPT entry RWX protection bit macros from the EPT Violation
   bits, both as a general cleanup and in anticipation of adding support for
   emulating Mode-Based Execution Control (MBEC).
 
 * Reject KVM_RUN if userspace manages to gain control and stuff invalid guest
   state while KVM is in the middle of emulating nested VM-Enter.
 
 * Add a macro to handle KVM's sanity checks on entry/exit VMCS control pairs
   in anticipation of adding sanity checks for secondary exit controls (the
   primary field is out of bits).
 
 x86 (AMD):
 
 * Ensure the PSP driver is initialized when both the PSP and KVM modules are
   built-in (the initcall framework doesn't handle dependencies).
 
 * Use long-term pins when registering encrypted memory regions, so that the
   pages are migrated out of MIGRATE_CMA/ZONE_MOVABLE and don't lead to
   excessive fragmentation.
 
 * Add macros and helpers for setting GHCB return/error codes.
 
 * Add support for Idle HLT interception, which elides interception if the vCPU
   has a pending, unmasked virtual IRQ when HLT is executed.
 
 * Fix a bug in INVPCID emulation where KVM fails to check for a non-canonical
   address.
 
 * Don't attempt VMRUN for SEV-ES+ guests if the vCPU's VMSA is invalid, e.g.
   because the vCPU was "destroyed" via SNP's AP Creation hypercall.
 
 * Reject SNP AP Creation if the requested SEV features for the vCPU don't
   match the VM's configured set of features.
 
 Selftests:
 
 * Fix again the Intel PMU counters test; add a data load and do CLFLUSH{OPT} on the data
   instead of executing code.  The theory is that modern Intel CPUs have
   learned new code prefetching tricks that bypass the PMU counters.
 
 * Fix a flaw in the Intel PMU counters test where it asserts that an event is
   counting correctly without actually knowing what the event counts on the
   underlying hardware.
 
 * Fix a variety of flaws, bugs, and false failures/passes dirty_log_test, and
   improve its coverage by collecting all dirty entries on each iteration.
 
 * Fix a few minor bugs related to handling of stats FDs.
 
 * Add infrastructure to make vCPU and VM stats FDs available to tests by
   default (open the FDs during VM/vCPU creation).
 
 * Relax an assertion on the number of HLT exits in the xAPIC IPI test when
   running on a CPU that supports AMD's Idle HLT (which elides interception of
   HLT if a virtual IRQ is pending and unmasked).
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmfcTkEUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMnQAf/cPx72hJOdNy4Qrm8M33YLXVRVV00
 yEZ8eN8TWdOclr0ltE/w/ELGh/qS4CU8pjURAk0A6lPioU+mdcTn3dPEqMDMVYom
 uOQ2lusEHw0UuSnGZSEjvZJsE/Ro2NSAsHIB6PWRqig1ZBPJzyu0frce34pMpeQH
 diwriJL9lKPAhBWXnUQ9BKoi1R0P5OLW9ahX4SOWk7cAFg4DLlDE66Nqf6nKqViw
 DwEucTiUEg5+a3d93gihdD4JNl+fb3vI2erxrMxjFjkacl0qgqRu3ei3DG0MfdHU
 wNcFSG5B1n0OECKxr80lr1Ip1KTVNNij0Ks+w6Gc6lSg9c4PptnNkfLK3A==
 =nnCN
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:

   - Nested virtualization support for VGICv3, giving the nested
     hypervisor control of the VGIC hardware when running an L2 VM

   - Removal of 'late' nested virtualization feature register masking,
     making the supported feature set directly visible to userspace

   - Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage
     of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers

   - Paravirtual interface for discovering the set of CPU
     implementations where a VM may run, addressing a longstanding issue
     of guest CPU errata awareness in big-little systems and
     cross-implementation VM migration

   - Userspace control of the registers responsible for identifying a
     particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1),
     allowing VMs to be migrated cross-implementation

   - pKVM updates, including support for tracking stage-2 page table
     allocations in the protected hypervisor in the 'SecPageTable' stat

   - Fixes to vPMU, ensuring that userspace updates to the vPMU after
     KVM_RUN are reflected into the backing perf events

  LoongArch:

   - Remove unnecessary header include path

   - Assume constant PGD during VM context switch

   - Add perf events support for guest VM

  RISC-V:

   - Disable the kernel perf counter during configure

   - KVM selftests improvements for PMU

   - Fix warning at the time of KVM module removal

  x86:

   - Add support for aging of SPTEs without holding mmu_lock.

     Not taking mmu_lock allows multiple aging actions to run in
     parallel, and more importantly avoids stalling vCPUs. This includes
     an implementation of per-rmap-entry locking; aging the gfn is done
     with only a per-rmap single-bin spinlock taken, whereas locking an
     rmap for write requires taking both the per-rmap spinlock and the
     mmu_lock.

     Note that this decreases slightly the accuracy of accessed-page
     information, because changes to the SPTE outside aging might not
     use atomic operations even if they could race against a clear of
     the Accessed bit.

     This is deliberate because KVM and mm/ tolerate false
     positives/negatives for accessed information, and testing has shown
     that reducing the latency of aging is far more beneficial to
     overall system performance than providing "perfect" young/old
     information.

   - Defer runtime CPUID updates until KVM emulates a CPUID instruction,
     to coalesce updates when multiple pieces of vCPU state are
     changing, e.g. as part of a nested transition

   - Fix a variety of nested emulation bugs, and add VMX support for
     synthesizing nested VM-Exit on interception (instead of injecting
     #UD into L2)

   - Drop "support" for async page faults for protected guests that do
     not set SEND_ALWAYS (i.e. that only want async page faults at CPL3)

   - Bring a bit of sanity to x86's VM teardown code, which has
     accumulated a lot of cruft over the years. Particularly, destroy
     vCPUs before the MMU, despite the latter being a VM-wide operation

   - Add common secure TSC infrastructure for use within SNP and in the
     future TDX

   - Block KVM_CAP_SYNC_REGS if guest state is protected. It does not
     make sense to use the capability if the relevant registers are not
     available for reading or writing

   - Don't take kvm->lock when iterating over vCPUs in the suspend
     notifier to fix a largely theoretical deadlock

   - Use the vCPU's actual Xen PV clock information when starting the
     Xen timer, as the cached state in arch.hv_clock can be stale/bogus

   - Fix a bug where KVM could bleed PVCLOCK_GUEST_STOPPED across
     different PV clocks; restrict PVCLOCK_GUEST_STOPPED to kvmclock, as
     KVM's suspend notifier only accounts for kvmclock, and there's no
     evidence that the flag is actually supported by Xen guests

   - Clean up the per-vCPU "cache" of its reference pvclock, and instead
     only track the vCPU's TSC scaling (multipler+shift) metadata (which
     is moderately expensive to compute, and rarely changes for modern
     setups)

   - Don't write to the Xen hypercall page on MSR writes that are
     initiated by the host (userspace or KVM) to fix a class of bugs
     where KVM can write to guest memory at unexpected times, e.g.
     during vCPU creation if userspace has set the Xen hypercall MSR
     index to collide with an MSR that KVM emulates

   - Restrict the Xen hypercall MSR index to the unofficial synthetic
     range to reduce the set of possible collisions with MSRs that are
     emulated by KVM (collisions can still happen as KVM emulates
     Hyper-V MSRs, which also reside in the synthetic range)

   - Clean up and optimize KVM's handling of Xen MSR writes and
     xen_hvm_config

   - Update Xen TSC leaves during CPUID emulation instead of modifying
     the CPUID entries when updating PV clocks; there is no guarantee PV
     clocks will be updated between TSC frequency changes and CPUID
     emulation, and guest reads of the TSC leaves should be rare, i.e.
     are not a hot path

  x86 (Intel):

   - Fix a bug where KVM unnecessarily reads XFD_ERR from hardware and
     thus modifies the vCPU's XFD_ERR on a #NM due to CR0.TS=1

   - Pass XFD_ERR as the payload when injecting #NM, as a preparatory
     step for upcoming FRED virtualization support

   - Decouple the EPT entry RWX protection bit macros from the EPT
     Violation bits, both as a general cleanup and in anticipation of
     adding support for emulating Mode-Based Execution Control (MBEC)

   - Reject KVM_RUN if userspace manages to gain control and stuff
     invalid guest state while KVM is in the middle of emulating nested
     VM-Enter

   - Add a macro to handle KVM's sanity checks on entry/exit VMCS
     control pairs in anticipation of adding sanity checks for secondary
     exit controls (the primary field is out of bits)

  x86 (AMD):

   - Ensure the PSP driver is initialized when both the PSP and KVM
     modules are built-in (the initcall framework doesn't handle
     dependencies)

   - Use long-term pins when registering encrypted memory regions, so
     that the pages are migrated out of MIGRATE_CMA/ZONE_MOVABLE and
     don't lead to excessive fragmentation

   - Add macros and helpers for setting GHCB return/error codes

   - Add support for Idle HLT interception, which elides interception if
     the vCPU has a pending, unmasked virtual IRQ when HLT is executed

   - Fix a bug in INVPCID emulation where KVM fails to check for a
     non-canonical address

   - Don't attempt VMRUN for SEV-ES+ guests if the vCPU's VMSA is
     invalid, e.g. because the vCPU was "destroyed" via SNP's AP
     Creation hypercall

   - Reject SNP AP Creation if the requested SEV features for the vCPU
     don't match the VM's configured set of features

  Selftests:

   - Fix again the Intel PMU counters test; add a data load and do
     CLFLUSH{OPT} on the data instead of executing code. The theory is
     that modern Intel CPUs have learned new code prefetching tricks
     that bypass the PMU counters

   - Fix a flaw in the Intel PMU counters test where it asserts that an
     event is counting correctly without actually knowing what the event
     counts on the underlying hardware

   - Fix a variety of flaws, bugs, and false failures/passes
     dirty_log_test, and improve its coverage by collecting all dirty
     entries on each iteration

   - Fix a few minor bugs related to handling of stats FDs

   - Add infrastructure to make vCPU and VM stats FDs available to tests
     by default (open the FDs during VM/vCPU creation)

   - Relax an assertion on the number of HLT exits in the xAPIC IPI test
     when running on a CPU that supports AMD's Idle HLT (which elides
     interception of HLT if a virtual IRQ is pending and unmasked)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (216 commits)
  RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed
  RISC-V: KVM: Teardown riscv specific bits after kvm_exit
  LoongArch: KVM: Register perf callbacks for guest
  LoongArch: KVM: Implement arch-specific functions for guest perf
  LoongArch: KVM: Add stub for kvm_arch_vcpu_preempted_in_kernel()
  LoongArch: KVM: Remove PGD saving during VM context switch
  LoongArch: KVM: Remove unnecessary header include path
  KVM: arm64: Tear down vGIC on failed vCPU creation
  KVM: arm64: PMU: Reload when resetting
  KVM: arm64: PMU: Reload when user modifies registers
  KVM: arm64: PMU: Fix SET_ONE_REG for vPMC regs
  KVM: arm64: PMU: Assume PMU presence in pmu-emul.c
  KVM: arm64: PMU: Set raw values from user to PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
  KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu
  KVM: arm64: Factor out pKVM hyp vcpu creation to separate function
  KVM: arm64: Initialize HCRX_EL2 traps in pKVM
  KVM: arm64: Factor out setting HCRX_EL2 traps into separate function
  KVM: x86: block KVM_CAP_SYNC_REGS if guest state is protected
  KVM: x86: Add infrastructure for secure TSC
  KVM: x86: Push down setting vcpu.arch.user_set_tsc
  ...
2025-03-25 14:22:07 -07:00
Linus Torvalds
317a76a996 Updates for the VDSO infrastructure:
- Consolidate the VDSO storage
 
     The VDSO data storage and data layout has been largely architecture
     specific for historical reasons. That increases the maintenance effort
     and causes inconsistencies over and over.
 
     There is no real technical reason for architecture specific layouts and
     implementations. The architecture specific details can easily be
     integrated into a generic layout, which also reduces the amount of
     duplicated code for managing the mappings.
 
     Convert all architectures over to a unified layout and common mapping
     infrastructure. This splits the VDSO data layout into subsystem
     specific blocks, timekeeping, random and architecture parts, which
     provides a better structure and allows to improve and update the
     functionalities without conflict and interaction.
 
   - Rework the timekeeping data storage
 
     The current implementation is designed for exposing system timekeeping
     accessors, which was good enough at the time when it was designed.
 
     PTP and Time Sensitive Networking (TSN) change that as there are
     requirements to expose independent PTP clocks, which are not related to
     system timekeeping.
 
     Replace the monolithic data storage by a structured layout, which
     allows to add support for independent PTP clocks on top while reusing
     both the data structures and the time accessor implementations.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmfgSWUTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoYGED/0f/M8YyacAyErDYW4ufW+zh2sUidSf
 GVlK0Jn5BMljOoye+y2XfTxuvvXxEDjJNYiJm2uKGPdV29tjNXreGK39XyNqXPu5
 jwR4f/IN/QVSM2nCO6jyydMz8ympJ2k6M4RewwmxXBL2KsUzzJWSKTgRNqM5Tdjs
 1RhJMjkQVTiiSYerBpHXYCeZLM7/VEfZ120uuzVAYPXo0/R6zuyF7IBgIao9hbfO
 IQeCMLLfpDQHQhwquTA8ZbWqQusiEoSYHT+kTDa3eXDDbE/2UklAUs9gaatI979x
 73zs0Yqxyx2iIGaghACWOAbKdcBWBeCYDw5fFwYVKn4VMQi1+wcxbtOYL767jp9o
 vfkLXGilXcVkvDjv4fH+e1NoJXXBxq1Ug1silKdOeJzenQF8Q1i3tavkWUVCNfwH
 qyOIM72NiCEWbYBDcz0lwBxEAyO4o0E6NP1bDc4y50VedEYIbXwSh0QGrdev1abn
 rjY9vsuUR9oznmZ6BRPPxMTY87gOSHoKvqydgSZUACEgLV9346f5qZf341OReYai
 MXUmXOM4+LdyaM1+Mec8ppvjMbLw+736NZyZtT2InusEBE+Ddp25L3hYiWnklJu8
 2uwv0AoyrwaJ8y6ADOX4thcLZq0gND0Z/Ayz/XvpeI30eftsGUCt5KOVlqwfwOkI
 4EQKvk2fAixPxg==
 =rwei
 -----END PGP SIGNATURE-----

Merge tag 'timers-vdso-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull VDSO infrastructure updates from Thomas Gleixner:

 - Consolidate the VDSO storage

   The VDSO data storage and data layout has been largely architecture
   specific for historical reasons. That increases the maintenance
   effort and causes inconsistencies over and over.

   There is no real technical reason for architecture specific layouts
   and implementations. The architecture specific details can easily be
   integrated into a generic layout, which also reduces the amount of
   duplicated code for managing the mappings.

   Convert all architectures over to a unified layout and common mapping
   infrastructure. This splits the VDSO data layout into subsystem
   specific blocks, timekeeping, random and architecture parts, which
   provides a better structure and allows to improve and update the
   functionalities without conflict and interaction.

 - Rework the timekeeping data storage

   The current implementation is designed for exposing system
   timekeeping accessors, which was good enough at the time when it was
   designed.

   PTP and Time Sensitive Networking (TSN) change that as there are
   requirements to expose independent PTP clocks, which are not related
   to system timekeeping.

   Replace the monolithic data storage by a structured layout, which
   allows to add support for independent PTP clocks on top while reusing
   both the data structures and the time accessor implementations.

* tag 'timers-vdso-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits)
  sparc/vdso: Always reject undefined references during linking
  x86/vdso: Always reject undefined references during linking
  vdso: Rework struct vdso_time_data and introduce struct vdso_clock
  vdso: Move architecture related data before basetime data
  powerpc/vdso: Prepare introduction of struct vdso_clock
  arm64/vdso: Prepare introduction of struct vdso_clock
  x86/vdso: Prepare introduction of struct vdso_clock
  time/namespace: Prepare introduction of struct vdso_clock
  vdso/namespace: Rename timens_setup_vdso_data() to reflect new vdso_clock struct
  vdso/vsyscall: Prepare introduction of struct vdso_clock
  vdso/gettimeofday: Prepare helper functions for introduction of struct vdso_clock
  vdso/gettimeofday: Prepare do_coarse_timens() for introduction of struct vdso_clock
  vdso/gettimeofday: Prepare do_coarse() for introduction of struct vdso_clock
  vdso/gettimeofday: Prepare do_hres_timens() for introduction of struct vdso_clock
  vdso/gettimeofday: Prepare do_hres() for introduction of struct vdso_clock
  vdso/gettimeofday: Prepare introduction of struct vdso_clock
  vdso/helpers: Prepare introduction of struct vdso_clock
  vdso/datapage: Define vdso_clock to prepare for multiple PTP clocks
  vdso: Make vdso_time_data cacheline aligned
  arm64: Make asm/cache.h compatible with vDSO
  ...
2025-03-25 11:30:42 -07:00
Linus Torvalds
a50b4fe095 A treewide hrtimer timer cleanup
hrtimers are initialized with hrtimer_init() and a subsequent store to
   the callback pointer. This turned out to be suboptimal for the upcoming
   Rust integration and is obviously a silly implementation to begin with.
 
   This cleanup replaces the hrtimer_init(T); T->function = cb; sequence
   with hrtimer_setup(T, cb);
 
   The conversion was done with Coccinelle and a few manual fixups.
 
   Once the conversion has completely landed in mainline, hrtimer_init()
   will be removed and the hrtimer::function becomes a private member.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmff5jQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVvRD/wKtuwmiA66NJFgXC0qVq82A6fO3bY8
 GBdbfysDJIbqGu5PTcULTbJ8qkqv3jeLUv6CcXvS4sZ7y/uJQl2lzf8yrD/0bbwc
 rLI6sHiPSZmK93kNVN4X5H7kvt7cE/DYC9nnEOgK3BY5FgKc4n9887d4aVBhL8Lv
 ODwVXvZ+xi351YCj7qRyPU24zt/p4tkkT1o2k4a0HBluqLI0D+V20fke9IERUL8r
 d1uWKlcn0TqYDesE8HXKIhbst3gx52rMJrXBJDHwFmG6v8Pj1fkTXCVpPo8QcBz8
 OTVkpomN9f/Tx4+GZwhZOF86LhLL3OhxD6pT7JhFCXdmSGv+Ez8uyk1YZysM/XpV
 Juy/1yAcBpDIDkmhMFGdAAn48Nn9Fotty0r4je60zSEp1d/4QMXcFme29qr2JTUE
 iWnQ/HD6DxUjVHqy7CYvvo26Xegg1C7qgyOVt4PYZwAM1VKF5P3kzYTb4SAdxtop
 Tpji1sfW9QV08jqMNo6XntD32DSP9S2HqjO9LwBw700jnx2jjJ35fcJs6iodMOUn
 gckIZLMn3L0OoglPdyA5O7SNTbKE7aFiRKdnT/cJtR3Fa39Qu27CwC5gfiyuie9I
 Q+LG8GLuYSBHXAR+PBK4GWlzJ7Dn8k3eqmbnLeKpRMsU6ZzcttgA64xhaviN2wN0
 iJbvLJeisXr3GA==
 =bYAX
 -----END PGP SIGNATURE-----

Merge tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer cleanups from Thomas Gleixner:
 "A treewide hrtimer timer cleanup

  hrtimers are initialized with hrtimer_init() and a subsequent store to
  the callback pointer. This turned out to be suboptimal for the
  upcoming Rust integration and is obviously a silly implementation to
  begin with.

  This cleanup replaces the hrtimer_init(T); T->function = cb; sequence
  with hrtimer_setup(T, cb);

  The conversion was done with Coccinelle and a few manual fixups.

  Once the conversion has completely landed in mainline, hrtimer_init()
  will be removed and the hrtimer::function becomes a private member"

* tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (100 commits)
  wifi: rt2x00: Switch to use hrtimer_update_function()
  io_uring: Use helper function hrtimer_update_function()
  serial: xilinx_uartps: Use helper function hrtimer_update_function()
  ASoC: fsl: imx-pcm-fiq: Switch to use hrtimer_setup()
  RDMA: Switch to use hrtimer_setup()
  virtio: mem: Switch to use hrtimer_setup()
  drm/vmwgfx: Switch to use hrtimer_setup()
  drm/xe/oa: Switch to use hrtimer_setup()
  drm/vkms: Switch to use hrtimer_setup()
  drm/msm: Switch to use hrtimer_setup()
  drm/i915/request: Switch to use hrtimer_setup()
  drm/i915/uncore: Switch to use hrtimer_setup()
  drm/i915/pmu: Switch to use hrtimer_setup()
  drm/i915/perf: Switch to use hrtimer_setup()
  drm/i915/gvt: Switch to use hrtimer_setup()
  drm/i915/huc: Switch to use hrtimer_setup()
  drm/amdgpu: Switch to use hrtimer_setup()
  stm class: heartbeat: Switch to use hrtimer_setup()
  i2c: Switch to use hrtimer_setup()
  iio: Switch to use hrtimer_setup()
  ...
2025-03-25 10:54:15 -07:00
Linus Torvalds
0f40464674 Updates for interrupt chip drivers:
- Support for hard indices on RISC-V. The hart index identifies a hart
     (core) within a specific interrupt domain in RISC-V's Priviledged
     Architecture.
 
   - Rework of the RISC-V MSI driver.
 
     This moves the driver over to the generic MSI library and solves the
     affinity problem of unmaskable PCI/MSI controllers. Unmaskable PCI/MSI
     controllers are prone to lose interrupts when the MSI message is
     updated to change the affinity because the message write consists of
     three 32-bit subsequent writes, which update address and data. As these
     writes are non-atomic versus the device raising an interrupt, the
     device can observe a half written update and issue an interrupt on the
     wrong vector. This is mitiated by a carefully orchestrated step by step
     update and the observation of an eventually pending interrupt on the
     CPU which issues the update. The algorithm follows the well established
     method of the X86 MSI driver.
 
   - A new driver for the RISC-V Sophgo SG2042 MSI controller
 
   - Overhaul of the Renesas RZQ2L driver.
 
     Simplification of the probe function by using devm_*() mechanisms,
     which avoid the endless list of error prone gotos in the failure paths.
 
   - Expand the Renesas RZV2H driver to support RZ/G3E SoCs
 
   - A workaround for Rockchip 3568002 erratum in the GIC-V3 driver to
     ensure that the addressing is limited to the lower 32-bit of the
     physical address space.
 
   - Add support for the Allwinner AS23 NMI controller
 
   - Expand the IMX irqsteer driver to handle up to 960 input interrupts
 
   - The usual small updates, cleanups and device tree changes.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmff454THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoZqoD/4kdHzbxfLpf7vC3NnG8NWwTq5FpbSx
 6grQC9hWNMAs4n2IFjJRFLrjeX3AcdAQXL/BWuM0LfW9tQDQaVmqlSIlB/bn69KB
 7HyAR6ozbOgnHKGAqFUXSLf+4pq+6q3mOgGKIF289dy14HFu4ta0DqKgkPZeQnVs
 R/J8i7REUnn+YuxzSt5eOqyDPyt2EHJosSUABSWQZBlrM9jy1W7f6NqDFwawiVsa
 +tv4U/bz91vjzVxwTIgt7nJK+b2HVYdxoZYuKJwPaTsj26ANPp6ltjRTeOmZhb5h
 uKgw+OyzDnk6q+tjGcRqrqwl291VKxCvnRiqHFfu3CERdmI9qvpN9IRcEJqIbkcN
 cakekhAyt7OO7sEPcql5vBL97e9hpb7EcH78gYxwHf8Dy0rFZUvSC5v+L6VRFnJS
 XcKA1L+f9B6u5qxnBtLan9IW08HYNdvmPq6AuVjk+ndKioPUFqB2q6AtXpuA3Rmu
 Y3XH/wh/q5wk0pgeByxQW6swsfpMN3OYK3mpLx475wFh2NKzcdGlwGhDFhiw8DKX
 m1AESy3UZatj1a0qGaFS/M+mm9KGrDYIMrje832Wf4Yf1LGmTsDkd3/V99oazSsq
 Jm4qhDASXChJXd0imQICX9hPw0aHTlLYNs54obUXVULH4HivQKIgWhUXrjG0dBDL
 +tttjuv5FJxr3A==
 =jPHa
 -----END PGP SIGNATURE-----

Merge tag 'irq-drivers-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq driver updates from Thomas Gleixner:

 - Support for hard indices on RISC-V. The hart index identifies a hart
   (core) within a specific interrupt domain in RISC-V's Priviledged
   Architecture.

 - Rework of the RISC-V MSI driver

   This moves the driver over to the generic MSI library and solves the
   affinity problem of unmaskable PCI/MSI controllers. Unmaskable
   PCI/MSI controllers are prone to lose interrupts when the MSI message
   is updated to change the affinity because the message write consists
   of three 32-bit subsequent writes, which update address and data. As
   these writes are non-atomic versus the device raising an interrupt,
   the device can observe a half written update and issue an interrupt
   on the wrong vector. This is mitiated by a carefully orchestrated
   step by step update and the observation of an eventually pending
   interrupt on the CPU which issues the update. The algorithm follows
   the well established method of the X86 MSI driver.

 - A new driver for the RISC-V Sophgo SG2042 MSI controller

 - Overhaul of the Renesas RZQ2L driver

   Simplification of the probe function by using devm_*() mechanisms,
   which avoid the endless list of error prone gotos in the failure
   paths.

 - Expand the Renesas RZV2H driver to support RZ/G3E SoCs

 - A workaround for Rockchip 3568002 erratum in the GIC-V3 driver to
   ensure that the addressing is limited to the lower 32-bit of the
   physical address space.

 - Add support for the Allwinner AS23 NMI controller

 - Expand the IMX irqsteer driver to handle up to 960 input interrupts

 - The usual small updates, cleanups and device tree changes

* tag 'irq-drivers-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  irqchip/imx-irqsteer: Support up to 960 input interrupts
  irqchip/sunxi-nmi: Support Allwinner A523 NMI controller
  dt-bindings: irq: sun7i-nmi: Document the Allwinner A523 NMI controller
  irqchip/davinci-cp-intc: Remove public header
  irqchip/renesas-rzv2h: Add RZ/G3E support
  irqchip/renesas-rzv2h: Update macros ICU_TSSR_TSSEL_{MASK,PREP}
  irqchip/renesas-rzv2h: Update TSSR_TIEN macro
  irqchip/renesas-rzv2h: Add field_width to struct rzv2h_hw_info
  irqchip/renesas-rzv2h: Add max_tssel to struct rzv2h_hw_info
  irqchip/renesas-rzv2h: Add struct rzv2h_hw_info with t_offs variable
  irqchip/renesas-rzv2h: Use devm_pm_runtime_enable()
  irqchip/renesas-rzv2h: Use devm_reset_control_get_exclusive_deasserted()
  irqchip/renesas-rzv2h: Simplify rzv2h_icu_init()
  irqchip/renesas-rzv2h: Drop irqchip from struct rzv2h_icu_priv
  irqchip/renesas-rzv2h: Fix wrong variable usage in rzv2h_tint_set_type()
  dt-bindings: interrupt-controller: renesas,rzv2h-icu: Document RZ/G3E SoC
  riscv: sophgo: dts: Add msi controller for SG2042
  irqchip: Add the Sophgo SG2042 MSI interrupt controller
  dt-bindings: interrupt-controller: Add Sophgo SG2042 MSI
  arm64: dts: rockchip: rk356x: Move PCIe MSI to use GIC ITS instead of MBI
  ...
2025-03-25 09:54:36 -07:00
Conor Dooley
12e7fbb6a8
RISC-V: add f & d extension validation checks
Using Clement's new validation callbacks, support checking that
dependencies have been satisfied for the floating point extensions.

The check for "d" might be slightly confusingly shorter than that of "f",
despite "d" depending on "f". This is because the requirement that a
hart supporting double precision must also support single precision,
should be validated by dt-bindings etc, not the kernel but lack of
support for single precision only is a limitation of the kernel.

Tested-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250312-reptile-platinum-62ee0f444a32@spud
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-25 14:19:24 +00:00
Conor Dooley
38077ec8fc
RISC-V: add vector crypto extension validation checks
Using Clement's new validation callbacks, support checking that
dependencies have been satisfied for the vector crpyto extensions.
Currently riscv_isa_extension_available(<vector crypto>) will return
true on systems that support the extensions but vector itself has been
disabled by the kernel, adding validation callbacks will prevent such a
scenario from occuring and make the behaviour of the extension detection
functions more consistent with user expectations - it's not expected to
have to check for vector AND the specific crypto extension.

The Unpriv spec states:
| The Zvknhb and Zvbc Vector Crypto Extensions --and accordingly the
| composite extensions Zvkn, Zvknc, Zvkng, and Zvksc-- require a Zve64x
| base, or application ("V") base Vector Extension. All of the other
| Vector Crypto Extensions can be built on any embedded (Zve*) or
| application ("V") base Vector Extension.

While this could be used as the basis for checking that the correct base
for individual crypto extensions, but that's not really the kernel's job
in my opinion and it is sufficient to leave that sort of precision to
the dt-bindings. The kernel only needs to make sure that vector, in some
form, is available.

Link: https://github.com/riscv/riscv-isa-manual/blob/main/src/vector-crypto.adoc#extensions-overview
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20250312-entertain-shaking-b664142c2f99@spud
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-25 14:10:08 +00:00
Conor Dooley
9324571e9e
RISC-V: add vector extension validation checks
Using Clement's new validation callbacks, support checking that
dependencies have been satisfied for the vector extensions. From the
kernel's perfective, it's not required to differentiate between the
conditions for all the various vector subsets - it's the firmware's job
to not report impossible combinations. Instead, the kernel only has to
check that the correct config options are enabled and to enforce its
requirement of the d extension being present for FPU support.

Since vector will now be disabled proactively, there's no need to clear
the bit in elf_hwcap in riscv_fill_hwcap() any longer.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250312-eclair-affluent-55b098c3602b@spud
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-25 14:10:08 +00:00
Linus Torvalds
e34c38057a [ Merge note: this pull request depends on you having merged
two locking commits in the locking tree,
 	      part of the locking-core-2025-03-22 pull request. ]
 
 x86 CPU features support:
   - Generate the <asm/cpufeaturemasks.h> header based on build config
     (H. Peter Anvin, Xin Li)
   - x86 CPUID parsing updates and fixes (Ahmed S. Darwish)
   - Introduce the 'setcpuid=' boot parameter (Brendan Jackman)
   - Enable modifying CPU bug flags with '{clear,set}puid='
     (Brendan Jackman)
   - Utilize CPU-type for CPU matching (Pawan Gupta)
   - Warn about unmet CPU feature dependencies (Sohil Mehta)
   - Prepare for new Intel Family numbers (Sohil Mehta)
 
 Percpu code:
   - Standardize & reorganize the x86 percpu layout and
     related cleanups (Brian Gerst)
   - Convert the stackprotector canary to a regular percpu
     variable (Brian Gerst)
   - Add a percpu subsection for cache hot data (Brian Gerst)
   - Unify __pcpu_op{1,2}_N() macros to __pcpu_op_N() (Uros Bizjak)
   - Construct __percpu_seg_override from __percpu_seg (Uros Bizjak)
 
 MM:
   - Add support for broadcast TLB invalidation using AMD's INVLPGB instruction
     (Rik van Riel)
   - Rework ROX cache to avoid writable copy (Mike Rapoport)
   - PAT: restore large ROX pages after fragmentation
     (Kirill A. Shutemov, Mike Rapoport)
   - Make memremap(MEMREMAP_WB) map memory as encrypted by default
     (Kirill A. Shutemov)
   - Robustify page table initialization (Kirill A. Shutemov)
   - Fix flush_tlb_range() when used for zapping normal PMDs (Jann Horn)
   - Clear _PAGE_DIRTY for kernel mappings when we clear _PAGE_RW
     (Matthew Wilcox)
 
 KASLR:
   - x86/kaslr: Reduce KASLR entropy on most x86 systems,
     to support PCI BAR space beyond the 10TiB region
     (CONFIG_PCI_P2PDMA=y) (Balbir Singh)
 
 CPU bugs:
   - Implement FineIBT-BHI mitigation (Peter Zijlstra)
   - speculation: Simplify and make CALL_NOSPEC consistent (Pawan Gupta)
   - speculation: Add a conditional CS prefix to CALL_NOSPEC (Pawan Gupta)
   - RFDS: Exclude P-only parts from the RFDS affected list (Pawan Gupta)
 
 System calls:
   - Break up entry/common.c (Brian Gerst)
   - Move sysctls into arch/x86 (Joel Granados)
 
 Intel LAM support updates: (Maciej Wieczor-Retman)
   - selftests/lam: Move cpu_has_la57() to use cpuinfo flag
   - selftests/lam: Skip test if LAM is disabled
   - selftests/lam: Test get_user() LAM pointer handling
 
 AMD SMN access updates:
   - Add SMN offsets to exclusive region access (Mario Limonciello)
   - Add support for debugfs access to SMN registers (Mario Limonciello)
   - Have HSMP use SMN through AMD_NODE (Yazen Ghannam)
 
 Power management updates: (Patryk Wlazlyn)
   - Allow calling mwait_play_dead with an arbitrary hint
   - ACPI/processor_idle: Add FFH state handling
   - intel_idle: Provide the default enter_dead() handler
   - Eliminate mwait_play_dead_cpuid_hint()
 
 Bootup:
 
 Build system:
   - Raise the minimum GCC version to 8.1 (Brian Gerst)
   - Raise the minimum LLVM version to 15.0.0
     (Nathan Chancellor)
 
 Kconfig: (Arnd Bergmann)
   - Add cmpxchg8b support back to Geode CPUs
   - Drop 32-bit "bigsmp" machine support
   - Rework CONFIG_GENERIC_CPU compiler flags
   - Drop configuration options for early 64-bit CPUs
   - Remove CONFIG_HIGHMEM64G support
   - Drop CONFIG_SWIOTLB for PAE
   - Drop support for CONFIG_HIGHPTE
   - Document CONFIG_X86_INTEL_MID as 64-bit-only
   - Remove old STA2x11 support
   - Only allow CONFIG_EISA for 32-bit
 
 Headers:
   - Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI and non-UAPI headers
     (Thomas Huth)
 
 Assembly code & machine code patching:
   - x86/alternatives: Simplify alternative_call() interface (Josh Poimboeuf)
   - x86/alternatives: Simplify callthunk patching (Peter Zijlstra)
   - KVM: VMX: Use named operands in inline asm (Josh Poimboeuf)
   - x86/hyperv: Use named operands in inline asm (Josh Poimboeuf)
   - x86/traps: Cleanup and robustify decode_bug() (Peter Zijlstra)
   - x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h>
     (Uros Bizjak)
   - Use named operands in inline asm (Uros Bizjak)
   - Improve performance by using asm_inline() for atomic locking instructions
     (Uros Bizjak)
 
 Earlyprintk:
   - Harden early_serial (Peter Zijlstra)
 
 NMI handler:
   - Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus()
     (Waiman Long)
 
 Miscellaneous fixes and cleanups:
 
   - by Ahmed S. Darwish, Andy Shevchenko, Ard Biesheuvel,
     Artem Bityutskiy, Borislav Petkov, Brendan Jackman, Brian Gerst,
     Dan Carpenter, Dr. David Alan Gilbert, H. Peter Anvin,
     Ingo Molnar, Josh Poimboeuf, Kevin Brodsky, Mike Rapoport,
     Lukas Bulwahn, Maciej Wieczor-Retman, Max Grobecker,
     Patryk Wlazlyn, Pawan Gupta, Peter Zijlstra,
     Philip Redkin, Qasim Ijaz, Rik van Riel, Thomas Gleixner,
     Thorsten Blum, Tom Lendacky, Tony Luck, Uros Bizjak,
     Vitaly Kuznetsov, Xin Li, liuye.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfenkQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g1FRAAi6OFTSn/5aeLMI0IMNBxJ6ddQiFc3imd
 7+C/vU5nul4CyDs8mKyj/+f/DDrbkG9lKz3VG631Yl237lXHjD8XWcVMeC/1z/q0
 3zInDIloE9/nBHRPkF6F7fARBLBZ0LFgaBsGrCo7mwpGybiQdqGcqcxllvTbtXaw
 OHta4q6ok+lBDNlfc0v6H4cRnzhmmlKu6Ng0j6UI3V7uFhi3vtxas32ltDQtzorq
 2+jbV6/+kbrrv+xPC+jlzOFhTEKRupNPQXmvyQteoQg6G3kqAKMDvBthGXd1rHuX
 Qa+BoDIifE/2NiVeRwNrhoqYH/pHCzUzDREW5IW8+ca+4XNKuzAC6EuC8CeCzyK1
 q8ZjZjooQW4zEeVFeJYllHONzJYfxfSH5CLsnbcuhq99yfGlrQhF1qL72/Omn1w/
 DfPJM8Zt5zyKvLqUg3Md+fkVCO2wyDNhB61QPzRgHF+yD+rvuDpoqvUWir+w7cSn
 fwEDVZGXlFx6dumtSrqRaTd1nvFt80s8yP2ll09DMvGQ8D/yruS7hndGAmmJVCSW
 NAfd8pSjq5v2+ux2UR92/Cc3VF3SjaUqHBOp/Nq9rESya18ZVa3cJpHhVYYtPIVf
 THW0h07RIkGVKs1uq+5ekLCr/8uAZg58UPIqmhTuW0ttymRHCNfohR45FQZzy+0M
 tJj1oc2TIZw=
 =Dcb3
 -----END PGP SIGNATURE-----

Merge tag 'x86-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core x86 updates from Ingo Molnar:
 "x86 CPU features support:
   - Generate the <asm/cpufeaturemasks.h> header based on build config
     (H. Peter Anvin, Xin Li)
   - x86 CPUID parsing updates and fixes (Ahmed S. Darwish)
   - Introduce the 'setcpuid=' boot parameter (Brendan Jackman)
   - Enable modifying CPU bug flags with '{clear,set}puid=' (Brendan
     Jackman)
   - Utilize CPU-type for CPU matching (Pawan Gupta)
   - Warn about unmet CPU feature dependencies (Sohil Mehta)
   - Prepare for new Intel Family numbers (Sohil Mehta)

  Percpu code:
   - Standardize & reorganize the x86 percpu layout and related cleanups
     (Brian Gerst)
   - Convert the stackprotector canary to a regular percpu variable
     (Brian Gerst)
   - Add a percpu subsection for cache hot data (Brian Gerst)
   - Unify __pcpu_op{1,2}_N() macros to __pcpu_op_N() (Uros Bizjak)
   - Construct __percpu_seg_override from __percpu_seg (Uros Bizjak)

  MM:
   - Add support for broadcast TLB invalidation using AMD's INVLPGB
     instruction (Rik van Riel)
   - Rework ROX cache to avoid writable copy (Mike Rapoport)
   - PAT: restore large ROX pages after fragmentation (Kirill A.
     Shutemov, Mike Rapoport)
   - Make memremap(MEMREMAP_WB) map memory as encrypted by default
     (Kirill A. Shutemov)
   - Robustify page table initialization (Kirill A. Shutemov)
   - Fix flush_tlb_range() when used for zapping normal PMDs (Jann Horn)
   - Clear _PAGE_DIRTY for kernel mappings when we clear _PAGE_RW
     (Matthew Wilcox)

  KASLR:
   - x86/kaslr: Reduce KASLR entropy on most x86 systems, to support PCI
     BAR space beyond the 10TiB region (CONFIG_PCI_P2PDMA=y) (Balbir
     Singh)

  CPU bugs:
   - Implement FineIBT-BHI mitigation (Peter Zijlstra)
   - speculation: Simplify and make CALL_NOSPEC consistent (Pawan Gupta)
   - speculation: Add a conditional CS prefix to CALL_NOSPEC (Pawan
     Gupta)
   - RFDS: Exclude P-only parts from the RFDS affected list (Pawan
     Gupta)

  System calls:
   - Break up entry/common.c (Brian Gerst)
   - Move sysctls into arch/x86 (Joel Granados)

  Intel LAM support updates: (Maciej Wieczor-Retman)
   - selftests/lam: Move cpu_has_la57() to use cpuinfo flag
   - selftests/lam: Skip test if LAM is disabled
   - selftests/lam: Test get_user() LAM pointer handling

  AMD SMN access updates:
   - Add SMN offsets to exclusive region access (Mario Limonciello)
   - Add support for debugfs access to SMN registers (Mario Limonciello)
   - Have HSMP use SMN through AMD_NODE (Yazen Ghannam)

  Power management updates: (Patryk Wlazlyn)
   - Allow calling mwait_play_dead with an arbitrary hint
   - ACPI/processor_idle: Add FFH state handling
   - intel_idle: Provide the default enter_dead() handler
   - Eliminate mwait_play_dead_cpuid_hint()

  Build system:
   - Raise the minimum GCC version to 8.1 (Brian Gerst)
   - Raise the minimum LLVM version to 15.0.0 (Nathan Chancellor)

  Kconfig: (Arnd Bergmann)
   - Add cmpxchg8b support back to Geode CPUs
   - Drop 32-bit "bigsmp" machine support
   - Rework CONFIG_GENERIC_CPU compiler flags
   - Drop configuration options for early 64-bit CPUs
   - Remove CONFIG_HIGHMEM64G support
   - Drop CONFIG_SWIOTLB for PAE
   - Drop support for CONFIG_HIGHPTE
   - Document CONFIG_X86_INTEL_MID as 64-bit-only
   - Remove old STA2x11 support
   - Only allow CONFIG_EISA for 32-bit

  Headers:
   - Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI and non-UAPI
     headers (Thomas Huth)

  Assembly code & machine code patching:
   - x86/alternatives: Simplify alternative_call() interface (Josh
     Poimboeuf)
   - x86/alternatives: Simplify callthunk patching (Peter Zijlstra)
   - KVM: VMX: Use named operands in inline asm (Josh Poimboeuf)
   - x86/hyperv: Use named operands in inline asm (Josh Poimboeuf)
   - x86/traps: Cleanup and robustify decode_bug() (Peter Zijlstra)
   - x86/kexec: Merge x86_32 and x86_64 code using macros from
     <asm/asm.h> (Uros Bizjak)
   - Use named operands in inline asm (Uros Bizjak)
   - Improve performance by using asm_inline() for atomic locking
     instructions (Uros Bizjak)

  Earlyprintk:
   - Harden early_serial (Peter Zijlstra)

  NMI handler:
   - Add an emergency handler in nmi_desc & use it in
     nmi_shootdown_cpus() (Waiman Long)

  Miscellaneous fixes and cleanups:
   - by Ahmed S. Darwish, Andy Shevchenko, Ard Biesheuvel, Artem
     Bityutskiy, Borislav Petkov, Brendan Jackman, Brian Gerst, Dan
     Carpenter, Dr. David Alan Gilbert, H. Peter Anvin, Ingo Molnar,
     Josh Poimboeuf, Kevin Brodsky, Mike Rapoport, Lukas Bulwahn, Maciej
     Wieczor-Retman, Max Grobecker, Patryk Wlazlyn, Pawan Gupta, Peter
     Zijlstra, Philip Redkin, Qasim Ijaz, Rik van Riel, Thomas Gleixner,
     Thorsten Blum, Tom Lendacky, Tony Luck, Uros Bizjak, Vitaly
     Kuznetsov, Xin Li, liuye"

* tag 'x86-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (211 commits)
  zstd: Increase DYNAMIC_BMI2 GCC version cutoff from 4.8 to 11.0 to work around compiler segfault
  x86/asm: Make asm export of __ref_stack_chk_guard unconditional
  x86/mm: Only do broadcast flush from reclaim if pages were unmapped
  perf/x86/intel, x86/cpu: Replace Pentium 4 model checks with VFM ones
  perf/x86/intel, x86/cpu: Simplify Intel PMU initialization
  x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-UAPI headers
  x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI headers
  x86/locking/atomic: Improve performance by using asm_inline() for atomic locking instructions
  x86/asm: Use asm_inline() instead of asm() in clwb()
  x86/asm: Use CLFLUSHOPT and CLWB mnemonics in <asm/special_insns.h>
  x86/hweight: Use asm_inline() instead of asm()
  x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm()
  x86/hweight: Use named operands in inline asm()
  x86/stackprotector/64: Only export __ref_stack_chk_guard on CONFIG_SMP
  x86/head/64: Avoid Clang < 17 stack protector in startup code
  x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h>
  x86/runtime-const: Add the RUNTIME_CONST_PTR assembly macro
  x86/cpu/intel: Limit the non-architectural constant_tsc model checks
  x86/mm/pat: Replace Intel x86_model checks with VFM ones
  x86/cpu/intel: Fix fast string initialization for extended Families
  ...
2025-03-24 22:06:11 -07:00
Linus Torvalds
2f2d529458 bitmap changes for 6.15
This includes:
  - cpumask_next_wrap() rework from me;
  - GENMASK() simplification from I Hsin;
  - rust bindings for cpumasks from Viresh and me;
  - scattered cleanups from Andy, Tamir, Vincent, Ignacio and Joel.
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmfhicUACgkQsUSA/Tof
 vsiT1Av/TFpTFPcfb0/U6zTjhphqSkhCqBN4JcT+Qh1pyFN3Q8xh7FIRjqm6PoWb
 wypQTrsOuS1UImfxj2PkHPiagDHz3LBWRJ1WCBZPF3FgZaFdOtVDObn91APaX4Jz
 K7B2eghnDLk74+eV3aBLVCPgdFPm4Og+3W2J9loWDHYNBrlgQX/3T8gZzJcIzDxk
 8jDiy84cGQweW3K6VDr7WGb/gDBTNXKByFig4+rzuW8X/VcUB1wZi1lHqTL3yBMm
 hXGsa8/VFLVKpRhZxx7PeTiXF+Wp4Tu7iyCuLVK9F9P9pY4GBZ9KV69yaeHLwlwF
 P4eA3Lj1KvtwmZYDT19lB8V0El7nZzcTHtmSgII8JEniWvuVQjjARicIqFqh6zmX
 QaLOt/gfGT/tr9nPzsFMgQxHV0ocibqWmM0gZyfEDsqIX0ynSh1fbMf52PrbBBSX
 aOaVV55HWIjHzLPzqvVee8JMaCwn4hNDrVaWItedQzZkf8aXKLk/GUWYaaEwQ8yY
 N7D3sXbT
 =Bm5k
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-for-6.15' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - cpumask_next_wrap() rework (me)

 - GENMASK() simplification (I Hsin)

 - rust bindings for cpumasks (Viresh and me)

 - scattered cleanups (Andy, Tamir, Vincent, Ignacio and Joel)

* tag 'bitmap-for-6.15' of https://github.com/norov/linux: (22 commits)
  cpumask: align text in comment
  riscv: fix test_and_{set,clear}_bit ordering documentation
  treewide: fix typo 'unsigned __init128' -> 'unsigned __int128'
  MAINTAINERS: add rust bindings entry for bitmap API
  rust: Add cpumask helpers
  uapi: Revert "bitops: avoid integer overflow in GENMASK(_ULL)"
  cpumask: drop cpumask_next_wrap_old()
  PCI: hv: Switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap()
  scsi: lpfc: rework lpfc_next_{online,present}_cpu()
  scsi: lpfc: switch lpfc_irq_rebalance() to using cpumask_next_wrap()
  s390: switch stop_machine_yield() to using cpumask_next_wrap()
  padata: switch padata_find_next() to using cpumask_next_wrap()
  cpumask: use cpumask_next_wrap() where appropriate
  cpumask: re-introduce cpumask_next{,_and}_wrap()
  cpumask: deprecate cpumask_next_wrap()
  powerpc/xmon: simplify xmon_batch_next_cpu()
  ibmvnic: simplify ibmvnic_set_queue_affinity()
  virtio_net: simplify virtnet_set_affinity()
  objpool: rework objpool_pop()
  cpumask: add for_each_{possible,online}_cpu_wrap
  ...
2025-03-24 19:11:58 -07:00
Paolo Abeni
f491593394 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.14-rc8).

Conflict:

tools/testing/selftests/net/Makefile
  03544faad7 ("selftest: net: add proc_net_pktgen")
  3ed61b8938 ("selftests: net: test for lwtunnel dst ref loops")

tools/testing/selftests/net/config:
  85cb3711ac ("selftests: net: Add test cases for link and peer netns")
  3ed61b8938 ("selftests: net: test for lwtunnel dst ref loops")

Adjacent commits:

tools/testing/selftests/net/Makefile
  c935af429e ("selftests: net: add support for testing SO_RCVMARK and SO_RCVPRIORITY")
  355d940f4d ("Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."")

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20 21:38:01 +01:00
Paolo Bonzini
361da275e5 Merge branch 'kvm-nvmx-and-vm-teardown' into HEAD
The immediate issue being fixed here is a nVMX bug where KVM fails to
detect that, after nested VM-Exit, L1 has a pending IRQ (or NMI).
However, checking for a pending interrupt accesses the legacy PIC, and
x86's kvm_arch_destroy_vm() currently frees the PIC before destroying
vCPUs, i.e. checking for IRQs during the forced nested VM-Exit results
in a NULL pointer deref; that's a prerequisite for the nVMX fix.

The remaining patches attempt to bring a bit of sanity to x86's VM
teardown code, which has accumulated a lot of cruft over the years.  E.g.
KVM currently unloads each vCPU's MMUs in a separate operation from
destroying vCPUs, all because when guest SMP support was added, KVM had a
kludgy MMU teardown flow that broke when a VM had more than one 1 vCPU.
And that oddity lived on, for 18 years...

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-03-20 13:13:00 -04:00
Paolo Bonzini
c0f99fb4e5 KVM/riscv changes for 6.15
- Disable the kernel perf counter during configure
 - KVM selftests improvements for PMU
 - Fix warning at the time of KVM module removal
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmfbsiQACgkQrUjsVaLH
 LAfGAxAAlODgZMDt+f/YlOOmQPsLohGrWsZjvheRfrf3X5PFNuXd1frpZSbF1PCs
 lk4WZrekJo2VT2tWMEBjFrH6A9slHUzjZs+xIxig1N/Gmxy1clKp/svJVG20WwQI
 a5/5jPH3TNDL3KUphwBF0QIi/Tlp6+SZ5RAag6QsBXKnFy5+o9lioR2T6tFzMoxy
 ry8VD8e395bQXn5tYkk4vsr9vBRzpgq7ZBOMg6zyNTL86UBOVq3QXPNzSE5kenXW
 4zqDmuxbk+GgpUKLXA9aXwLvasGiOr+59uToZd2lHf5ynSdSprQyW11I5ZOFs2DX
 800mLeYtyHc4nGTi6OO3VN9toMhmgijtFLNpBDg4LRvwMfB+sp62Ixsmjs3CBCd+
 GqcbJaMtdgM1SpZ3tlzEQoWpIBlNYs7BuG/6Jen9xKHHvOg7Ty9UimJeMikV+FUm
 srq0+GlgjaniZ/d8arIIm7LVqZLAc6OSzn+A8IcE0KQ9fKBgB0CZrANW+N+b+7ub
 Lkq9hA8tM5Ti0bNszqPyJu3bmhWAOEOwF0CjYL79vEhIYsegCKKtruKOppdM+sl2
 dzk+DVV9Aar+bDu/v/bYV5TfIq39U86Tqxe+OU7EPzch5NdyDvKcpZ9/JIZnf/j5
 tviM3awZfFmgTeBUTAYvaR34KExBOLQLkGb9BqAydYjg+pl9jVE=
 =HP3N
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-6.15-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv changes for 6.15

- Disable the kernel perf counter during configure
- KVM selftests improvements for PMU
- Fix warning at the time of KVM module removal
2025-03-20 12:53:34 -04:00
Alexandre Ghiti
74f4bf9d15
Merge patch series "riscv: Add runtime constant support"
Charlie Jenkins <charlie@rivosinc.com> says:

Ard brought this to my attention in this patch [1].

I benchmarked this patch on the Nezha D1 (which does not contain Zba or
Zbkb so it uses the default algorithm) by navigating through a large
directory structure. I created a 1000-deep directory structure and then
cd and ls through it. With this patch there was a 0.57% performance
improvement.

[1] https://lore.kernel.org/lkml/CAMj1kXE4DJnwFejNWQu784GvyJO=aGNrzuLjSxiowX_e7nW8QA@mail.gmail.com/

* patches from https://lore.kernel.org/r/20250319-runtime_const_riscv-v10-0-745b31a11d65@rivosinc.com:
  riscv: Add runtime constant support
  riscv: Move nop definition to insn-def.h

Link: https://lore.kernel.org/linux-riscv/20250319-runtime_const_riscv-v10-0-745b31a11d65@rivosinc.com/
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-20 09:15:04 +00:00
Charlie Jenkins
a44fb57221
riscv: Add runtime constant support
Implement the runtime constant infrastructure for riscv. Use this
infrastructure to generate constants to be used by the d_hash()
function.

This is the riscv variant of commit 94a2bc0f61 ("arm64: add 'runtime
constant' support") and commit e3c92e8171 ("runtime constants: add
x86 architecture support").

[ alex: Remove trailing whitespace ]

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250319-runtime_const_riscv-v10-2-745b31a11d65@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-20 09:15:03 +00:00
Charlie Jenkins
afa8a93932
riscv: Move nop definition to insn-def.h
We have duplicated the definition of the nop instruction in ftrace.h and
in jump_label.c. Move this definition into the generic file insn-def.h
so that they can share the definition with each other and with future
files.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250319-runtime_const_riscv-v10-1-745b31a11d65@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-20 09:14:42 +00:00
Alexandre Ghiti
d9b65824d8
Merge patch series "riscv: Unaligned access speed probing fixes and skipping"
Andrew Jones <ajones@ventanamicro.com> says:

The first six patches of this series are fixes and cleanups of the
unaligned access speed probing code. The next patch introduces a
kernel command line option that allows the probing to be skipped.
This command line option is a different approach than Jesse's [1].
[1] takes a cpu-list for a particular speed, supporting heterogeneous
platforms. With this approach, the kernel command line should only
be used for homogeneous platforms. [1] also only allowed 'fast' and
'slow' to be selected. This parameter also supports 'unsupported',
which could be useful for testing code paths gated on that. The final
patch adds the documentation.

[1] https://lore.kernel.org/linux-riscv/20240805173816.3722002-1-jesse@rivosinc.com/

* patches from https://lore.kernel.org/r/20250304120014.143628-10-ajones@ventanamicro.com:
  Documentation/kernel-parameters: Add riscv unaligned speed parameters
  riscv: Add parameter for skipping access speed tests
  riscv: Fix set up of vector cpu hotplug callback
  riscv: Fix set up of cpu hotplug callbacks
  riscv: Change check_unaligned_access_speed_all_cpus to void
  riscv: Fix check_unaligned_access_all_cpus
  riscv: Fix riscv_online_cpu_vec
  riscv: Annotate unaligned access init functions

Link: https://lore.kernel.org/r/20250304120014.143628-10-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-20 07:25:24 +00:00
Chao Du
b3f263a98d RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed
The comments for EXT_SVADE are a bit confusing. Clarify it.

Signed-off-by: Chao Du <duchao@eswincomputing.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250221025929.31678-1-duchao@eswincomputing.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-03-20 11:33:56 +05:30
Arnd Bergmann
519df17cb0 RISC-V Devicetrees for v6.15
Starfive:
 All changes for jh7110-based boards including the removal of a dac
 that does not exist and the addition of usb3 support on the star64 board
 and pcie on the framework mainboard.
 
 Microchip:
 Update pcie reg properties to fix a mistake originally describing them.
 Here rather than in fixes, since the driver maintains support for the
 old format.
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZ9lfIwAKCRB4tDGHoIJi
 0hoWAQD4bX80fEznTmpFKy5suiljz7v2ePTkOhRFqU9yu7RS9AD/WEZkw4tNU4Y9
 IxS/Znk5i/ECLgQGsi5axHTwYkoMFwQ=
 =V0XH
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmfbM+AACgkQYKtH/8kJ
 UieHeBAA17sRn2qDBsOVjOY5SU7j1vyWwWUEt5qJh4Ft55wEt2gZ/ay1pWqMST3q
 PpQ5cQQqf+lzhAweVVrmCsXSBwDe4AzArOdDpOxfsWCq8wkiq2/yV0220VD513j/
 qH1BmzO5QNC1rw5a9wvmOzjc/YJwEbnrfAwscBiuePvA+ve/vafngTd0ElJgc6dh
 5Si9OP2OvLZzNEiR3eKWLMd1PFCNwHUiAu8Ug50K0zYWt1p2jT8y+nFYI8bZIUjZ
 pltj1XHZe6h0eItt9w1aM4PY6TygiB5bc8teHWcp2tnoBplj7C+QGVw86M4YFI5D
 neKwiH1+ek8rVLORdAtH0eBG8BvLppZJ7enlY6aFgiYbIH3SYI9AGfM3ntnZhvRi
 Z6VrfhCkB2ddq1uTBgBqRld9ZgmCPQcSoTMVwG3KXbXk01f7JBDwG6bE1OBOeJbS
 cIUHwuxN+bwSPD/QOagUf+wsJt+XYTWv3iJIj46+mSD5qz6yKjmvRAMAotydG130
 G3ftE6I79E3zDYtbNt17d2K3sTJnLjjRk4gHp6g9SLf6QhLNd8SCp1zIVfUk1WoL
 CIUpPlgGgkbtLTmGAEXNW95zVRi+ZeFFDIOcCsahS33hmOkidthd5BzkSUXKa+Zh
 lzMujCh/4JUXToHmugwpi1ZEVl84rk9RLCgPd9PhfFvhCzL1rSA=
 =YiZd
 -----END PGP SIGNATURE-----

Merge tag 'riscv-dt-for-v6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/dt

RISC-V Devicetrees for v6.15

Starfive:
All changes for jh7110-based boards including the removal of a dac
that does not exist and the addition of usb3 support on the star64 board
and pcie on the framework mainboard.

Microchip:
Update pcie reg properties to fix a mistake originally describing them.
Here rather than in fixes, since the driver maintains support for the
old format.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-dt-for-v6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: dts: starfive: jh7110-pine64-star64: enable USB 3.0 port
  riscv: dts: starfive: jh7110: pciephy0 USB 3.0 configuration registers
  riscv: dts: starfive: fml13v01: enable pcie1
  riscv: dts: starfive: remove non-existent dac from jh7110
  riscv: dts: starfive: Unify regulator naming scheme
  riscv: dts: microchip: update pcie reg properties to new format

Link: https://lore.kernel.org/r/20250318-favorite-presuming-bf2fcf55bf6a@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-03-19 22:15:12 +01:00
Arnd Bergmann
43adf498e9 RISC-V Devicetrees for v6.15
Sophgo:
 Add pwm controller support for SG2042.
 Add pwm-fan & cooling maps for Milk-V Pioneer.
 Updated MAINTAINERS info for SOPHGO DEVICETREES and DRIVERS.
 
 Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEdoBX2jyDC9ZCTwZjDCzASqG0i0IFAmfX3K0ACgkQDCzASqG0
 i0Ju0AwAogVPQ/u5zG4KVlcsFNUy+u369ZTOLGCv9fWFPqv962bbCAysFEMJmfRj
 tWCUUDDo31P/iLThghSUMAi9A4YOJAHc8RFMscNKqbaRoAPJ2rAsUL67f3VE6L1F
 eXoUUSA8oWYD+MIE75ba7LNxvWQ19Fy6AV8PqRHS6w8vWFh9lW6ghKx/1Yxq4MHq
 OFRrHKcYZfF3Z8T18JtOwbVgSuLAh0U2CZZWZw0LSr7eVie/4YUCjcvQcXR+lK5h
 h0y3zvNlph1vUfz5UxoMXd9w74xjPalxwacMNkA7DnL3uTwo4a8rqY8e0xdMXztD
 vORAj3nUMNgaMVfE6D4E5UF8yEkHd3K1TAMj4mqOZ2YjSjwnzg9ZAVp3itI7TTvs
 S4kAUNwlyOzoFmDOGDAll4k7VIt8GwmvNaDtXXyEAEW6mLHt5ygLC/Cr3N5OwHDR
 k7abl2WIM2VnQRNcZIejGK0BAzAMBas6W+t4Lp6Ax1vRJcC2ArVvc/izH7pDF9NL
 hKl7Bvie
 =UoSS
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmfbLgkACgkQYKtH/8kJ
 Uic2Lw//R+Iq1JNVwlbbGc/58kjuPOmKXV6jnlHmUVty58a/vZI/nwX8Jkpo7WPw
 4oFkp6p9OiZnsMRLv+OTFCCznIJm2l8glXmQf1xFjsPnJIylqsqSXttfiTutA/oW
 pqmzOwnlzbsXjgPT/O6SxaCkZrcbsj9aSDRJyj+tBbp6+SUF164agpONqa5JQAeA
 byXY3FhTg95aoQBCUrb+x9mLbas3Ei6LcjO9KdGmPHQ8TwyvqTRYhPinuY+DrrE0
 LS/RM/+i5JG9EeCTJdnHLowty7L6qdjXHZQYPMdZASs9JDvtAmbdnKkBeD4Gvbqm
 tLk8sggtoydlbiAAZZBHC+fgo2ZyjgFg/1tuOXEZjSG8obtkLbJjdWKH/OVuxJBb
 uxL00+9wdqS9zLfq1zpRid+Ekwkj5QPcy5g/izIk89BZkZRTMRuTKRY2myePyImU
 vruMli0I8czLRhjzphEVlzFGHIA998EA4m/MYV/rath46k6GnBHCM0/QO+PSq+rt
 l/MdivC68k1QRzNFQ++PPm7D0LYew1V3g2OourXR90SU3DDgAKyfVAoSO7fBXQkP
 /eykGFKZ78Jytfnhp5AbJxstIB4yWfBaKAYsBd2LgngGMhJpD6eDQQKTwYCbBKfH
 f3V+HBAJhxnGfYqjLQfS1y99YdVYQdTHyjBNOE+MMPRsK1lT1Z0=
 =ZMLq
 -----END PGP SIGNATURE-----

Merge tag 'riscv-sophgo-dt-for-v6.15' of https://github.com/sophgo/linux into soc/dt

RISC-V Devicetrees for v6.15

Sophgo:
Add pwm controller support for SG2042.
Add pwm-fan & cooling maps for Milk-V Pioneer.
Updated MAINTAINERS info for SOPHGO DEVICETREES and DRIVERS.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>

* tag 'riscv-sophgo-dt-for-v6.15' of https://github.com/sophgo/linux:
  riscv: sophgo: dts: add cooling maps for Milk-V Pioneer
  riscv: sophgo: dts: add pwm-fan for Milk-V Pioneer
  MAINTAINERS: update info for SOPHGO DEVICETREES and DRIVERS
  riscv: sophgo: dts: add pwm controller for SG2042 SoC

Link: https://lore.kernel.org/r/PN0PR01MB10393CEC71B623E0A779E7393FEDF2@PN0PR01MB10393.INDPRD01.PROD.OUTLOOK.COM
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-03-19 21:50:17 +01:00
Russell King (Oracle)
637af286f9 riscv: dts: starfive: remove "snps,en-tx-lpi-clockgating" property
Whether the MII transmit clock can be stopped is primarily a property
of the PHY (there is a capability bit that should be checked first.)
Whether the MAC is capable of stopping the transmit clock is a separate
issue, but this is already handled by the core DesignWare MAC code.

As commit "net: stmmac: starfive: use PHY capability for TX clock stop"
adds the flag to use the PHY capability, remove the DT property that is
now unecessary.

Cc: Samin Guo <samin.guo@starfivetech.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tsIU5-005vGR-4c@rmk-PC.armlinux.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-19 18:06:32 +01:00
Atish Patra
2d117e67f3 RISC-V: KVM: Teardown riscv specific bits after kvm_exit
During a module removal, kvm_exit invokes arch specific disable
call which disables AIA. However, we invoke aia_exit before kvm_exit
resulting in the following warning. KVM kernel module can't be inserted
afterwards due to inconsistent state of IRQ.

[25469.031389] percpu IRQ 31 still enabled on CPU0!
[25469.031732] WARNING: CPU: 3 PID: 943 at kernel/irq/manage.c:2476 __free_percpu_irq+0xa2/0x150
[25469.031804] Modules linked in: kvm(-)
[25469.031848] CPU: 3 UID: 0 PID: 943 Comm: rmmod Not tainted 6.14.0-rc5-06947-g91c763118f47-dirty #2
[25469.031905] Hardware name: riscv-virtio,qemu (DT)
[25469.031928] epc : __free_percpu_irq+0xa2/0x150
[25469.031976]  ra : __free_percpu_irq+0xa2/0x150
[25469.032197] epc : ffffffff8007db1e ra : ffffffff8007db1e sp : ff2000000088bd50
[25469.032241]  gp : ffffffff8131cef8 tp : ff60000080b96400 t0 : ff2000000088baf8
[25469.032285]  t1 : fffffffffffffffc t2 : 5249207570637265 s0 : ff2000000088bd90
[25469.032329]  s1 : ff60000098b21080 a0 : 037d527a15eb4f00 a1 : 037d527a15eb4f00
[25469.032372]  a2 : 0000000000000023 a3 : 0000000000000001 a4 : ffffffff8122dbf8
[25469.032410]  a5 : 0000000000000fff a6 : 0000000000000000 a7 : ffffffff8122dc10
[25469.032448]  s2 : ff60000080c22eb0 s3 : 0000000200000022 s4 : 000000000000001f
[25469.032488]  s5 : ff60000080c22e00 s6 : ffffffff80c351c0 s7 : 0000000000000000
[25469.032582]  s8 : 0000000000000003 s9 : 000055556b7fb490 s10: 00007ffff0e12fa0
[25469.032621]  s11: 00007ffff0e13e9a t3 : ffffffff81354ac7 t4 : ffffffff81354ac7
[25469.032664]  t5 : ffffffff81354ac8 t6 : ffffffff81354ac7
[25469.032698] status: 0000000200000100 badaddr: ffffffff8007db1e cause: 0000000000000003
[25469.032738] [<ffffffff8007db1e>] __free_percpu_irq+0xa2/0x150
[25469.032797] [<ffffffff8007dbfc>] free_percpu_irq+0x30/0x5e
[25469.032856] [<ffffffff013a57dc>] kvm_riscv_aia_exit+0x40/0x42 [kvm]
[25469.033947] [<ffffffff013b4e82>] cleanup_module+0x10/0x32 [kvm]
[25469.035300] [<ffffffff8009b150>] __riscv_sys_delete_module+0x18e/0x1fc
[25469.035374] [<ffffffff8000c1ca>] syscall_handler+0x3a/0x46
[25469.035456] [<ffffffff809ec9a4>] do_trap_ecall_u+0x72/0x134
[25469.035536] [<ffffffff809f5e18>] handle_exception+0x148/0x156

Invoke aia_exit and other arch specific cleanup functions after kvm_exit
so that disable gets a chance to be called first before exit.

Fixes: 54e43320c2 ("RISC-V: KVM: Initial skeletal support for AIA")
Fixes: eded6754f3 ("riscv: KVM: add basic support for host vs guest profiling")
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20250317-kvm_exit_fix-v1-1-aa5240c5dbd2@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-03-19 20:21:38 +05:30
Andrew Jones
aecb09e091
riscv: Add parameter for skipping access speed tests
Allow skipping scalar and vector unaligned access speed tests. This
is useful for testing alternative code paths and to skip the tests in
environments where they run too slowly. All CPUs must have the same
unaligned access speed.

The code movement is because we now need the scalar cpu hotplug
callback to always run, so we need to bring it and its supporting
functions out of CONFIG_RISCV_PROBE_UNALIGNED_ACCESS.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-17-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:30 +00:00
Andrew Jones
2744ec472d
riscv: Fix set up of vector cpu hotplug callback
Whether or not we have RISCV_PROBE_VECTOR_UNALIGNED_ACCESS we need to
set up a cpu hotplug callback to check if we have vector at all,
since, when we don't have vector, we need to set
vector_misaligned_access to unsupported rather than leave it the
default of unknown.

Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-16-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:30 +00:00
Andrew Jones
05ee21f0fc
riscv: Fix set up of cpu hotplug callbacks
CPU hotplug callbacks should be set up even if we detected all
current cpus emulate misaligned accesses, since we want to
ensure our expectations of all cpus emulating is maintained.

Fixes: 6e5ce7f2ea ("riscv: Decouple emulated unaligned accesses from access speed")
Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-15-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:30 +00:00
Andrew Jones
813d39baee
riscv: Change check_unaligned_access_speed_all_cpus to void
The return value of check_unaligned_access_speed_all_cpus() is always
zero, so make the function void so we don't need to concern ourselves
with it. The change also allows us to tidy up
check_unaligned_access_all_cpus() a bit.

Reviewed-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-14-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:29 +00:00
Andrew Jones
e6d0adf2eb
riscv: Fix check_unaligned_access_all_cpus
check_vector_unaligned_access_emulated_all_cpus(), like its name
suggests, will return true when all cpus emulate unaligned vector
accesses. If the function returned false it may have been because
vector isn't supported at all (!has_vector()) or because at least
one cpu doesn't emulate unaligned vector accesses. Since false may
be returned for two cases, checking for it isn't sufficient when
attempting to determine if we should proceed with the vector speed
check. Move the !has_vector() functionality to
check_unaligned_access_all_cpus() in order for
check_vector_unaligned_access_emulated_all_cpus() to return false
for a single case.

Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-13-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:29 +00:00
Andrew Jones
5af72a8186
riscv: Fix riscv_online_cpu_vec
We shouldn't probe when we already know vector is unsupported and
we should probe when we see we don't yet know whether it's supported.
Furthermore, we should ensure we've set the access type to
unsupported when we don't have vector at all.

Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-12-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:29 +00:00
Andrew Jones
a00e022be5
riscv: Annotate unaligned access init functions
Several functions used in unaligned access probing are only run at
init time. Annotate them appropriately.

Fixes: f413aae96c ("riscv: Set unaligned access speed at compile time")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-11-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:28 +00:00
Clément Léger
2d79608cf6
RISC-V: KVM: Allow Zaamo/Zalrsc extensions for Guest/VM
Extend the KVM ISA extension ONE_REG interface to allow KVM user space
to detect and enable Zaamo/Zalrsc extensions for Guest/VM.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240619153913.867263-5-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 12:03:46 +00:00
Clément Léger
9d45d1ff90
riscv: hwprobe: export Zaamo and Zalrsc extensions
Export the Zaamo and Zalrsc extensions to userspace using hwprobe.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240619153913.867263-4-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 12:03:45 +00:00
Clément Léger
35173b666a
riscv: add parsing for Zaamo and Zalrsc extensions
These 2 new extensions are actually a subset of the A extension which
provides atomic memory operations and load-reserved/store-conditional
instructions.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240619153913.867263-3-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 12:03:45 +00:00
Pu Lehui
67a5ba8f74
riscv: fgraph: Fix stack layout to match __arch_ftrace_regs argument of ftrace_return_to_handler
Naresh Kamboju reported a "Bad frame pointer" kernel warning while
running LTP trace ftrace_stress_test.sh in riscv. We can reproduce the
same issue with the following command:

```
$ cd /sys/kernel/debug/tracing
$ echo 'f:myprobe do_nanosleep%return args1=$retval' > dynamic_events
$ echo 1 > events/fprobes/enable
$ echo 1 > tracing_on
$ sleep 1
```

And we can get the following kernel warning:

[  127.692888] ------------[ cut here ]------------
[  127.693755] Bad frame pointer: expected ff2000000065be50, received ba34c141e9594000
[  127.693755]   from func do_nanosleep return to ffffffff800ccb16
[  127.698699] WARNING: CPU: 1 PID: 129 at kernel/trace/fgraph.c:755 ftrace_return_to_handler+0x1b2/0x1be
[  127.699894] Modules linked in:
[  127.700908] CPU: 1 UID: 0 PID: 129 Comm: sleep Not tainted 6.14.0-rc3-g0ab191c74642 #32
[  127.701453] Hardware name: riscv-virtio,qemu (DT)
[  127.701859] epc : ftrace_return_to_handler+0x1b2/0x1be
[  127.702032]  ra : ftrace_return_to_handler+0x1b2/0x1be
[  127.702151] epc : ffffffff8013b5e0 ra : ffffffff8013b5e0 sp : ff2000000065bd10
[  127.702221]  gp : ffffffff819c12f8 tp : ff60000080853100 t0 : 6e00000000000000
[  127.702284]  t1 : 0000000000000020 t2 : 6e7566206d6f7266 s0 : ff2000000065bd80
[  127.702346]  s1 : ff60000081262000 a0 : 000000000000007b a1 : ffffffff81894f20
[  127.702408]  a2 : 0000000000000010 a3 : fffffffffffffffe a4 : 0000000000000000
[  127.702470]  a5 : 0000000000000000 a6 : 0000000000000008 a7 : 0000000000000038
[  127.702530]  s2 : ba34c141e9594000 s3 : 0000000000000000 s4 : ff2000000065bdd0
[  127.702591]  s5 : 00007fff8adcf400 s6 : 000055556dc1d8c0 s7 : 0000000000000068
[  127.702651]  s8 : 00007fff8adf5d10 s9 : 000000000000006d s10: 0000000000000001
[  127.702710]  s11: 00005555737377c8 t3 : ffffffff819d899e t4 : ffffffff819d899e
[  127.702769]  t5 : ffffffff819d89a0 t6 : ff2000000065bb18
[  127.702826] status: 0000000200000120 badaddr: 0000000000000000 cause: 0000000000000003
[  127.703292] [<ffffffff8013b5e0>] ftrace_return_to_handler+0x1b2/0x1be
[  127.703760] [<ffffffff80017bce>] return_to_handler+0x16/0x26
[  127.704009] [<ffffffff80017bb8>] return_to_handler+0x0/0x26
[  127.704057] [<ffffffff800d3352>] common_nsleep+0x42/0x54
[  127.704117] [<ffffffff800d44a2>] __riscv_sys_clock_nanosleep+0xba/0x10a
[  127.704176] [<ffffffff80901c56>] do_trap_ecall_u+0x188/0x218
[  127.704295] [<ffffffff8090cc3e>] handle_exception+0x14a/0x156
[  127.705436] ---[ end trace 0000000000000000 ]---

The reason is that the stack layout for constructing argument for the
ftrace_return_to_handler in the return_to_handler does not match the
__arch_ftrace_regs structure of riscv, leading to unexpected results.

Fixes: a3ed4157b7 ("fgraph: Replace fgraph_ret_regs with ftrace_regs")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/all/CA+G9fYvp_oAxeDFj88Tk2rfEZ7jtYKAKSwfYS66=57Db9TBdyA@mail.gmail.com
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20250317031214.4138436-2-pulehui@huaweicloud.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 12:03:25 +00:00
Ingo Molnar
89771319e0 Linux 6.14-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmfXVtUeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGN/sH/i5423Gt/z51gDjA
 s4v5Z7GaBJ9zOGBahn2RWFe72zytTqKrEJmMnGfguirs0atD1DtQj4WAP7iFKP+e
 WyO663X6HF7i5y37ja0Yd4PZc31hwtqzKH8LjBf8f8tTy8UsEVqumdi5A4sS9KTM
 qm4kTyyVEY9D/s7oRY8ywjDlRJtO6nT0aKMp4kAqNEbrNUYbilT/a0hgXcgSmPyB
 uIjmjL2fZfutxGI5LgfbaSHCa1ElmhvTvivOMpaAmZSGCRVHCKGgT0CTNnHyn/7C
 dB145JkRO4ZOUqirCdO4PE/23id3ajq9fcixJGBzAv7c45y+B3JZ1r2kAfKalE8/
 qrOKLys=
 =8r7a
 -----END PGP SIGNATURE-----

Merge tag 'v6.14-rc7' into x86/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-19 11:03:06 +01:00
Pu Lehui
e8eb8e1bda
riscv: fgraph: Select HAVE_FUNCTION_GRAPH_TRACER depends on HAVE_DYNAMIC_FTRACE_WITH_ARGS
Currently, fgraph on riscv relies on the infrastructure of
DYNAMIC_FTRACE_WITH_ARGS. However, DYNAMIC_FTRACE_WITH_ARGS may be
turned off on riscv, which will cause the enabled fgraph to be abnormal.
Therefore, let's select HAVE_FUNCTION_GRAPH_TRACER depends on
HAVE_DYNAMIC_FTRACE_WITH_ARGS.

Fixes: a3ed4157b7 ("fgraph: Replace fgraph_ret_regs with ftrace_regs")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202503160820.dvqMpH0g-lkp@intel.com/
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20250317031214.4138436-1-pulehui@huaweicloud.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 14:11:45 +00:00
Alexandre Ghiti
33981b1c4e
riscv: Fix missing __free_pages() in check_vector_unaligned_access()
The locally allocated pages are never freed up, so add the corresponding
__free_pages().

Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Link: https://lore.kernel.org/r/20250228090613.345309-1-alexghiti@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 13:37:56 +00:00
Tingbo Liao
475afa39b1
riscv: Fix the __riscv_copy_vec_words_unaligned implementation
Correct the VEC_S macro definition to fix the implementation
of vector words copy in the case of unalignment in RISC-V.

Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Tingbo Liao <tingbo.liao@starfivetech.com>
Link: https://lore.kernel.org/r/20250228090801.8334-1-tingbo.liao@starfivetech.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 13:37:27 +00:00
Thomas Weißschuh
eb8db421ce
riscv: mm: Don't use %pK through printk
Restricted pointers ("%pK") are not meant to be used through printk().
It can unintentionally expose security sensitive, raw pointer values.

Use regular pointer formatting instead.

Link: https://lore.kernel.org/lkml/20250113171731-dc10e3c1-da64-4af0-b767-7c7070468023@linutronix.de/
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250217-restricted-pointers-riscv-v1-1-72a078076a76@linutronix.de
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 13:35:07 +00:00
Zixian Zeng
eac5b13881
riscv: remove redundant CMDLINE_FORCE check
Drop redundant CMDLINE_FORCE check as it's already done in
function early_init_dt_scan_chosen().

Signed-off-by: Zixian Zeng <sycamoremoon376@gmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250114-rebund-v1-1-5632b2d54d6c@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 13:33:31 +00:00
Juhan Jin
5f1a58ed91
riscv: ftrace: Add parentheses in macro definitions of make_call_t0 and make_call_ra
This patch adds parentheses to parameters caller and callee of macros
make_call_t0 and make_call_ra. Every existing invocation of these two
macros uses a single variable for each argument, so the absence of the
parentheses seems okay. However, future invocations might use more
complex expressions as arguments. For example, a future invocation might
look like this: make_call_t0(a - b, c, call). Without parentheses in the
macro definition, the macro invocation expands to:

...
unsigned int offset = (unsigned long) c - (unsigned long) a - b;
...

which is clearly wrong.

The use of parentheses ensures arguments are correctly evaluated and
potentially saves future users of make_call_t0 and make_call_ra debugging
trouble.

Fixes: 6724a76cff ("riscv: ftrace: Reduce the detour code size to half")
Signed-off-by: Juhan Jin <juhan.jin@foxmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/tencent_AE90AA59903A628E87E9F80E563DA5BA5508@qq.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 13:32:55 +00:00
Masahiro Yamada
82e81b8950
riscv: migrate to the generic rule for built-in DTB
Commit 654102df2a ("kbuild: add generic support for built-in boot
DTBs") introduced generic support for built-in DTBs.

Select GENERIC_BUILTIN_DTB when built-in DTB support is enabled.

To keep consistency across architectures, this commit also renames
CONFIG_BUILTIN_DTB_SOURCE to CONFIG_BUILTIN_DTB_NAME.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20241222000836.2578171-1-masahiroy@kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 13:30:13 +00:00
Charlie Jenkins
bba547810c
riscv: tracing: Fix __write_overflow_field in ftrace_partial_regs()
The size of &regs->a0 is unknown, causing the error:

../include/linux/fortify-string.h:571:25: warning: call to
'__write_overflow_field' declared with attribute warning: detected write
beyond size of field (1st parameter); maybe use struct_group()?
[-Wattribute-warning]

Fix this by wrapping the required registers in pt_regs with
struct_group() and reference the group when doing the offending
memcpy().

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250224-fix_ftrace_partial_regs-v1-1-54b906417e86@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 13:06:05 +00:00
Geert Uytterhoeven
72770690e0
riscv: Remove duplicate CLINT_TIMER selections
Since commit f862bbf4cd ("riscv: Allow NOMMU kernels to run in
S-mode") in v6.10, CLINT_TIMER is selected by the main RISCV symbol when
RISCV_M_MODE is enabled.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/ce55529a42fa232cacd580e38866c60701f91095.1738764474.git.geert+renesas@glider.be
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 13:05:13 +00:00
Geert Uytterhoeven
c2db83fe10
riscv: defconfig: Disable Renesas SoC support
Follow-up to commit e36ddf3226 ("riscv: defconfig: Disable RZ/Five
peripheral support") in v6.12-rc1:

  - Disable ARCH_RENESAS, too, as currently RZ/Five is the sole Renesas
    RISC-V SoC,
  - Drop no longer needed explicit disable of USB_XHCI_RCAR, which
    depends on ARCH_RENESAS.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/e8a2fb273c8c68bd6d526b924b4212f397195b28.1738764211.git.geert+renesas@glider.be
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 13:04:25 +00:00
Chin Yik Ming
418af0eafb
riscv: Fix a comment typo in set_mm_asid()
s/verion/version

Signed-off-by: Chin Yik Ming <yikming2222@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241114212725.4172401-1-yikming2222@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 13:02:47 +00:00
Alexandre Ghiti
a5edc510da
Merge patch series "Support SSTC while PM operations"
Nick Hu <nick.hu@sifive.com> says:

When the cpu is going to be hotplug, stop the stimecmp to prevent pending
interrupt.
When the cpu is going to be suspended, save the stimecmp before entering
the suspend state and restore it in the resume path.

* patches from https://lore.kernel.org/r/20250219114135.27764-1-nick.hu@sifive.com:
  clocksource/drivers/timer-riscv: Stop stimecmp when cpu hotplug
  riscv: Add stimecmp save and restore

Link: https://lore.kernel.org/r/20250219114135.27764-1-nick.hu@sifive.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 12:59:08 +00:00
Nick Hu
ffef54ad41
riscv: Add stimecmp save and restore
If the HW support the SSTC extension, we should save and restore the
stimecmp register while cpu non retention suspend.

Signed-off-by: Nick Hu <nick.hu@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20250219114135.27764-2-nick.hu@sifive.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 12:59:03 +00:00
Chin Yik Ming
7f238b1266
riscv: Simplify base extension checks and direct boolean return
Reduce three lines checking to single line using a ternary conditional
expression for getting the base extension word. In addition, the
test_bit macro function already return a boolean which matches the
return type of the caller, so directly return the result of the test_bit
macro function.

Signed-off-by: Chin Yik Ming <yikming2222@gmail.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250129203843.1136838-1-yikming2222@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 12:47:20 +00:00
Jinjie Ruan
a4a58f510b
riscv: Remove unused TASK_TI_FLAGS
Since commit f0bddf5058 ("riscv: entry: Convert to generic
entry"), TASK_TI_FLAGS is not used any more, so remove it.

Fixes: f0bddf5058 ("riscv: entry: Convert to generic entry")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241109014605.2801492-1-ruanjinjie@huawei.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 12:46:15 +00:00
Yunhui Cui
eb10039709
RISC-V: hwprobe: Expose Zicbom extension and its block size
Expose Zicbom through hwprobe and also provide a key to extract its
respective block size.

[ alex: Fix merge conflicts and hwprobe numbering ]

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20250226063206.71216-3-cuiyunhui@bytedance.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 12:43:56 +00:00
Yunhui Cui
de70b532f9
RISC-V: Enable cbo.clean/flush in usermode
Enabling cbo.clean and cbo.flush in user mode makes it more
convenient to manage the cache state and achieve better performance.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20250226063206.71216-2-cuiyunhui@bytedance.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 12:35:14 +00:00
Alexandre Ghiti
2f2cd9f334
Merge patch series "riscv: Add bfloat16 instruction support"
Inochi Amaoto <inochiama@gmail.com> says:

Add description for the BFloat16 precision Floating-Point ISA extension,
(Zfbfmin, Zvfbfmin, Zvfbfwma). which was ratified in commit 4dc23d62
("Added Chapter title to BF16") of the riscv-isa-manual.

* patches from https://lore.kernel.org/r/20250213003849.147358-1-inochiama@gmail.com:
  riscv: hwprobe: export bfloat16 ISA extension
  riscv: add ISA extension parsing for bfloat16 ISA extension
  dt-bindings: riscv: add bfloat16 ISA extension description

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250213003849.147358-1-inochiama@gmail.com
2025-03-18 11:52:54 +00:00
Inochi Amaoto
a4863e002c
riscv: hwprobe: export bfloat16 ISA extension
Export Zfbmin, Zvfbfmin, Zvfbfwma ISA extension through hwprobe.

Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20250213003849.147358-4-inochiama@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 11:52:02 +00:00
Inochi Amaoto
e186c28dda
riscv: add ISA extension parsing for bfloat16 ISA extension
Add parsing for Zfbmin, Zvfbfmin, Zvfbfwma ISA extension which
were ratified in 4dc23d62 ("Added Chapter title to BF16") of
the riscv-isa-manual.

Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20250213003849.147358-3-inochiama@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 11:52:02 +00:00
Guo Ren
d9708b1931
riscv: Implement smp_cond_load8/16() with Zawrs
RISC-V code uses the queued spinlock implementation, which calls
the macros smp_cond_load_acquire for one byte. So, complement the
implementation of byte and halfword versions.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20241217013910.1039923-1-guoren@kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 09:12:45 +00:00
Alexandre Ghiti
d9be2b9b60
riscv: Call secondary mmu notifier when flushing the tlb
This is required to allow the IOMMU driver to correctly flush its own
TLB.

Reviewed-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20250113142424.30487-1-alexghiti@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 09:11:43 +00:00
Miquel Sabaté Solà
4458b8f68d
riscv: hwprobe: export Zicntr and Zihpm extensions
Export Zicntr and Zihpm ISA extensions through the hwprobe syscall.

[ alex: Fix hwprobe numbering ]

Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
Acked-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240913051324.8176-1-mikisabate@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 09:10:22 +00:00
Clément Léger
d3817d091f
riscv: remove useless pc check in stacktrace handling
Checking for pc to be a kernel text address at this location is useless
since pc == handle_exception. Remove this check.

[ alex: Fix merge conflict ]

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240830084934.3690037-1-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 09:06:21 +00:00
Andrew Bresticker
03dc00a2b6
riscv: Support huge pfnmaps
Use RSW0 as the special bit for pmds and puds, just like for ptes.
Also define the {pte,pmd,pud}_pgprot helpers which were previously
missing and are needed for the follow_pfnmap APIs.

Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250108135700.2614848-1-abrestic@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 08:58:42 +00:00
Alexandre Ghiti
8df0cdcc21
Merge patch series "RISC-V: clarify what some RISCV_ISA* config options do & redo Zbb toolchain dependency"
Conor Dooley <conor@kernel.org> says:

Since one depends on the other, albeit trivially, here's a v4 of the Zbb
toolchain dep removal alongside the rewording of Kconfig options I'd
sent out before the merge window. I think I like this implementation
better than v1, but I couldn't think of a good name for a "public"
version of __ALTERNATIVE(), so I used it here directly.
Unfortunately "ALTERNATIVE_2_CFG" already exists and I couldn't think of
a good way to name an alternative macro that allows for several config
options that didn't make the distinction sufficiently clear.. Yell
if you have better suggestions than I did.

I am a wee bit "worried" that this makes the Kconfig option confusing as
it isn't immediately obvious if someone is or is not going to get the
toolchain based optimisations.

Cheers,
Conor.

* patches from https://lore.kernel.org/r/20241024-aspire-rectify-9982da6943e5@spud:
  RISC-V: separate Zbb optimisations requiring and not requiring toolchain support
  RISC-V: clarify what some RISCV_ISA* config options do

Link: https://lore.kernel.org/r/20241024-aspire-rectify-9982da6943e5@spud
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 08:53:10 +00:00
Conor Dooley
9343aaba1f
RISC-V: separate Zbb optimisations requiring and not requiring toolchain support
It seems a bit ridiculous to require toolchain support for BPF to
assemble Zbb instructions, so move the dependency on toolchain support
for Zbb optimisations out of the Kconfig option and to the callsites.

Zbb support has always depended on alternatives, so while adjusting the
config options guarding optimisations, remove any checks for
whether or not alternatives are enabled.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241024-chump-freebase-d26b6d81af33@spud
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 08:53:02 +00:00
Conor Dooley
6216182fb7
RISC-V: clarify what some RISCV_ISA* config options do
During some discussion on IRC yesterday and on Pu's bpf patch [1]
I noticed that these RISCV_ISA* Kconfig options are not really clear
about their implications. Many of these options have no impact on what
userspace is allowed to do, for example an application can use Zbb
regardless of whether or not the kernel does. Change the help text to
try and clarify whether or not an option affects just the kernel, or
also userspace. None of these options actually control whether or not an
extension is detected dynamically as that's done regardless of Kconfig
options, so drop any text that implies the option is required for
dynamic detection, rewording them as "do x when y is detected".

Link: https://lore.kernel.org/linux-riscv/20240328-ferocity-repose-c554f75a676c@spud/ [1]
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241024-overdue-slogan-0b0f69d3da91@spud
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 08:53:02 +00:00
Mike Rapoport (Microsoft)
8afa901c14 arch, mm: make releasing of memory to page allocator more explicit
The point where the memory is released from memblock to the buddy
allocator is hidden inside arch-specific mem_init()s and the call to
memblock_free_all() is needlessly duplicated in every artiste cure and
after introduction of arch_mm_preinit() hook, mem_init() implementation on
many architecture only contains the call to memblock_free_all().

Pull memblock_free_all() call into mm_core_init() and drop mem_init() on
relevant architectures to make it more explicit where the free memory is
released from memblock to the buddy allocator and to reduce code
duplication in architecture specific code.

Link: https://lkml.kernel.org/r/20250313135003.836600-14-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>	[x86]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Tested-by: Mark Brown <broonie@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Guo Ren (csky) <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russel King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:53 -07:00
Mike Rapoport (Microsoft)
0d98484ee3 arch, mm: introduce arch_mm_preinit
Currently, implementation of mem_init() in every architecture consists of
one or more of the following:

* initializations that must run before page allocator is active, for
  instance swiotlb_init()
* a call to memblock_free_all() to release all the memory to the buddy
  allocator
* initializations that must run after page allocator is ready and there is
  no arch-specific hook other than mem_init() for that, like for example
  register_page_bootmem_info() in x86 and sparc64 or simple setting of
  mem_init_done = 1 in several architectures
* a bunch of semi-related stuff that apparently had no better place to
  live, for example a ton of BUILD_BUG_ON()s in parisc.

Introduce arch_mm_preinit() that will be the first thing called from
mm_core_init(). On architectures that have initializations that must happen
before the page allocator is ready, move those into arch_mm_preinit() along
with the code that does not depend on ordering with page allocator setup.

On several architectures this results in reduction of mem_init() to a
single call to memblock_free_all() that allows its consolidation next.

Link: https://lkml.kernel.org/r/20250313135003.836600-13-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>	[x86]
Tested-by: Mark Brown <broonie@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Guo Ren (csky) <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russel King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:53 -07:00
Mike Rapoport (Microsoft)
e120d1bc12 arch, mm: set high_memory in free_area_init()
high_memory defines upper bound on the directly mapped memory.  This bound
is defined by the beginning of ZONE_HIGHMEM when a system has high memory
and by the end of memory otherwise.

All this is known to generic memory management initialization code that
can set high_memory while initializing core mm structures.

Add a generic calculation of high_memory to free_area_init() and remove
per-architecture calculation except for the architectures that set and use
high_memory earlier than that.

Link: https://lkml.kernel.org/r/20250313135003.836600-11-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>	[x86]
Tested-by: Mark Brown <broonie@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Guo Ren (csky) <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russel King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:52 -07:00
Mike Rapoport (Microsoft)
8268af309d arch, mm: set max_mapnr when allocating memory map for FLATMEM
max_mapnr is essentially the size of the memory map for systems that use
FLATMEM. There is no reason to calculate it in each and every architecture
when it's anyway calculated in alloc_node_mem_map().

Drop setting of max_mapnr from architecture code and set it once in
alloc_node_mem_map().

While on it, move definition of mem_map and max_mapnr to mm/mm_init.c so
there won't be two copies for MMU and !MMU variants.

Link: https://lkml.kernel.org/r/20250313135003.836600-10-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>	[x86]
Tested-by: Mark Brown <broonie@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Guo Ren (csky) <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russel King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 22:06:52 -07:00
Linus Torvalds
fc444ada13 soc: fixes for 6.14, part 2
The majority of the last fixes are for devicetree files. This address two
 important regressions for the Qualcomm SMMU and the Raspberry Pi 4 USB
 controller, as well as a larger number of patches fixing minor mistakes
 in board specific files for Rockchips, i.MX, starfive and broadcom.
 
 The non-DT changes are
 
  - A fix for an old boot regression on Renesas shmobile chips
 
  - Another boot time regression for for the Qualcomm PDR SoC driver,
    among a few other Qualcomm firmware driver fixes for efivars
    and tzmem.
 
  - Minor Kconfig fixes for davinci and OMAP1
 
  - Minor code fixes for sparx5 reset controllers, OMAP memory controller,
    i.MX SCU, cpufreq and SoC drivers and a Hisilicon SoC driver.
 
  - One more update to the Asahi maintainers, adding Neal Gompa as a
    reviewer
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmfYYesACgkQYKtH/8kJ
 UicnLxAA2rhqxXizpfBS1JBi3V1koh2JCbbsFfWPlYbgs6vFKfodvzpbz8nggWwK
 lM+nIeF7PgxwDzip9mu07EdJgFwrQN9SFHG1lkXKW+d+e5o8ZKJzVMF7Faxedqju
 vIWjpIiF3BlbLYOX+HkEL+fCmqmD7ehAWE4GJbIde3MdjEbzjJoAZqElh5BbPesq
 PMnH0eRQtHRKlLhrb5BB512G7QGwFPB2GXJkwfY1egUpFQqwKsY45F9e3XUzm8B0
 JlO45qcLJclHTA4ravLrZdXG38nF6DvwaHdgfyymu2QO+Hr6FHoQcjvS1ZkxByP1
 EXuPurnC17UhM7SNGCB7iqx4V9jsVyh20gR9G36vMxq3igSppShaniaADlfYBXOn
 To5ojVNHFWUJIFbFURPsLDXdo8CHjZDu+rB9p8eOeNl+A65CbOpa87L0K94JxrON
 Tm35vVLNKPpMuAcYiT77J2v0hKnHRsnyKj3ofA7mnZ2SoiMGhet+ahmlh2qvNfpY
 v32SEF31kpMsUDcZrX9Ex/ncwqba1M1VnnqZ9KZEJqvG6qZUo1M6338U7BGADV1t
 O7XVTgiOycJExxMVtwmLEoMriQRqjUiQ5iPDtnfkvWn1OsMzxXBC2XjF0zr2U7ot
 KpjydXdtrv8HrknsyvKZnJpn+0opbHq7yutgzjx1YW8i+cJPq/w=
 =9pDv
 -----END PGP SIGNATURE-----

Merge tag 'soc-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
 "The majority of these last fixes are for devicetree files.

  These address two important regressions for the Qualcomm SMMU and the
  Raspberry Pi 4 USB controller, as well as a larger number of patches
  fixing minor mistakes in board specific files for Rockchips, i.MX,
  starfive and broadcom.

  The non-DT changes are

   - A fix for an old boot regression on Renesas shmobile chips

   - Another boot time regression for for the Qualcomm PDR SoC driver,
     among a few other Qualcomm firmware driver fixes for efivars and
     tzmem

   - Minor Kconfig fixes for davinci and OMAP1

   - Minor code fixes for sparx5 reset controllers, OMAP memory
     controller, i.MX SCU, cpufreq and SoC drivers and a Hisilicon SoC
     driver

   - One more update to the Asahi maintainers, adding Neal Gompa as a
     reviewer"

* tag 'soc-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (35 commits)
  ARM: davinci: da850: fix selecting ARCH_DAVINCI_DA8XX
  soc: hisilicon: kunpeng_hccs: Fix incorrect string assembly
  memory: omap-gpmc: drop no compatible check
  reset: mchp: sparx5: Fix for lan966x
  ARM: shmobile: smp: Enforce shmobile_smp_* alignment
  MAINTAINERS: Add myself (Neal Gompa) as a reviewer for ARM Apple support
  MAINTAINERS: Add apple-spi driver & binding files
  arm64: dts: rockchip: slow down emmc freq for rock 5 itx
  ARM: dts: BCM5301X: Fix switch port labels of ASUS RT-AC3200
  ARM: dts: BCM5301X: Fix switch port labels of ASUS RT-AC5300
  ARM: dts: bcm2711: Don't mark timer regs unconfigured
  ARM: OMAP1: select CONFIG_GENERIC_IRQ_CHIP
  arm64: dts: rockchip: Add missing PCIe supplies to RockPro64 board dtsi
  arm64: dts: rockchip: Add avdd HDMI supplies to RockPro64 board dtsi
  arm64: dts: rockchip: Remove undocumented sdmmc property from lubancat-1
  arm64: dts: rockchip: fix pinmux of UART5 for PX30 Ringneck on Haikou
  arm64: dts: rockchip: fix pinmux of UART0 for PX30 Ringneck on Haikou
  arm64: dts: rockchip: fix u2phy1_host status for NanoPi R4S
  arm64: dts: bcm2712: PL011 UARTs are actually r1p5
  ARM: dts: bcm2711: PL011 UARTs are actually r1p5
  ...
2025-03-17 14:40:40 -07:00
Anshuman Khandual
f9aad62200 mm: rename GENERIC_PTDUMP and PTDUMP_CORE
Platforms subscribe into generic ptdump implementation via GENERIC_PTDUMP.
But generic ptdump gets enabled via PTDUMP_CORE.  These configs
combination is confusing as they sound very similar and does not
differentiate between platform's feature subscription and feature
enablement for ptdump.  Rename the configs as ARCH_HAS_PTDUMP and PTDUMP
making it more clear and improve readability.

Link: https://lkml.kernel.org/r/20250226122404.1927473-6-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> (powerpc)
Acked-by: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
Cc: Will Deacon <will@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Steven Price <steven.price@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17 00:05:32 -07:00
Sourabh Jain
7b54a96f30 crash: remove an unused argument from reserve_crashkernel_generic()
cmdline argument is not used in reserve_crashkernel_generic() so remove
it.  Correspondingly, all the callers have been updated as well.

No functional change intended.

Link: https://lkml.kernel.org/r/20250131113830.925179-3-sourabhjain@linux.ibm.com
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16 22:30:47 -07:00
Ryan Roberts
86758b5048 mm/ioremap: pass pgprot_t to ioremap_prot() instead of unsigned long
ioremap_prot() currently accepts pgprot_val parameter as an unsigned long,
thus implicitly assuming that pgprot_val and pgprot_t could never be
bigger than unsigned long.  But this assumption soon will not be true on
arm64 when using D128 pgtables.  In 128 bit page table configuration,
unsigned long is 64 bit, but pgprot_t is 128 bit.

Passing platform abstracted pgprot_t argument is better as compared to
size based data types.  Let's change the parameter to directly pass
pgprot_t like another similar helper generic_ioremap_prot().

Without this change in place, D128 configuration does not work on arm64 as
the top 64 bits gets silently stripped when passing the protection value
to this function.

Link: https://lkml.kernel.org/r/20250218101954.415331-1-anshuman.khandual@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Co-developed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16 22:06:23 -07:00
Barry Song
2f4ab3ac10 mm: support tlbbatch flush for a range of PTEs
This patch lays the groundwork for supporting batch PTE unmapping in
try_to_unmap_one().  It introduces range handling for TLB batch flushing,
with the range currently set to the size of PAGE_SIZE.

The function __flush_tlb_range_nosync() is architecture-specific and is
only used within arch/arm64.  This function requires the mm structure
instead of the vma structure.  To allow its reuse by
arch_tlbbatch_add_pending(), which operates with mm but not vma, this
patch modifies the argument of __flush_tlb_range_nosync() to take mm as
its parameter.

Link: https://lkml.kernel.org/r/20250214093015.51024-3-21cnbao@gmail.com
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shaoqin Huang <shahuang@redhat.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chis Li <chrisl@kernel.org>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mauricio Faria de Oliveira <mfo@canonical.com>
Cc: Tangquan Zheng <zhengtangquan@oppo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16 22:06:16 -07:00
Ard Biesheuvel
ac4f06789b kbuild: Create intermediate vmlinux build with relocations preserved
The imperative paradigm used to build vmlinux, extract some info from it
or perform some checks on it, and subsequently modify it again goes
against the declarative paradigm that is usually employed for defining
make rules.

In particular, the Makefile.postlink files that consume their input via
an output rule result in some dodgy logic in the decompressor makefiles
for RISC-V and x86, given that the vmlinux.relocs input file needed to
generate the arch-specific relocation tables may not exist or be out of
date, but cannot be constructed using the ordinary Make dependency based
rules, because the info needs to be extracted while vmlinux is in its
ephemeral, non-stripped form.

So instead, for architectures that require the static relocations that
are emitted into vmlinux when passing --emit-relocs to the linker, and
are subsequently stripped out again, introduce an intermediate vmlinux
target called vmlinux.unstripped, and organize the reset of the build
logic accordingly:

- vmlinux.unstripped is created only once, and not updated again
- build rules under arch/*/boot can depend on vmlinux.unstripped without
  running the risk of the data disappearing or being out of date
- the final vmlinux generated by the build is not bloated with static
  relocations that are never needed again after the build completes.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-03-17 00:29:50 +09:00
Ard Biesheuvel
9b400d1725 kbuild: Introduce Kconfig symbol for linking vmlinux with relocations
Some architectures build vmlinux with static relocations preserved, but
strip them again from the final vmlinux image. Arch specific tools
consume these static relocations in order to construct relocation tables
for KASLR.

The fact that vmlinux is created, consumed and subsequently updated goes
against the typical, declarative paradigm used by Make, which is based
on rules and dependencies. So as a first step towards cleaning this up,
introduce a Kconfig symbol to declare that the arch wants to consume the
static relocations emitted into vmlinux. This will be wired up further
in subsequent patches.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-03-17 00:29:50 +09:00
Ignacio Encinas
e3f42c436d riscv: fix test_and_{set,clear}_bit ordering documentation
test_and_{set,clear}_bit are fully ordered as specified in
Documentation/atomic_bitops.txt. Fix incorrect comment stating otherwise.

Note that the implementation is correct since commit
9347ce54cd ("RISC-V: __test_and_op_bit_ord should be strongly ordered")
was introduced.

Signed-off-by: Ignacio Encinas <ignacio@iencinas.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2025-03-12 19:27:15 -04:00
Linus Torvalds
0b46b049d6 Some further v6.14 fixes:
- Fix the regmap settings for bcm281xx, this was missing the
   stride.
 
 - NULL check for the Nuvoton npcm8xx devm_kasprintf()
 
 - Enable the Spacemit pin controller by default in the
   SoC config. The SoC will not boot without it so this one
   is prett much required.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmfQEsEACgkQQRCzN7AZ
 XXNrMw//SAJpyYYdVcTlkjag1O0gklX/zwsm1QvdYFsysIivlp5aJxIonTgHKl/6
 vj1rd2w4yvsxOWDb7wQJGzqu1ijaL/spC4UOsaeJfN9orMgn3DogYswUT22MxPeN
 tYzpdrRRMDkNEcf8x+Y5ogANInbEkIjT2eGHGDschYUQw7l2rUrct4E8uelfWEW1
 t6UPa69DbEdz3E77Zrx9z35R0a9lyBgh1gkd0schboEfIuX5iunf6WqRx8oSCSH/
 +cS215SeGUO5SarWuw2X7bFvTEldm1wA6aQZloMfoae3Z49mXNIOI565rxVJFSSD
 JGjagkZhTzT1yy1xFtptVwBHURsx4joHErhQ9MTaDEOUvwMY3h0pHSZdFy1aUi/T
 PLkhfd4eMdXUqN739szG1zgm8esPNgmyqqXlnU6CY7lYqYQvjQescvIJ5brol4Bt
 k3vIiouRItSp0C14ci9SG0vFHUbwjznYWk6396GQClc1y/xTX0p7xqLBhciCE0s4
 Jdk3cFHe1QjF9T+Jn6UFv2YvTupj+X6bFWe+XOzUBaguDXI6AJgx2wzQiKWWOqRe
 DmG1+zTCC59kbh0j7huQesvUquYx5qZbqLNIlBeY10wrGiHI7+a7+8Kzc9BDXtlq
 DSkAQyFgctm7Bcm+dRH0qrkyKssTRb86ryIqk2GBPdokR2Vnmk8=
 =r2/Q
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:

 - Fix the regmap settings for bcm281xx, this was missing the stride

 - NULL check for the Nuvoton npcm8xx devm_kasprintf()

 - Enable the Spacemit pin controller by default in the SoC config. The
   SoC will not boot without it so this one is pretty much required

* tag 'pinctrl-v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: spacemit: enable config option
  pinctrl: nuvoton: npcm8xx: Add NULL check in npcm8xx_gpio_fw
  pinctrl: bcm281xx: Fix incorrect regmap max_registers value
2025-03-11 07:29:02 -10:00
Eric Biggers
511484fa88 riscv/crc64: add Zbc optimized CRC64 functions
Wire up crc64_be_arch() and crc64_nvme_arch() for 64-bit RISC-V using
crc-clmul-template.h.  This greatly improves the performance of these
CRCs on Zbc-capable CPUs in 64-bit kernels.

These optimized CRC64 functions are not yet supported in 32-bit kernels,
since crc-clmul-template.h assumes that the CRC fits in an unsigned
long.  That implementation limitation could be addressed, but it would
add a fair bit of complexity, so it has been omitted for now.

Tested-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250216225530.306980-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10 09:29:27 -07:00
Eric Biggers
8bf3e17898 riscv/crc-t10dif: add Zbc optimized CRC-T10DIF function
Wire up crc_t10dif_arch() for RISC-V using crc-clmul-template.h.  This
greatly improves CRC-T10DIF performance on Zbc-capable CPUs.

Tested-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250216225530.306980-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10 09:29:25 -07:00
Eric Biggers
72acff5f81 riscv/crc32: reimplement the CRC32 functions using new template
Delete the previous Zbc optimized CRC32 code, and re-implement it using
the new template.  The new implementation is more optimized and shares
more code among CRC variants.

Tested-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250216225530.306980-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10 09:29:22 -07:00
Eric Biggers
bbe2610bc5 riscv/crc: add "template" for Zbc optimized CRC functions
Add a "template" crc-clmul-template.h that can generate RISC-V Zbc
optimized CRC functions.  Each generated CRC function is parameterized
by CRC length and bit order, and it accepts a pointer to the constants
struct required for the specific CRC polynomial desired.  Update
gen-crc-consts.py to support generating the needed constants structs.

This makes it possible to easily wire up a Zbc optimized implementation
of almost any CRC.

The design generally follows what I did for x86, but it is simplified by
using RISC-V's scalar carryless multiplication Zbc, which has no
equivalent on x86.  RISC-V's clmulr instruction is also helpful.  A
potential switch to Zvbc (or support for Zvbc alongside Zbc) is left for
future work.  For long messages Zvbc should be fastest, but it would
need to be shown to be worthwhile over just using Zbc which is
significantly more convenient to use, especially in the kernel context.

Compared to the existing Zbc-optimized CRC32 code and the earlier
proposed Zbc-optimized CRC-T10DIF code
(https://lore.kernel.org/r/20250211071101.181652-1-zhihang.shao.iscas@gmail.com),
this submission deduplicates the code among CRC variants and is
significantly more optimized.  It uses "folding" to take better
advantage of instruction-level parallelism (to a more limited extent
than x86 for now, but it could be extended to more), it reworks the
Barrett reduction to eliminate unnecessary instructions, and it
documents all the math used and makes all the constants reproducible.

Tested-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250216225530.306980-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-03-10 09:29:08 -07:00
Arnd Bergmann
26e29a26f8 RISC-V Devicetree fix for v6.14-rc6
A single fix for an incorrect define in the jh7110 pinctrl header.
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZ8hOeAAKCRB4tDGHoIJi
 0ti5AQD7CJrTmBBqP5nxQHvgvz1KSSqfzpOMrtA3YpOkDaqYigD/Ww6z12oSHFuK
 lLFSnpVJLqFd01N0wP1e+TUrbZ++Hg4=
 =WHwP
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmfJwJIACgkQYKtH/8kJ
 Uidt6RAApTAvDdwAR4FAxHqSCK+mJkC7njGkGr5t1xNBFpWdKwsONdet84ErEEzl
 ift1XFMWwZ98jIR4cHAnsPinHyFZUDH48rOJw8AmygSaikmaH27604lqLF+i55nV
 JjqBrQDCA3T6haH/1uM6+Te8c7FH/klG6sAymZs+x+l8UER2l4U8eNThaU0Xk1/n
 ZMXQK/R1buv+6oGeY5mXNkdIerXM9fz/BjwcfECwZHyTk5gMTrwzM+6emlo78Y/V
 OvnX/FJOGZ3MNL/rj9LDFu8QDqYf7p7D7fkYpGqRgScFs6LMTjKb37uCVfU7i4i8
 4zMwUZjwOkaJOxBgnDHmySCLmlBhcj0FN34kRsVgdqG5YAYzcTzTBdBPeOjNCDpa
 7i8kg3mJQD5hVbZFHWIvTqpBXOxFJTKKbU+W8dmlTsB7DyK4uV3WPym+MnqIZWw/
 +xcBi7dJZEyMaEThuWjCfTrJu0rWJg3WeZjFLgxoiz+yB0NzDivXMN7lrmoYe/1U
 dIoP4yDCOzVmJIeW9CIXVd/sXTOHPVaWRSUV0dqjHYefUl/wz9Mo55DbTT+JK9Ng
 0d1iSQBmz/M5ThjYGDBYW1Ltq7zwSDxBIHwVBkKpopRPanbXSYHsByTMB3LNGDn1
 Uou9MCr65MqZNajbk+vE1kEvAke47+/SHau9arFVtkjcA0srdcE=
 =qqLw
 -----END PGP SIGNATURE-----

Merge tag 'riscv-dt-fixes-for-v6.14-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes

RISC-V Devicetree fix for v6.14-rc6

A single fix for an incorrect define in the jh7110 pinctrl header.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-dt-fixes-for-v6.14-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: dts: starfive: Fix a typo in StarFive JH7110 pin function definitions

Link: https://lore.kernel.org/r/20250305-sip-unable-d56ef7dbf86b@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-03-06 16:34:42 +01:00
Atish Patra
bbb6224887 RISC-V: KVM: Disable the kernel perf counter during configure
The perf event should be marked disabled during the creation as
it is not ready to be scheduled until there is SBI PMU start call
or config matching is called with auto start. Otherwise, event add/start
gets called during perf_event_create_kernel_counter function.
It will be enabled and scheduled to run via perf_event_enable during
either the above mentioned scenario.

Fixes: 0cb74b65d2 ("RISC-V: KVM: Implement perf support without sampling")

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250303-kvm_pmu_improve-v2-1-41d177e45929@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-03-06 09:57:07 +05:30
Ingo Molnar
1b4c36f9b1 Merge branch 'x86/urgent' into x86/cpu, to pick up dependent commits
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-04 11:15:26 +01:00
Herbert Xu
17ec3e71ba crypto: lib/Kconfig - Hide arch options from user
The ARCH_MAY_HAVE patch missed arm64, mips and s390.  But it may
also lead to arch options being enabled but ineffective because
of modular/built-in conflicts.

As the primary user of all these options wireguard is selecting
the arch options anyway, make the same selections at the lib/crypto
option level and hide the arch options from the user.

Instead of selecting them centrally from lib/crypto, simply set
the default of each arch option as suggested by Eric Biggers.

Change the Crypto API generic algorithms to select the top-level
lib/crypto options instead of the generic one as otherwise there
is no way to enable the arch options (Eric Biggers).  Introduce a
set of INTERNAL options to work around dependency cycles on the
CONFIG_CRYPTO symbol.

Fixes: 1047e21aec ("crypto: lib/Kconfig - Fix lib built-in failure when arch is modular")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Arnd Bergmann <arnd@kernel.org>
Closes: https://lore.kernel.org/oe-kbuild-all/202502232152.JC84YDLp-lkp@intel.com/
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-02 15:21:47 +08:00
Linus Torvalds
9d20040d71 arm64 fixes for -rc5
- Fix a sporadic boot failure due to incorrect randomization of the
   linear map on systems that support it
 
 - Fix the zapping (both clearing the entries *and* invalidating the TLB)
   of hugetlb PTEs constructed using the contiguous bit
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmfDdBIQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNN0GB/9gmEOX1GwMU6wFjPYqvjWlkGCFDwrldO84
 uF9jEUbPaw3P4xHTOFyPCfEWidktqa+yDVbe90mB7GVOM+1eEZ81em1k1hYBEXbz
 Q73Nl5VrNzxX4BjOrdxxoTSaR/TKklUh5mqWfIzy1RxEnBfpr/GuDPtUn1GViCAs
 sU16Ju12UdYXn3tyHFDHpjZS9WYZskfnrvS0QvXinz0LahZrCkeaH+ptYHrTjMFx
 hxyrRQwOlqLnZWvjLOegH9AC6uyRkKDinXKhXqHYvUfcfEkQsKwM7Fpc6cviUD0Q
 X2npLNegnYxPniwmLpXfNXazPDnKVMzxb9lpqw1fZS3nAuh8XOde
 =RqDZ
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "Ryan's been hard at work finding and fixing mm bugs in the arm64 code,
  so here's a small crop of fixes for -rc5.

  The main changes are to fix our zapping of non-present PTEs for
  hugetlb entries created using the contiguous bit in the page-table
  rather than a block entry at the level above. Prior to these fixes, we
  were pulling the contiguous bit back out of the PTE in order to
  determine the size of the hugetlb page but this is clearly bogus if
  the thing isn't present and consequently both the clearing of the
  PTE(s) and the TLB invalidation were unreliable.

  Although the problem was found by code inspection, we really don't
  want this sitting around waiting to trigger and the changes are CC'd
  to stable accordingly.

  Note that the diffstat looks a lot worse than it really is;
  huge_ptep_get_and_clear() now takes a size argument from the core code
  and so all the arch implementations of that have been updated in a
  pretty mechanical fashion.

   - Fix a sporadic boot failure due to incorrect randomization of the
     linear map on systems that support it

   - Fix the zapping (both clearing the entries *and* invalidating the
     TLB) of hugetlb PTEs constructed using the contiguous bit"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: hugetlb: Fix flush_hugetlb_tlb_range() invalidation level
  arm64: hugetlb: Fix huge_ptep_get_and_clear() for non-present ptes
  mm: hugetlb: Add huge page size param to huge_ptep_get_and_clear()
  arm64/mm: Fix Boot panic on Ampere Altra
2025-03-01 13:44:51 -08:00
Linus Torvalds
209cd6f2ca ARM:
* Fix TCR_EL2 configuration to not use the ASID in TTBR1_EL2
   and not mess-up T1SZ/PS by using the HCR_EL2.E2H==0 layout.
 
 * Bring back the VMID allocation to the vcpu_load phase, ensuring
   that we only setup VTTBR_EL2 once on VHE. This cures an ugly
   race that would lead to running with an unallocated VMID.
 
 RISC-V:
 
 * Fix hart status check in SBI HSM extension
 
 * Fix hart suspend_type usage in SBI HSM extension
 
 * Fix error returned by SBI IPI and TIME extensions for
   unsupported function IDs
 
 * Fix suspend_type usage in SBI SUSP extension
 
 * Remove unnecessary vcpu kick after injecting interrupt
   via IMSIC guest file
 
 x86:
 
 * Fix an nVMX bug where KVM fails to detect that, after nested
   VM-Exit, L1 has a pending IRQ (or NMI).
 
 * To avoid freeing the PIC while vCPUs are still around, which
   would cause a NULL pointer access with the previous patch,
   destroy vCPUs before any VM-level destruction.
 
 * Handle failures to create vhost_tasks
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmfCvVsUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPqGwf9FOWQRd/yCKHiufjPDefD1Og0DmgB
 Dgk0nmHxaxbyPw+5vYlhn/J3vZ54sNngBpmUekE5OuBMZ9EsxXAK/myByHkzNnV9
 cyLm4vYwpb9OQmbQ5MMdDlptYsjV40EmSfwwIJpBxjdkwAI3f7NgeHvG8EwkJgch
 C+X4JMrLu2+BGo7BUhuE/xrB8h0CBRnhalB5aK1wuF+ey8v06zcU0zdQCRLUpOsx
 mW9S0OpSpSlecvcblr0AhuajjHjwFaTFOQofaXaQFBW6kv3dXmSq/JRABEfx0TBb
 MTUDQtnnaYvPy/RWwZIzBpgfASLQNQNxSJ7DIw9C8IG7k6rK25BSRwTmSw==
 =afMB
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Fix TCR_EL2 configuration to not use the ASID in TTBR1_EL2 and not
     mess-up T1SZ/PS by using the HCR_EL2.E2H==0 layout.

   - Bring back the VMID allocation to the vcpu_load phase, ensuring
     that we only setup VTTBR_EL2 once on VHE. This cures an ugly race
     that would lead to running with an unallocated VMID.

  RISC-V:

   - Fix hart status check in SBI HSM extension

   - Fix hart suspend_type usage in SBI HSM extension

   - Fix error returned by SBI IPI and TIME extensions for unsupported
     function IDs

   - Fix suspend_type usage in SBI SUSP extension

   - Remove unnecessary vcpu kick after injecting interrupt via IMSIC
     guest file

  x86:

   - Fix an nVMX bug where KVM fails to detect that, after nested
     VM-Exit, L1 has a pending IRQ (or NMI).

   - To avoid freeing the PIC while vCPUs are still around, which would
     cause a NULL pointer access with the previous patch, destroy vCPUs
     before any VM-level destruction.

   - Handle failures to create vhost_tasks"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: retry nx_huge_page_recovery_thread creation
  vhost: return task creation error instead of NULL
  KVM: nVMX: Process events on nested VM-Exit if injectable IRQ or NMI is pending
  KVM: x86: Free vCPUs before freeing VM state
  riscv: KVM: Remove unnecessary vcpu kick
  KVM: arm64: Ensure a VMID is allocated before programming VTTBR_EL2
  KVM: arm64: Fix tcr_el2 initialisation in hVHE mode
  riscv: KVM: Fix SBI sleep_type use
  riscv: KVM: Fix SBI TIME error generation
  riscv: KVM: Fix SBI IPI error generation
  riscv: KVM: Fix hart suspend_type use
  riscv: KVM: Fix hart suspend status check
2025-03-01 08:48:53 -08:00
Ryan Roberts
02410ac72a mm: hugetlb: Add huge page size param to huge_ptep_get_and_clear()
In order to fix a bug, arm64 needs to be told the size of the huge page
for which the huge_pte is being cleared in huge_ptep_get_and_clear().
Provide for this by adding an `unsigned long sz` parameter to the
function. This follows the same pattern as huge_pte_clear() and
set_huge_pte_at().

This commit makes the required interface modifications to the core mm as
well as all arches that implement this function (arm64, loongarch, mips,
parisc, powerpc, riscv, s390, sparc). The actual arm64 bug will be fixed
in a separate commit.

Cc: stable@vger.kernel.org
Fixes: 66b3923a1a ("arm64: hugetlb: add support for PTE contiguous bit")
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> # s390
Link: https://lore.kernel.org/r/20250226120656.2400136-2-ryan.roberts@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-02-27 17:40:57 +00:00
Sean Christopherson
b2aba529bf KVM: Drop kvm_arch_sync_events() now that all implementations are nops
Remove kvm_arch_sync_events() now that x86 no longer uses it (no other
arch has ever used it).

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Message-ID: <20250224235542.2562848-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-02-26 13:17:23 -05:00
Chen Wang
0edaa4593e riscv: sophgo: dts: Add msi controller for SG2042
Add msi-controller node to dts for SG2042.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/f47c6c3f0309a543d495cb088d6c8c5750bb5647.1740535748.git.unicorn_wang@outlook.com
2025-02-26 08:41:28 +01:00
Yixun Lan
7ff4faba63 pinctrl: spacemit: enable config option
Pinctrl is an essential driver for SpacemiT's SoC,
The uart driver requires it, same as sd card driver,
so let's enable it by default for this SoC.

The CONFIG_PINCTRL_SPACEMIT_K1 isn't enabled when using
'make defconfig' to select kernel configuration options.
This result in a broken uart driver where fail at probe()
stage due to no pins found.

Fixes: a83c29e1d1 ("pinctrl: spacemit: add support for SpacemiT K1 SoC")
Reported-by: Alex Elder <elder@kernel.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Alex Elder <elder@riscstar.com>
Signed-off-by: Yixun Lan <dlan@gentoo.org>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Tested-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://lore.kernel.org/20250218-k1-pinctrl-option-v3-1-36e031e0da1b@gentoo.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-02-25 17:22:36 +01:00
Kirill A. Shutemov
a9ebcb8813 mm/memremap: Pass down MEMREMAP_* flags to arch_memremap_wb()
x86 version of arch_memremap_wb() needs the flags to decide if the mapping
has to be encrypted or decrypted.

Pass down the flag to arch_memremap_wb(). All current implementations
ignore the argument.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20250217163822.343400-2-kirill.shutemov@linux.intel.com
2025-02-21 15:05:38 +01:00
BillXiang
d252435aca riscv: KVM: Remove unnecessary vcpu kick
Remove the unnecessary kick to the vCPU after writing to the vs_file
of IMSIC in kvm_riscv_vcpu_aia_imsic_inject.

For vCPUs that are running, writing to the vs_file directly forwards
the interrupt as an MSI to them and does not need an extra kick.

For vCPUs that are descheduled after emulating WFI, KVM will enable
the guest external interrupt for that vCPU in
kvm_riscv_aia_wakeon_hgei. This means that writing to the vs_file
will cause a guest external interrupt, which will cause KVM to wake
up the vCPU in hgei_interrupt to handle the interrupt properly.

Signed-off-by: BillXiang <xiangwencheng@lanxincomputing.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Radim Krčmář <rkrcmar@ventanamicro.com>
Link: https://lore.kernel.org/r/20250221104538.2147-1-xiangwencheng@lanxincomputing.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-02-21 17:27:32 +05:30
Thomas Weißschuh
46fe55b204 riscv: vdso: Switch to generic storage implementation
The generic storage implementation provides the same features as the
custom one. However it can be shared between architectures, making
maintenance easier.

Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-9-13a4669dfc8c@linutronix.de
2025-02-21 09:54:02 +01:00
Thomas Weißschuh
127b0e05c1 vdso: Rename included Makefile
As the Makefile is included into other Makefiles it can not be used to
define objects to be built from the current source directory.
However the generic datastore will introduce such a local source file.
Rename the included Makefile so it is clear how it is to be used and to
make room for a regular Makefile in lib/vdso/.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-4-13a4669dfc8c@linutronix.de
2025-02-21 09:54:01 +01:00
Anup Patel
58d868b67a RISC-V: Select CONFIG_GENERIC_PENDING_IRQ
Enable CONFIG_GENERIC_PENDING_IRQ for RISC-V so that RISC-V interrupt chips
can support delayed interrupt mirgration in interrupt context.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250217085657.789309-7-apatel@ventanamicro.com
2025-02-20 15:19:26 +01:00
E Shattow
38818f7c9c riscv: dts: starfive: jh7110-pine64-star64: enable USB 3.0 port
One of four USB-A ports on the Pine64 Star64 is USB 3.0 which requires to
disable PCIE0 and change the mode of PCIE0 PHY to USB3.0 operation. The
remaining three USB-A ports are USB 2.0 with the USB0 PHY and do not
conflict with any of PCIE0 or PCIE1. PCIE1 (1-lane) routes to a PCIe X4
connector.

Signed-off-by: E Shattow <e@freeshell.de>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-02-18 16:32:25 +00:00
E Shattow
65e8b99126 riscv: dts: starfive: jh7110: pciephy0 USB 3.0 configuration registers
StarFive JH7110 contains a Cadence USB2.0+USB3.0 controller IP block that
may exclusively use pciephy0 for USB3.0 connectivity. Add the register
offsets for the driver to enable/disable USB3.0 on pciephy0.

Signed-off-by: E Shattow <e@freeshell.de>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-02-18 16:32:24 +00:00
Sandie Cao
57b5369f36 riscv: dts: starfive: fml13v01: enable pcie1
Starfive Soc common defines GPIO28 as pcie1 reset, GPIO21 as pcie1 wakeup;
But the FML13V01 board uses GPIO21 as pcie1 reset, GPIO28 as pcie1 wakeup;
redefine pcie1 gpio and enable pcie1 for pcie based Wi-Fi.

Signed-off-by: Sandie Cao <sandie.cao@deepcomputing.io>
Tested-by: Maud Spierings <maud_spierings@hotmail.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-02-18 16:28:33 +00:00
Conor Dooley
4bdea6e339 riscv: dts: starfive: remove non-existent dac from jh7110
The jh7110 boards do not have a Rohm DAC on them as far as I
can tell, and they certainly do not have a dh2228fv, as this device does
not actually exist! Remove the dac nodes from the devicetrees as it is
not acceptable to pretend to have a device on a board in order to bind
the spidev driver in Linux.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-02-18 16:27:53 +00:00
Nam Cao
92051cb9d3 riscv: kvm: Switch to use hrtimer_setup()
hrtimer_setup() takes the callback function pointer as argument and
initializes the timer completely.

Replace hrtimer_init() and the open coded initialization of
hrtimer::function with the new setup mechanism.

Patch was created by using Coccinelle.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/d5ededf778f59f2fc38ff4276fb7f4c893e4142c.1738746821.git.namcao@linutronix.de
2025-02-18 10:32:31 +01:00
Chen Wang
f047a9285f riscv: sophgo: dts: add cooling maps for Milk-V Pioneer
The normal operating temperature range of SG2042 is -20 degrees
Celsius ~ 85 degrees Celsius.

Simultaneously monitor soc temperature and board temperature to
improve redundancy and safety.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/r/5a36a2784d97ed7b1e06777cb0c3c14fe9185e99.1739351437.git.unicorn_wang@outlook.com
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
2025-02-18 10:11:37 +08:00
Chen Wang
62cdf0a06d riscv: sophgo: dts: add pwm-fan for Milk-V Pioneer
Milk-V Pioneer uses fan as cooling-device, and speed of the fan is
controlled by the first channel of pwm controller of SG2042.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/r/dd23362328f77dd91aa9354848bbb0abad0f554b.1739351437.git.unicorn_wang@outlook.com
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
2025-02-18 10:11:37 +08:00
Javier Martinez Canillas
5b90a3d609
riscv: dts: spacemit: Add Milk-V Jupiter board device tree
Add initial support for the Milk-V Jupiter board [1], which is a Mini ITX
computer based on the SpacemiT K1/M1 Octa-Core X60 64-bit RISC-V SoC [2].

There are two variant for this board, one using the K1 chip and another
using the M1 chip. The main difference is that the M1 can run at a higher
frequency than the K1, thanks to its packaging.

For now, only a DTS for the K1 variant is added since there isn't support
yet for the X60 cores operating performance and thermal trip points.

The support is minimal, but at least allows to boot into a serial console.

Link: https://milkv.io/jupiter [1]
Link: https://www.spacemit.com/en/key-stone-k1 [2]
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Yixun Lan <dlan@gentoo.org>
Link: https://lore.kernel.org/r/20250214151700.666544-3-javierm@redhat.com
Signed-off-by: Yixun Lan <dlan@gentoo.org>
2025-02-17 21:11:50 +08:00
Andrew Jones
351e02b173 riscv: KVM: Fix SBI sleep_type use
The spec says sleep_type is 32 bits wide and "In case the data is
defined as 32bit wide, higher privilege software must ensure that it
only uses 32 bit data." Mask off upper bits of sleep_type before
using it.

Fixes: 023c15151f ("RISC-V: KVM: Add SBI system suspend support")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20250217084506.18763-12-ajones@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-02-17 16:28:28 +05:30
Andrew Jones
b901484852 riscv: KVM: Fix SBI TIME error generation
When an invalid function ID of an SBI extension is used we should
return not-supported, not invalid-param.

Fixes: 5f862df558 ("RISC-V: KVM: Add v0.1 replacement SBI extensions defined in v0.2")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20250217084506.18763-11-ajones@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-02-17 16:28:28 +05:30
Andrew Jones
0611f78f83 riscv: KVM: Fix SBI IPI error generation
When an invalid function ID of an SBI extension is used we should
return not-supported, not invalid-param. Also, when we see that at
least one hartid constructed from the base and mask parameters is
invalid, then we should return invalid-param. Finally, rather than
relying on overflowing a left shift to result in zero and then using
that zero in a condition which [correctly] skips sending an IPI (but
loops unnecessarily), explicitly check for overflow and exit the loop
immediately.

Fixes: 5f862df558 ("RISC-V: KVM: Add v0.1 replacement SBI extensions defined in v0.2")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20250217084506.18763-10-ajones@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-02-17 16:28:28 +05:30
Andrew Jones
e3219b0c49 riscv: KVM: Fix hart suspend_type use
The spec says suspend_type is 32 bits wide and "In case the data is
defined as 32bit wide, higher privilege software must ensure that it
only uses 32 bit data." Mask off upper bits of suspend_type before
using it.

Fixes: 763c8bed8c ("RISC-V: KVM: Implement SBI HSM suspend call")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20250217084506.18763-9-ajones@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-02-17 16:28:28 +05:30
Andrew Jones
c7db342e3b riscv: KVM: Fix hart suspend status check
"Not stopped" means started or suspended so we need to check for
a single state in order to have a chance to check for each state.
Also, we need to use target_vcpu when checking for the suspend
state.

Fixes: 763c8bed8c ("RISC-V: KVM: Implement SBI HSM suspend call")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20250217084506.18763-8-ajones@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-02-17 16:28:27 +05:30
Yong-Xuan Wang
564fc8eb6f
riscv: signal: fix signal_minsigstksz
The init_rt_signal_env() funciton is called before the alternative patch
is applied, so using the alternative-related API to check the availability
of an extension within this function doesn't have the intended effect.
This patch reorders the init_rt_signal_env() and apply_boot_alternatives()
to get the correct signal_minsigstksz.

Fixes: e92f469b07 ("riscv: signal: Report signal frame size to userspace via auxv")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Andy Chiu <andybnac@gmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241220083926.19453-3-yongxuan.wang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-02-14 13:06:50 -08:00
Yong-Xuan Wang
aa49bc2ca8
riscv: signal: fix signal frame size
The signal context of certain RISC-V extensions will be appended after
struct __riscv_extra_ext_header, which already includes an empty context
header. Therefore, there is no need to preserve a separate hdr for the
END of signal context.

Fixes: 8ee0b41898 ("riscv: signal: Add sigcontext save/restore for vector")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Andy Chiu <AndybnAC@gmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241220083926.19453-2-yongxuan.wang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-02-14 13:06:44 -08:00
Andreas Schwab
599c44cd21
riscv/futex: sign extend compare value in atomic cmpxchg
Make sure the compare value in the lr/sc loop is sign extended to match
what lr.w does.  Fortunately, due to the compiler keeping the register
contents sign extended anyway the lack of the explicit extension didn't
result in wrong code so far, but this cannot be relied upon.

Fixes: b90edb3301 ("RISC-V: Add futex support.")
Signed-off-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/mvmfrkv2vhz.fsf@suse.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-02-14 13:06:31 -08:00
Andreas Schwab
1898300abf
riscv/atomic: Do proper sign extension also for unsigned in arch_cmpxchg
Sign extend also an unsigned compare value to match what lr.w is doing.
Otherwise try_cmpxchg may spuriously return true when used on a u32 value
that has the sign bit set, as it happens often in inode_set_ctime_current.

Do this in three conversion steps.  The first conversion to long is needed
to avoid a -Wpointer-to-int-cast warning when arch_cmpxchg is used with a
pointer type.  Then convert to int and back to long to always sign extend
the 32-bit value to 64-bit.

Fixes: 6c58f25e69 ("riscv/atomic: Fix sign extension for RV64I")
Signed-off-by: Andreas Schwab <schwab@suse.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Xi Ruoyao <xry111@xry111.site>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/mvmed0k4prh.fsf@suse.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-02-14 13:06:23 -08:00
Clément Léger
c6ec1e1b07
riscv: cpufeature: use bitmap_equal() instead of memcmp()
Comparison of bitmaps should be done using bitmap_equal(), not memcmp(),
use the former one to compare isa bitmaps.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Fixes: 625034abd5 ("riscv: add ISA extensions validation callback")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250210155615.1545738-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-02-14 13:06:16 -08:00
Rob Herring
fb8179ce29
riscv: cacheinfo: Use of_property_present() for non-boolean properties
The use of of_property_read_bool() for non-boolean properties is
deprecated in favor of of_property_present() when testing for property
presence.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Cc: stable@vger.kernel.org
Fixes: 76d2a0493a ("RISC-V: Init and Halt Code")
Link: https://lore.kernel.org/r/20241104190314.270095-1-robh@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-02-14 13:04:20 -08:00
Shengyu Qu
3d20e619c9 riscv: dts: starfive: Unify regulator naming scheme
Currently, there are 3 regulators defined in JH7110's common device tree,
but regulator names are mixed with "-" and "_". So unify them to "_",
which is more often to be seen in other dts files.

Signed-off-by: Shengyu Qu <wiagn233@outlook.com>
Acked-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-02-12 08:14:16 +00:00
Eric Biggers
68ea3c2ae0 lib/crc32: remove "_le" from crc32c base and arch functions
Following the standardization on crc32c() as the lib entry point for the
Castagnoli CRC32 instead of the previous mix of crc32c(), crc32c_le(),
and __crc32c_le(), make the same change to the underlying base and arch
functions that implement it.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250208024911.14936-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08 20:06:30 -08:00
Eric Biggers
bc2736fe7e lib/crc32: don't bother with pure and const function attributes
Drop the use of __pure and __attribute_const__ from the CRC32 library
functions that had them.  Both of these are unusual optimizations that
don't help properly written code.  They seem more likely to cause
problems than have any real benefit.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250208024911.14936-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08 20:06:30 -08:00
Chen Wang
255f83ba5c riscv: sophgo: dts: add pwm controller for SG2042 SoC
SG2042 has one PWM controller, which has 4 pwm output channels.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/r/f376e16c0ee0cdac51bb91421d78defc0601627a.1738737617.git.unicorn_wang@outlook.com
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
2025-02-08 20:26:49 +08:00
E Shattow
1b133129ad riscv: dts: starfive: Fix a typo in StarFive JH7110 pin function definitions
Fix a typo in StarFive JH7110 pin function definitions for GPOUT_SYS_SDIO1_DATA4

Fixes: e22f09e598 ("riscv: dts: starfive: Add StarFive JH7110 pin function definitions")
Signed-off-by: E Shattow <e@freeshell.de>
Acked-by: Hal Feng <hal.feng@starfivetech.com>
CC: stable@vger.kernel.org
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-02-04 20:31:30 +00:00
Conor Dooley
9b181f4a95 riscv: dts: microchip: update pcie reg properties to new format
The existing PolarFire SoC devicetrees all use root port instance 1,
update the reg properties in PCIe nodes to use the new format that
specifies the instance in use. Failing to do so would still work but
produces warnings:
mpfs-icicle-kit.dtb: pcie@3000000000: reg: [[48, 0, 0, 134217728], [0, 1124073472, 0, 65536]] is too short
mpfs-icicle-kit.dtb: pcie@3000000000: reg-names: ['cfg', 'apb'] is too short

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
---
CC: Conor Dooley <conor@kernel.org>
CC: Daire McNamara <daire.mcnamara@microchip.com>
CC: valentina.fernandezalanis@microchip.com
CC: Rob Herring <robh@kernel.org>
CC: Krzysztof Kozlowski <krzk+dt@kernel.org>
CC: linux-riscv@lists.infradead.org
CC: devicetree@vger.kernel.org
CC: linux-kernel@vger.kernel.org
2025-02-04 20:28:06 +00:00
Linus Torvalds
1b5f3c51fb RISC-V Patches for the 6.14 Merge Window, Part 1
* The PH1520 pinctrl and dwmac drivers are enabeled in defconfig.
 * A redundant AQRL barrier has been removed from the futex cmpxchg
   implementation.
 * Support for the T-Head vector extensions, which includes exposing
   these extensions to userspace on systems that implement them.
 * Some more page table information is now printed on die() and systems
   that cause PA overflows.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmedHIoTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYievXD/4hdt8h+fMM0I9mmJS096YevRJONdfe
 Wk7D5q4PBwSHISHahuzfphieBhqPVnYkkEd7Vw6xRrLbUnhA41Fe0uvR52dx5UZd
 3LwrDV/kjGTD59x6A2Zo9bSs/qPKJ2WHmHwHM21jY5tvcIB2Lo4dF8HT63OrwVNW
 DxsujLO0jUw+HEwXPsfmUAZJWOPZuUnatl/9CaLMLwQv5N7yiMuz5oYDzJXTLnNh
 m3Hv3CCtj1EeQPqDoWzz9nZvmAKOwcblSzz6OAy+xrRk1N0N3QFQPbIaRvkI9OVz
 +wPHQiyx4KZNeAe0csV0uLQRIiXZV8rkCz5UT65s3Bfy3vukvzz+1VBdNnCqiP8Q
 RpCTcYw62Cr6BWnvyTh+s9bhHb1ijG043nXd/Ty7ZRPCNLKHY6oL1CZ0pgqbTwPs
 D2U2ZTZFTc35mPrU6QMfbTiUVWCU2XagFhI27Dgj3xh9mkBOQCHwk2Mrzn7uS4iz
 xGNnrjRnKtuwBrvD68JzxCkEi8INFn2ifbVr44VZrOdTM7XtODGAYrBohQtV62kU
 2L+q8DoHYis+0xFbR1wdrY1mRZoe45boUFgwnOpmoBr9ULe584sL+526y7IkkEHu
 /9hmLPtLg7nyoR/rO1j1Sfg4Eqdwg5HY1TKNfagJZAdu23EDRwrcW1PD0P6vtDv8
 j4og8MmL7dTt3A==
 =HbAQ
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.14-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - The PH1520 pinctrl and dwmac drivers are enabeled in defconfig

 - A redundant AQRL barrier has been removed from the futex cmpxchg
   implementation

 - Support for the T-Head vector extensions, which includes exposing
   these extensions to userspace on systems that implement them

 - Some more page table information is now printed on die() and systems
   that cause PA overflows

* tag 'riscv-for-linus-6.14-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: add a warning when physical memory address overflows
  riscv/mm/fault: add show_pte() before die()
  riscv: Add ghostwrite vulnerability
  selftests: riscv: Support xtheadvector in vector tests
  selftests: riscv: Fix vector tests
  riscv: hwprobe: Document thead vendor extensions and xtheadvector extension
  riscv: hwprobe: Add thead vendor extension probing
  riscv: vector: Support xtheadvector save/restore
  riscv: Add xtheadvector instruction definitions
  riscv: csr: Add CSR encodings for CSR_VXRM/CSR_VXSAT
  RISC-V: define the elements of the VCSR vector CSR
  riscv: vector: Use vlenb from DT for thead
  riscv: Add thead and xtheadvector as a vendor extension
  riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree
  dt-bindings: cpus: add a thead vlen register length property
  dt-bindings: riscv: Add xtheadvector ISA extension description
  RISC-V: Mark riscv_v_init() as __init
  riscv: defconfig: drop RT_GROUP_SCHED=y
  riscv/futex: Optimize atomic cmpxchg
  riscv: defconfig: enable pinctrl and dwmac support for TH1520
2025-01-31 15:13:25 -08:00
Linus Torvalds
fd8c09ad0d Kbuild updates for v6.14
- Support multiple hook locations for maint scripts of Debian package
 
  - Remove 'cpio' from the build tool requirement
 
  - Introduce gendwarfksyms tool, which computes CRCs for export symbols
    based on the DWARF information
 
  - Support CONFIG_MODVERSIONS for Rust
 
  - Resolve all conflicts in the genksyms parser
 
  - Fix several syntax errors in genksyms
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmedJ7AVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGBs0P/1svrWxVRdD7XFtM+UykQqf2R/be
 SnZDeeviqdbGr1J0X5FM/pjaeOb1BPnXsr6O67QaWhyVnpWUviBwwYG2NrCDsx9k
 AaMVsROzpDIeoMhMNeYVs+/SYhA8Jndnd4BOH5CBbo52k4/BLqhU9LXLncLgjZbF
 nys30TilwOaylKq2FHFVI1GhTCQtKKbq+tIxE1SKkHZ1LsSaFphe/eCHMpcOvQb+
 zIFHMm2eIcSlS8TCONFeTnw7NdeCVldYbPCyNAV+nC7Ow7VM8Ws1fq5vsKeH3TGE
 +3qtgQS41KpccNRdp4cTvy6p9iBEvpAvk1BAAOZ347EtLXrNNhngg65CbjyCUt7H
 yBpgWZ6GAGq2yExX5bHbbFHb/n4I3HhkZKeaFDZ3VnMPnni4zdbBqD+sBSI3yHLC
 LUh1NI8gIHjD4bIazbjxWMAQlamQNVNMaHkHqGjro2yIbLL1i2mqMdXuzYhAVwrx
 al7hv357fVPwQ1Gfin7R23T4/NqBTB+1IJnYTBYnnAFnUIAdhsuu9YkX8I97i6yu
 mdTrGROpEYL0GZTE+1LGz6V9DZfQHP8GZ5fDAU1X/f8Js9xkn1lHFhFCg/xM5f9j
 RnwHdiRbvq6uL40/zkinlcpPnGWJkAXWgZqlc0ZbwW8v/vKI51Uo0qUxlVSeu2uH
 mQG30SibArJFPpUN
 =D553
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Support multiple hook locations for maint scripts of Debian package

 - Remove 'cpio' from the build tool requirement

 - Introduce gendwarfksyms tool, which computes CRCs for export symbols
   based on the DWARF information

 - Support CONFIG_MODVERSIONS for Rust

 - Resolve all conflicts in the genksyms parser

 - Fix several syntax errors in genksyms

* tag 'kbuild-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (64 commits)
  kbuild: fix Clang LTO with CONFIG_OBJTOOL=n
  kbuild: Strip runtime const RELA sections correctly
  kconfig: fix memory leak in sym_warn_unmet_dep()
  kconfig: fix file name in warnings when loading KCONFIG_DEFCONFIG_LIST
  genksyms: fix syntax error for attribute before init-declarator
  genksyms: fix syntax error for builtin (u)int*x*_t types
  genksyms: fix syntax error for attribute after 'union'
  genksyms: fix syntax error for attribute after 'struct'
  genksyms: fix syntax error for attribute after abstact_declarator
  genksyms: fix syntax error for attribute before nested_declarator
  genksyms: fix syntax error for attribute before abstract_declarator
  genksyms: decouple ATTRIBUTE_PHRASE from type-qualifier
  genksyms: record attributes consistently for init-declarator
  genksyms: restrict direct-declarator to take one parameter-type-list
  genksyms: restrict direct-abstract-declarator to take one parameter-type-list
  genksyms: remove Makefile hack
  genksyms: fix last 3 shift/reduce conflicts
  genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts
  genksyms: reduce type_qualifier directly to decl_specifier
  genksyms: rename cvar_qualifier to type_qualifier
  ...
2025-01-31 12:07:07 -08:00
Ard Biesheuvel
71d815bf5d kbuild: Strip runtime const RELA sections correctly
Due to the fact that runtime const ELF sections are named without a
leading period or double underscore, the RSTRIP logic that removes the
static RELA sections from vmlinux fails to identify them. This results
in a situation like below, where some sections that were supposed to get
removed are left behind.

  [Nr] Name                              Type            Address          Off     Size   ES Flg Lk Inf Al

  [58] runtime_shift_d_hash_shift        PROGBITS        ffffffff83500f50 2900f50 000014 00   A  0   0  1
  [59] .relaruntime_shift_d_hash_shift   RELA            0000000000000000 55b6f00 000078 18   I 70  58  8
  [60] runtime_ptr_dentry_hashtable      PROGBITS        ffffffff83500f68 2900f68 000014 00   A  0   0  1
  [61] .relaruntime_ptr_dentry_hashtable RELA            0000000000000000 55b6f78 000078 18   I 70  60  8
  [62] runtime_ptr_USER_PTR_MAX          PROGBITS        ffffffff83500f80 2900f80 000238 00   A  0   0  1
  [63] .relaruntime_ptr_USER_PTR_MAX     RELA            0000000000000000 55b6ff0 000d50 18   I 70  62  8

So tweak the match expression to strip all sections starting with .rel.
While at it, consolidate the logic used by RISC-V, s390 and x86 into a
single shared Makefile library command.

Link: https://lore.kernel.org/all/CAHk-=wjk3ynjomNvFN8jf9A1k=qSc=JFF591W00uXj-qqNUxPQ@mail.gmail.com/
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-02-01 04:28:05 +09:00
Yunhui Cui
101971298b
riscv: add a warning when physical memory address overflows
The part of physical memory that exceeds the size of the linear mapping
will be discarded. When the system starts up normally, a warning message
will be printed to prevent confusion caused by the mismatch between the
system memory and the actual physical memory.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240814062625.19794-1-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-29 18:36:09 -08:00
Joel Granados
1751f872cc treewide: const qualify ctl_tables where applicable
Add the const qualifier to all the ctl_tables in the tree except for
watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls,
loadpin_sysctl_table and the ones calling register_net_sysctl (./net,
drivers/inifiniband dirs). These are special cases as they use a
registration function with a non-const qualified ctl_table argument or
modify the arrays before passing them on to the registration function.

Constifying ctl_table structs will prevent the modification of
proc_handler function pointers as the arrays would reside in .rodata.
This is made possible after commit 78eb4ea25c ("sysctl: treewide:
constify the ctl_table argument of proc_handlers") constified all the
proc_handlers.

Created this by running an spatch followed by a sed command:
Spatch:
    virtual patch

    @
    depends on !(file in "net")
    disable optional_qualifier
    @

    identifier table_name != {
      watchdog_hardlockup_sysctl,
      iwcm_ctl_table,
      ucma_ctl_table,
      memory_allocation_profiling_sysctls,
      loadpin_sysctl_table
    };
    @@

    + const
    struct ctl_table table_name [] = { ... };

sed:
    sed --in-place \
      -e "s/struct ctl_table .table = &uts_kern/const struct ctl_table *table = \&uts_kern/" \
      kernel/utsname_sysctl.c

Reviewed-by: Song Liu <song@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> # for kernel/trace/
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> # SCSI
Reviewed-by: Darrick J. Wong <djwong@kernel.org> # xfs
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Bill O'Donnell <bodonnel@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Acked-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-01-28 13:48:37 +01:00
Linus Torvalds
9c5968db9e The various patchsets are summarized below. Plus of course many
indivudual patches which are described in their changelogs.
 
 - "Allocate and free frozen pages" from Matthew Wilcox reorganizes the
   page allocator so we end up with the ability to allocate and free
   zero-refcount pages.  So that callers (ie, slab) can avoid a refcount
   inc & dec.
 
 - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to use
   large folios other than PMD-sized ones.
 
 - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance and
   fixes for this small built-in kernel selftest.
 
 - "mas_anode_descend() related cleanup" from Wei Yang tidies up part of
   the mapletree code.
 
 - "mm: fix format issues and param types" from Keren Sun implements a
   few minor code cleanups.
 
 - "simplify split calculation" from Wei Yang provides a few fixes and a
   test for the mapletree code.
 
 - "mm/vma: make more mmap logic userland testable" from Lorenzo Stoakes
   continues the work of moving vma-related code into the (relatively) new
   mm/vma.c.
 
 - "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David
   Hildenbrand cleans up and rationalizes handling of gfp flags in the page
   allocator.
 
 - "readahead: Reintroduce fix for improper RA window sizing" from Jan
   Kara is a second attempt at fixing a readahead window sizing issue.  It
   should reduce the amount of unnecessary reading.
 
 - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng
   addresses an issue where "huge" amounts of pte pagetables are
   accumulated
   (https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/).
   Qi's series addresses this windup by synchronously freeing PTE memory
   within the context of madvise(MADV_DONTNEED).
 
 - "selftest/mm: Remove warnings found by adding compiler flags" from
   Muhammad Usama Anjum fixes some build warnings in the selftests code
   when optional compiler warnings are enabled.
 
 - "mm: don't use __GFP_HARDWALL when migrating remote pages" from David
   Hildenbrand tightens the allocator's observance of __GFP_HARDWALL.
 
 - "pkeys kselftests improvements" from Kevin Brodsky implements various
   fixes and cleanups in the MM selftests code, mainly pertaining to the
   pkeys tests.
 
 - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to
   estimate application working set size.
 
 - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn
   provides some cleanups to memcg's hugetlb charging logic.
 
 - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song
   removes the global swap cgroup lock.  A speedup of 10% for a tmpfs-based
   kernel build was demonstrated.
 
 - "zram: split page type read/write handling" from Sergey Senozhatsky
   has several fixes and cleaups for zram in the area of zram_write_page().
   A watchdog softlockup warning was eliminated.
 
 - "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin Brodsky
   cleans up the pagetable destructor implementations.  A rare
   use-after-free race is fixed.
 
 - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes
   simplifies and cleans up the debugging code in the VMA merging logic.
 
 - "Account page tables at all levels" from Kevin Brodsky cleans up and
   regularizes the pagetable ctor/dtor handling.  This results in
   improvements in accounting accuracy.
 
 - "mm/damon: replace most damon_callback usages in sysfs with new core
   functions" from SeongJae Park cleans up and generalizes DAMON's sysfs
   file interface logic.
 
 - "mm/damon: enable page level properties based monitoring" from
   SeongJae Park increases the amount of information which is presented in
   response to DAMOS actions.
 
 - "mm/damon: remove DAMON debugfs interface" from SeongJae Park removes
   DAMON's long-deprecated debugfs interfaces.  Thus the migration to sysfs
   is completed.
 
 - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from Peter
   Xu cleans up and generalizes the hugetlb reservation accounting.
 
 - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino
   removes a never-used feature of the alloc_pages_bulk() interface.
 
 - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park
   extends DAMOS filters to support not only exclusion (rejecting), but
   also inclusion (allowing) behavior.
 
 - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi
   "introduces a new memory descriptor for zswap.zpool that currently
   overlaps with struct page for now.  This is part of the effort to reduce
   the size of struct page and to enable dynamic allocation of memory
   descriptors."
 
 - "mm, swap: rework of swap allocator locks" from Kairui Song redoes and
   simplifies the swap allocator locking.  A speedup of 400% was
   demonstrated for one workload.  As was a 35% reduction for kernel build
   time with swap-on-zram.
 
 - "mm: update mips to use do_mmap(), make mmap_region() internal" from
   Lorenzo Stoakes reworks MIPS's use of mmap_region() so that
   mmap_region() can be made MM-internal.
 
 - "mm/mglru: performance optimizations" from Yu Zhao fixes a few MGLRU
   regressions and otherwise improves MGLRU performance.
 
 - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae Park
   updates DAMON documentation.
 
 - "Cleanup for memfd_create()" from Isaac Manjarres does that thing.
 
 - "mm: hugetlb+THP folio and migration cleanups" from David Hildenbrand
   provides various cleanups in the areas of hugetlb folios, THP folios and
   migration.
 
 - "Uncached buffered IO" from Jens Axboe implements the new
   RWF_DONTCACHE flag which provides synchronous dropbehind for pagecache
   reading and writing.  To permite userspace to address issues with
   massive buildup of useless pagecache when reading/writing fast devices.
 
 - "selftests/mm: virtual_address_range: Reduce memory" from Thomas
   Weißschuh fixes and optimizes some of the MM selftests.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ5a+cwAKCRDdBJ7gKXxA
 jtoyAP9R58oaOKPJuTizEKKXvh/RpMyD6sYcz/uPpnf+cKTZxQEAqfVznfWlw/Lz
 uC3KRZYhmd5YrxU4o+qjbzp9XWX/xAE=
 =Ib2s
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "The various patchsets are summarized below. Plus of course many
  indivudual patches which are described in their changelogs.

   - "Allocate and free frozen pages" from Matthew Wilcox reorganizes
     the page allocator so we end up with the ability to allocate and
     free zero-refcount pages. So that callers (ie, slab) can avoid a
     refcount inc & dec

   - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to
     use large folios other than PMD-sized ones

   - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance
     and fixes for this small built-in kernel selftest

   - "mas_anode_descend() related cleanup" from Wei Yang tidies up part
     of the mapletree code

   - "mm: fix format issues and param types" from Keren Sun implements a
     few minor code cleanups

   - "simplify split calculation" from Wei Yang provides a few fixes and
     a test for the mapletree code

   - "mm/vma: make more mmap logic userland testable" from Lorenzo
     Stoakes continues the work of moving vma-related code into the
     (relatively) new mm/vma.c

   - "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David
     Hildenbrand cleans up and rationalizes handling of gfp flags in the
     page allocator

   - "readahead: Reintroduce fix for improper RA window sizing" from Jan
     Kara is a second attempt at fixing a readahead window sizing issue.
     It should reduce the amount of unnecessary reading

   - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng
     addresses an issue where "huge" amounts of pte pagetables are
     accumulated:

       https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/

     Qi's series addresses this windup by synchronously freeing PTE
     memory within the context of madvise(MADV_DONTNEED)

   - "selftest/mm: Remove warnings found by adding compiler flags" from
     Muhammad Usama Anjum fixes some build warnings in the selftests
     code when optional compiler warnings are enabled

   - "mm: don't use __GFP_HARDWALL when migrating remote pages" from
     David Hildenbrand tightens the allocator's observance of
     __GFP_HARDWALL

   - "pkeys kselftests improvements" from Kevin Brodsky implements
     various fixes and cleanups in the MM selftests code, mainly
     pertaining to the pkeys tests

   - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to
     estimate application working set size

   - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn
     provides some cleanups to memcg's hugetlb charging logic

   - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song
     removes the global swap cgroup lock. A speedup of 10% for a
     tmpfs-based kernel build was demonstrated

   - "zram: split page type read/write handling" from Sergey Senozhatsky
     has several fixes and cleaups for zram in the area of
     zram_write_page(). A watchdog softlockup warning was eliminated

   - "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin
     Brodsky cleans up the pagetable destructor implementations. A rare
     use-after-free race is fixed

   - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes
     simplifies and cleans up the debugging code in the VMA merging
     logic

   - "Account page tables at all levels" from Kevin Brodsky cleans up
     and regularizes the pagetable ctor/dtor handling. This results in
     improvements in accounting accuracy

   - "mm/damon: replace most damon_callback usages in sysfs with new
     core functions" from SeongJae Park cleans up and generalizes
     DAMON's sysfs file interface logic

   - "mm/damon: enable page level properties based monitoring" from
     SeongJae Park increases the amount of information which is
     presented in response to DAMOS actions

   - "mm/damon: remove DAMON debugfs interface" from SeongJae Park
     removes DAMON's long-deprecated debugfs interfaces. Thus the
     migration to sysfs is completed

   - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from
     Peter Xu cleans up and generalizes the hugetlb reservation
     accounting

   - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino
     removes a never-used feature of the alloc_pages_bulk() interface

   - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park
     extends DAMOS filters to support not only exclusion (rejecting),
     but also inclusion (allowing) behavior

   - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi
     introduces a new memory descriptor for zswap.zpool that currently
     overlaps with struct page for now. This is part of the effort to
     reduce the size of struct page and to enable dynamic allocation of
     memory descriptors

   - "mm, swap: rework of swap allocator locks" from Kairui Song redoes
     and simplifies the swap allocator locking. A speedup of 400% was
     demonstrated for one workload. As was a 35% reduction for kernel
     build time with swap-on-zram

   - "mm: update mips to use do_mmap(), make mmap_region() internal"
     from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that
     mmap_region() can be made MM-internal

   - "mm/mglru: performance optimizations" from Yu Zhao fixes a few
     MGLRU regressions and otherwise improves MGLRU performance

   - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae
     Park updates DAMON documentation

   - "Cleanup for memfd_create()" from Isaac Manjarres does that thing

   - "mm: hugetlb+THP folio and migration cleanups" from David
     Hildenbrand provides various cleanups in the areas of hugetlb
     folios, THP folios and migration

   - "Uncached buffered IO" from Jens Axboe implements the new
     RWF_DONTCACHE flag which provides synchronous dropbehind for
     pagecache reading and writing. To permite userspace to address
     issues with massive buildup of useless pagecache when
     reading/writing fast devices

   - "selftests/mm: virtual_address_range: Reduce memory" from Thomas
     Weißschuh fixes and optimizes some of the MM selftests"

* tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
  mm/compaction: fix UBSAN shift-out-of-bounds warning
  s390/mm: add missing ctor/dtor on page table upgrade
  kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags()
  tools: add VM_WARN_ON_VMG definition
  mm/damon/core: use str_high_low() helper in damos_wmark_wait_us()
  seqlock: add missing parameter documentation for raw_seqcount_try_begin()
  mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh
  mm/page_alloc: remove the incorrect and misleading comment
  zram: remove zcomp_stream_put() from write_incompressible_page()
  mm: separate move/undo parts from migrate_pages_batch()
  mm/kfence: use str_write_read() helper in get_access_type()
  selftests/mm/mkdirty: fix memory leak in test_uffdio_copy()
  kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags()
  selftests/mm: virtual_address_range: avoid reading from VM_IO mappings
  selftests/mm: vm_util: split up /proc/self/smaps parsing
  selftests/mm: virtual_address_range: unmap chunks after validation
  selftests/mm: virtual_address_range: mmap() without PROT_WRITE
  selftests/memfd/memfd_test: fix possible NULL pointer dereference
  mm: add FGP_DONTCACHE folio creation flag
  mm: call filemap_fdatawrite_range_kick() after IOCB_DONTCACHE issue
  ...
2025-01-26 18:36:23 -08:00
Linus Torvalds
5fb4088624 bitmap patches for v6.14.
Hi Linus,
 
 Please pull bitmap patches for v6.14.
 
 This includes const_true() series from Vincent Mailhol, another
 __always_inline rework from Nathan Chancellor for RISCV, and a
 couple random fixes from Dr. David Alan Gilbert and I Hsin Cheng.
 
 Thanks,
 Yury
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmeUMpEACgkQsUSA/Tof
 vsiqEAv/VvtTD6I3Ms3kIl2G1pBP/EFcYQGbwS1PCsX3RX16rinZ2XUDRtjvRy1Y
 FA+OsZ2yPtH8G1WRM8YauZsh2cZSCA4xTFadLSZkT8leSWERKaTyJI+PXe2A43IU
 d+FV4zYH5JYqV2u9aLHWMO8Voq9nNHZXOYHRu0q53TBFn7V294Lma9oDlK3Wjfur
 vSZZU9SKKlMV8Oy6/hZ3tDemUDM1jAGlqrxFb8aXRsTsCpsmlqE1bQdZ+AadjevZ
 cVeplB8OCCnqcYV28szIwsJpSzmd5/WBP6jLpeMgBYFGS0JT2USdZ8gsw+Yq/On5
 hjxek3cHBKdv0CINk0Ejf4aV0IvoX5S/VRlTjhttzyX68no1DoibDuWJWB42PRWS
 frllVOmdkm2DqA0G9mgxtwzBl5UqMFVe5LuVU9E9BZZeDmRZmS3obrUiMzpiqUOs
 zkdDvA0uaKgjx2qZADDEFqg1+XdX0A0iPebEv9vLaULXv0+D/PbkClNqIf8p7778
 2GWuBLJe
 =1hW6
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-for-6.14' of https://github.com:/norov/linux

Pull bitmap updates from Yury Norov:
 "This includes const_true() series from Vincent Mailhol, another
  __always_inline rework from Nathan Chancellor for RISCV, and a couple
  of random fixes from Dr. David Alan Gilbert and I Hsin Cheng"

* tag 'bitmap-for-6.14' of https://github.com:/norov/linux:
  cpumask: Rephrase comments for cpumask_any*() APIs
  cpu: Remove unused init_cpu_online
  riscv: Always inline bitops
  linux/bits.h: simplify GENMASK_INPUT_CHECK()
  compiler.h: add const_true()
2025-01-26 14:03:44 -08:00
Guo Weikang
c6f239796b mm/memblock: add memblock_alloc_or_panic interface
Before SLUB initialization, various subsystems used memblock_alloc to
allocate memory.  In most cases, when memory allocation fails, an
immediate panic is required.  To simplify this behavior and reduce
repetitive checks, introduce `memblock_alloc_or_panic`.  This function
ensures that memory allocation failures result in a panic automatically,
improving code readability and consistency across subsystems that require
this behavior.

[guoweikang.kernel@gmail.com: arch/s390: save_area_alloc default failure behavior changed to panic]
  Link: https://lkml.kernel.org/r/20250109033136.2845676-1-guoweikang.kernel@gmail.com
  Link: https://lore.kernel.org/lkml/Z2fknmnNtiZbCc7x@kernel.org/
Link: https://lkml.kernel.org/r/20250102072528.650926-1-guoweikang.kernel@gmail.com
Signed-off-by: Guo Weikang <guoweikang.kernel@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>	[s390]
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:38 -08:00
Kevin Brodsky
a9b3c355c2 asm-generic: pgalloc: provide generic __pgd_{alloc,free}
We already have a generic implementation of alloc/free up to P4D level, as
well as pgd_free().  Let's finish the work and add a generic PGD-level
alloc helper as well.

Unlike at lower levels, almost all architectures need some specific magic
at PGD level (typically initialising PGD entries), so introducing a
generic pgd_alloc() isn't worth it.  Instead we introduce two new helpers,
__pgd_alloc() and __pgd_free(), and make use of them in the arch-specific
pgd_alloc() and pgd_free() wherever possible.  To accommodate as many arch
as possible, __pgd_alloc() takes a page allocation order.

Because pagetable_alloc() allocates zeroed pages, explicit zeroing in
pgd_alloc() becomes redundant and we can get rid of it.  Some trivial
implementations of pgd_free() also become unnecessary once __pgd_alloc()
is used; remove them.

Another small improvement is consistent accounting of PGD pages by using
GFP_PGTABLE_{USER,KERNEL} as appropriate.

Not all PGD allocations can be handled by the generic helpers.  In
particular, multiple architectures allocate PGDs from a kmem_cache, and
those PGDs may not be page-sized.

Link: https://lkml.kernel.org/r/20250103184415.2744423-6-kevin.brodsky@arm.com
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:24 -08:00
Qi Zheng
2dccdf7076 mm: pgtable: introduce generic __tlb_remove_table()
Several architectures (arm, arm64, riscv and x86) define exactly the same
__tlb_remove_table(), just introduce generic __tlb_remove_table() to
eliminate these duplications.

The s390 __tlb_remove_table() is nearly the same, so also make s390
__tlb_remove_table() version generic.

Link: https://lkml.kernel.org/r/ea372633d94f4d3f9f56a7ec5994bf050bf77e39.1736317725.git.zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
Acked-by: Andreas Larsson <andreas@gaisler.com>		[sparc]
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>	[s390]
Acked-by: Arnd Bergmann <arnd@arndb.de>			[asm-generic]
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:23 -08:00
Qi Zheng
deab5a355e riscv: pgtable: move pagetable_dtor() to __tlb_remove_table()
Move pagetable_dtor() to __tlb_remove_table(), so that ptlock and page
table pages can be freed together (regardless of whether RCU is used). 
This prevents the use-after-free problem where the ptlock is freed
immediately but the page table pages is freed later via RCU.

Page tables shouldn't have swap cache, so use pagetable_free() instead of
free_page_and_swap_cache() to free page table pages.

By the way, move the comment above __tlb_remove_table() to
riscv_tlb_remove_ptdesc(), it will be more appropriate.

Link: https://lkml.kernel.org/r/b89d77c965507b1b102cbabe988e69365cb288b6.1736317725.git.zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:22 -08:00
Qi Zheng
db6b435d73 mm: pgtable: introduce pagetable_dtor()
The pagetable_p*_dtor() are exactly the same except for the handling of
ptlock.  If we make ptlock_free() handle the case where ptdesc->ptl is
NULL and remove VM_BUG_ON_PAGE() from pmd_ptlock_free(), we can unify
pagetable_p*_dtor() into one function.  Let's introduce pagetable_dtor()
to do this.

Later, pagetable_dtor() will be moved to tlb_remove_ptdesc(), so that
ptlock and page table pages can be freed together (regardless of whether
RCU is used).  This prevents the use-after-free problem where the ptlock
is freed immediately but the page table pages is freed later via RCU.

Link: https://lkml.kernel.org/r/47f44fff9dc68d9d9e9a0d6c036df275f820598a.1736317725.git.zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>	[s390]
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:22 -08:00
Qi Zheng
5fcf5fa612 mm: pgtable: add statistics for P4D level page table
Like other levels of page tables, add statistics for P4D level page table.

Link: https://lkml.kernel.org/r/d55fe3c286305aae84457da9e1066df99b3de125.1736317725.git.zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:21 -08:00
Kevin Brodsky
98a7e47faa asm-generic: pgalloc: provide generic p4d_{alloc_one,free}
Four architectures currently implement 5-level pgtables: arm64, riscv, x86
and s390.  The first three have essentially the same implementation for
p4d_alloc_one() and p4d_free(), so we've got an opportunity to reduce
duplication like at the lower levels.

Provide a generic version of p4d_alloc_one() and p4d_free(), and make use
of it on those architectures.

Their implementation is the same as at PUD level, except that p4d_free()
performs a runtime check by calling mm_p4d_folded().  5-level pgtables
depend on a runtime-detected hardware feature on all supported
architectures, so we might as well include this check in the generic
implementation.  No runtime check is required in p4d_alloc_one() as the
top-level p4d_alloc() already does the required check.

Link: https://lkml.kernel.org/r/26d69c74a29183ecc335b9b407040d8e4cd70c6a.1736317725.git.zhengqi.arch@bytedance.com
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>		[asm-generic]
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:21 -08:00
Kevin Brodsky
5a32443f94 riscv: mm: skip pgtable level check in {pud,p4d}_alloc_one
Patch series "move pagetable_*_dtor() to __tlb_remove_table()", v5.

As proposed [1] by Peter Zijlstra below, this patch series aims to move
pagetable_*_dtor() into __tlb_remove_table().  This will cleanup
pagetable_*_dtor() a bit and more gracefully fix the UAF issue [2]
reported by syzbot.

: Notably:
: 
:  - s390 pud isn't calling the existing pagetable_pud_[cd]tor()
:  - none of the p4d things have pagetable_p4d_[cd]tor() (x86,arm64,s390,riscv)
:    and they have inconsistent accounting
:  - while much of the _ctor calls are in generic code, many of the _dtor
:    calls are in arch code for hysterial raisins, this could easily be
:    fixed
:  - if we fix ptlock_free() to handle NULL, then all the _dtor()
:    functions can use it, and we can observe they're all identical
:    and can be folded
: 
: after all that cleanup, you can move the _dtor from *_free_tlb() into
: tlb_remove_table() -- which for the above case, would then have it called
: from __tlb_remove_table_free().


This patch (of 16):

{pmd,pud,p4d}_alloc_one() is never called if the corresponding page table
level is folded, as {pmd,pud,p4d}_alloc() already does the required check.
We can therefore remove the runtime page table level checks in
{pud,p4d}_alloc_one.  The PUD helper becomes equivalent to the generic
version, so we remove it altogether.

This is consistent with the way arm64 and x86 handle this situation
(runtime check in p4d_free() only).

Link: https://lkml.kernel.org/r/cover.1736317725.git.zhengqi.arch@bytedance.com
Link: https://lkml.kernel.org/r/93a1c6bddc0ded9f1a9f15658c1e4af5c93d1194.1736317725.git.zhengqi.arch@bytedance.com
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:21 -08:00
Linus Torvalds
0f8e26b38d Loongarch:
* Clear LLBCTL if secondary mmu mapping changes.
 
 * Add hypercall service support for usermode VMM.
 
 x86:
 
 * Add a comment to kvm_mmu_do_page_fault() to explain why KVM performs a
   direct call to kvm_tdp_page_fault() when RETPOLINE is enabled.
 
 * Ensure that all SEV code is compiled out when disabled in Kconfig, even
   if building with less brilliant compilers.
 
 * Remove a redundant TLB flush on AMD processors when guest CR4.PGE changes.
 
 * Use str_enabled_disabled() to replace open coded strings.
 
 * Drop kvm_x86_ops.hwapic_irr_update() as KVM updates hardware's APICv cache
   prior to every VM-Enter.
 
 * Overhaul KVM's CPUID feature infrastructure to track all vCPU capabilities
   instead of just those where KVM needs to manage state and/or explicitly
   enable the feature in hardware.  Along the way, refactor the code to make
   it easier to add features, and to make it more self-documenting how KVM
   is handling each feature.
 
 * Rework KVM's handling of VM-Exits during event vectoring; this plugs holes
   where KVM unintentionally puts the vCPU into infinite loops in some scenarios
   (e.g. if emulation is triggered by the exit), and brings parity between VMX
   and SVM.
 
 * Add pending request and interrupt injection information to the kvm_exit and
   kvm_entry tracepoints respectively.
 
 * Fix a relatively benign flaw where KVM would end up redoing RDPKRU when
   loading guest/host PKRU, due to a refactoring of the kernel helpers that
   didn't account for KVM's pre-checking of the need to do WRPKRU.
 
 * Make the completion of hypercalls go through the complete_hypercall
   function pointer argument, no matter if the hypercall exits to
   userspace or not.  Previously, the code assumed that KVM_HC_MAP_GPA_RANGE
   specifically went to userspace, and all the others did not; the new code
   need not special case KVM_HC_MAP_GPA_RANGE and in fact does not care at
   all whether there was an exit to userspace or not.
 
 * As part of enabling TDX virtual machines, support support separation of
   private/shared EPT into separate roots.  When TDX will be enabled, operations
   on private pages will need to go through the privileged TDX Module via SEAMCALLs;
   as a result, they are limited and relatively slow compared to reading a PTE.
   The patches included in 6.14 allow KVM to keep a mirror of the private EPT in
   host memory, and define entries in kvm_x86_ops to operate on external page
   tables such as the TDX private EPT.
 
 * The recently introduced conversion of the NX-page reclamation kthread to
   vhost_task moved the task under the main process.  The task is created as
   soon as KVM_CREATE_VM was invoked and this, of course, broke userspace that
   didn't expect to see any child task of the VM process until it started
   creating its own userspace threads.  In particular crosvm refuses to fork()
   if procfs shows any child task, so unbreak it by creating the task lazily.
   This is arguably a userspace bug, as there can be other kinds of legitimate
   worker tasks and they wouldn't impede fork(); but it's not like userspace
   has a way to distinguish kernel worker tasks right now.  Should they show
   as "Kthread: 1" in proc/.../status?
 
 x86 - Intel:
 
 * Fix a bug where KVM updates hardware's APICv cache of the highest ISR bit
   while L2 is active, while ultimately results in a hardware-accelerated L1
   EOI effectively being lost.
 
 * Honor event priority when emulating Posted Interrupt delivery during nested
   VM-Enter by queueing KVM_REQ_EVENT instead of immediately handling the
   interrupt.
 
 * Rework KVM's processing of the Page-Modification Logging buffer to reap
   entries in the same order they were created, i.e. to mark gfns dirty in the
   same order that hardware marked the page/PTE dirty.
 
 * Misc cleanups.
 
 Generic:
 
 * Cleanup and harden kvm_set_memory_region(); add proper lockdep assertions when
   setting memory regions and add a dedicated API for setting KVM-internal
   memory regions.  The API can then explicitly disallow all flags for
   KVM-internal memory regions.
 
 * Explicitly verify the target vCPU is online in kvm_get_vcpu() to fix a bug
   where KVM would return a pointer to a vCPU prior to it being fully online,
   and give kvm_for_each_vcpu() similar treatment to fix a similar flaw.
 
 * Wait for a vCPU to come online prior to executing a vCPU ioctl, to fix a
   bug where userspace could coerce KVM into handling the ioctl on a vCPU that
   isn't yet onlined.
 
 * Gracefully handle xarray insertion failures; even though such failures are
   impossible in practice after xa_reserve(), reserving an entry is always followed
   by xa_store() which does not know (or differentiate) whether there was an
   xa_reserve() before or not.
 
 RISC-V:
 
 * Zabha, Svvptc, and Ziccrse extension support for guests.  None of them
   require anything in KVM except for detecting them and marking them
   as supported; Zabha adds byte and halfword atomic operations, while the
   others are markers for specific operation of the TLB and of LL/SC
   instructions respectively.
 
 * Virtualize SBI system suspend extension for Guest/VM
 
 * Support firmware counters which can be used by the guests to collect
   statistics about traps that occur in the host.
 
 Selftests:
 
 * Rework vcpu_get_reg() to return a value instead of using an out-param, and
   update all affected arch code accordingly.
 
 * Convert the max_guest_memory_test into a more generic mmu_stress_test.
   The basic gist of the "conversion" is to have the test do mprotect() on
   guest memory while vCPUs are accessing said memory, e.g. to verify KVM
   and mmu_notifiers are working as intended.
 
 * Play nice with treewrite builds of unsupported architectures, e.g. arm
   (32-bit), as KVM selftests' Makefile doesn't do anything to ensure the
   target architecture is actually one KVM selftests supports.
 
 * Use the kernel's $(ARCH) definition instead of the target triple for arch
   specific directories, e.g. arm64 instead of aarch64, mainly so as not to
   be different from the rest of the kernel.
 
 * Ensure that format strings for logging statements are checked by the
   compiler even when the logging statement itself is disabled.
 
 * Attempt to whack the last LLC references/misses mole in the Intel PMU
   counters test by adding a data load and doing CLFLUSH{OPT} on the data
   instead of the code being executed.  It seems that modern Intel CPUs
   have learned new code prefetching tricks that bypass the PMU counters.
 
 * Fix a flaw in the Intel PMU counters test where it asserts that events
   are counting correctly without actually knowing what the events count
   given the underlying hardware; this can happen if Intel reuses a
   formerly microarchitecture-specific event encoding as an architectural
   event, as was the case for Top-Down Slots.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeTuzoUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOkBwf8CRNExYaM3j9y2E7mmo6AiL2ug6+J
 Uy5Hai1poY48pPwKC6ke3EWT8WVsgj/Py5pCeHvLojQchWNjCCYNfSQluJdkRxwG
 DgP3QUljSxEJWBeSwyTRcKM+IySi5hZd1IFo3gePFRB829Jpnj05vjbvCyv8gIwU
 y3HXxSYDsViaaFoNg4OlZFsIGis7mtknsZzk++QjuCXmxNa6UCbv3qvE/UkVLhVg
 WH65RTRdjk+EsdwaOMHKuUvQoGa+iM4o39b6bqmw8+ZMK39+y33WeTX/y5RXsp1N
 tUUBRfS+MuuYgC/6LmTr66EkMzoChxk3Dp3kKUaCBcfqRC8PxQag5reZhw==
 =NEaO
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "Loongarch:

   - Clear LLBCTL if secondary mmu mapping changes

   - Add hypercall service support for usermode VMM

  x86:

   - Add a comment to kvm_mmu_do_page_fault() to explain why KVM
     performs a direct call to kvm_tdp_page_fault() when RETPOLINE is
     enabled

   - Ensure that all SEV code is compiled out when disabled in Kconfig,
     even if building with less brilliant compilers

   - Remove a redundant TLB flush on AMD processors when guest CR4.PGE
     changes

   - Use str_enabled_disabled() to replace open coded strings

   - Drop kvm_x86_ops.hwapic_irr_update() as KVM updates hardware's
     APICv cache prior to every VM-Enter

   - Overhaul KVM's CPUID feature infrastructure to track all vCPU
     capabilities instead of just those where KVM needs to manage state
     and/or explicitly enable the feature in hardware. Along the way,
     refactor the code to make it easier to add features, and to make it
     more self-documenting how KVM is handling each feature

   - Rework KVM's handling of VM-Exits during event vectoring; this
     plugs holes where KVM unintentionally puts the vCPU into infinite
     loops in some scenarios (e.g. if emulation is triggered by the
     exit), and brings parity between VMX and SVM

   - Add pending request and interrupt injection information to the
     kvm_exit and kvm_entry tracepoints respectively

   - Fix a relatively benign flaw where KVM would end up redoing RDPKRU
     when loading guest/host PKRU, due to a refactoring of the kernel
     helpers that didn't account for KVM's pre-checking of the need to
     do WRPKRU

   - Make the completion of hypercalls go through the complete_hypercall
     function pointer argument, no matter if the hypercall exits to
     userspace or not.

     Previously, the code assumed that KVM_HC_MAP_GPA_RANGE specifically
     went to userspace, and all the others did not; the new code need
     not special case KVM_HC_MAP_GPA_RANGE and in fact does not care at
     all whether there was an exit to userspace or not

   - As part of enabling TDX virtual machines, support support
     separation of private/shared EPT into separate roots.

     When TDX will be enabled, operations on private pages will need to
     go through the privileged TDX Module via SEAMCALLs; as a result,
     they are limited and relatively slow compared to reading a PTE.

     The patches included in 6.14 allow KVM to keep a mirror of the
     private EPT in host memory, and define entries in kvm_x86_ops to
     operate on external page tables such as the TDX private EPT

   - The recently introduced conversion of the NX-page reclamation
     kthread to vhost_task moved the task under the main process. The
     task is created as soon as KVM_CREATE_VM was invoked and this, of
     course, broke userspace that didn't expect to see any child task of
     the VM process until it started creating its own userspace threads.

     In particular crosvm refuses to fork() if procfs shows any child
     task, so unbreak it by creating the task lazily. This is arguably a
     userspace bug, as there can be other kinds of legitimate worker
     tasks and they wouldn't impede fork(); but it's not like userspace
     has a way to distinguish kernel worker tasks right now. Should they
     show as "Kthread: 1" in proc/.../status?

  x86 - Intel:

   - Fix a bug where KVM updates hardware's APICv cache of the highest
     ISR bit while L2 is active, while ultimately results in a
     hardware-accelerated L1 EOI effectively being lost

   - Honor event priority when emulating Posted Interrupt delivery
     during nested VM-Enter by queueing KVM_REQ_EVENT instead of
     immediately handling the interrupt

   - Rework KVM's processing of the Page-Modification Logging buffer to
     reap entries in the same order they were created, i.e. to mark gfns
     dirty in the same order that hardware marked the page/PTE dirty

   - Misc cleanups

  Generic:

   - Cleanup and harden kvm_set_memory_region(); add proper lockdep
     assertions when setting memory regions and add a dedicated API for
     setting KVM-internal memory regions. The API can then explicitly
     disallow all flags for KVM-internal memory regions

   - Explicitly verify the target vCPU is online in kvm_get_vcpu() to
     fix a bug where KVM would return a pointer to a vCPU prior to it
     being fully online, and give kvm_for_each_vcpu() similar treatment
     to fix a similar flaw

   - Wait for a vCPU to come online prior to executing a vCPU ioctl, to
     fix a bug where userspace could coerce KVM into handling the ioctl
     on a vCPU that isn't yet onlined

   - Gracefully handle xarray insertion failures; even though such
     failures are impossible in practice after xa_reserve(), reserving
     an entry is always followed by xa_store() which does not know (or
     differentiate) whether there was an xa_reserve() before or not

  RISC-V:

   - Zabha, Svvptc, and Ziccrse extension support for guests. None of
     them require anything in KVM except for detecting them and marking
     them as supported; Zabha adds byte and halfword atomic operations,
     while the others are markers for specific operation of the TLB and
     of LL/SC instructions respectively

   - Virtualize SBI system suspend extension for Guest/VM

   - Support firmware counters which can be used by the guests to
     collect statistics about traps that occur in the host

  Selftests:

   - Rework vcpu_get_reg() to return a value instead of using an
     out-param, and update all affected arch code accordingly

   - Convert the max_guest_memory_test into a more generic
     mmu_stress_test. The basic gist of the "conversion" is to have the
     test do mprotect() on guest memory while vCPUs are accessing said
     memory, e.g. to verify KVM and mmu_notifiers are working as
     intended

   - Play nice with treewrite builds of unsupported architectures, e.g.
     arm (32-bit), as KVM selftests' Makefile doesn't do anything to
     ensure the target architecture is actually one KVM selftests
     supports

   - Use the kernel's $(ARCH) definition instead of the target triple
     for arch specific directories, e.g. arm64 instead of aarch64,
     mainly so as not to be different from the rest of the kernel

   - Ensure that format strings for logging statements are checked by
     the compiler even when the logging statement itself is disabled

   - Attempt to whack the last LLC references/misses mole in the Intel
     PMU counters test by adding a data load and doing CLFLUSH{OPT} on
     the data instead of the code being executed. It seems that modern
     Intel CPUs have learned new code prefetching tricks that bypass the
     PMU counters

   - Fix a flaw in the Intel PMU counters test where it asserts that
     events are counting correctly without actually knowing what the
     events count given the underlying hardware; this can happen if
     Intel reuses a formerly microarchitecture-specific event encoding
     as an architectural event, as was the case for Top-Down Slots"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (151 commits)
  kvm: defer huge page recovery vhost task to later
  KVM: x86/mmu: Return RET_PF* instead of 1 in kvm_mmu_page_fault()
  KVM: Disallow all flags for KVM-internal memslots
  KVM: x86: Drop double-underscores from __kvm_set_memory_region()
  KVM: Add a dedicated API for setting KVM-internal memslots
  KVM: Assert slots_lock is held when setting memory regions
  KVM: Open code kvm_set_memory_region() into its sole caller (ioctl() API)
  LoongArch: KVM: Add hypercall service support for usermode VMM
  LoongArch: KVM: Clear LLBCTL if secondary mmu mapping is changed
  KVM: SVM: Use str_enabled_disabled() helper in svm_hardware_setup()
  KVM: VMX: read the PML log in the same order as it was written
  KVM: VMX: refactor PML terminology
  KVM: VMX: Fix comment of handle_vmx_instruction()
  KVM: VMX: Reinstate __exit attribute for vmx_exit()
  KVM: SVM: Use str_enabled_disabled() helper in sev_hardware_setup()
  KVM: x86: Avoid double RDPKRU when loading host/guest PKRU
  KVM: x86: Use LVT_TIMER instead of an open coded literal
  RISC-V: KVM: Add new exit statstics for redirected traps
  RISC-V: KVM: Update firmware counters for various events
  RISC-V: KVM: Redirect instruction access fault trap to guest
  ...
2025-01-25 09:55:09 -08:00
Linus Torvalds
917846e9f0 samsung: add gs101-mbox driver
microchip: add sbi-ipc driver
 zynqmp: fix invalid __percpu annotation
 qcom: add IPQ5424 APCS compatible
 mpfs fix copy and paste bug
 th1520: Fix NULL vs IS_ERR() and a memory corruption bug
 tegra-hsp: clear mailbox before using message
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6EwehDt/SOnwFyTyf9lkf8eYP5UFAmeTDB4ACgkQf9lkf8eY
 P5UfghAAgusrJTWyRCeBC/pg6dwkacvl9zAOi+eIVWb6IJ/sZcEyHvWjo9jsqAw2
 mX2tqvcltUnh6ID1GdFz6bH1OYTQfcXhs0LO/ESkGCkVsrAh2uuBFVvxLwqahY/w
 64bEc+S/xVLpAp9/XA9HXgtyAoCueDLB0dreLB4umG9ps8j9n1g46GVAauf2ftmV
 5BN0ybnPd4nYzQ4ergD59jWMI1QOrb/F2JpK9M8RLCBt135xqZ1rz5cCiYZcN4LY
 gxX6HlbkVSaZ45ZEe7++iGK8c8M+uWsxOhjE0PXT1SeeNPyXz09Spsny75r1uUZo
 04t9kQ2yzcrBkHd7AInNBpiKQgFSvuhhjZDI0gQZLcD0aDsNvTTR86PoOuH7pP8B
 qGg+wCU1Puy1Olv9nelc+M5nSLMSBHcXvSDVN8kOCGOXGPgoPLEOD6K8L0s9WPRZ
 VnzQiddMcBuYhFXbGNv73EblrjGyMwJPeUl06yDnMBoaZXvqHirvYwANzJgvhZn2
 piq5bhX2eUrN0UIP+VInU2RAWFsN+ORAoL5TH/JTkA5S7+yzOCK0R/wkUsMcmvco
 gBmNtKOvuHEDl4M50TC9baL/3p07VM/hxXf9j6zGeQVpfITAQGh++/vMNug95ITM
 FlgWWWJziNb9v3F1XU2bLESqrTSjwNC5MXPDV+CraVorwzyeao0=
 =XBAn
 -----END PGP SIGNATURE-----

Merge tag 'mailbox-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox

Pull mailbox updates from Jassi Brar:

 - samsung: add gs101-mbox driver

 - microchip: add sbi-ipc driver

 - zynqmp: fix invalid __percpu annotation

 - qcom: add IPQ5424 APCS compatible

 - mpfs fix copy and paste bug

 - th1520: Fix NULL vs IS_ERR() and a memory corruption bug

 - tegra-hsp: clear mailbox before using message

* tag 'mailbox-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox:
  riscv: export __cpuid_to_hartid_map
  riscv: sbi: vendorid_list: Add Microchip Technology to the vendor list
  mailbox: th1520: Fix memory corruption due to incorrect array size
  mailbox: zynqmp: Remove invalid __percpu annotation in zynqmp_ipi_probe()
  MAINTAINERS: add entry for Samsung Exynos mailbox driver
  mailbox: add Samsung Exynos driver
  dt-bindings: mailbox: add google,gs101-mbox
  mailbox: qcom: Add support for IPQ5424 APCS IPC
  dt-bindings: mailbox: qcom: Add IPQ5424 APCS compatible
  mailbox: qcom-ipcc: Reset CLEAR_ON_RECV_RD if set from boot firmware
  mailbox: add Microchip IPC support
  dt-bindings: mailbox: add binding for Microchip IPC mailbox controller
  mailbox: tegra-hsp: Clear mailbox before using message
  mailbox: mpfs: fix copy and paste bug in probe
  mailbox: th1520: Fix a NULL vs IS_ERR() bug
2025-01-24 16:04:40 -08:00
Linus Torvalds
7108814670 soc: defconfig updates for 6.14
As usual, a number of new drivers get added to the defconfig to
 support additional hardware. The stm32 defconfig also turns off
 a few options to optimize for size.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmeTguUACgkQYKtH/8kJ
 Uifr2xAAtxKljNKki7bb9aeTcvZ98g34dSBCB76Oc80tYTM1Q9zZO+96ZBv/9W/F
 fhvAG8cSjfrXfUB8nwiQBzbV8Kl7jaxomb3h3z8snji6Kp4+IZtrYUPfS+QG5GMB
 o5U5eBTeipbsZRO3EwHdPvdCf39oeAvb+6IsESnbzAbZkhMOaUhzHok1G+fxekvv
 pF4mgFCYmnjq4TLMEikVlKF5qBQ2kScbECLDVx9/xvz8p6RotL0WOKGvfwJf4LFx
 nP1D1bBB01g/yilVZ6PAyfhnpN8KULdENllOBxnladPkt+Qb4zEHHuAChaSE8gQS
 ulZAq83OGyY85xTAUUPgFTpmLrIVeOPtjHuJQ2RScO+pB/MIf/fL1ozRvxD2jiKa
 KYFTlVFUX0xIeSAJ/xCp3/i2052AQ6ac5mp9P3ti77XSbSfCarWx882BfELZrbsw
 gKabUr3QJxJmxio8UL5HUPlxKAsBczBEa/jC/XqtSbQmcnfL3FxpNh77KgLx5ICx
 n3CGio4YmE/cVgWVjjJzq974MPpdvzRWN7cnLpWEQnOv3iuk2tomdGDqkUlDzuan
 7U2ylFa2q0Li7eKqplX6gqny4QKCNH1aMQlucmxifvAG0X9jYlq9sLVVeJM8LiCi
 C0cCVUx6KtN0ZjRawDSY0qZ8Yeq/NyIpU4/Ywc0aAZ5JfNqsA4Y=
 =ZFLM
 -----END PGP SIGNATURE-----

Merge tag 'soc-defconfig-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC defconfig updates from Arnd Bergmann:
 "As usual, a number of new drivers get added to the defconfig to
  support additional hardware.

  The stm32 defconfig also turns off a few options to optimize for size"

* tag 'soc-defconfig-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (27 commits)
  dt-bindings: soc: samsung: exynos-pmu: Add exynos990-pmu compatible
  arm64: defconfig: enable Maxim TCPCI driver
  ARM: configs: stm32: Remove useless flags in STM32 defconfig
  ARM: configs: stm32: Remove CRYPTO in STM32 defconfig
  ARM: configs: stm32: Clean STM32 defconfig
  ARM: configs: stm32: Remove FLASH_MEM_BASE and FLASH_SIZE in STM32 defconfig
  arm64: defconfig: Enable pinctrl-based I2C mux
  arm64: defconfig: Enable Rockchip extensions for Synopsys DW HDMI QP
  arm64: defconfig: Enable RFKILL GPIO
  arm64: defconfig: Enable TI K3 M4 remoteproc driver
  arm64: defconfig: Enable Qualcomm IPQ CMN PLL clock controller
  arm64: defconfig: Enable basic Qualcomm SM8750 SoC drivers
  arm64: defconfig: remove obsolete CONFIG_SM_DISPCC_8650
  arm64: defconfig: enable clock controller, interconnect and pinctrl for QCS8300
  arm64: defconfig: Enable sa8775p clock controllers
  arm64: defconfig: Enable MediaTek DWMAC
  arm64: defconfig: Enable sound for MT8188
  arm64: defconfig: Enable MediaTek STAR Ethernet MAC
  riscv: defconfig: enable pinctrl and dwmac support for TH1520
  arm64: defconfig: Enable Amazon Elastic Network Adaptor
  ...
2025-01-24 15:03:53 -08:00
Linus Torvalds
f102039270 soc: devicetree changes for 6.14
We see the addition of eleven new SoCs, including a total of sixx arm64
 chips from Qualcomm alone. Overall, the Qualcomm platforms once again
 make up the majority of all changes, after a couple of quieter releases.
 
 The new SoCs in this branch are:
 
  - Microchip sama7d65 is a new 32-bit embedded chip with a single
    Cortex-A7 and the current high end of the old Atmel SoC line.
 
  - Samsung Exynos 9810 is a mobile phone chip used in some older
    phones like the Samsung Galaxy S9
 
  - Renesas R-Car V4H ES3.0 (R8A779G3) is an updated version of
    the V4H (R8A779G0) low-power automotive SoC
 
  - Renesas RZ/G3E (R0A09G047) is a family of embedded chips
    using Cortex-A55 cores
 
  - Qualcomm Snapdragon 8 Elite (SM8750) is a new phone chip based on
    Qualcomm's Oryon CPU cores.
 
  - Qualcomm Snapdragon AR2 (SAR2130P) is a SoC for augmented reality
    glasses.
 
  - Qualcomm IQ6 (QCS610) and IQ8 (QCS8300) are two industrial
    IOT platforms.
 
  - Snapdragon 425 (MSM8917) is a mobile phone SoC from 2016
 
  - Qualcomm IPQ5424 is a Wi-Fi 7 networking chip
 
 All of the above are part of already supported SoC families that
 only need new devicetree files. Two additional SoCs in new
 families are part of a separate branch.
 
 There are 48 new machines in total, including six arm32 ones based
 on aspeed. broadcom, microchip and st SoCs all using Cortex-A7 cores,
 and a single risc-v board, the Banana Pi R3.
 
 The remaining ones use arm64 chips from Broadcom, Samsung, NXP, Mediatek,
 Qualcomm, Renesas and Rockchips and cover development boards, phones,
 laptops, industrial machines routers.
 
 A lot of ongoing work is for cleaning up build time warnings and other
 issues, in addition to the new machines and added features.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmeSdLQACgkQYKtH/8kJ
 UicovRAA0fABVQ8Fl45/NNaBGfXYagXptCSTGOFsdKJ49LVF4uLfWtL+0ENx5Ck5
 PJjr0n9kMNWqeJDiaaQtW21HhYxGxcz3MJEj60/C+D0QNQExPVROUHNy1aggxjNI
 qHf0DnTLAWzjtD0YdCmiI6JCDRdPIRQi2IJymAu7tlooc809PG15bbo6PpIYginC
 1U6cYtuyBuE/9ku2FgWX6E4T0aRjPyaR8thg9VAIsKsugdH3v9EdtLC/MUqOBHMt
 30PyghR9+r1LxQzOC/q7TFcPmnUb74fSPW85X7a5KXv53K6MeRXtRhnetts08R7Z
 iZCJi2ORO100RX7plAzxtF+CWI8eO3bVzibTcZmgxP/Is6CmrlnTcPzOFvqfyx1E
 AfeyEGA7XofjFwPJcc9bCQc3r2w90FpsKqtlaBAn2Od+1EUuuAAgUcjrNyNJqlkp
 8Vos0FxNOOnYULjndYZqa6MslBuxNXYtNj0Ph1/fpzUWKwo+x8LWy8Xb9a5Sdz0H
 OsPVWbumrXlG1rcNMFu8yPzKOBgO0t8on5MRwW+1Xmf1lcQNzJWeGqTzsFPObREV
 Ar7evGEgSb8qladOtzbg645wIezWIXpSJUICQhilxV8DUO+IYuMz668QoZZP40V5
 uHdWxFGdNe1cm5JAsjjwCeFNk/Pbro1+ojc4E6//MRp+WCgdPQ0=
 =vdmR
 -----END PGP SIGNATURE-----

Merge tag 'soc-dt-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC devicetree updates from Arnd Bergmann:
 "We see the addition of eleven new SoCs, including a total of sixx
  arm64 chips from Qualcomm alone. Overall, the Qualcomm platforms once
  again make up the majority of all changes, after a couple of quieter
  releases.

  The new SoCs in this branch are:

   - Microchip sama7d65 is a new 32-bit embedded chip with a single
     Cortex-A7 and the current high end of the old Atmel SoC line.

   - Samsung Exynos 9810 is a mobile phone chip used in some older
     phones like the Samsung Galaxy S9

   - Renesas R-Car V4H ES3.0 (R8A779G3) is an updated version of the V4H
     (R8A779G0) low-power automotive SoC

   - Renesas RZ/G3E (R0A09G047) is a family of embedded chips using
     Cortex-A55 cores

   - Qualcomm Snapdragon 8 Elite (SM8750) is a new phone chip based on
     Qualcomm's Oryon CPU cores.

   - Qualcomm Snapdragon AR2 (SAR2130P) is a SoC for augmented reality
     glasses.

   - Qualcomm IQ6 (QCS610) and IQ8 (QCS8300) are two industrial IOT
     platforms.

   - Snapdragon 425 (MSM8917) is a mobile phone SoC from 2016

   - Qualcomm IPQ5424 is a Wi-Fi 7 networking chip

  All of the above are part of already supported SoC families that only
  need new devicetree files. Two additional SoCs in new families are
  part of a separate branch.

  There are 48 new machines in total, including six arm32 ones based on
  aspeed. broadcom, microchip and st SoCs all using Cortex-A7 cores, and
  a single risc-v board, the Banana Pi R3.

  The remaining ones use arm64 chips from Broadcom, Samsung, NXP,
  Mediatek, Qualcomm, Renesas and Rockchips and cover development
  boards, phones, laptops, industrial machines routers.

 A lot of ongoing work is for cleaning up build time warnings and other
 issues, in addition to the new machines and added features"

* tag 'soc-dt-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (619 commits)
  arm64: tegra: Fix Tegra234 PCIe interrupt-map
  arm64: dts: qcom: x1e80100-romulus: Update firmware nodes
  arm64: dts: rockchip: add DTs for Firefly ITX-3588J and its Core-3588J SoM
  dt-bindings: arm: rockchip: Add Firefly ITX-3588J board
  arm64: dts: rockchip: Add Orange Pi 5 Max board
  dt-bindings: arm: rockchip: Add Xunlong Orange Pi 5 Max
  arm64: dts: rockchip: refactor common rk3588-orangepi-5.dtsi
  arm64: dts: rockchip: add WLAN to rk3588-evb1 controller
  arm64: dts: rockchip: increase gmac rx_delay on rk3399-puma
  arm64: dts: rockchip: Delete redundant RK3328 GMAC stability fixes
  arm64: tegra: Disable Tegra234 sce-fabric node
  arm64: tegra: Fix typo in Tegra234 dce-fabric compatible
  arm64: tegra: Fix DMA ID for SPI2
  arm64: dts: qcom: msm8916-samsung-serranove: Add display panel
  arm64: dts: qcom: sm8650: Add 'global' interrupt to the PCIe RC nodes
  arm64: dts: qcom: sm8550: Add 'global' interrupt to the PCIe RC nodes
  arm64: dts: qcom: Remove unused and undocumented properties
  arm64: dts: qcom: sdm450-lenovo-tbx605f: add DSI panel nodes
  arm64: dts: qcom: pmi8950: add LAB-IBB nodes
  arm64: dts: qcom: ipq5424: enable the download mode support
  ...
2025-01-24 14:48:03 -08:00
Linus Torvalds
4e517a6acd soc: new SoC support for 6.14
Two new SoC families are added here, with devicetree files and
 a little bit of infrastructure to allow booting:
 
  - Blaize BLZP1600 is an AI chip using custom GSP (Graph Streaming
    Processor) cores for computation, and two small Cortex-A53 cores
    that run the operating system.
 
  - SpacemiT K1 is a 64-bit RISC-V chip, using eight custom RVA22
    compatible CPU cores with vector support.
    Also marketed at AI applications, it has a much slower NPU compared
    to BLZP1600, but in turn focuses on the CPU performance
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmeSd7cACgkQYKtH/8kJ
 UienxxAAoUMAWex/k5gWfg3DdvDrleXydjYzIm6DBIad1kxHYy/gvEuQAgV9vxqb
 tQ763NKRLekmwv+CX+S47PjrMHgqqwtch5If8ixd50gYSZPukJmuCTkthMu1qBb9
 /zi0LXg3cViPQBAY0MwCQMzukRHsqmBzaUSle6bM1yDHhHAmcz88Iqs/LttJOnYk
 M/+ybxYoUdVmPI+Nt3e8hxlecfU2oM0S1hoStvdsA0fpuLfEBF8WpF7DnkSphT10
 JqVysB3FTF0PYXyBodcuh37iCCObAFVAUT/ZvKO3+Qo/rI2IseP8lSeAV84JP/5S
 w4MqpGhFecICBEWJ9h0FnbSI2NyON8uz23R+va6YAtTxuPL2Loq4HwvxXh5H5imC
 sxZ0SEnkwFQ7DJ10nV00FXlmKZ7+Ax0WLoboNHWYxdAfqhgcZ8q8gi/+WTETTBog
 jnTnWjscuEA5QTNfOP0AqUtVbWQOj09wLsiceJc8G5jpbe9WaZJy1yzXaCNu2hZV
 2x+2UT31Y6G2Poxee8Ti4u1ljieJh5BmD1Ts81P8EpAC0JsVyJ3HtouYeBHcBPoj
 HXmdGDPN8bS0Ugu7iFu3GO+TPcBzUw1ZND8L3QbBesb/oFK0455ARkVXO2nazP4p
 4s/iGWbGAKIyKPOdYGaiSXmbXurx99pOVY6G8ccqkNf+17MfTgE=
 =clzA
 -----END PGP SIGNATURE-----

Merge tag 'soc-new-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull new SoC support from Arnd Bergmann:
 "Two new SoC families are added here, with devicetree files and a
  little bit of infrastructure to allow booting:

   - Blaize BLZP1600 is an AI chip using custom GSP (Graph Streaming
     Processor) cores for computation, and two small Cortex-A53 cores
     that run the operating system.

   - SpacemiT K1 is a 64-bit RISC-V chip, using eight custom RVA22
     compatible CPU cores with vector support.

     Also marketed at AI applications, it has a much slower NPU compared
     to BLZP1600, but in turn focuses on the CPU performance"

* tag 'soc-new-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  riscv: dts: spacemit: move aliases to board dts
  riscv: dts: spacemit: add pinctrl property to uart0 in BPI-F3
  riscv: defconfig: enable SpacemiT SoC
  riscv: dts: spacemit: add Banana Pi BPI-F3 board device tree
  riscv: dts: add initial SpacemiT K1 SoC device tree
  riscv: add SpacemiT SoC family Kconfig support
  dt-bindings: serial: 8250: Add SpacemiT K1 uart compatible
  dt-bindings: interrupt-controller: Add SpacemiT K1 PLIC
  dt-bindings: timer: Add SpacemiT K1 CLINT
  dt-bindings: riscv: add SpacemiT K1 bindings
  dt-bindings: riscv: Add SpacemiT X60 compatibles
  MAINTAINERS: setup support for SpacemiT SoC tree
  MAINTAINER: Add entry for Blaize SoC
  arm64: defconfig: Enable Blaize BLZP1600 platform
  arm64: dts: Add initial support for Blaize BLZP1600 CB2
  arm64: Add Blaize BLZP1600 SoC family
  dt-bindings: arm: blaize: Add Blaize BLZP1600 SoC
  dt-bindings: Add Blaize vendor prefix
2025-01-24 14:31:06 -08:00
Linus Torvalds
37b33c68b0 CRC updates for 6.14
- Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be
   directly accessible via the library API, instead of requiring the
   crypto API.  This is much simpler and more efficient.
 
 - Convert some users such as ext4 to use the CRC32 library API instead
   of the crypto API.  More conversions like this will come later.
 
 - Add a KUnit test that tests and benchmarks multiple CRC variants.
   Remove older, less-comprehensive tests that are made redundant by
   this.
 
 - Add an entry to MAINTAINERS for the kernel's CRC library code.  I'm
   volunteering to maintain it.  I have additional cleanups and
   optimizations planned for future cycles.
 
 These patches have been in linux-next since -rc1.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZ418ZRQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKyJYAP9kBlpm8W9/XY6N8SpjKaXE/vKQYHQl
 Nobhak06Us8uJwEAkcUTymWP4IwQj5A9jgBAPRw53FQcNVKIc+01C7gRHw0=
 =mqSH
 -----END PGP SIGNATURE-----

Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull CRC updates from Eric Biggers:

 - Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be
   directly accessible via the library API, instead of requiring the
   crypto API. This is much simpler and more efficient.

 - Convert some users such as ext4 to use the CRC32 library API instead
   of the crypto API. More conversions like this will come later.

 - Add a KUnit test that tests and benchmarks multiple CRC variants.
   Remove older, less-comprehensive tests that are made redundant by
   this.

 - Add an entry to MAINTAINERS for the kernel's CRC library code. I'm
   volunteering to maintain it. I have additional cleanups and
   optimizations planned for future cycles.

* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (31 commits)
  MAINTAINERS: add entry for CRC library
  powerpc/crc: delete obsolete crc-vpmsum_test.c
  lib/crc32test: delete obsolete crc32test.c
  lib/crc16_kunit: delete obsolete crc16_kunit.c
  lib/crc_kunit.c: add KUnit test suite for CRC library functions
  powerpc/crc-t10dif: expose CRC-T10DIF function through lib
  arm64/crc-t10dif: expose CRC-T10DIF function through lib
  arm/crc-t10dif: expose CRC-T10DIF function through lib
  x86/crc-t10dif: expose CRC-T10DIF function through lib
  crypto: crct10dif - expose arch-optimized lib function
  lib/crc-t10dif: add support for arch overrides
  lib/crc-t10dif: stop wrapping the crypto API
  scsi: target: iscsi: switch to using the crc32c library
  f2fs: switch to using the crc32 library
  jbd2: switch to using the crc32c library
  ext4: switch to using the crc32c library
  lib/crc32: make crc32c() go directly to lib
  bcachefs: Explicitly select CRYPTO from BCACHEFS_FS
  x86/crc32: expose CRC32 functions through lib
  x86/crc32: update prototype for crc32_pclmul_le_16()
  ...
2025-01-22 19:55:08 -08:00
Linus Torvalds
2e04247f7c ftrace updates for v6.14:
- Have fprobes built on top of function graph infrastructure
 
   The fprobe logic is an optimized kprobe that uses ftrace to attach to
   functions when a probe is needed at the start or end of the function. The
   fprobe and kretprobe logic implements a similar method as the function
   graph tracer to trace the end of the function. That is to hijack the
   return address and jump to a trampoline to do the trace when the function
   exits. To do this, a shadow stack needs to be created to store the
   original return address.  Fprobes and function graph do this slightly
   differently. Fprobes (and kretprobes) has slots per callsite that are
   reserved to save the return address. This is fine when just a few points
   are traced. But users of fprobes, such as BPF programs, are starting to add
   many more locations, and this method does not scale.
 
   The function graph tracer was created to trace all functions in the
   kernel. In order to do this, when function graph tracing is started, every
   task gets its own shadow stack to hold the return address that is going to
   be traced. The function graph tracer has been updated to allow multiple
   users to use its infrastructure. Now have fprobes be one of those users.
   This will also allow for the fprobe and kretprobe methods to trace the
   return address to become obsolete. With new technologies like CFI that
   need to know about these methods of hijacking the return address, going
   toward a solution that has only one method of doing this will make the
   kernel less complex.
 
 - Cleanup with guard() and free() helpers
 
   There were several places in the code that had a lot of "goto out" in the
   error paths to either unlock a lock or free some memory that was
   allocated. But this is error prone. Convert the code over to use the
   guard() and free() helpers that let the compiler unlock locks or free
   memory when the function exits.
 
 - Remove disabling of interrupts in the function graph tracer
 
   When function graph tracer was first introduced, it could race with
   interrupts and NMIs. To prevent that race, it would disable interrupts and
   not trace NMIs. But the code has changed to allow NMIs and also
   interrupts. This change was done a long time ago, but the disabling of
   interrupts was never removed. Remove the disabling of interrupts in the
   function graph tracer is it is not needed. This greatly improves its
   performance.
 
 - Allow the :mod: command to enable tracing module functions on the kernel
   command line.
 
   The function tracer already has a way to enable functions to be traced in
   modules by writing ":mod:<module>" into set_ftrace_filter. That will
   enable either all the functions for the module if it is loaded, or if it
   is not, it will cache that command, and when the module is loaded that
   matches <module>, its functions will be enabled. This also allows init
   functions to be traced. But currently events do not have that feature.
 
   Because enabling function tracing can be done very early at boot up
   (before scheduling is enabled), the commands that can be done when
   function tracing is started is limited. Having the ":mod:" command to
   trace module functions as they are loaded is very useful. Update the
   kernel command line function filtering to allow it.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ42E2RQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qqXSAPwOMxuhye8tb1GYG62QD9+w7e6nOmlC
 2GCPj4detnEM2QD/ciivkhespVKhHpZHRewAuSnJgHPSM45NQ3EVESzjWQ4=
 =snbx
 -----END PGP SIGNATURE-----

Merge tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ftrace updates from Steven Rostedt:

 - Have fprobes built on top of function graph infrastructure

   The fprobe logic is an optimized kprobe that uses ftrace to attach to
   functions when a probe is needed at the start or end of the function.
   The fprobe and kretprobe logic implements a similar method as the
   function graph tracer to trace the end of the function. That is to
   hijack the return address and jump to a trampoline to do the trace
   when the function exits. To do this, a shadow stack needs to be
   created to store the original return address. Fprobes and function
   graph do this slightly differently. Fprobes (and kretprobes) has
   slots per callsite that are reserved to save the return address. This
   is fine when just a few points are traced. But users of fprobes, such
   as BPF programs, are starting to add many more locations, and this
   method does not scale.

   The function graph tracer was created to trace all functions in the
   kernel. In order to do this, when function graph tracing is started,
   every task gets its own shadow stack to hold the return address that
   is going to be traced. The function graph tracer has been updated to
   allow multiple users to use its infrastructure. Now have fprobes be
   one of those users. This will also allow for the fprobe and kretprobe
   methods to trace the return address to become obsolete. With new
   technologies like CFI that need to know about these methods of
   hijacking the return address, going toward a solution that has only
   one method of doing this will make the kernel less complex.

 - Cleanup with guard() and free() helpers

   There were several places in the code that had a lot of "goto out" in
   the error paths to either unlock a lock or free some memory that was
   allocated. But this is error prone. Convert the code over to use the
   guard() and free() helpers that let the compiler unlock locks or free
   memory when the function exits.

 - Remove disabling of interrupts in the function graph tracer

   When function graph tracer was first introduced, it could race with
   interrupts and NMIs. To prevent that race, it would disable
   interrupts and not trace NMIs. But the code has changed to allow NMIs
   and also interrupts. This change was done a long time ago, but the
   disabling of interrupts was never removed. Remove the disabling of
   interrupts in the function graph tracer is it is not needed. This
   greatly improves its performance.

 - Allow the :mod: command to enable tracing module functions on the
   kernel command line.

   The function tracer already has a way to enable functions to be
   traced in modules by writing ":mod:<module>" into set_ftrace_filter.
   That will enable either all the functions for the module if it is
   loaded, or if it is not, it will cache that command, and when the
   module is loaded that matches <module>, its functions will be
   enabled. This also allows init functions to be traced. But currently
   events do not have that feature.

   Because enabling function tracing can be done very early at boot up
   (before scheduling is enabled), the commands that can be done when
   function tracing is started is limited. Having the ":mod:" command to
   trace module functions as they are loaded is very useful. Update the
   kernel command line function filtering to allow it.

* tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits)
  ftrace: Implement :mod: cache filtering on kernel command line
  tracing: Adopt __free() and guard() for trace_fprobe.c
  bpf: Use ftrace_get_symaddr() for kprobe_multi probes
  ftrace: Add ftrace_get_symaddr to convert fentry_ip to symaddr
  Documentation: probes: Update fprobe on function-graph tracer
  selftests/ftrace: Add a test case for repeating register/unregister fprobe
  selftests: ftrace: Remove obsolate maxactive syntax check
  tracing/fprobe: Remove nr_maxactive from fprobe
  fprobe: Add fprobe_header encoding feature
  fprobe: Rewrite fprobe on function-graph tracer
  s390/tracing: Enable HAVE_FTRACE_GRAPH_FUNC
  ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC
  bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled
  tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
  tracing: Add ftrace_fill_perf_regs() for perf event
  tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  fprobe: Use ftrace_regs in fprobe exit handler
  fprobe: Use ftrace_regs in fprobe entry handler
  fgraph: Pass ftrace_regs to retfunc
  fgraph: Replace fgraph_ret_regs with ftrace_regs
  ...
2025-01-21 15:15:28 -08:00
Linus Torvalds
4c551165e7 Updates for the interrupt subsystem:
- Consolidation of the machine_kexec_mask_interrupts() by providing a
     generic implementation and replacing the copy & pasta orgy in the
     relevant architectures.
 
   - Prevent unconditional operations on interrupt chips during kexec
     shutdown, which can trigger warnings in certain cases when the
     underlying interrupt has been shut down before.
 
   - Make the enforcement of interrupt handling in interrupt context
     unconditionally available, so that it actually works for non x86
     related interrupt chips. The earlier enablement for ARM GIC chips set
     the required chip flag, but did not notice that the check was hidden
     behind a config switch which is not selected by ARM[64].
 
   - Decrapify the handling of deferred interrupt affinity setting. Some
     interrupt chips require that affinity changes are made from the context
     of handling an interrupt to avoid certain race conditions. For x86 this
     was the default, but with interrupt remapping this requirement was
     lifted and a flag was introduced which tells the core code that
     affinity changes can be done in any context. Unrestricted affinity
     changes are the default for the majority of interrupt chips. RISCV has
     the requirement to add the deferred mode to one of it's interrupt
     controllers, but with the original implementation this would require to
     add the any context flag to all other RISC-V interrupt chips. That's
     backwards, so reverse the logic and require that chips, which need the
     deferred mode have to be marked accordingly. That avoids chasing the
     'sane' chips and marking them.
 
   - Add multi-node support to the Loongarch AVEC interrupt controller
     driver.
 
   - The usual tiny cleanups, fixes and improvements all over the place.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmePkVITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoRbQD/9bHVph/V9Ekl7JAX3aY4gG4JbRhOc7
 dp1VAcHRhktRfoTztYRbjsbMu2nvZ58GKA8bkOS2jHSF/m3PbkIJfOhwk0YdIAoa
 +kdy5yDgqCGfkqW43DN4Cr+CnzGjWMitw67tFp3fhwehMDpDjdt2L28IjtanSS0f
 hO6FV7o65MWeJwxk4Isb2/nvkO+X23Lrp6RrWS8SXBnF9FFXxiPIg/fiOPTizhCh
 1W/bSGxLLb9WwsVzmlGAKVFlXDij0QGaIUug2fdVZ63OsELXD7tJrLSPG133yk92
 ppIa0s6BT4IBsfM00us4hG15PkLuJmP3yWWcoquG0rP8Wq58VOXiN6+rcJIyvB+5
 mWceTH6IKfZGoRQKwXC7BxeBAIb147reiJtb06meq1/8ADIvzafiNy0c8x9i/UaV
 QiyhPVENjaGCGDomZmJQqN7Yb02Wge1k8InQnodDrHxZNl/bX/B1Z8Bxd0n6hPHg
 NSJXYif2AxgaddpohsdygqRDbT6SNyQdj7YjJFY5qAGJ3yFyJ4JB6WTqkWW4o1vH
 3FVqdAnJmejAmmYSkah0Hkem2T5QASQmTWb93PLxiV6q+d0NM8stWAujjyVdIV/B
 W4Uj9mQ20cz54TjLtxqX+A1k6KcqOWRgh1l2QbUlFsgsOP3V8yz47yqYdR9qMWlO
 9kNEjI3sw+G/IQ==
 =q4rj
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull interrupt subsystem updates from Thomas Gleixner:

 - Consolidate the machine_kexec_mask_interrupts() by providing a
   generic implementation and replacing the copy & pasta orgy in the
   relevant architectures.

 - Prevent unconditional operations on interrupt chips during kexec
   shutdown, which can trigger warnings in certain cases when the
   underlying interrupt has been shut down before.

 - Make the enforcement of interrupt handling in interrupt context
   unconditionally available, so that it actually works for non x86
   related interrupt chips. The earlier enablement for ARM GIC chips set
   the required chip flag, but did not notice that the check was hidden
   behind a config switch which is not selected by ARM[64].

 - Decrapify the handling of deferred interrupt affinity setting.

   Some interrupt chips require that affinity changes are made from the
   context of handling an interrupt to avoid certain race conditions.
   For x86 this was the default, but with interrupt remapping this
   requirement was lifted and a flag was introduced which tells the core
   code that affinity changes can be done in any context. Unrestricted
   affinity changes are the default for the majority of interrupt chips.

   RISCV has the requirement to add the deferred mode to one of it's
   interrupt controllers, but with the original implementation this
   would require to add the any context flag to all other RISC-V
   interrupt chips. That's backwards, so reverse the logic and require
   that chips, which need the deferred mode have to be marked
   accordingly. That avoids chasing the 'sane' chips and marking them.

 - Add multi-node support to the Loongarch AVEC interrupt controller
   driver.

 - The usual tiny cleanups, fixes and improvements all over the place.

* tag 'irq-core-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/generic_chip: Export irq_gc_mask_disable_and_ack_set()
  genirq/timings: Add kernel-doc for a function parameter
  genirq: Remove IRQ_MOVE_PCNTXT and related code
  x86/apic: Convert to IRQCHIP_MOVE_DEFERRED
  genirq: Provide IRQCHIP_MOVE_DEFERRED
  hexagon: Remove GENERIC_PENDING_IRQ leftover
  ARC: Remove GENERIC_PENDING_IRQ
  genirq: Remove handle_enforce_irqctx() wrapper
  genirq: Make handle_enforce_irqctx() unconditionally available
  irqchip/loongarch-avec: Add multi-nodes topology support
  irqchip/ts4800: Replace seq_printf() by seq_puts()
  irqchip/ti-sci-inta : Add module build support
  irqchip/ti-sci-intr: Add module build support
  irqchip/irq-brcmstb-l2: Replace brcmstb_l2_mask_and_ack() by generic function
  irqchip: keystone: Use syscon_regmap_lookup_by_phandle_args
  genirq/kexec: Prevent redundant IRQ masking by checking state before shutdown
  kexec: Consolidate machine_kexec_mask_interrupts() implementation
  genirq: Reuse irq_thread_fn() for forced thread case
  genirq: Move irq_thread_fn() further up in the code
2025-01-21 13:51:07 -08:00
Valentina Fernandez
4783ce32b0 riscv: export __cpuid_to_hartid_map
EXPORT_SYMBOL_GPL() is missing for __cpuid_to_hartid_map array.
Export this symbol to allow drivers compiled as modules to use
cpuid_to_hartid_map().

Signed-off-by: Valentina Fernandez <valentina.fernandezalanis@microchip.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2025-01-20 10:25:11 -06:00
Valentina Fernandez
c138285233 riscv: sbi: vendorid_list: Add Microchip Technology to the vendor list
Add Microchip Technology to the RISC-V vendor list.

Signed-off-by: Valentina Fernandez <valentina.fernandezalanis@microchip.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2025-01-20 10:25:05 -06:00
Paolo Bonzini
43f640f4b9 KVM/riscv changes for 6.14
- Svvptc, Zabha, and Ziccrse extension support for Guest/VM
 - Virtualize SBI system suspend extension for Guest/VM
 - Trap related exit statstics as SBI PMU firmware counters for Guest/VM
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmeJ3zYACgkQrUjsVaLH
 LAdd4hAAj7MDKBXFfkgrg6M5b3CgjFp/1FbiQQe+gJNEOrDdWJy115imk2IKbEZq
 MAHLF1nlavmi3ZxTCGRPamyyVej5A0+oNujl+ktltHXIKPqXVcJvz5Iu15uDJNZf
 Ow6Vz74HONl3UKH1H029hFA8oXBTzlxyLv8EXfUJ/HZ/bUeQAfgX52pdvTZ54o4Z
 gJy+dODD7LOIh9WSlqsrqRcaZkQ4uWZNzQVuXQyubon/AhyqLZyoX/Kx+eLg0QSG
 LwwXQA5ew4fG3heOWGSSVhiBeqKj7Kmk7l3tyWe6f/YUw7BP1EEwY6ufdH2jy6Z3
 mEE/3+mYhkBvr6sWCMPgGt8IMGqe6vRPE1SjPRk3xjzQN4a5aSGXA3J2bBHZhdgt
 ksGKD0CNhFv/E9LyeQnIylqQqNL9IIb35hrRmVCGzeOJB5PhqDRZiuyyz4LMOgY9
 1uY6c5fuSxV0IIZV2Y4oTxZ26dCBkGzR5rrVBSakSF1xWfU+0rzX91FslEUzPyDO
 W+RcG9ziFTdCbYkTGlt6sSiXRwkYe/TD6VpDgsxbQECgW/9Itg2BSCZojuy7Lfo9
 idhjHLIouruGyQKrJmadUdOuHzLOCX8XMo1oTjlrPudNoILC6GmZs+X7xUUJ6Fzi
 mOgxcUBsByzZLhyhPnbwS0o0D7La7HJbuNne8VSHfEjPhr8/yl0=
 =EQ0o
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-6.14-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv changes for 6.14

- Svvptc, Zabha, and Ziccrse extension support for Guest/VM
- Virtualize SBI system suspend extension for Guest/VM
- Trap related exit statstics as SBI PMU firmware counters for Guest/VM
2025-01-20 07:01:17 -05:00
Yunhui Cui
b6de116e46
riscv/mm/fault: add show_pte() before die()
When the kernel displays "Unable to handle kernel paging request at
virtual address", we would like to confirm the status of the virtual
address in the page table. So add show_pte() before die().

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240723021820.87718-1-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 13:29:23 -08:00
Palmer Dabbelt
2613c15b0c
Merge patch series "riscv: Add support for xtheadvector"
Charlie Jenkins <charlie@rivosinc.com> says:

xtheadvector is a custom extension that is based upon riscv vector
version 0.7.1 [1]. All of the vector routines have been modified to
support this alternative vector version based upon whether xtheadvector
was determined to be supported at boot.

vlenb is not supported on the existing xtheadvector hardware, so a
devicetree property thead,vlenb is added to provide the vlenb to Linux.

There is a new hwprobe key RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 that is
used to request which thead vendor extensions are supported on the
current platform. This allows future vendors to allocate hwprobe keys
for their vendor.

Support for xtheadvector is also added to the vector kselftests.

[1] 95358cb2cc/xtheadvector.adoc

* b4-shazam-merge:
  riscv: Add ghostwrite vulnerability
  selftests: riscv: Support xtheadvector in vector tests
  selftests: riscv: Fix vector tests
  riscv: hwprobe: Document thead vendor extensions and xtheadvector extension
  riscv: hwprobe: Add thead vendor extension probing
  riscv: vector: Support xtheadvector save/restore
  riscv: Add xtheadvector instruction definitions
  riscv: csr: Add CSR encodings for CSR_VXRM/CSR_VXSAT
  RISC-V: define the elements of the VCSR vector CSR
  riscv: vector: Use vlenb from DT for thead
  riscv: Add thead and xtheadvector as a vendor extension
  riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree
  dt-bindings: cpus: add a thead vlen register length property
  dt-bindings: riscv: Add xtheadvector ISA extension description

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-0-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:43 -08:00
Charlie Jenkins
4bf9706923
riscv: Add ghostwrite vulnerability
Follow the patterns of the other architectures that use
GENERIC_CPU_VULNERABILITIES for riscv to introduce the ghostwrite
vulnerability and mitigation. The mitigation is to disable all vector
which is accomplished by clearing the bit from the cpufeature field.

Ghostwrite only affects thead c9xx CPUs that impelment xtheadvector, so
the vulerability will only be mitigated on these CPUs.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-14-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:39 -08:00
Charlie Jenkins
a5ea53da65
riscv: hwprobe: Add thead vendor extension probing
Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0" which
allows userspace to probe for the new RISCV_ISA_VENDOR_EXT_XTHEADVECTOR
vendor extension.

This new key will allow userspace code to probe for which thead vendor
extensions are supported. This API is modeled to be consistent with
RISCV_HWPROBE_KEY_IMA_EXT_0. The bitmask returned will have each bit
corresponding to a supported thead vendor extension of the cpumask set.
Just like RISCV_HWPROBE_KEY_IMA_EXT_0, this allows a userspace program
to determine all of the supported thead vendor extensions in one call.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-10-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:35 -08:00
Charlie Jenkins
d863910eab
riscv: vector: Support xtheadvector save/restore
Use alternatives to add support for xtheadvector vector save/restore
routines.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-9-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:33 -08:00
Charlie Jenkins
01e3313e34
riscv: Add xtheadvector instruction definitions
xtheadvector uses different encodings than standard vector for
vsetvli and vector loads/stores. Write the instruction formats to be
used in assembly code.

Co-developed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-8-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:32 -08:00
Charlie Jenkins
b9a9314424
riscv: csr: Add CSR encodings for CSR_VXRM/CSR_VXSAT
The VXRM vector csr for xtheadvector has an encoding of 0xa and VXSAT
has an encoding of 0x9.

Co-developed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-7-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:31 -08:00
Heiko Stuebner
66f197785d
RISC-V: define the elements of the VCSR vector CSR
The VCSR CSR contains two elements VXRM[2:1] and VXSAT[0].

Define constants for those to access the elements in a readable way.

Acked-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-6-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:30 -08:00
Charlie Jenkins
377be47f90
riscv: vector: Use vlenb from DT for thead
If thead,vlenb is provided in the device tree, prefer that over reading
the vlenb csr.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-5-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:29 -08:00
Charlie Jenkins
cddd63869f
riscv: Add thead and xtheadvector as a vendor extension
Add support to the kernel for THead vendor extensions with the target of
the new extension xtheadvector.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-4-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:28 -08:00
Charlie Jenkins
ce1daeeba6
riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree
The D1/D1s SoCs support xtheadvector so it can be included in the
devicetree. Also include vlenb for the cpu.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-3-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:27 -08:00
Palmer Dabbelt
9d87cf525f
RISC-V: Mark riscv_v_init() as __init
This trips up with Xtheadvector enabled, but as far as I can tell it's
just been an issue since the original patchset.

Fixes: 7ca7a7b9b6 ("riscv: Add sysctl to set the default vector rule for new processes")
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20250115180251.31444-1-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:25 -08:00
Yixun Lan
3d72d603af riscv: dts: spacemit: move aliases to board dts
aliases info should belong to board dts, instead of
putting it at SoC dtsi file.

Fixes: d8fe646919 ("riscv: dts: add initial SpacemiT K1 SoC device tree")
Link: https://lore.kernel.org/all/6a8bb914-858e-479d-a7d9-09e0ff688160@app.fastmail.com
Signed-off-by: Yixun Lan <dlan@gentoo.org>
2025-01-17 08:05:42 +08:00
Yixun Lan
3579b3506f riscv: dts: spacemit: add pinctrl property to uart0 in BPI-F3
Before pinctrl driver implemented, the uart0 controller reply on
bootloader for setting correct pin mux and configurations.

Now, let's add pinctrl property to uart0 of Bananapi-F3 board.

Signed-off-by: Yixun Lan <dlan@gentoo.org>
2025-01-17 07:53:52 +08:00
Yangyu Chen
21bef40ad1 riscv: defconfig: enable SpacemiT SoC
Enable SpacemiT SoC config in defconfig to allow the default upstream
kernel booting on Banana Pi BPI-F3 board.

Signed-off-by: Yangyu Chen <cyy@cyyself.name>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Jesse Taube <jesse@rivosinc.com>
Signed-off-by: Yixun Lan <dlan@gentoo.org>
2025-01-17 07:53:52 +08:00
Yangyu Chen
d60d57ab6b riscv: dts: spacemit: add Banana Pi BPI-F3 board device tree
Banana Pi BPI-F3 [1] is a industrial grade RISC-V development board, it
design with SpacemiT K1 8 core RISC-V chip [2].

Currently only support booting into console with only uart enabled,
other features will be added soon later.

Link: https://docs.banana-pi.org/en/BPI-F3/BananaPi_BPI-F3 [1]
Link: https://www.spacemit.com/en/spacemit-key-stone-2/ [2]
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
Acked-by: Jesse Taube <jesse@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Yixun Lan <dlan@gentoo.org>
2025-01-17 07:53:52 +08:00
Yangyu Chen
d8fe646919 riscv: dts: add initial SpacemiT K1 SoC device tree
Banana Pi BPI-F3 motherboard is powered by SpacemiT K1[1].

Key features:
- 4 cores per cluster, 2 clusters on chip
- UART IP is Intel XScale UART

Some key considerations:
- ISA string is inferred from vendor documentation[2]
- Cluster topology is inferred from datasheet[1] and L2 in vendor dts[3]
- No coherent DMA on this board
    Inferred by taking vendor ethernet and MMC drivers to the mainline
    kernel. Without dma-noncoherent in soc node, the driver fails.
- Add cache nodes
    K1 SoC has 128 sets of 32KiB L1 I/D Cache for each hart, and 512 sets
    of 512KiB L2 Cache for each cluster.

Currently only support booting into console with only uart, other
features will be added soon later.

Link: https://docs.banana-pi.org/en/BPI-F3/SpacemiT_K1_datasheet [1]
Link: https://developer.spacemit.com/#/documentation?token=BWbGwbx7liGW21kq9lucSA6Vnpb [2]
Link: https://gitee.com/bianbu-linux/linux-6.1/blob/bl-v1.0.y/arch/riscv/boot/dts/spacemit/k1-x.dtsi [3]
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
Acked-by: Jesse Taube <jesse@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Yixun Lan <dlan@gentoo.org>
2025-01-17 07:53:52 +08:00
Yangyu Chen
8814aa123a riscv: add SpacemiT SoC family Kconfig support
The first SoC in the SpacemiT series is K1, which contains 8 RISC-V
cores with RISC-V Vector v1.0 support.

Link: https://www.spacemit.com/en/spacemit-key-stone-2/
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Yixun Lan <dlan@gentoo.org>
2025-01-17 07:53:51 +08:00
Arnd Bergmann
a48867bc2f ~RISC-V~ StarFive Devicetrees for v6.14
Not so much RISC-V, but rather StarFive, this time around as there are
 only two changes: the Milk-V Mars and Pine64 Star64 boards get their usb0
 interfaces moved from peripheral to host mode.
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZ4VhEwAKCRB4tDGHoIJi
 0t1FAQCSFrIxz189bY1TjiyspNuggR2oNuCfg1j3X/piLDbDnQEAlY8+/v53mxUi
 fqn+HjkDWqdR9EnNR1s/sQYTEN3QSwk=
 =zXqg
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmeJKAIACgkQYKtH/8kJ
 UidepxAA4ypf7/f5DTRh7ZlWIzafjQzsryuAMuk25L8A/M7T2mIGfyCnnWpQPjDs
 GvrYXtKo3jsAXRSWQqShDF2zZrRua/UGhh5dlwq6yaX7XL8dUwQm2TxB732gBSa/
 CtI90BA/RyKbXyxeuFURAPla2P68d8sd0vdufp2FuPU9fXDfAOLlpujL7fusxiV4
 f/jJ5LGtf+Q9S2Ba67O+MVORIAfK3zCIL8GavwCturpS2duR9kt0mAPrLCEeP456
 1WnnDFgXDkMSHXsnmrvfWlvhuY8u3papqDzHOB+VhOY39zzycG5BukqPE/gQpl37
 8WNQtpvr/asGXPOBjjM7aBPGQV7181gm68v49QVFITKiBYyY9UBqALGcr+PN8BJH
 Yc6aoe9q7KAQe3nxZJ526WuhJ11I5Ngi7Wkp4U6ku0aHjypp8TSBd95h7wBKhMs0
 jeJSPC1eCNwuJvuvmLU2XW0MJStIPSwcXyym+Xbc83E3NbiXRQqFKC0b6PTg6RPC
 cKEfdXyDDz2MavW5QWXfszObCGUQunne5vcc0sadb3XW2dbZA8W3Sz9s1G6e1h9z
 4IQmyz1KD1oT1DEkZezubn/Hbk0rVgytWo+Z/MNoucZDo6caxyycldb6dcqDy81w
 irVdrYeXyLnj4JV13h14pW6zTNLiamCqahcv9YdU5XOEY4cS2js=
 =hu/V
 -----END PGP SIGNATURE-----

Merge tag 'riscv-dt-for-v6.14' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/dt

~RISC-V~ StarFive Devicetrees for v6.14

Not so much RISC-V, but rather StarFive, this time around as there are
only two changes: the Milk-V Mars and Pine64 Star64 boards get their usb0
interfaces moved from peripheral to host mode.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-dt-for-v6.14' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: dts: starfive: jh7110-milkv-mars: enable usb0 host function
  riscv: dts: starfive: jh7110-pine64-star64: enable usb0 host function

Link: https://lore.kernel.org/r/20250113-kennel-outplayed-21a52a654c36@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-01-16 16:38:42 +01:00
Paolo Bonzini
5cf32aff20 LoongArch KVM changes for v6.14
1. Clear LLBCTL if secondary mmu mapping changed.
 2. Add hypercall service support for usermode VMM.
 
 This is a really small changeset, because the Chinese New Year
 (Spring Festival) is coming. Happy New Year!
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmeFGcIWHGNoZW5odWFj
 YWlAa2VybmVsLm9yZwAKCRAChivD8uImern1D/9AZ8M+0nBAaONZaq2qKLC+RaW6
 KqvFsR1PUUFzVcQZaHh9OZcx5s4EAH12EaxBH68W0o0ejbTUJp8QXT6cmO9bNFj1
 tVaczGACss34kDerrddHisOpimFdaP+ECX4Q43oTc5N7vG6zUu3ijOISnIIxkhHP
 RlX/+5Djw0NoaVAkhEj4v+LkY33z5QnDFNI0OjJiHpDepP2vLQ1FD573pLMeqcGs
 BVYwZv7DP3SnVajSjRhT/r5qjy9EMjrXRLkIIwyjOUArRaPq/Lfg4CTK85e5MZsR
 2GTkdjvh/YArpluRki4FX1cVOwpBbEtC+24/NWB+MPijtnYqMyAoIraZGqJMAzhw
 P6W70A15GvBhlhQmvKNai1oXkdZaaT7XDcbFT706Cwhu7LvcNM8kK7VrPc59WLTR
 uHO+ehJh0DCpBMC2BKH/8sztGx80u7SB4Ph0ytZCK+uYznTMEiBqRup7E/QLLG+1
 EotXv8U4+Bwx/inzMxwi6vR1ZXo0dIDsnvFdSZeA6PC/cSoPzdqCdrXjQT/7HUIu
 DNgcsRVL3LFE+A/sDVGb5/w9UPdQfCdO10bu97FkY37ftqp7LvPTlWJvDZJx+Wle
 KfErCOM1/ZRQ2knzE7fst58auA3ZFNn3jWRkD/0gJ4X1Fgu63VrYeuc4FL7r8ken
 HxKLYOLtD6dOzR5DeA==
 =z2VU
 -----END PGP SIGNATURE-----

Merge tag 'loongarch-kvm-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD

LoongArch KVM changes for v6.14

1. Clear LLBCTL if secondary mmu mapping changed.
2. Add hypercall service support for usermode VMM.

This is a really small changeset, because the Chinese New Year
(Spring Festival) is coming. Happy New Year!
2025-01-15 11:51:56 -05:00
Palmer Dabbelt
6f6ecce59d
Merge patch series "SBI PMU event related fixes"
Atish Patra <atishp@rivosinc.com> says:

Here are two minor improvement/fixes in the PMU event path. The first patch
was part of the series[1]. The 2nd patch was suggested during the series
review.

While the series can only be merged once SBI v3.0 is frozen, these two
patches can be independent of SBI v3.0 and can be merged sooner. Hence, these
two patches are sent as a separate series.

* b4-shazam-merge:
  drivers/perf: riscv: Do not allow invalid raw event config
  drivers/perf: riscv: Return error for default case
  drivers/perf: riscv: Fix Platform firmware event data

Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-0-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09 09:37:12 -08:00
Atish Patra
fc58db9aeb
drivers/perf: riscv: Fix Platform firmware event data
Platform firmware event data field is allowed to be 62 bits for
Linux as uppper most two bits are reserved to indicate SBI fw or
platform specific firmware events.
However, the event data field is masked as per the hardware raw
event mask which is not correct.

Fix the platform firmware event data field with proper mask.

Fixes: f0c9363db2 ("perf/riscv-sbi: Add platform specific firmware event handling")

Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-1-813e8a4f5962@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-09 09:37:08 -08:00
Clément Léger
5cd900b8b7
riscv: use local label names instead of global ones in assembly
Local labels should be prefix by '.L' or they'll be exported in the
symbol table. Additionally, this messes up the backtrace by displaying
an incorrect symbol:

  ...
  [   12.751810] [<ffffffff80441628>] _copy_from_user+0x28/0xc2
  [   12.752035] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc
  [   12.752310] [<ffffffff80a033e8>] do_trap_load_misaligned+0x24/0xee
  [   12.752596] [<ffffffff80a0dcae>] _new_vmalloc_restore_context_a0+0xc2/0xce

After:
  ...
  [   10.243916] [<ffffffff804415e4>] _copy_from_user+0x28/0xc2
  [   10.244026] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc
  [   10.244150] [<ffffffff80a033a0>] do_trap_load_misaligned+0x24/0xee
  [   10.244268] [<ffffffff80a0dc66>] handle_exception+0x146/0x152

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Fixes: 503638e0ba ("riscv: Stop emitting preventive sfence.vma for new vmalloc mappings")
Link: https://lore.kernel.org/r/20250103141814.508865-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:46:14 -08:00
Guo Ren
40e6073e76
riscv: qspinlock: Fixup _Q_PENDING_LOOPS definition
When CONFIG_RISCV_QUEUED_SPINLOCKS=y, the _Q_PENDING_LOOPS
definition is missing. Add the _Q_PENDING_LOOPS definition for
pure qspinlock usage.

Fixes: ab83647fad ("riscv: Add qspinlock support")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241215135252.201983-1-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:46:01 -08:00
Clément Léger
51356ce60e
riscv: stacktrace: fix backtracing through exceptions
Prior to commit 5d5fc33ce5 ("riscv: Improve exception and system call
latency"), backtrace through exception worked since ra was filled with
ret_from_exception symbol address and the stacktrace code checked 'pc' to
be equal to that symbol. Now that handle_exception uses regular 'call'
instructions, this isn't working anymore and backtrace stops at
handle_exception(). Since there are multiple call site to C code in the
exception handling path, rather than checking multiple potential return
addresses, add a new symbol at the end of exception handling and check pc
to be in that range.

Fixes: 5d5fc33ce5 ("riscv: Improve exception and system call latency")
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241209155714.1239665-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:45:49 -08:00
Xu Lu
f754f27e98
riscv: mm: Fix the out of bound issue of vmemmap address
In sparse vmemmap model, the virtual address of vmemmap is calculated as:
((struct page *)VMEMMAP_START - (phys_ram_base >> PAGE_SHIFT)).
And the struct page's va can be calculated with an offset:
(vmemmap + (pfn)).

However, when initializing struct pages, kernel actually starts from the
first page from the same section that phys_ram_base belongs to. If the
first page's physical address is not (phys_ram_base >> PAGE_SHIFT), then
we get an va below VMEMMAP_START when calculating va for it's struct page.

For example, if phys_ram_base starts from 0x82000000 with pfn 0x82000, the
first page in the same section is actually pfn 0x80000. During
init_unavailable_range(), we will initialize struct page for pfn 0x80000
with virtual address ((struct page *)VMEMMAP_START - 0x2000), which is
below VMEMMAP_START as well as PCI_IO_END.

This commit fixes this bug by introducing a new variable
'vmemmap_start_pfn' which is aligned with memory section size and using
it to calculate vmemmap address instead of phys_ram_base.

Fixes: a11dd49dcb ("riscv: Sparse-Memory/vmemmap out-of-bounds fix")
Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20241209122617.53341-1-luxu.kernel@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:45:34 -08:00
Nam Cao
13134cc949
riscv: kprobes: Fix incorrect address calculation
p->ainsn.api.insn is a pointer to u32, therefore arithmetic operations are
multiplied by four. This is clearly undesirable for this case.

Cast it to (void *) first before any calculation.

Below is a sample before/after. The dumped memory is two kprobe slots, the
first slot has

  - c.addiw a0, 0x1c (0x7125)
  - ebreak           (0x00100073)

and the second slot has:

  - c.addiw a0, -4   (0x7135)
  - ebreak           (0x00100073)

Before this patch:

(gdb) x/16xh 0xff20000000135000
0xff20000000135000:	0x7125	0x0000	0x0000	0x0000	0x7135	0x0010	0x0000	0x0000
0xff20000000135010:	0x0073	0x0010	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000

After this patch:

(gdb) x/16xh 0xff20000000125000
0xff20000000125000:	0x7125	0x0073	0x0010	0x0000	0x7135	0x0073	0x0010	0x0000
0xff20000000125010:	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000

Fixes: b1756750a3 ("riscv: kprobes: Use patch_text_nosync() for insn slots")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241119111056.2554419-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:39:39 -08:00
Nam Cao
6a97f4118a
riscv: Fix sleeping in invalid context in die()
die() can be called in exception handler, and therefore cannot sleep.
However, die() takes spinlock_t which can sleep with PREEMPT_RT enabled.
That causes the following warning:

BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 285, name: mutex
preempt_count: 110001, expected: 0
RCU nest depth: 0, expected: 0
CPU: 0 UID: 0 PID: 285 Comm: mutex Not tainted 6.12.0-rc7-00022-ge19049cf7d56-dirty #234
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
    dump_backtrace+0x1c/0x24
    show_stack+0x2c/0x38
    dump_stack_lvl+0x5a/0x72
    dump_stack+0x14/0x1c
    __might_resched+0x130/0x13a
    rt_spin_lock+0x2a/0x5c
    die+0x24/0x112
    do_trap_insn_illegal+0xa0/0xea
    _new_vmalloc_restore_context_a0+0xcc/0xd8
Oops - illegal instruction [#1]

Switch to use raw_spinlock_t, which does not sleep even with PREEMPT_RT
enabled.

Fixes: 76d2a0493a ("RISC-V: Init and Halt Code")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20241118091333.1185288-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:23:17 -08:00
Clément Léger
03f0b54853
riscv: module: remove relocation_head rel_entry member allocation
relocation_head's list_head member, rel_entry, doesn't need to be
allocated, its storage can just be part of the allocated relocation_head.
Remove the pointer which allows to get rid of the allocation as well as
an existing memory leak found by Kai Zhang using kmemleak.

Fixes: 8fd6c51423 ("riscv: Add remaining module relocations")
Reported-by: Kai Zhang <zhangkai@iscas.ac.cn>
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241128081636.3620468-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:22:52 -08:00
Nathan Chancellor
f9d2ee3f51 riscv: Always inline bitops
When building allmodconfig + ThinLTO with certain versions of clang,
arch_set_bit() may not be inlined, resulting in a modpost warning:

  WARNING: modpost: vmlinux: section mismatch in reference: arch_set_bit+0x58 (section: .text.arch_set_bit) -> numa_nodes_parsed (section: .init.data)

acpi_numa_rintc_affinity_init() calls arch_set_bit() via __node_set()
with numa_nodes_parsed, which is marked as __initdata. If arch_set_bit()
is not inlined, modpost will flag that it is being called with data that
will be freed after init.

As acpi_numa_rintc_affinity_init() is marked as __init, there is not
actually a functional issue here. However, the bitop functions should be
marked as __always_inline, so that they work consistently for init and
non-init code, which the comment in include/linux/nodemask.h alludes to.
This matches s390 and x86's implementations.

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-12-30 10:29:25 -08:00
Atish Patra
af79caa83f RISC-V: KVM: Add new exit statstics for redirected traps
Currently, kvm doesn't delegate the few traps such as misaligned
load/store, illegal instruction and load/store access faults because it
is not expected to occur in the guest very frequently. Thus, kvm gets a
chance to act upon it or collect statistics about it before redirecting
the traps to the guest.

Collect both guest and host visible statistics during the traps.
Enable them so that both guest and host can collect the stats about
them if required.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241224-kvm_guest_stat-v2-3-08a77ac36b02@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-12-30 14:01:02 +05:30
Atish Patra
2f15b5eaff RISC-V: KVM: Update firmware counters for various events
SBI PMU specification defines few firmware counters which can be
used by the guests to collect the statstics about various traps
occurred in the host.

Update these counters whenever a corresponding trap is taken

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241224-kvm_guest_stat-v2-2-08a77ac36b02@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-12-30 14:01:02 +05:30
Quan Zhou
51c5895673 RISC-V: KVM: Redirect instruction access fault trap to guest
The M-mode redirects an unhandled instruction access
fault trap back to S-mode when not delegating it to
VS-mode(hedeleg). However, KVM running in HS-mode
terminates the VS-mode software when back from M-mode.

The KVM should redirect the trap back to VS-mode, and
let VS-mode trap handler decide the next step.

Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20241224-kvm_guest_stat-v2-1-08a77ac36b02@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-12-30 14:01:02 +05:30
Quan Zhou
79be257b57 RISC-V: KVM: Allow Ziccrse extension for Guest/VM
Extend the KVM ISA extension ONE_REG interface to allow KVM user space
to detect and enable Ziccrse extension for Guest/VM.

Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/d10e746d165074174f830aa3d89bf3c92017acee.1732854096.git.zhouquan@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-12-30 14:01:02 +05:30
Quan Zhou
679e132c0a RISC-V: KVM: Allow Zabha extension for Guest/VM
Extend the KVM ISA extension ONE_REG interface to allow KVM user space
to detect and enable Zabha extension for Guest/VM.

Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/4074feb27819e23bab05b0fd6441a38bf0b6a5e2.1732854096.git.zhouquan@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-12-30 14:01:02 +05:30
Quan Zhou
0f89158597 RISC-V: KVM: Allow Svvptc extension for Guest/VM
Extend the KVM ISA extension ONE_REG interface to allow KVM user space
to detect and enable Svvptc extension for Guest/VM.

Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/133509ffe5783b62cf95e8f675cc3e327bee402e.1732854096.git.zhouquan@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-12-30 14:01:01 +05:30
Andrew Jones
023c15151f RISC-V: KVM: Add SBI system suspend support
Implement a KVM SBI SUSP extension handler. The handler only
validates the system suspend entry criteria and prepares for resuming
in the appropriate state at the resume_addr (as specified by the SBI
spec), but then it forwards the call to the VMM where any system
suspend behavior may be implemented. Since VMM support is needed, KVM
disables the extension by default.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20241017074538.18867-5-ajones@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-12-30 14:01:01 +05:30
Celeste Liu
26f2d6de41
riscv: defconfig: drop RT_GROUP_SCHED=y
Commit ba6cfef057 ("riscv: enable Docker requirements in defconfig")
introduced it because of Docker, but Docker has removed this requirement
since [1] (2023-04-19).

For cgroup v1, if turned on, and there's any cgroup in the "cpu" hierarchy it
needs an RT budget assigned, otherwise the processes in it will not be able to
get RT at all. The problem with RT group scheduling is that it requires the
budget assigned but there's no way we could assign a default budget, since the
values to assign are both upper and lower time limits, are absolute, and need to
be sum up to < 1 for each individal cgroup. That means we cannot really come up
with values that would work by default in the general case.[2]

For cgroup v2, it's almost unusable as well. If it turned on, the cpu controller
can only be enabled when all RT processes are in the root cgroup. But it will
lose the benefits of cgroup v2 if all RT process were placed in the same cgroup.

Red Hat, Gentoo, Arch Linux and Debian all disable it. systemd also doesn't
support it.[3]

[1]: 005150ed69
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1229700
[3]: https://github.com/systemd/systemd/issues/13781#issuecomment-549164383

Acked-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
Acked-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240910-fix-riscv-rt_group_sched-v3-1-486e75e5ae6d@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-27 08:45:40 -08:00
Masami Hiramatsu (Google)
b5fa903b7f fprobe: Add fprobe_header encoding feature
Fprobe store its data structure address and size on the fgraph return stack
by __fprobe_header. But most 64bit architecture can combine those to
one unsigned long value because 4 MSB in the kernel address are the same.
With this encoding, fprobe can consume less space on ret_stack.

This introduces asm/fprobe.h to define arch dependent encode/decode
macros. Note that since fprobe depends on CONFIG_HAVE_FUNCTION_GRAPH_FREGS,
currently only arm64, loongarch, riscv, s390 and x86 are supported.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: bpf <bpf@vger.kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/173519005783.391279.5307910947400277525.stgit@devnote2
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-26 10:50:05 -05:00
Masami Hiramatsu (Google)
4346ba1604 fprobe: Rewrite fprobe on function-graph tracer
Rewrite fprobe implementation on function-graph tracer.
Major API changes are:
 -  'nr_maxactive' field is deprecated.
 -  This depends on CONFIG_DYNAMIC_FTRACE_WITH_ARGS or
    !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS, and
    CONFIG_HAVE_FUNCTION_GRAPH_FREGS. So currently works only
    on x86_64.
 -  Currently the entry size is limited in 15 * sizeof(long).
 -  If there is too many fprobe exit handler set on the same
    function, it will fail to probe.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: bpf <bpf@vger.kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Naveen N Rao <naveen@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/173519003970.391279.14406792285453830996.stgit@devnote2
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-26 10:50:05 -05:00
Masami Hiramatsu (Google)
a762e9267d ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC
Add CONFIG_HAVE_FTRACE_GRAPH_FUNC kconfig in addition to ftrace_graph_func
macro check. This is for the other feature (e.g. FPROBE) which requires to
access ftrace_regs from fgraph_ops::entryfunc() can avoid compiling if
the fgraph can not pass the valid ftrace_regs.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: bpf <bpf@vger.kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Naveen N Rao <naveen@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/173519001472.391279.1174901685282588467.stgit@devnote2
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-26 10:50:04 -05:00
Masami Hiramatsu (Google)
b9b55c8912 tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
Add ftrace_partial_regs() which converts the ftrace_regs to pt_regs.
This is for the eBPF which needs this to keep the same pt_regs interface
to access registers.
Thus when replacing the pt_regs with ftrace_regs in fprobes (which is
used by kprobe_multi eBPF event), this will be used.

If the architecture defines its own ftrace_regs, this copies partial
registers to pt_regs and returns it. If not, ftrace_regs is the same as
pt_regs and ftrace_partial_regs() will return ftrace_regs::regs.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Florent Revest <revest@chromium.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: bpf <bpf@vger.kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Link: https://lore.kernel.org/173518996761.391279.4987911298206448122.stgit@devnote2
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-26 10:50:03 -05:00
Masami Hiramatsu (Google)
a3ed4157b7 fgraph: Replace fgraph_ret_regs with ftrace_regs
Use ftrace_regs instead of fgraph_ret_regs for tracing return value
on function_graph tracer because of simplifying the callback interface.

The CONFIG_HAVE_FUNCTION_GRAPH_RETVAL is also replaced by
CONFIG_HAVE_FUNCTION_GRAPH_FREGS.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: bpf <bpf@vger.kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/173518991508.391279.16635322774382197642.stgit@devnote2
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-26 10:50:02 -05:00
Masami Hiramatsu (Google)
41705c4262 fgraph: Pass ftrace_regs to entryfunc
Pass ftrace_regs to the fgraph_ops::entryfunc(). If ftrace_regs is not
available, it passes a NULL instead. User callback function can access
some registers (including return address) via this ftrace_regs.

Note that the ftrace_regs can be NULL when the arch does NOT define:
HAVE_DYNAMIC_FTRACE_WITH_ARGS or HAVE_DYNAMIC_FTRACE_WITH_REGS.
More specifically, if HAVE_DYNAMIC_FTRACE_WITH_REGS is defined but
not the HAVE_DYNAMIC_FTRACE_WITH_ARGS, and the ftrace ops used to
register the function callback does not set FTRACE_OPS_FL_SAVE_REGS.
In this case, ftrace_regs can be NULL in user callback.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: bpf <bpf@vger.kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Naveen N Rao <naveen@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/173518990044.391279.17406984900626078579.stgit@devnote2
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-26 10:50:02 -05:00
Paolo Bonzini
a066bad89c KVM selftests "tree"-wide changes for 6.14:
- Rework vcpu_get_reg() to return a value instead of using an out-param, and
    update all affected arch code accordingly.
 
  - Convert the max_guest_memory_test into a more generic mmu_stress_test.
    The basic gist of the "conversion" is to have the test do mprotect() on
    guest memory while vCPUs are accessing said memory, e.g. to verify KVM
    and mmu_notifiers are working as intended.
 
  - Play nice with treewrite builds of unsupported architectures, e.g. arm
    (32-bit), as KVM selftests' Makefile doesn't do anything to ensure the
    target architecture is actually one KVM selftests supports.
 
  - Use the kernel's $(ARCH) definition instead of the target triple for arch
    specific directories, e.g. arm64 instead of aarch64, mainly so as not to
    be different from the rest of the kernel.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmdjehwACgkQOlYIJqCj
 N/0+/RAAh3M2IsNCtaiJ7n3COe9DHUuqxherS625J9YEOTCcGrZUd1WoSBvdDCIW
 46YKEYdHIpFHOYMEDPg5ODd20/y6lLw1yDKW7xj22cC8Np1TrPbt0q+PqaDdb4RR
 UlyObB6OsI/wxUHjsrvg2ZmDwAH9hIzh0kTUKPv7NZdaB+kT49eBV+tILDHtUGTx
 m0LUcNUIZBUhUE2YjnGNIoPQg4w+H1bYFK8lM62Rx09HtXbJ93VlMSxBq4ModMDq
 v74F6263NpOiou87bXperWT7iAWAWSqjGR+K/cwtnJpg2sosLlgpXYTnuUEiQMOL
 DU2c2u3lPRZMAhUVzj6qRr9nVBICfgk/lS5nkCb6khNZ296VK2UzimQF429a8Dhy
 ZkgOduNNuGVMBW6/Mc96L9ygo2yXNpGcT72RZ/2C8u/Zt0RwtYMYkzrBZQeqvgXp
 wMPxgkwatHLm7VSXzSLVd1c28GXTS8Fdg77HbUbOJNjvigaOeUR8ti82LRIWPn0Y
 XRyMpU1fK1wtuSkIEdxF8KWqr9B4ZPMxqFNE+ntyIhfUKvNdz6mRpQj/eYVbVdfn
 poXco5iq02aCDo7q1Zqx1rxeNrYUvTdq9UmllilRbYLVOC0M+Rf+jBvqM+VBhUCu
 hFSd80Pkh2iR4JuRDCtXQmk/cNfjk1CTgkeDmpm00Hdk98brmlc=
 =mFjt
 -----END PGP SIGNATURE-----

Merge tag 'kvm-selftests-treewide-6.14' of https://github.com/kvm-x86/linux into HEAD

KVM selftests "tree"-wide changes for 6.14:

 - Rework vcpu_get_reg() to return a value instead of using an out-param, and
   update all affected arch code accordingly.

 - Convert the max_guest_memory_test into a more generic mmu_stress_test.
   The basic gist of the "conversion" is to have the test do mprotect() on
   guest memory while vCPUs are accessing said memory, e.g. to verify KVM
   and mmu_notifiers are working as intended.

 - Play nice with treewrite builds of unsupported architectures, e.g. arm
   (32-bit), as KVM selftests' Makefile doesn't do anything to ensure the
   target architecture is actually one KVM selftests supports.

 - Use the kernel's $(ARCH) definition instead of the target triple for arch
   specific directories, e.g. arm64 instead of aarch64, mainly so as not to
   be different from the rest of the kernel.
2024-12-19 07:50:06 -05:00
Sean Christopherson
915d2f0718 KVM: Move KVM_REG_SIZE() definition to common uAPI header
Define KVM_REG_SIZE() in the common kvm.h header, and delete the arm64 and
RISC-V versions.  As evidenced by the surrounding definitions, all aspects
of the register size encoding are generic, i.e. RISC-V should have moved
arm64's definition to common code instead of copy+pasting.

Acked-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20241128005547.4077116-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-12-17 08:49:48 -08:00
Drew Fustini
e7177ecdd2 riscv: defconfig: enable pinctrl and dwmac support for TH1520
Enable pinctrl and ethernet dwmac driver for the TH1520 SoC boards like
the BeagleV Ahead and the Sipeed LicheePi 4A.

Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Drew Fustini <drew@pdp7.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-12-16 09:39:10 +00:00
Linus Torvalds
81576a9a27 ARM64:
* Fix confusion with implicitly-shifted MDCR_EL2 masks breaking
   SPE/TRBE initialization.
 
 * Align nested page table walker with the intended memory attribute
   combining rules of the architecture.
 
 * Prevent userspace from constraining the advertised ASID width,
   avoiding horrors of guest TLBIs not matching the intended context in
   hardware.
 
 * Don't leak references on LPIs when insertion into the translation
   cache fails.
 
 RISC-V:
 
 * Replace csr_write() with csr_set() for HVIEN PMU overflow bit.
 
 x86:
 
 * Cache CPUID.0xD XSTATE offsets+sizes during module init - On Intel's
   Emerald Rapids CPUID costs hundreds of cycles and there are a lot of
   leaves under 0xD.  Getting rid of the CPUIDs during nested VM-Enter and
   VM-Exit is planned for the next release, for now just cache them: even
   on Skylake that is 40% faster.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmdcibgUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOQsgf+NwNdfNQ0V5vU7YNeVxyhkCyYvNiA
 njvBTd1Lwh7EDtJ2NLKzwHktH2ymQI8qykxKr/qY3Jxkow+vcvsK0LacAaJdIzGo
 jnMGxXxRCFpxdkNb1kDJk4Cd6GSSAxYwgPj3wj7whsMcVRjPlFcjuHf02bRUU0Gt
 yulzBOZJ/7QTquKSnwt1kZQ1i/mJ8wCh4vJArZqtcImrDSK7oh+BaQ44h+lNe8qa
 Xiw6Fw3tYXgHy5WlnUU/OyFs+bZbcVzPM75qYgdGIWSo0TdL69BeIw8S4K2Ri4eL
 EoEBigwAd8PiF16Q1wO4gXWcNwinMTs3LIftxYpENTHA5gnrS5hgWWDqHw==
 =4v2y
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM64:

   - Fix confusion with implicitly-shifted MDCR_EL2 masks breaking
     SPE/TRBE initialization

   - Align nested page table walker with the intended memory attribute
     combining rules of the architecture

   - Prevent userspace from constraining the advertised ASID width,
     avoiding horrors of guest TLBIs not matching the intended context
     in hardware

   - Don't leak references on LPIs when insertion into the translation
     cache fails

  RISC-V:

   - Replace csr_write() with csr_set() for HVIEN PMU overflow bit

  x86:

   - Cache CPUID.0xD XSTATE offsets+sizes during module init

     On Intel's Emerald Rapids CPUID costs hundreds of cycles and there
     are a lot of leaves under 0xD. Getting rid of the CPUIDs during
     nested VM-Enter and VM-Exit is planned for the next release, for
     now just cache them: even on Skylake that is 40% faster"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init
  RISC-V: KVM: Fix csr_write -> csr_set for HVIEN PMU overflow bit
  KVM: arm64: vgic-its: Add error handling in vgic_its_cache_translation
  KVM: arm64: Do not allow ID_AA64MMFR0_EL1.ASIDbits to be overridden
  KVM: arm64: Fix S1/S2 combination when FWB==1 and S2 has Device memory type
  arm64: Fix usage of new shifted MDCR_EL2 values
2024-12-15 09:26:13 -08:00
Michal Wilczynski
c95c1362e5 riscv: dts: thead: Add mailbox node
Add mailbox device tree node. This work is based on the vendor kernel [1].

Link: https://github.com/revyos/thead-kernel.git [1]
Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
Reviewed-by: Drew Fustini <dfustini@tenstorrent.com>
Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-12-12 20:07:16 -08:00
Björn Töpel
21f1b85c89
riscv: mm: Do not call pmd dtor on vmemmap page table teardown
The vmemmap's, which is used for RV64 with SPARSEMEM_VMEMMAP, page
tables are populated using pmd (page middle directory) hugetables.
However, the pmd allocation is not using the generic mechanism used by
the VMA code (e.g. pmd_alloc()), or the RISC-V specific
create_pgd_mapping()/alloc_pmd_late(). Instead, the vmemmap page table
code allocates a page, and calls vmemmap_set_pmd(). This results in
that the pmd ctor is *not* called, nor would it make sense to do so.

Now, when tearing down a vmemmap page table pmd, the cleanup code
would unconditionally, and incorrectly call the pmd dtor, which
results in a crash (best case).

This issue was found when running the HMM selftests:

  | tools/testing/selftests/mm# ./test_hmm.sh smoke
  | ... # when unloading the test_hmm.ko module
  | page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10915b
  | flags: 0x1000000000000000(node=0|zone=1)
  | raw: 1000000000000000 0000000000000000 dead000000000122 0000000000000000
  | raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
  | page dumped because: VM_BUG_ON_PAGE(ptdesc->pmd_huge_pte)
  | ------------[ cut here ]------------
  | kernel BUG at include/linux/mm.h:3080!
  | Kernel BUG [#1]
  | Modules linked in: test_hmm(-) sch_fq_codel fuse drm drm_panel_orientation_quirks backlight dm_mod
  | CPU: 1 UID: 0 PID: 514 Comm: modprobe Tainted: G        W          6.12.0-00982-gf2a4f1682d07 #2
  | Tainted: [W]=WARN
  | Hardware name: riscv-virtio qemu/qemu, BIOS 2024.10 10/01/2024
  | epc : remove_pgd_mapping+0xbec/0x1070
  |  ra : remove_pgd_mapping+0xbec/0x1070
  | epc : ffffffff80010a68 ra : ffffffff80010a68 sp : ff20000000a73940
  |  gp : ffffffff827b2d88 tp : ff6000008785da40 t0 : ffffffff80fbce04
  |  t1 : 0720072007200720 t2 : 706d756420656761 s0 : ff20000000a73a50
  |  s1 : ff6000008915cff8 a0 : 0000000000000039 a1 : 0000000000000008
  |  a2 : ff600003fff0de20 a3 : 0000000000000000 a4 : 0000000000000000
  |  a5 : 0000000000000000 a6 : c0000000ffffefff a7 : ffffffff824469b8
  |  s2 : ff1c0000022456c0 s3 : ff1ffffffdbfffff s4 : ff6000008915c000
  |  s5 : ff6000008915c000 s6 : ff6000008915c000 s7 : ff1ffffffdc00000
  |  s8 : 0000000000000001 s9 : ff1ffffffdc00000 s10: ffffffff819a31f0
  |  s11: ffffffffffffffff t3 : ffffffff8000c950 t4 : ff60000080244f00
  |  t5 : ff60000080244000 t6 : ff20000000a73708
  | status: 0000000200000120 badaddr: ffffffff80010a68 cause: 0000000000000003
  | [<ffffffff80010a68>] remove_pgd_mapping+0xbec/0x1070
  | [<ffffffff80fd238e>] vmemmap_free+0x14/0x1e
  | [<ffffffff8032e698>] section_deactivate+0x220/0x452
  | [<ffffffff8032ef7e>] sparse_remove_section+0x4a/0x58
  | [<ffffffff802f8700>] __remove_pages+0x7e/0xba
  | [<ffffffff803760d8>] memunmap_pages+0x2bc/0x3fe
  | [<ffffffff02a3ca28>] dmirror_device_remove_chunks+0x2ea/0x518 [test_hmm]
  | [<ffffffff02a3e026>] hmm_dmirror_exit+0x3e/0x1018 [test_hmm]
  | [<ffffffff80102c14>] __riscv_sys_delete_module+0x15a/0x2a6
  | [<ffffffff80fd020c>] do_trap_ecall_u+0x1f2/0x266
  | [<ffffffff80fde0a2>] _new_vmalloc_restore_context_a0+0xc6/0xd2
  | Code: bf51 7597 0184 8593 76a5 854a 4097 0029 80e7 2c00 (9002) 7597
  | ---[ end trace 0000000000000000 ]---
  | Kernel panic - not syncing: Fatal exception in interrupt

Add a check to avoid calling the pmd dtor, if the calling context is
vmemmap_free().

Fixes: c75a74f4ba ("riscv: mm: Add memory hotplugging support")
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241120131203.1859787-1-bjorn@kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-11 11:44:21 -08:00
Alexandre Ghiti
b3431a8bb3
riscv: Fix IPIs usage in kfence_protect_page()
flush_tlb_kernel_range() may use IPIs to flush the TLBs of all the
cores, which triggers the following warning when the irqs are disabled:

[    3.455330] WARNING: CPU: 1 PID: 0 at kernel/smp.c:815 smp_call_function_many_cond+0x452/0x520
[    3.456647] Modules linked in:
[    3.457218] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.12.0-rc7-00010-g91d3de7240b8 #1
[    3.457416] Hardware name: QEMU QEMU Virtual Machine, BIOS
[    3.457633] epc : smp_call_function_many_cond+0x452/0x520
[    3.457736]  ra : on_each_cpu_cond_mask+0x1e/0x30
[    3.457786] epc : ffffffff800b669a ra : ffffffff800b67c2 sp : ff2000000000bb50
[    3.457824]  gp : ffffffff815212b8 tp : ff6000008014f080 t0 : 000000000000003f
[    3.457859]  t1 : ffffffff815221e0 t2 : 000000000000000f s0 : ff2000000000bc10
[    3.457920]  s1 : 0000000000000040 a0 : ffffffff815221e0 a1 : 0000000000000001
[    3.457953]  a2 : 0000000000010000 a3 : 0000000000000003 a4 : 0000000000000000
[    3.458006]  a5 : 0000000000000000 a6 : ffffffffffffffff a7 : 0000000000000000
[    3.458042]  s2 : ffffffff815223be s3 : 00fffffffffff000 s4 : ff600001ffe38fc0
[    3.458076]  s5 : ff600001ff950d00 s6 : 0000000200000120 s7 : 0000000000000001
[    3.458109]  s8 : 0000000000000001 s9 : ff60000080841ef0 s10: 0000000000000001
[    3.458141]  s11: ffffffff81524812 t3 : 0000000000000001 t4 : ff60000080092bc0
[    3.458172]  t5 : 0000000000000000 t6 : ff200000000236d0
[    3.458203] status: 0000000200000100 badaddr: ffffffff800b669a cause: 0000000000000003
[    3.458373] [<ffffffff800b669a>] smp_call_function_many_cond+0x452/0x520
[    3.458593] [<ffffffff800b67c2>] on_each_cpu_cond_mask+0x1e/0x30
[    3.458625] [<ffffffff8000e4ca>] __flush_tlb_range+0x118/0x1ca
[    3.458656] [<ffffffff8000e6b2>] flush_tlb_kernel_range+0x1e/0x26
[    3.458683] [<ffffffff801ea56a>] kfence_protect+0xc0/0xce
[    3.458717] [<ffffffff801e9456>] kfence_guarded_free+0xc6/0x1c0
[    3.458742] [<ffffffff801e9d6c>] __kfence_free+0x62/0xc6
[    3.458764] [<ffffffff801c57d8>] kfree+0x106/0x32c
[    3.458786] [<ffffffff80588cf2>] detach_buf_split+0x188/0x1a8
[    3.458816] [<ffffffff8058708c>] virtqueue_get_buf_ctx+0xb6/0x1f6
[    3.458839] [<ffffffff805871da>] virtqueue_get_buf+0xe/0x16
[    3.458880] [<ffffffff80613d6a>] virtblk_done+0x5c/0xe2
[    3.458908] [<ffffffff8058766e>] vring_interrupt+0x6a/0x74
[    3.458930] [<ffffffff800747d8>] __handle_irq_event_percpu+0x7c/0xe2
[    3.458956] [<ffffffff800748f0>] handle_irq_event+0x3c/0x86
[    3.458978] [<ffffffff800786cc>] handle_simple_irq+0x9e/0xbe
[    3.459004] [<ffffffff80073934>] generic_handle_domain_irq+0x1c/0x2a
[    3.459027] [<ffffffff804bf87c>] imsic_handle_irq+0xba/0x120
[    3.459056] [<ffffffff80073934>] generic_handle_domain_irq+0x1c/0x2a
[    3.459080] [<ffffffff804bdb76>] riscv_intc_aia_irq+0x24/0x34
[    3.459103] [<ffffffff809d0452>] handle_riscv_irq+0x2e/0x4c
[    3.459133] [<ffffffff809d923e>] call_on_irq_stack+0x32/0x40

So only flush the local TLB and let the lazy kfence page fault handling
deal with the faults which could happen when a core has an old protected
pte version cached in its TLB. That leads to potential inaccuracies which
can be tolerated when using kfence.

Fixes: 47513f243b ("riscv: Enable KFENCE for riscv64")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241209074125.52322-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-11 11:44:03 -08:00
Alexandre Ghiti
c796e18720
riscv: Fix wrong usage of __pa() on a fixmap address
riscv uses fixmap addresses to map the dtb so we can't use __pa() which
is reserved for linear mapping addresses.

Fixes: b2473a3597 ("of/fdt: add dt_phys arg to early_init_dt_scan and early_init_dt_verify")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241209074508.53037-1-alexghiti@rivosinc.com
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-11 11:43:44 -08:00
Guo Ren
b3134b8c1a
riscv: Fixup boot failure when CONFIG_DEBUG_RT_MUTEXES=y
When CONFIG_DEBUG_RT_MUTEXES=y, mutex_lock->rt_mutex_try_acquire
would change from rt_mutex_cmpxchg_acquire to
rt_mutex_slowtrylock():
	raw_spin_lock_irqsave(&lock->wait_lock, flags);
	ret = __rt_mutex_slowtrylock(lock);
	raw_spin_unlock_irqrestore(&lock->wait_lock, flags);

Because queued_spin_#ops to ticket_#ops is changed one by one by
jump_label, raw_spin_lock/unlock would cause a deadlock during the
changing.

That means in arch/riscv/kernel/jump_label.c:
1.
arch_jump_label_transform_queue() ->
mutex_lock(&text_mutex); +-> raw_spin_lock  -> queued_spin_lock
			 |-> raw_spin_unlock -> queued_spin_unlock
patch_insn_write -> change the raw_spin_lock to ticket_lock
mutex_unlock(&text_mutex);
...

2. /* Dirty the lock value */
arch_jump_label_transform_queue() ->
mutex_lock(&text_mutex); +-> raw_spin_lock -> *ticket_lock*
                         |-> raw_spin_unlock -> *queued_spin_unlock*
			  /* BUG: ticket_lock with queued_spin_unlock */
patch_insn_write  ->  change the raw_spin_unlock to ticket_unlock
mutex_unlock(&text_mutex);
...

3. /* Dead lock */
arch_jump_label_transform_queue() ->
mutex_lock(&text_mutex); +-> raw_spin_lock -> ticket_lock /* deadlock! */
                         |-> raw_spin_unlock -> ticket_unlock
patch_insn_write -> change other raw_spin_#op -> ticket_#op
mutex_unlock(&text_mutex);

So, the solution is to disable mutex usage of
arch_jump_label_transform_queue() during early_boot_irqs_disabled, just
like we have done for stop_machine.

Reported-by: Conor Dooley <conor@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Fixes: ab83647fad ("riscv: Add qspinlock support")
Link: https://lore.kernel.org/linux-riscv/CAJF2gTQwYTGinBmCSgVUoPv0_q4EPt_+WiyfUA1HViAKgUzxAg@mail.gmail.com/T/#mf488e6347817fca03bb93a7d34df33d8615b3775
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241130153310.3349484-1-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-11 11:43:39 -08:00
Eliav Farber
bad6722e47 kexec: Consolidate machine_kexec_mask_interrupts() implementation
Consolidate the machine_kexec_mask_interrupts implementation into a common
function located in a new file: kernel/irq/kexec.c. This removes duplicate
implementations from architecture-specific files in arch/arm, arch/arm64,
arch/powerpc, and arch/riscv, reducing code duplication and improving
maintainability.

The new implementation retains architecture-specific behavior for
CONFIG_GENERIC_IRQ_KEXEC_CLEAR_VM_FORWARD, which was previously implemented
for ARM64. When enabled (currently for ARM64), it clears the active state
of interrupts forwarded to virtual machines (VMs) before handling other
interrupt masking operations.

Signed-off-by: Eliav Farber <farbere@amazon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241204142003.32859-2-farbere@amazon.com
2024-12-11 20:32:34 +01:00
Davidlohr Bueso
9d0593da94
riscv/futex: Optimize atomic cmpxchg
Remove redundant release/acquire barriers, optimizing the lr/sc sequence
to provide conditional RCsc synchronization, per the RVWMO.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20241113183321.491113-1-dave@stgolabs.net
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-11 07:09:55 -08:00
Drew Fustini
0207244ea0
riscv: defconfig: enable pinctrl and dwmac support for TH1520
Enable pinctrl and ethernet dwmac driver for the TH1520 SoC boards like
the BeagleV Ahead and the Sipeed LicheePi 4A.

Signed-off-by: Drew Fustini <drew@pdp7.com>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Link: https://lore.kernel.org/r/20241113184333.829716-1-drew@pdp7.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-11 07:09:46 -08:00
Michael Neuling
ea6398a5af RISC-V: KVM: Fix csr_write -> csr_set for HVIEN PMU overflow bit
This doesn't cause a problem currently as HVIEN isn't used elsewhere
yet. Found by inspection.

Signed-off-by: Michael Neuling <michaelneuling@tenstorrent.com>
Fixes: 16b0bde9a3 ("RISC-V: KVM: Add perf sampling support for guests")
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20241127041840.419940-1-michaelneuling@tenstorrent.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-12-06 18:42:38 +05:30
E Shattow
708d55db3e riscv: dts: starfive: jh7110-milkv-mars: enable usb0 host function
Milk-V Mars board routes one of four USB-A ports to USB0 on the SoC
rather than to the VL805 USB 3.0 <-> PCIe chip.
Set JH7110 on-chip USB host mode and vbus pin assignment accordingly.

Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: E Shattow <e@freeshell.de>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-12-02 19:06:40 +00:00
E Shattow
03bd268ae0 riscv: dts: starfive: jh7110-pine64-star64: enable usb0 host function
Pine64 Star64 board routes all four USB-A ports to USB0 on the SoC.
Set JH7110 on-chip USB host mode and vbus pin assignment accordingly.

Signed-off-by: E Shattow <e@freeshell.de>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-12-02 19:06:33 +00:00
Eric Biggers
b5ae12e0ee lib/crc32: expose whether the lib is really optimized at runtime
Make the CRC32 library export a function crc32_optimizations() which
returns flags that indicate which CRC32 functions are actually executing
optimized code at runtime.

This will be used to determine whether the crc32[c]-$arch shash
algorithms should be registered in the crypto API.  btrfs could also
start using these flags instead of the hack that it currently uses where
it parses the crypto_shash_driver_name.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20241202010844.144356-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01 17:23:01 -08:00
Eric Biggers
d36cebe03c lib/crc32: improve support for arch-specific overrides
Currently the CRC32 library functions are defined as weak symbols, and
the arm64 and riscv architectures override them.

This method of arch-specific overrides has the limitation that it only
works when both the base and arch code is built-in.  Also, it makes the
arch-specific code be silently not used if it is accidentally built with
lib-y instead of obj-y; unfortunately the RISC-V code does this.

This commit reorganizes the code to have explicit *_arch() functions
that are called when they are enabled, similar to how some of the crypto
library code works (e.g. chacha_crypt() calls chacha_crypt_arch()).

Make the existing kconfig choice for the CRC32 implementation also
control whether the arch-optimized implementation (if one is available)
is enabled or not.  Make it enabled by default if CRC32 is also enabled.

The result is that arch-optimized CRC32 library functions will be
included automatically when appropriate, but it is now possible to
disable them.  They can also now be built as a loadable module if the
CRC32 library functions happen to be used only by loadable modules, in
which case the arch and base CRC32 modules will be automatically loaded
via direct symbol dependency when appropriate.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20241202010844.144356-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01 17:23:01 -08:00