Commit Graph

1577 Commits

Author SHA1 Message Date
Conor Dooley
12e7fbb6a8
RISC-V: add f & d extension validation checks
Using Clement's new validation callbacks, support checking that
dependencies have been satisfied for the floating point extensions.

The check for "d" might be slightly confusingly shorter than that of "f",
despite "d" depending on "f". This is because the requirement that a
hart supporting double precision must also support single precision,
should be validated by dt-bindings etc, not the kernel but lack of
support for single precision only is a limitation of the kernel.

Tested-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250312-reptile-platinum-62ee0f444a32@spud
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-25 14:19:24 +00:00
Conor Dooley
38077ec8fc
RISC-V: add vector crypto extension validation checks
Using Clement's new validation callbacks, support checking that
dependencies have been satisfied for the vector crpyto extensions.
Currently riscv_isa_extension_available(<vector crypto>) will return
true on systems that support the extensions but vector itself has been
disabled by the kernel, adding validation callbacks will prevent such a
scenario from occuring and make the behaviour of the extension detection
functions more consistent with user expectations - it's not expected to
have to check for vector AND the specific crypto extension.

The Unpriv spec states:
| The Zvknhb and Zvbc Vector Crypto Extensions --and accordingly the
| composite extensions Zvkn, Zvknc, Zvkng, and Zvksc-- require a Zve64x
| base, or application ("V") base Vector Extension. All of the other
| Vector Crypto Extensions can be built on any embedded (Zve*) or
| application ("V") base Vector Extension.

While this could be used as the basis for checking that the correct base
for individual crypto extensions, but that's not really the kernel's job
in my opinion and it is sufficient to leave that sort of precision to
the dt-bindings. The kernel only needs to make sure that vector, in some
form, is available.

Link: https://github.com/riscv/riscv-isa-manual/blob/main/src/vector-crypto.adoc#extensions-overview
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20250312-entertain-shaking-b664142c2f99@spud
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-25 14:10:08 +00:00
Conor Dooley
9324571e9e
RISC-V: add vector extension validation checks
Using Clement's new validation callbacks, support checking that
dependencies have been satisfied for the vector extensions. From the
kernel's perfective, it's not required to differentiate between the
conditions for all the various vector subsets - it's the firmware's job
to not report impossible combinations. Instead, the kernel only has to
check that the correct config options are enabled and to enforce its
requirement of the d extension being present for FPU support.

Since vector will now be disabled proactively, there's no need to clear
the bit in elf_hwcap in riscv_fill_hwcap() any longer.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250312-eclair-affluent-55b098c3602b@spud
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-25 14:10:08 +00:00
Alexandre Ghiti
74f4bf9d15
Merge patch series "riscv: Add runtime constant support"
Charlie Jenkins <charlie@rivosinc.com> says:

Ard brought this to my attention in this patch [1].

I benchmarked this patch on the Nezha D1 (which does not contain Zba or
Zbkb so it uses the default algorithm) by navigating through a large
directory structure. I created a 1000-deep directory structure and then
cd and ls through it. With this patch there was a 0.57% performance
improvement.

[1] https://lore.kernel.org/lkml/CAMj1kXE4DJnwFejNWQu784GvyJO=aGNrzuLjSxiowX_e7nW8QA@mail.gmail.com/

* patches from https://lore.kernel.org/r/20250319-runtime_const_riscv-v10-0-745b31a11d65@rivosinc.com:
  riscv: Add runtime constant support
  riscv: Move nop definition to insn-def.h

Link: https://lore.kernel.org/linux-riscv/20250319-runtime_const_riscv-v10-0-745b31a11d65@rivosinc.com/
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-20 09:15:04 +00:00
Charlie Jenkins
a44fb57221
riscv: Add runtime constant support
Implement the runtime constant infrastructure for riscv. Use this
infrastructure to generate constants to be used by the d_hash()
function.

This is the riscv variant of commit 94a2bc0f61 ("arm64: add 'runtime
constant' support") and commit e3c92e8171 ("runtime constants: add
x86 architecture support").

[ alex: Remove trailing whitespace ]

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250319-runtime_const_riscv-v10-2-745b31a11d65@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-20 09:15:03 +00:00
Charlie Jenkins
afa8a93932
riscv: Move nop definition to insn-def.h
We have duplicated the definition of the nop instruction in ftrace.h and
in jump_label.c. Move this definition into the generic file insn-def.h
so that they can share the definition with each other and with future
files.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250319-runtime_const_riscv-v10-1-745b31a11d65@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-20 09:14:42 +00:00
Alexandre Ghiti
d9b65824d8
Merge patch series "riscv: Unaligned access speed probing fixes and skipping"
Andrew Jones <ajones@ventanamicro.com> says:

The first six patches of this series are fixes and cleanups of the
unaligned access speed probing code. The next patch introduces a
kernel command line option that allows the probing to be skipped.
This command line option is a different approach than Jesse's [1].
[1] takes a cpu-list for a particular speed, supporting heterogeneous
platforms. With this approach, the kernel command line should only
be used for homogeneous platforms. [1] also only allowed 'fast' and
'slow' to be selected. This parameter also supports 'unsupported',
which could be useful for testing code paths gated on that. The final
patch adds the documentation.

[1] https://lore.kernel.org/linux-riscv/20240805173816.3722002-1-jesse@rivosinc.com/

* patches from https://lore.kernel.org/r/20250304120014.143628-10-ajones@ventanamicro.com:
  Documentation/kernel-parameters: Add riscv unaligned speed parameters
  riscv: Add parameter for skipping access speed tests
  riscv: Fix set up of vector cpu hotplug callback
  riscv: Fix set up of cpu hotplug callbacks
  riscv: Change check_unaligned_access_speed_all_cpus to void
  riscv: Fix check_unaligned_access_all_cpus
  riscv: Fix riscv_online_cpu_vec
  riscv: Annotate unaligned access init functions

Link: https://lore.kernel.org/r/20250304120014.143628-10-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-20 07:25:24 +00:00
Andrew Jones
aecb09e091
riscv: Add parameter for skipping access speed tests
Allow skipping scalar and vector unaligned access speed tests. This
is useful for testing alternative code paths and to skip the tests in
environments where they run too slowly. All CPUs must have the same
unaligned access speed.

The code movement is because we now need the scalar cpu hotplug
callback to always run, so we need to bring it and its supporting
functions out of CONFIG_RISCV_PROBE_UNALIGNED_ACCESS.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-17-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:30 +00:00
Andrew Jones
2744ec472d
riscv: Fix set up of vector cpu hotplug callback
Whether or not we have RISCV_PROBE_VECTOR_UNALIGNED_ACCESS we need to
set up a cpu hotplug callback to check if we have vector at all,
since, when we don't have vector, we need to set
vector_misaligned_access to unsupported rather than leave it the
default of unknown.

Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-16-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:30 +00:00
Andrew Jones
05ee21f0fc
riscv: Fix set up of cpu hotplug callbacks
CPU hotplug callbacks should be set up even if we detected all
current cpus emulate misaligned accesses, since we want to
ensure our expectations of all cpus emulating is maintained.

Fixes: 6e5ce7f2ea ("riscv: Decouple emulated unaligned accesses from access speed")
Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-15-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:30 +00:00
Andrew Jones
813d39baee
riscv: Change check_unaligned_access_speed_all_cpus to void
The return value of check_unaligned_access_speed_all_cpus() is always
zero, so make the function void so we don't need to concern ourselves
with it. The change also allows us to tidy up
check_unaligned_access_all_cpus() a bit.

Reviewed-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-14-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:29 +00:00
Andrew Jones
e6d0adf2eb
riscv: Fix check_unaligned_access_all_cpus
check_vector_unaligned_access_emulated_all_cpus(), like its name
suggests, will return true when all cpus emulate unaligned vector
accesses. If the function returned false it may have been because
vector isn't supported at all (!has_vector()) or because at least
one cpu doesn't emulate unaligned vector accesses. Since false may
be returned for two cases, checking for it isn't sufficient when
attempting to determine if we should proceed with the vector speed
check. Move the !has_vector() functionality to
check_unaligned_access_all_cpus() in order for
check_vector_unaligned_access_emulated_all_cpus() to return false
for a single case.

Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-13-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:29 +00:00
Andrew Jones
5af72a8186
riscv: Fix riscv_online_cpu_vec
We shouldn't probe when we already know vector is unsupported and
we should probe when we see we don't yet know whether it's supported.
Furthermore, we should ensure we've set the access type to
unsupported when we don't have vector at all.

Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-12-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:29 +00:00
Andrew Jones
a00e022be5
riscv: Annotate unaligned access init functions
Several functions used in unaligned access probing are only run at
init time. Annotate them appropriately.

Fixes: f413aae96c ("riscv: Set unaligned access speed at compile time")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250304120014.143628-11-ajones@ventanamicro.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 14:23:28 +00:00
Clément Léger
9d45d1ff90
riscv: hwprobe: export Zaamo and Zalrsc extensions
Export the Zaamo and Zalrsc extensions to userspace using hwprobe.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240619153913.867263-4-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 12:03:45 +00:00
Clément Léger
35173b666a
riscv: add parsing for Zaamo and Zalrsc extensions
These 2 new extensions are actually a subset of the A extension which
provides atomic memory operations and load-reserved/store-conditional
instructions.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240619153913.867263-3-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 12:03:45 +00:00
Pu Lehui
67a5ba8f74
riscv: fgraph: Fix stack layout to match __arch_ftrace_regs argument of ftrace_return_to_handler
Naresh Kamboju reported a "Bad frame pointer" kernel warning while
running LTP trace ftrace_stress_test.sh in riscv. We can reproduce the
same issue with the following command:

```
$ cd /sys/kernel/debug/tracing
$ echo 'f:myprobe do_nanosleep%return args1=$retval' > dynamic_events
$ echo 1 > events/fprobes/enable
$ echo 1 > tracing_on
$ sleep 1
```

And we can get the following kernel warning:

[  127.692888] ------------[ cut here ]------------
[  127.693755] Bad frame pointer: expected ff2000000065be50, received ba34c141e9594000
[  127.693755]   from func do_nanosleep return to ffffffff800ccb16
[  127.698699] WARNING: CPU: 1 PID: 129 at kernel/trace/fgraph.c:755 ftrace_return_to_handler+0x1b2/0x1be
[  127.699894] Modules linked in:
[  127.700908] CPU: 1 UID: 0 PID: 129 Comm: sleep Not tainted 6.14.0-rc3-g0ab191c74642 #32
[  127.701453] Hardware name: riscv-virtio,qemu (DT)
[  127.701859] epc : ftrace_return_to_handler+0x1b2/0x1be
[  127.702032]  ra : ftrace_return_to_handler+0x1b2/0x1be
[  127.702151] epc : ffffffff8013b5e0 ra : ffffffff8013b5e0 sp : ff2000000065bd10
[  127.702221]  gp : ffffffff819c12f8 tp : ff60000080853100 t0 : 6e00000000000000
[  127.702284]  t1 : 0000000000000020 t2 : 6e7566206d6f7266 s0 : ff2000000065bd80
[  127.702346]  s1 : ff60000081262000 a0 : 000000000000007b a1 : ffffffff81894f20
[  127.702408]  a2 : 0000000000000010 a3 : fffffffffffffffe a4 : 0000000000000000
[  127.702470]  a5 : 0000000000000000 a6 : 0000000000000008 a7 : 0000000000000038
[  127.702530]  s2 : ba34c141e9594000 s3 : 0000000000000000 s4 : ff2000000065bdd0
[  127.702591]  s5 : 00007fff8adcf400 s6 : 000055556dc1d8c0 s7 : 0000000000000068
[  127.702651]  s8 : 00007fff8adf5d10 s9 : 000000000000006d s10: 0000000000000001
[  127.702710]  s11: 00005555737377c8 t3 : ffffffff819d899e t4 : ffffffff819d899e
[  127.702769]  t5 : ffffffff819d89a0 t6 : ff2000000065bb18
[  127.702826] status: 0000000200000120 badaddr: 0000000000000000 cause: 0000000000000003
[  127.703292] [<ffffffff8013b5e0>] ftrace_return_to_handler+0x1b2/0x1be
[  127.703760] [<ffffffff80017bce>] return_to_handler+0x16/0x26
[  127.704009] [<ffffffff80017bb8>] return_to_handler+0x0/0x26
[  127.704057] [<ffffffff800d3352>] common_nsleep+0x42/0x54
[  127.704117] [<ffffffff800d44a2>] __riscv_sys_clock_nanosleep+0xba/0x10a
[  127.704176] [<ffffffff80901c56>] do_trap_ecall_u+0x188/0x218
[  127.704295] [<ffffffff8090cc3e>] handle_exception+0x14a/0x156
[  127.705436] ---[ end trace 0000000000000000 ]---

The reason is that the stack layout for constructing argument for the
ftrace_return_to_handler in the return_to_handler does not match the
__arch_ftrace_regs structure of riscv, leading to unexpected results.

Fixes: a3ed4157b7 ("fgraph: Replace fgraph_ret_regs with ftrace_regs")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/all/CA+G9fYvp_oAxeDFj88Tk2rfEZ7jtYKAKSwfYS66=57Db9TBdyA@mail.gmail.com
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20250317031214.4138436-2-pulehui@huaweicloud.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-19 12:03:25 +00:00
Alexandre Ghiti
33981b1c4e
riscv: Fix missing __free_pages() in check_vector_unaligned_access()
The locally allocated pages are never freed up, so add the corresponding
__free_pages().

Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Link: https://lore.kernel.org/r/20250228090613.345309-1-alexghiti@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 13:37:56 +00:00
Tingbo Liao
475afa39b1
riscv: Fix the __riscv_copy_vec_words_unaligned implementation
Correct the VEC_S macro definition to fix the implementation
of vector words copy in the case of unalignment in RISC-V.

Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Tingbo Liao <tingbo.liao@starfivetech.com>
Link: https://lore.kernel.org/r/20250228090801.8334-1-tingbo.liao@starfivetech.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 13:37:27 +00:00
Zixian Zeng
eac5b13881
riscv: remove redundant CMDLINE_FORCE check
Drop redundant CMDLINE_FORCE check as it's already done in
function early_init_dt_scan_chosen().

Signed-off-by: Zixian Zeng <sycamoremoon376@gmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250114-rebund-v1-1-5632b2d54d6c@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 13:33:31 +00:00
Alexandre Ghiti
a5edc510da
Merge patch series "Support SSTC while PM operations"
Nick Hu <nick.hu@sifive.com> says:

When the cpu is going to be hotplug, stop the stimecmp to prevent pending
interrupt.
When the cpu is going to be suspended, save the stimecmp before entering
the suspend state and restore it in the resume path.

* patches from https://lore.kernel.org/r/20250219114135.27764-1-nick.hu@sifive.com:
  clocksource/drivers/timer-riscv: Stop stimecmp when cpu hotplug
  riscv: Add stimecmp save and restore

Link: https://lore.kernel.org/r/20250219114135.27764-1-nick.hu@sifive.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 12:59:08 +00:00
Nick Hu
ffef54ad41
riscv: Add stimecmp save and restore
If the HW support the SSTC extension, we should save and restore the
stimecmp register while cpu non retention suspend.

Signed-off-by: Nick Hu <nick.hu@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20250219114135.27764-2-nick.hu@sifive.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 12:59:03 +00:00
Chin Yik Ming
7f238b1266
riscv: Simplify base extension checks and direct boolean return
Reduce three lines checking to single line using a ternary conditional
expression for getting the base extension word. In addition, the
test_bit macro function already return a boolean which matches the
return type of the caller, so directly return the result of the test_bit
macro function.

Signed-off-by: Chin Yik Ming <yikming2222@gmail.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250129203843.1136838-1-yikming2222@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 12:47:20 +00:00
Jinjie Ruan
a4a58f510b
riscv: Remove unused TASK_TI_FLAGS
Since commit f0bddf5058 ("riscv: entry: Convert to generic
entry"), TASK_TI_FLAGS is not used any more, so remove it.

Fixes: f0bddf5058 ("riscv: entry: Convert to generic entry")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241109014605.2801492-1-ruanjinjie@huawei.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 12:46:15 +00:00
Yunhui Cui
eb10039709
RISC-V: hwprobe: Expose Zicbom extension and its block size
Expose Zicbom through hwprobe and also provide a key to extract its
respective block size.

[ alex: Fix merge conflicts and hwprobe numbering ]

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20250226063206.71216-3-cuiyunhui@bytedance.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 12:43:56 +00:00
Yunhui Cui
de70b532f9
RISC-V: Enable cbo.clean/flush in usermode
Enabling cbo.clean and cbo.flush in user mode makes it more
convenient to manage the cache state and achieve better performance.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20250226063206.71216-2-cuiyunhui@bytedance.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 12:35:14 +00:00
Alexandre Ghiti
2f2cd9f334
Merge patch series "riscv: Add bfloat16 instruction support"
Inochi Amaoto <inochiama@gmail.com> says:

Add description for the BFloat16 precision Floating-Point ISA extension,
(Zfbfmin, Zvfbfmin, Zvfbfwma). which was ratified in commit 4dc23d62
("Added Chapter title to BF16") of the riscv-isa-manual.

* patches from https://lore.kernel.org/r/20250213003849.147358-1-inochiama@gmail.com:
  riscv: hwprobe: export bfloat16 ISA extension
  riscv: add ISA extension parsing for bfloat16 ISA extension
  dt-bindings: riscv: add bfloat16 ISA extension description

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20250213003849.147358-1-inochiama@gmail.com
2025-03-18 11:52:54 +00:00
Inochi Amaoto
a4863e002c
riscv: hwprobe: export bfloat16 ISA extension
Export Zfbmin, Zvfbfmin, Zvfbfwma ISA extension through hwprobe.

Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20250213003849.147358-4-inochiama@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 11:52:02 +00:00
Inochi Amaoto
e186c28dda
riscv: add ISA extension parsing for bfloat16 ISA extension
Add parsing for Zfbmin, Zvfbfmin, Zvfbfwma ISA extension which
were ratified in 4dc23d62 ("Added Chapter title to BF16") of
the riscv-isa-manual.

Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20250213003849.147358-3-inochiama@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 11:52:02 +00:00
Miquel Sabaté Solà
4458b8f68d
riscv: hwprobe: export Zicntr and Zihpm extensions
Export Zicntr and Zihpm ISA extensions through the hwprobe syscall.

[ alex: Fix hwprobe numbering ]

Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
Acked-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240913051324.8176-1-mikisabate@gmail.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 09:10:22 +00:00
Clément Léger
d3817d091f
riscv: remove useless pc check in stacktrace handling
Checking for pc to be a kernel text address at this location is useless
since pc == handle_exception. Remove this check.

[ alex: Fix merge conflict ]

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240830084934.3690037-1-cleger@rivosinc.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-03-18 09:06:21 +00:00
Ryan Roberts
86758b5048 mm/ioremap: pass pgprot_t to ioremap_prot() instead of unsigned long
ioremap_prot() currently accepts pgprot_val parameter as an unsigned long,
thus implicitly assuming that pgprot_val and pgprot_t could never be
bigger than unsigned long.  But this assumption soon will not be true on
arm64 when using D128 pgtables.  In 128 bit page table configuration,
unsigned long is 64 bit, but pgprot_t is 128 bit.

Passing platform abstracted pgprot_t argument is better as compared to
size based data types.  Let's change the parameter to directly pass
pgprot_t like another similar helper generic_ioremap_prot().

Without this change in place, D128 configuration does not work on arm64 as
the top 64 bits gets silently stripped when passing the protection value
to this function.

Link: https://lkml.kernel.org/r/20250218101954.415331-1-anshuman.khandual@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Co-developed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-16 22:06:23 -07:00
Thomas Weißschuh
46fe55b204 riscv: vdso: Switch to generic storage implementation
The generic storage implementation provides the same features as the
custom one. However it can be shared between architectures, making
maintenance easier.

Co-developed-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-9-13a4669dfc8c@linutronix.de
2025-02-21 09:54:02 +01:00
Thomas Weißschuh
127b0e05c1 vdso: Rename included Makefile
As the Makefile is included into other Makefiles it can not be used to
define objects to be built from the current source directory.
However the generic datastore will introduce such a local source file.
Rename the included Makefile so it is clear how it is to be used and to
make room for a regular Makefile in lib/vdso/.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-4-13a4669dfc8c@linutronix.de
2025-02-21 09:54:01 +01:00
Yong-Xuan Wang
564fc8eb6f
riscv: signal: fix signal_minsigstksz
The init_rt_signal_env() funciton is called before the alternative patch
is applied, so using the alternative-related API to check the availability
of an extension within this function doesn't have the intended effect.
This patch reorders the init_rt_signal_env() and apply_boot_alternatives()
to get the correct signal_minsigstksz.

Fixes: e92f469b07 ("riscv: signal: Report signal frame size to userspace via auxv")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Andy Chiu <andybnac@gmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241220083926.19453-3-yongxuan.wang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-02-14 13:06:50 -08:00
Yong-Xuan Wang
aa49bc2ca8
riscv: signal: fix signal frame size
The signal context of certain RISC-V extensions will be appended after
struct __riscv_extra_ext_header, which already includes an empty context
header. Therefore, there is no need to preserve a separate hdr for the
END of signal context.

Fixes: 8ee0b41898 ("riscv: signal: Add sigcontext save/restore for vector")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Andy Chiu <AndybnAC@gmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241220083926.19453-2-yongxuan.wang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-02-14 13:06:44 -08:00
Clément Léger
c6ec1e1b07
riscv: cpufeature: use bitmap_equal() instead of memcmp()
Comparison of bitmaps should be done using bitmap_equal(), not memcmp(),
use the former one to compare isa bitmaps.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Fixes: 625034abd5 ("riscv: add ISA extensions validation callback")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250210155615.1545738-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-02-14 13:06:16 -08:00
Rob Herring
fb8179ce29
riscv: cacheinfo: Use of_property_present() for non-boolean properties
The use of of_property_read_bool() for non-boolean properties is
deprecated in favor of of_property_present() when testing for property
presence.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Cc: stable@vger.kernel.org
Fixes: 76d2a0493a ("RISC-V: Init and Halt Code")
Link: https://lore.kernel.org/r/20241104190314.270095-1-robh@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-02-14 13:04:20 -08:00
Linus Torvalds
1b5f3c51fb RISC-V Patches for the 6.14 Merge Window, Part 1
* The PH1520 pinctrl and dwmac drivers are enabeled in defconfig.
 * A redundant AQRL barrier has been removed from the futex cmpxchg
   implementation.
 * Support for the T-Head vector extensions, which includes exposing
   these extensions to userspace on systems that implement them.
 * Some more page table information is now printed on die() and systems
   that cause PA overflows.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmedHIoTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYievXD/4hdt8h+fMM0I9mmJS096YevRJONdfe
 Wk7D5q4PBwSHISHahuzfphieBhqPVnYkkEd7Vw6xRrLbUnhA41Fe0uvR52dx5UZd
 3LwrDV/kjGTD59x6A2Zo9bSs/qPKJ2WHmHwHM21jY5tvcIB2Lo4dF8HT63OrwVNW
 DxsujLO0jUw+HEwXPsfmUAZJWOPZuUnatl/9CaLMLwQv5N7yiMuz5oYDzJXTLnNh
 m3Hv3CCtj1EeQPqDoWzz9nZvmAKOwcblSzz6OAy+xrRk1N0N3QFQPbIaRvkI9OVz
 +wPHQiyx4KZNeAe0csV0uLQRIiXZV8rkCz5UT65s3Bfy3vukvzz+1VBdNnCqiP8Q
 RpCTcYw62Cr6BWnvyTh+s9bhHb1ijG043nXd/Ty7ZRPCNLKHY6oL1CZ0pgqbTwPs
 D2U2ZTZFTc35mPrU6QMfbTiUVWCU2XagFhI27Dgj3xh9mkBOQCHwk2Mrzn7uS4iz
 xGNnrjRnKtuwBrvD68JzxCkEi8INFn2ifbVr44VZrOdTM7XtODGAYrBohQtV62kU
 2L+q8DoHYis+0xFbR1wdrY1mRZoe45boUFgwnOpmoBr9ULe584sL+526y7IkkEHu
 /9hmLPtLg7nyoR/rO1j1Sfg4Eqdwg5HY1TKNfagJZAdu23EDRwrcW1PD0P6vtDv8
 j4og8MmL7dTt3A==
 =HbAQ
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.14-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - The PH1520 pinctrl and dwmac drivers are enabeled in defconfig

 - A redundant AQRL barrier has been removed from the futex cmpxchg
   implementation

 - Support for the T-Head vector extensions, which includes exposing
   these extensions to userspace on systems that implement them

 - Some more page table information is now printed on die() and systems
   that cause PA overflows

* tag 'riscv-for-linus-6.14-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: add a warning when physical memory address overflows
  riscv/mm/fault: add show_pte() before die()
  riscv: Add ghostwrite vulnerability
  selftests: riscv: Support xtheadvector in vector tests
  selftests: riscv: Fix vector tests
  riscv: hwprobe: Document thead vendor extensions and xtheadvector extension
  riscv: hwprobe: Add thead vendor extension probing
  riscv: vector: Support xtheadvector save/restore
  riscv: Add xtheadvector instruction definitions
  riscv: csr: Add CSR encodings for CSR_VXRM/CSR_VXSAT
  RISC-V: define the elements of the VCSR vector CSR
  riscv: vector: Use vlenb from DT for thead
  riscv: Add thead and xtheadvector as a vendor extension
  riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree
  dt-bindings: cpus: add a thead vlen register length property
  dt-bindings: riscv: Add xtheadvector ISA extension description
  RISC-V: Mark riscv_v_init() as __init
  riscv: defconfig: drop RT_GROUP_SCHED=y
  riscv/futex: Optimize atomic cmpxchg
  riscv: defconfig: enable pinctrl and dwmac support for TH1520
2025-01-31 15:13:25 -08:00
Joel Granados
1751f872cc treewide: const qualify ctl_tables where applicable
Add the const qualifier to all the ctl_tables in the tree except for
watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls,
loadpin_sysctl_table and the ones calling register_net_sysctl (./net,
drivers/inifiniband dirs). These are special cases as they use a
registration function with a non-const qualified ctl_table argument or
modify the arrays before passing them on to the registration function.

Constifying ctl_table structs will prevent the modification of
proc_handler function pointers as the arrays would reside in .rodata.
This is made possible after commit 78eb4ea25c ("sysctl: treewide:
constify the ctl_table argument of proc_handlers") constified all the
proc_handlers.

Created this by running an spatch followed by a sed command:
Spatch:
    virtual patch

    @
    depends on !(file in "net")
    disable optional_qualifier
    @

    identifier table_name != {
      watchdog_hardlockup_sysctl,
      iwcm_ctl_table,
      ucma_ctl_table,
      memory_allocation_profiling_sysctls,
      loadpin_sysctl_table
    };
    @@

    + const
    struct ctl_table table_name [] = { ... };

sed:
    sed --in-place \
      -e "s/struct ctl_table .table = &uts_kern/const struct ctl_table *table = \&uts_kern/" \
      kernel/utsname_sysctl.c

Reviewed-by: Song Liu <song@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> # for kernel/trace/
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> # SCSI
Reviewed-by: Darrick J. Wong <djwong@kernel.org> # xfs
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Bill O'Donnell <bodonnel@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Acked-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-01-28 13:48:37 +01:00
Linus Torvalds
9c5968db9e The various patchsets are summarized below. Plus of course many
indivudual patches which are described in their changelogs.
 
 - "Allocate and free frozen pages" from Matthew Wilcox reorganizes the
   page allocator so we end up with the ability to allocate and free
   zero-refcount pages.  So that callers (ie, slab) can avoid a refcount
   inc & dec.
 
 - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to use
   large folios other than PMD-sized ones.
 
 - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance and
   fixes for this small built-in kernel selftest.
 
 - "mas_anode_descend() related cleanup" from Wei Yang tidies up part of
   the mapletree code.
 
 - "mm: fix format issues and param types" from Keren Sun implements a
   few minor code cleanups.
 
 - "simplify split calculation" from Wei Yang provides a few fixes and a
   test for the mapletree code.
 
 - "mm/vma: make more mmap logic userland testable" from Lorenzo Stoakes
   continues the work of moving vma-related code into the (relatively) new
   mm/vma.c.
 
 - "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David
   Hildenbrand cleans up and rationalizes handling of gfp flags in the page
   allocator.
 
 - "readahead: Reintroduce fix for improper RA window sizing" from Jan
   Kara is a second attempt at fixing a readahead window sizing issue.  It
   should reduce the amount of unnecessary reading.
 
 - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng
   addresses an issue where "huge" amounts of pte pagetables are
   accumulated
   (https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/).
   Qi's series addresses this windup by synchronously freeing PTE memory
   within the context of madvise(MADV_DONTNEED).
 
 - "selftest/mm: Remove warnings found by adding compiler flags" from
   Muhammad Usama Anjum fixes some build warnings in the selftests code
   when optional compiler warnings are enabled.
 
 - "mm: don't use __GFP_HARDWALL when migrating remote pages" from David
   Hildenbrand tightens the allocator's observance of __GFP_HARDWALL.
 
 - "pkeys kselftests improvements" from Kevin Brodsky implements various
   fixes and cleanups in the MM selftests code, mainly pertaining to the
   pkeys tests.
 
 - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to
   estimate application working set size.
 
 - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn
   provides some cleanups to memcg's hugetlb charging logic.
 
 - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song
   removes the global swap cgroup lock.  A speedup of 10% for a tmpfs-based
   kernel build was demonstrated.
 
 - "zram: split page type read/write handling" from Sergey Senozhatsky
   has several fixes and cleaups for zram in the area of zram_write_page().
   A watchdog softlockup warning was eliminated.
 
 - "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin Brodsky
   cleans up the pagetable destructor implementations.  A rare
   use-after-free race is fixed.
 
 - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes
   simplifies and cleans up the debugging code in the VMA merging logic.
 
 - "Account page tables at all levels" from Kevin Brodsky cleans up and
   regularizes the pagetable ctor/dtor handling.  This results in
   improvements in accounting accuracy.
 
 - "mm/damon: replace most damon_callback usages in sysfs with new core
   functions" from SeongJae Park cleans up and generalizes DAMON's sysfs
   file interface logic.
 
 - "mm/damon: enable page level properties based monitoring" from
   SeongJae Park increases the amount of information which is presented in
   response to DAMOS actions.
 
 - "mm/damon: remove DAMON debugfs interface" from SeongJae Park removes
   DAMON's long-deprecated debugfs interfaces.  Thus the migration to sysfs
   is completed.
 
 - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from Peter
   Xu cleans up and generalizes the hugetlb reservation accounting.
 
 - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino
   removes a never-used feature of the alloc_pages_bulk() interface.
 
 - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park
   extends DAMOS filters to support not only exclusion (rejecting), but
   also inclusion (allowing) behavior.
 
 - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi
   "introduces a new memory descriptor for zswap.zpool that currently
   overlaps with struct page for now.  This is part of the effort to reduce
   the size of struct page and to enable dynamic allocation of memory
   descriptors."
 
 - "mm, swap: rework of swap allocator locks" from Kairui Song redoes and
   simplifies the swap allocator locking.  A speedup of 400% was
   demonstrated for one workload.  As was a 35% reduction for kernel build
   time with swap-on-zram.
 
 - "mm: update mips to use do_mmap(), make mmap_region() internal" from
   Lorenzo Stoakes reworks MIPS's use of mmap_region() so that
   mmap_region() can be made MM-internal.
 
 - "mm/mglru: performance optimizations" from Yu Zhao fixes a few MGLRU
   regressions and otherwise improves MGLRU performance.
 
 - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae Park
   updates DAMON documentation.
 
 - "Cleanup for memfd_create()" from Isaac Manjarres does that thing.
 
 - "mm: hugetlb+THP folio and migration cleanups" from David Hildenbrand
   provides various cleanups in the areas of hugetlb folios, THP folios and
   migration.
 
 - "Uncached buffered IO" from Jens Axboe implements the new
   RWF_DONTCACHE flag which provides synchronous dropbehind for pagecache
   reading and writing.  To permite userspace to address issues with
   massive buildup of useless pagecache when reading/writing fast devices.
 
 - "selftests/mm: virtual_address_range: Reduce memory" from Thomas
   Weißschuh fixes and optimizes some of the MM selftests.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ5a+cwAKCRDdBJ7gKXxA
 jtoyAP9R58oaOKPJuTizEKKXvh/RpMyD6sYcz/uPpnf+cKTZxQEAqfVznfWlw/Lz
 uC3KRZYhmd5YrxU4o+qjbzp9XWX/xAE=
 =Ib2s
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "The various patchsets are summarized below. Plus of course many
  indivudual patches which are described in their changelogs.

   - "Allocate and free frozen pages" from Matthew Wilcox reorganizes
     the page allocator so we end up with the ability to allocate and
     free zero-refcount pages. So that callers (ie, slab) can avoid a
     refcount inc & dec

   - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to
     use large folios other than PMD-sized ones

   - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance
     and fixes for this small built-in kernel selftest

   - "mas_anode_descend() related cleanup" from Wei Yang tidies up part
     of the mapletree code

   - "mm: fix format issues and param types" from Keren Sun implements a
     few minor code cleanups

   - "simplify split calculation" from Wei Yang provides a few fixes and
     a test for the mapletree code

   - "mm/vma: make more mmap logic userland testable" from Lorenzo
     Stoakes continues the work of moving vma-related code into the
     (relatively) new mm/vma.c

   - "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David
     Hildenbrand cleans up and rationalizes handling of gfp flags in the
     page allocator

   - "readahead: Reintroduce fix for improper RA window sizing" from Jan
     Kara is a second attempt at fixing a readahead window sizing issue.
     It should reduce the amount of unnecessary reading

   - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng
     addresses an issue where "huge" amounts of pte pagetables are
     accumulated:

       https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/

     Qi's series addresses this windup by synchronously freeing PTE
     memory within the context of madvise(MADV_DONTNEED)

   - "selftest/mm: Remove warnings found by adding compiler flags" from
     Muhammad Usama Anjum fixes some build warnings in the selftests
     code when optional compiler warnings are enabled

   - "mm: don't use __GFP_HARDWALL when migrating remote pages" from
     David Hildenbrand tightens the allocator's observance of
     __GFP_HARDWALL

   - "pkeys kselftests improvements" from Kevin Brodsky implements
     various fixes and cleanups in the MM selftests code, mainly
     pertaining to the pkeys tests

   - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to
     estimate application working set size

   - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn
     provides some cleanups to memcg's hugetlb charging logic

   - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song
     removes the global swap cgroup lock. A speedup of 10% for a
     tmpfs-based kernel build was demonstrated

   - "zram: split page type read/write handling" from Sergey Senozhatsky
     has several fixes and cleaups for zram in the area of
     zram_write_page(). A watchdog softlockup warning was eliminated

   - "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin
     Brodsky cleans up the pagetable destructor implementations. A rare
     use-after-free race is fixed

   - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes
     simplifies and cleans up the debugging code in the VMA merging
     logic

   - "Account page tables at all levels" from Kevin Brodsky cleans up
     and regularizes the pagetable ctor/dtor handling. This results in
     improvements in accounting accuracy

   - "mm/damon: replace most damon_callback usages in sysfs with new
     core functions" from SeongJae Park cleans up and generalizes
     DAMON's sysfs file interface logic

   - "mm/damon: enable page level properties based monitoring" from
     SeongJae Park increases the amount of information which is
     presented in response to DAMOS actions

   - "mm/damon: remove DAMON debugfs interface" from SeongJae Park
     removes DAMON's long-deprecated debugfs interfaces. Thus the
     migration to sysfs is completed

   - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from
     Peter Xu cleans up and generalizes the hugetlb reservation
     accounting

   - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino
     removes a never-used feature of the alloc_pages_bulk() interface

   - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park
     extends DAMOS filters to support not only exclusion (rejecting),
     but also inclusion (allowing) behavior

   - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi
     introduces a new memory descriptor for zswap.zpool that currently
     overlaps with struct page for now. This is part of the effort to
     reduce the size of struct page and to enable dynamic allocation of
     memory descriptors

   - "mm, swap: rework of swap allocator locks" from Kairui Song redoes
     and simplifies the swap allocator locking. A speedup of 400% was
     demonstrated for one workload. As was a 35% reduction for kernel
     build time with swap-on-zram

   - "mm: update mips to use do_mmap(), make mmap_region() internal"
     from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that
     mmap_region() can be made MM-internal

   - "mm/mglru: performance optimizations" from Yu Zhao fixes a few
     MGLRU regressions and otherwise improves MGLRU performance

   - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae
     Park updates DAMON documentation

   - "Cleanup for memfd_create()" from Isaac Manjarres does that thing

   - "mm: hugetlb+THP folio and migration cleanups" from David
     Hildenbrand provides various cleanups in the areas of hugetlb
     folios, THP folios and migration

   - "Uncached buffered IO" from Jens Axboe implements the new
     RWF_DONTCACHE flag which provides synchronous dropbehind for
     pagecache reading and writing. To permite userspace to address
     issues with massive buildup of useless pagecache when
     reading/writing fast devices

   - "selftests/mm: virtual_address_range: Reduce memory" from Thomas
     Weißschuh fixes and optimizes some of the MM selftests"

* tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
  mm/compaction: fix UBSAN shift-out-of-bounds warning
  s390/mm: add missing ctor/dtor on page table upgrade
  kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags()
  tools: add VM_WARN_ON_VMG definition
  mm/damon/core: use str_high_low() helper in damos_wmark_wait_us()
  seqlock: add missing parameter documentation for raw_seqcount_try_begin()
  mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh
  mm/page_alloc: remove the incorrect and misleading comment
  zram: remove zcomp_stream_put() from write_incompressible_page()
  mm: separate move/undo parts from migrate_pages_batch()
  mm/kfence: use str_write_read() helper in get_access_type()
  selftests/mm/mkdirty: fix memory leak in test_uffdio_copy()
  kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags()
  selftests/mm: virtual_address_range: avoid reading from VM_IO mappings
  selftests/mm: vm_util: split up /proc/self/smaps parsing
  selftests/mm: virtual_address_range: unmap chunks after validation
  selftests/mm: virtual_address_range: mmap() without PROT_WRITE
  selftests/memfd/memfd_test: fix possible NULL pointer dereference
  mm: add FGP_DONTCACHE folio creation flag
  mm: call filemap_fdatawrite_range_kick() after IOCB_DONTCACHE issue
  ...
2025-01-26 18:36:23 -08:00
Guo Weikang
c6f239796b mm/memblock: add memblock_alloc_or_panic interface
Before SLUB initialization, various subsystems used memblock_alloc to
allocate memory.  In most cases, when memory allocation fails, an
immediate panic is required.  To simplify this behavior and reduce
repetitive checks, introduce `memblock_alloc_or_panic`.  This function
ensures that memory allocation failures result in a panic automatically,
improving code readability and consistency across subsystems that require
this behavior.

[guoweikang.kernel@gmail.com: arch/s390: save_area_alloc default failure behavior changed to panic]
  Link: https://lkml.kernel.org/r/20250109033136.2845676-1-guoweikang.kernel@gmail.com
  Link: https://lore.kernel.org/lkml/Z2fknmnNtiZbCc7x@kernel.org/
Link: https://lkml.kernel.org/r/20250102072528.650926-1-guoweikang.kernel@gmail.com
Signed-off-by: Guo Weikang <guoweikang.kernel@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>	[s390]
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-25 20:22:38 -08:00
Linus Torvalds
917846e9f0 samsung: add gs101-mbox driver
microchip: add sbi-ipc driver
 zynqmp: fix invalid __percpu annotation
 qcom: add IPQ5424 APCS compatible
 mpfs fix copy and paste bug
 th1520: Fix NULL vs IS_ERR() and a memory corruption bug
 tegra-hsp: clear mailbox before using message
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6EwehDt/SOnwFyTyf9lkf8eYP5UFAmeTDB4ACgkQf9lkf8eY
 P5UfghAAgusrJTWyRCeBC/pg6dwkacvl9zAOi+eIVWb6IJ/sZcEyHvWjo9jsqAw2
 mX2tqvcltUnh6ID1GdFz6bH1OYTQfcXhs0LO/ESkGCkVsrAh2uuBFVvxLwqahY/w
 64bEc+S/xVLpAp9/XA9HXgtyAoCueDLB0dreLB4umG9ps8j9n1g46GVAauf2ftmV
 5BN0ybnPd4nYzQ4ergD59jWMI1QOrb/F2JpK9M8RLCBt135xqZ1rz5cCiYZcN4LY
 gxX6HlbkVSaZ45ZEe7++iGK8c8M+uWsxOhjE0PXT1SeeNPyXz09Spsny75r1uUZo
 04t9kQ2yzcrBkHd7AInNBpiKQgFSvuhhjZDI0gQZLcD0aDsNvTTR86PoOuH7pP8B
 qGg+wCU1Puy1Olv9nelc+M5nSLMSBHcXvSDVN8kOCGOXGPgoPLEOD6K8L0s9WPRZ
 VnzQiddMcBuYhFXbGNv73EblrjGyMwJPeUl06yDnMBoaZXvqHirvYwANzJgvhZn2
 piq5bhX2eUrN0UIP+VInU2RAWFsN+ORAoL5TH/JTkA5S7+yzOCK0R/wkUsMcmvco
 gBmNtKOvuHEDl4M50TC9baL/3p07VM/hxXf9j6zGeQVpfITAQGh++/vMNug95ITM
 FlgWWWJziNb9v3F1XU2bLESqrTSjwNC5MXPDV+CraVorwzyeao0=
 =XBAn
 -----END PGP SIGNATURE-----

Merge tag 'mailbox-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox

Pull mailbox updates from Jassi Brar:

 - samsung: add gs101-mbox driver

 - microchip: add sbi-ipc driver

 - zynqmp: fix invalid __percpu annotation

 - qcom: add IPQ5424 APCS compatible

 - mpfs fix copy and paste bug

 - th1520: Fix NULL vs IS_ERR() and a memory corruption bug

 - tegra-hsp: clear mailbox before using message

* tag 'mailbox-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox:
  riscv: export __cpuid_to_hartid_map
  riscv: sbi: vendorid_list: Add Microchip Technology to the vendor list
  mailbox: th1520: Fix memory corruption due to incorrect array size
  mailbox: zynqmp: Remove invalid __percpu annotation in zynqmp_ipi_probe()
  MAINTAINERS: add entry for Samsung Exynos mailbox driver
  mailbox: add Samsung Exynos driver
  dt-bindings: mailbox: add google,gs101-mbox
  mailbox: qcom: Add support for IPQ5424 APCS IPC
  dt-bindings: mailbox: qcom: Add IPQ5424 APCS compatible
  mailbox: qcom-ipcc: Reset CLEAR_ON_RECV_RD if set from boot firmware
  mailbox: add Microchip IPC support
  dt-bindings: mailbox: add binding for Microchip IPC mailbox controller
  mailbox: tegra-hsp: Clear mailbox before using message
  mailbox: mpfs: fix copy and paste bug in probe
  mailbox: th1520: Fix a NULL vs IS_ERR() bug
2025-01-24 16:04:40 -08:00
Linus Torvalds
2e04247f7c ftrace updates for v6.14:
- Have fprobes built on top of function graph infrastructure
 
   The fprobe logic is an optimized kprobe that uses ftrace to attach to
   functions when a probe is needed at the start or end of the function. The
   fprobe and kretprobe logic implements a similar method as the function
   graph tracer to trace the end of the function. That is to hijack the
   return address and jump to a trampoline to do the trace when the function
   exits. To do this, a shadow stack needs to be created to store the
   original return address.  Fprobes and function graph do this slightly
   differently. Fprobes (and kretprobes) has slots per callsite that are
   reserved to save the return address. This is fine when just a few points
   are traced. But users of fprobes, such as BPF programs, are starting to add
   many more locations, and this method does not scale.
 
   The function graph tracer was created to trace all functions in the
   kernel. In order to do this, when function graph tracing is started, every
   task gets its own shadow stack to hold the return address that is going to
   be traced. The function graph tracer has been updated to allow multiple
   users to use its infrastructure. Now have fprobes be one of those users.
   This will also allow for the fprobe and kretprobe methods to trace the
   return address to become obsolete. With new technologies like CFI that
   need to know about these methods of hijacking the return address, going
   toward a solution that has only one method of doing this will make the
   kernel less complex.
 
 - Cleanup with guard() and free() helpers
 
   There were several places in the code that had a lot of "goto out" in the
   error paths to either unlock a lock or free some memory that was
   allocated. But this is error prone. Convert the code over to use the
   guard() and free() helpers that let the compiler unlock locks or free
   memory when the function exits.
 
 - Remove disabling of interrupts in the function graph tracer
 
   When function graph tracer was first introduced, it could race with
   interrupts and NMIs. To prevent that race, it would disable interrupts and
   not trace NMIs. But the code has changed to allow NMIs and also
   interrupts. This change was done a long time ago, but the disabling of
   interrupts was never removed. Remove the disabling of interrupts in the
   function graph tracer is it is not needed. This greatly improves its
   performance.
 
 - Allow the :mod: command to enable tracing module functions on the kernel
   command line.
 
   The function tracer already has a way to enable functions to be traced in
   modules by writing ":mod:<module>" into set_ftrace_filter. That will
   enable either all the functions for the module if it is loaded, or if it
   is not, it will cache that command, and when the module is loaded that
   matches <module>, its functions will be enabled. This also allows init
   functions to be traced. But currently events do not have that feature.
 
   Because enabling function tracing can be done very early at boot up
   (before scheduling is enabled), the commands that can be done when
   function tracing is started is limited. Having the ":mod:" command to
   trace module functions as they are loaded is very useful. Update the
   kernel command line function filtering to allow it.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ42E2RQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qqXSAPwOMxuhye8tb1GYG62QD9+w7e6nOmlC
 2GCPj4detnEM2QD/ciivkhespVKhHpZHRewAuSnJgHPSM45NQ3EVESzjWQ4=
 =snbx
 -----END PGP SIGNATURE-----

Merge tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ftrace updates from Steven Rostedt:

 - Have fprobes built on top of function graph infrastructure

   The fprobe logic is an optimized kprobe that uses ftrace to attach to
   functions when a probe is needed at the start or end of the function.
   The fprobe and kretprobe logic implements a similar method as the
   function graph tracer to trace the end of the function. That is to
   hijack the return address and jump to a trampoline to do the trace
   when the function exits. To do this, a shadow stack needs to be
   created to store the original return address. Fprobes and function
   graph do this slightly differently. Fprobes (and kretprobes) has
   slots per callsite that are reserved to save the return address. This
   is fine when just a few points are traced. But users of fprobes, such
   as BPF programs, are starting to add many more locations, and this
   method does not scale.

   The function graph tracer was created to trace all functions in the
   kernel. In order to do this, when function graph tracing is started,
   every task gets its own shadow stack to hold the return address that
   is going to be traced. The function graph tracer has been updated to
   allow multiple users to use its infrastructure. Now have fprobes be
   one of those users. This will also allow for the fprobe and kretprobe
   methods to trace the return address to become obsolete. With new
   technologies like CFI that need to know about these methods of
   hijacking the return address, going toward a solution that has only
   one method of doing this will make the kernel less complex.

 - Cleanup with guard() and free() helpers

   There were several places in the code that had a lot of "goto out" in
   the error paths to either unlock a lock or free some memory that was
   allocated. But this is error prone. Convert the code over to use the
   guard() and free() helpers that let the compiler unlock locks or free
   memory when the function exits.

 - Remove disabling of interrupts in the function graph tracer

   When function graph tracer was first introduced, it could race with
   interrupts and NMIs. To prevent that race, it would disable
   interrupts and not trace NMIs. But the code has changed to allow NMIs
   and also interrupts. This change was done a long time ago, but the
   disabling of interrupts was never removed. Remove the disabling of
   interrupts in the function graph tracer is it is not needed. This
   greatly improves its performance.

 - Allow the :mod: command to enable tracing module functions on the
   kernel command line.

   The function tracer already has a way to enable functions to be
   traced in modules by writing ":mod:<module>" into set_ftrace_filter.
   That will enable either all the functions for the module if it is
   loaded, or if it is not, it will cache that command, and when the
   module is loaded that matches <module>, its functions will be
   enabled. This also allows init functions to be traced. But currently
   events do not have that feature.

   Because enabling function tracing can be done very early at boot up
   (before scheduling is enabled), the commands that can be done when
   function tracing is started is limited. Having the ":mod:" command to
   trace module functions as they are loaded is very useful. Update the
   kernel command line function filtering to allow it.

* tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits)
  ftrace: Implement :mod: cache filtering on kernel command line
  tracing: Adopt __free() and guard() for trace_fprobe.c
  bpf: Use ftrace_get_symaddr() for kprobe_multi probes
  ftrace: Add ftrace_get_symaddr to convert fentry_ip to symaddr
  Documentation: probes: Update fprobe on function-graph tracer
  selftests/ftrace: Add a test case for repeating register/unregister fprobe
  selftests: ftrace: Remove obsolate maxactive syntax check
  tracing/fprobe: Remove nr_maxactive from fprobe
  fprobe: Add fprobe_header encoding feature
  fprobe: Rewrite fprobe on function-graph tracer
  s390/tracing: Enable HAVE_FTRACE_GRAPH_FUNC
  ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC
  bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled
  tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
  tracing: Add ftrace_fill_perf_regs() for perf event
  tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  fprobe: Use ftrace_regs in fprobe exit handler
  fprobe: Use ftrace_regs in fprobe entry handler
  fgraph: Pass ftrace_regs to retfunc
  fgraph: Replace fgraph_ret_regs with ftrace_regs
  ...
2025-01-21 15:15:28 -08:00
Linus Torvalds
4c551165e7 Updates for the interrupt subsystem:
- Consolidation of the machine_kexec_mask_interrupts() by providing a
     generic implementation and replacing the copy & pasta orgy in the
     relevant architectures.
 
   - Prevent unconditional operations on interrupt chips during kexec
     shutdown, which can trigger warnings in certain cases when the
     underlying interrupt has been shut down before.
 
   - Make the enforcement of interrupt handling in interrupt context
     unconditionally available, so that it actually works for non x86
     related interrupt chips. The earlier enablement for ARM GIC chips set
     the required chip flag, but did not notice that the check was hidden
     behind a config switch which is not selected by ARM[64].
 
   - Decrapify the handling of deferred interrupt affinity setting. Some
     interrupt chips require that affinity changes are made from the context
     of handling an interrupt to avoid certain race conditions. For x86 this
     was the default, but with interrupt remapping this requirement was
     lifted and a flag was introduced which tells the core code that
     affinity changes can be done in any context. Unrestricted affinity
     changes are the default for the majority of interrupt chips. RISCV has
     the requirement to add the deferred mode to one of it's interrupt
     controllers, but with the original implementation this would require to
     add the any context flag to all other RISC-V interrupt chips. That's
     backwards, so reverse the logic and require that chips, which need the
     deferred mode have to be marked accordingly. That avoids chasing the
     'sane' chips and marking them.
 
   - Add multi-node support to the Loongarch AVEC interrupt controller
     driver.
 
   - The usual tiny cleanups, fixes and improvements all over the place.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmePkVITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoRbQD/9bHVph/V9Ekl7JAX3aY4gG4JbRhOc7
 dp1VAcHRhktRfoTztYRbjsbMu2nvZ58GKA8bkOS2jHSF/m3PbkIJfOhwk0YdIAoa
 +kdy5yDgqCGfkqW43DN4Cr+CnzGjWMitw67tFp3fhwehMDpDjdt2L28IjtanSS0f
 hO6FV7o65MWeJwxk4Isb2/nvkO+X23Lrp6RrWS8SXBnF9FFXxiPIg/fiOPTizhCh
 1W/bSGxLLb9WwsVzmlGAKVFlXDij0QGaIUug2fdVZ63OsELXD7tJrLSPG133yk92
 ppIa0s6BT4IBsfM00us4hG15PkLuJmP3yWWcoquG0rP8Wq58VOXiN6+rcJIyvB+5
 mWceTH6IKfZGoRQKwXC7BxeBAIb147reiJtb06meq1/8ADIvzafiNy0c8x9i/UaV
 QiyhPVENjaGCGDomZmJQqN7Yb02Wge1k8InQnodDrHxZNl/bX/B1Z8Bxd0n6hPHg
 NSJXYif2AxgaddpohsdygqRDbT6SNyQdj7YjJFY5qAGJ3yFyJ4JB6WTqkWW4o1vH
 3FVqdAnJmejAmmYSkah0Hkem2T5QASQmTWb93PLxiV6q+d0NM8stWAujjyVdIV/B
 W4Uj9mQ20cz54TjLtxqX+A1k6KcqOWRgh1l2QbUlFsgsOP3V8yz47yqYdR9qMWlO
 9kNEjI3sw+G/IQ==
 =q4rj
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull interrupt subsystem updates from Thomas Gleixner:

 - Consolidate the machine_kexec_mask_interrupts() by providing a
   generic implementation and replacing the copy & pasta orgy in the
   relevant architectures.

 - Prevent unconditional operations on interrupt chips during kexec
   shutdown, which can trigger warnings in certain cases when the
   underlying interrupt has been shut down before.

 - Make the enforcement of interrupt handling in interrupt context
   unconditionally available, so that it actually works for non x86
   related interrupt chips. The earlier enablement for ARM GIC chips set
   the required chip flag, but did not notice that the check was hidden
   behind a config switch which is not selected by ARM[64].

 - Decrapify the handling of deferred interrupt affinity setting.

   Some interrupt chips require that affinity changes are made from the
   context of handling an interrupt to avoid certain race conditions.
   For x86 this was the default, but with interrupt remapping this
   requirement was lifted and a flag was introduced which tells the core
   code that affinity changes can be done in any context. Unrestricted
   affinity changes are the default for the majority of interrupt chips.

   RISCV has the requirement to add the deferred mode to one of it's
   interrupt controllers, but with the original implementation this
   would require to add the any context flag to all other RISC-V
   interrupt chips. That's backwards, so reverse the logic and require
   that chips, which need the deferred mode have to be marked
   accordingly. That avoids chasing the 'sane' chips and marking them.

 - Add multi-node support to the Loongarch AVEC interrupt controller
   driver.

 - The usual tiny cleanups, fixes and improvements all over the place.

* tag 'irq-core-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/generic_chip: Export irq_gc_mask_disable_and_ack_set()
  genirq/timings: Add kernel-doc for a function parameter
  genirq: Remove IRQ_MOVE_PCNTXT and related code
  x86/apic: Convert to IRQCHIP_MOVE_DEFERRED
  genirq: Provide IRQCHIP_MOVE_DEFERRED
  hexagon: Remove GENERIC_PENDING_IRQ leftover
  ARC: Remove GENERIC_PENDING_IRQ
  genirq: Remove handle_enforce_irqctx() wrapper
  genirq: Make handle_enforce_irqctx() unconditionally available
  irqchip/loongarch-avec: Add multi-nodes topology support
  irqchip/ts4800: Replace seq_printf() by seq_puts()
  irqchip/ti-sci-inta : Add module build support
  irqchip/ti-sci-intr: Add module build support
  irqchip/irq-brcmstb-l2: Replace brcmstb_l2_mask_and_ack() by generic function
  irqchip: keystone: Use syscon_regmap_lookup_by_phandle_args
  genirq/kexec: Prevent redundant IRQ masking by checking state before shutdown
  kexec: Consolidate machine_kexec_mask_interrupts() implementation
  genirq: Reuse irq_thread_fn() for forced thread case
  genirq: Move irq_thread_fn() further up in the code
2025-01-21 13:51:07 -08:00
Valentina Fernandez
4783ce32b0 riscv: export __cpuid_to_hartid_map
EXPORT_SYMBOL_GPL() is missing for __cpuid_to_hartid_map array.
Export this symbol to allow drivers compiled as modules to use
cpuid_to_hartid_map().

Signed-off-by: Valentina Fernandez <valentina.fernandezalanis@microchip.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2025-01-20 10:25:11 -06:00
Palmer Dabbelt
2613c15b0c
Merge patch series "riscv: Add support for xtheadvector"
Charlie Jenkins <charlie@rivosinc.com> says:

xtheadvector is a custom extension that is based upon riscv vector
version 0.7.1 [1]. All of the vector routines have been modified to
support this alternative vector version based upon whether xtheadvector
was determined to be supported at boot.

vlenb is not supported on the existing xtheadvector hardware, so a
devicetree property thead,vlenb is added to provide the vlenb to Linux.

There is a new hwprobe key RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 that is
used to request which thead vendor extensions are supported on the
current platform. This allows future vendors to allocate hwprobe keys
for their vendor.

Support for xtheadvector is also added to the vector kselftests.

[1] 95358cb2cc/xtheadvector.adoc

* b4-shazam-merge:
  riscv: Add ghostwrite vulnerability
  selftests: riscv: Support xtheadvector in vector tests
  selftests: riscv: Fix vector tests
  riscv: hwprobe: Document thead vendor extensions and xtheadvector extension
  riscv: hwprobe: Add thead vendor extension probing
  riscv: vector: Support xtheadvector save/restore
  riscv: Add xtheadvector instruction definitions
  riscv: csr: Add CSR encodings for CSR_VXRM/CSR_VXSAT
  RISC-V: define the elements of the VCSR vector CSR
  riscv: vector: Use vlenb from DT for thead
  riscv: Add thead and xtheadvector as a vendor extension
  riscv: dts: allwinner: Add xtheadvector to the D1/D1s devicetree
  dt-bindings: cpus: add a thead vlen register length property
  dt-bindings: riscv: Add xtheadvector ISA extension description

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-0-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:43 -08:00
Charlie Jenkins
4bf9706923
riscv: Add ghostwrite vulnerability
Follow the patterns of the other architectures that use
GENERIC_CPU_VULNERABILITIES for riscv to introduce the ghostwrite
vulnerability and mitigation. The mitigation is to disable all vector
which is accomplished by clearing the bit from the cpufeature field.

Ghostwrite only affects thead c9xx CPUs that impelment xtheadvector, so
the vulerability will only be mitigated on these CPUs.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-14-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:39 -08:00
Charlie Jenkins
a5ea53da65
riscv: hwprobe: Add thead vendor extension probing
Add a new hwprobe key "RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0" which
allows userspace to probe for the new RISCV_ISA_VENDOR_EXT_XTHEADVECTOR
vendor extension.

This new key will allow userspace code to probe for which thead vendor
extensions are supported. This API is modeled to be consistent with
RISCV_HWPROBE_KEY_IMA_EXT_0. The bitmask returned will have each bit
corresponding to a supported thead vendor extension of the cpumask set.
Just like RISCV_HWPROBE_KEY_IMA_EXT_0, this allows a userspace program
to determine all of the supported thead vendor extensions in one call.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-10-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:35 -08:00
Charlie Jenkins
d863910eab
riscv: vector: Support xtheadvector save/restore
Use alternatives to add support for xtheadvector vector save/restore
routines.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-9-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:33 -08:00
Charlie Jenkins
377be47f90
riscv: vector: Use vlenb from DT for thead
If thead,vlenb is provided in the device tree, prefer that over reading
the vlenb csr.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-5-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:29 -08:00
Charlie Jenkins
cddd63869f
riscv: Add thead and xtheadvector as a vendor extension
Add support to the kernel for THead vendor extensions with the target of
the new extension xtheadvector.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/20241113-xtheadvector-v11-4-236c22791ef9@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:28 -08:00
Palmer Dabbelt
9d87cf525f
RISC-V: Mark riscv_v_init() as __init
This trips up with Xtheadvector enabled, but as far as I can tell it's
just been an issue since the original patchset.

Fixes: 7ca7a7b9b6 ("riscv: Add sysctl to set the default vector rule for new processes")
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20250115180251.31444-1-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-18 12:33:25 -08:00
Clément Léger
5cd900b8b7
riscv: use local label names instead of global ones in assembly
Local labels should be prefix by '.L' or they'll be exported in the
symbol table. Additionally, this messes up the backtrace by displaying
an incorrect symbol:

  ...
  [   12.751810] [<ffffffff80441628>] _copy_from_user+0x28/0xc2
  [   12.752035] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc
  [   12.752310] [<ffffffff80a033e8>] do_trap_load_misaligned+0x24/0xee
  [   12.752596] [<ffffffff80a0dcae>] _new_vmalloc_restore_context_a0+0xc2/0xce

After:
  ...
  [   10.243916] [<ffffffff804415e4>] _copy_from_user+0x28/0xc2
  [   10.244026] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc
  [   10.244150] [<ffffffff80a033a0>] do_trap_load_misaligned+0x24/0xee
  [   10.244268] [<ffffffff80a0dc66>] handle_exception+0x146/0x152

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Fixes: 503638e0ba ("riscv: Stop emitting preventive sfence.vma for new vmalloc mappings")
Link: https://lore.kernel.org/r/20250103141814.508865-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:46:14 -08:00
Clément Léger
51356ce60e
riscv: stacktrace: fix backtracing through exceptions
Prior to commit 5d5fc33ce5 ("riscv: Improve exception and system call
latency"), backtrace through exception worked since ra was filled with
ret_from_exception symbol address and the stacktrace code checked 'pc' to
be equal to that symbol. Now that handle_exception uses regular 'call'
instructions, this isn't working anymore and backtrace stops at
handle_exception(). Since there are multiple call site to C code in the
exception handling path, rather than checking multiple potential return
addresses, add a new symbol at the end of exception handling and check pc
to be in that range.

Fixes: 5d5fc33ce5 ("riscv: Improve exception and system call latency")
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241209155714.1239665-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:45:49 -08:00
Nam Cao
13134cc949
riscv: kprobes: Fix incorrect address calculation
p->ainsn.api.insn is a pointer to u32, therefore arithmetic operations are
multiplied by four. This is clearly undesirable for this case.

Cast it to (void *) first before any calculation.

Below is a sample before/after. The dumped memory is two kprobe slots, the
first slot has

  - c.addiw a0, 0x1c (0x7125)
  - ebreak           (0x00100073)

and the second slot has:

  - c.addiw a0, -4   (0x7135)
  - ebreak           (0x00100073)

Before this patch:

(gdb) x/16xh 0xff20000000135000
0xff20000000135000:	0x7125	0x0000	0x0000	0x0000	0x7135	0x0010	0x0000	0x0000
0xff20000000135010:	0x0073	0x0010	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000

After this patch:

(gdb) x/16xh 0xff20000000125000
0xff20000000125000:	0x7125	0x0073	0x0010	0x0000	0x7135	0x0073	0x0010	0x0000
0xff20000000125010:	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000	0x0000

Fixes: b1756750a3 ("riscv: kprobes: Use patch_text_nosync() for insn slots")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241119111056.2554419-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:39:39 -08:00
Nam Cao
6a97f4118a
riscv: Fix sleeping in invalid context in die()
die() can be called in exception handler, and therefore cannot sleep.
However, die() takes spinlock_t which can sleep with PREEMPT_RT enabled.
That causes the following warning:

BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 285, name: mutex
preempt_count: 110001, expected: 0
RCU nest depth: 0, expected: 0
CPU: 0 UID: 0 PID: 285 Comm: mutex Not tainted 6.12.0-rc7-00022-ge19049cf7d56-dirty #234
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
    dump_backtrace+0x1c/0x24
    show_stack+0x2c/0x38
    dump_stack_lvl+0x5a/0x72
    dump_stack+0x14/0x1c
    __might_resched+0x130/0x13a
    rt_spin_lock+0x2a/0x5c
    die+0x24/0x112
    do_trap_insn_illegal+0xa0/0xea
    _new_vmalloc_restore_context_a0+0xcc/0xd8
Oops - illegal instruction [#1]

Switch to use raw_spinlock_t, which does not sleep even with PREEMPT_RT
enabled.

Fixes: 76d2a0493a ("RISC-V: Init and Halt Code")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20241118091333.1185288-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:23:17 -08:00
Clément Léger
03f0b54853
riscv: module: remove relocation_head rel_entry member allocation
relocation_head's list_head member, rel_entry, doesn't need to be
allocated, its storage can just be part of the allocated relocation_head.
Remove the pointer which allows to get rid of the allocation as well as
an existing memory leak found by Kai Zhang using kmemleak.

Fixes: 8fd6c51423 ("riscv: Add remaining module relocations")
Reported-by: Kai Zhang <zhangkai@iscas.ac.cn>
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241128081636.3620468-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08 10:22:52 -08:00
Masami Hiramatsu (Google)
a3ed4157b7 fgraph: Replace fgraph_ret_regs with ftrace_regs
Use ftrace_regs instead of fgraph_ret_regs for tracing return value
on function_graph tracer because of simplifying the callback interface.

The CONFIG_HAVE_FUNCTION_GRAPH_RETVAL is also replaced by
CONFIG_HAVE_FUNCTION_GRAPH_FREGS.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: bpf <bpf@vger.kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/173518991508.391279.16635322774382197642.stgit@devnote2
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-26 10:50:02 -05:00
Masami Hiramatsu (Google)
41705c4262 fgraph: Pass ftrace_regs to entryfunc
Pass ftrace_regs to the fgraph_ops::entryfunc(). If ftrace_regs is not
available, it passes a NULL instead. User callback function can access
some registers (including return address) via this ftrace_regs.

Note that the ftrace_regs can be NULL when the arch does NOT define:
HAVE_DYNAMIC_FTRACE_WITH_ARGS or HAVE_DYNAMIC_FTRACE_WITH_REGS.
More specifically, if HAVE_DYNAMIC_FTRACE_WITH_REGS is defined but
not the HAVE_DYNAMIC_FTRACE_WITH_ARGS, and the ftrace ops used to
register the function callback does not set FTRACE_OPS_FL_SAVE_REGS.
In this case, ftrace_regs can be NULL in user callback.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: bpf <bpf@vger.kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Naveen N Rao <naveen@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/173518990044.391279.17406984900626078579.stgit@devnote2
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-12-26 10:50:02 -05:00
Alexandre Ghiti
c796e18720
riscv: Fix wrong usage of __pa() on a fixmap address
riscv uses fixmap addresses to map the dtb so we can't use __pa() which
is reserved for linear mapping addresses.

Fixes: b2473a3597 ("of/fdt: add dt_phys arg to early_init_dt_scan and early_init_dt_verify")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241209074508.53037-1-alexghiti@rivosinc.com
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-11 11:43:44 -08:00
Guo Ren
b3134b8c1a
riscv: Fixup boot failure when CONFIG_DEBUG_RT_MUTEXES=y
When CONFIG_DEBUG_RT_MUTEXES=y, mutex_lock->rt_mutex_try_acquire
would change from rt_mutex_cmpxchg_acquire to
rt_mutex_slowtrylock():
	raw_spin_lock_irqsave(&lock->wait_lock, flags);
	ret = __rt_mutex_slowtrylock(lock);
	raw_spin_unlock_irqrestore(&lock->wait_lock, flags);

Because queued_spin_#ops to ticket_#ops is changed one by one by
jump_label, raw_spin_lock/unlock would cause a deadlock during the
changing.

That means in arch/riscv/kernel/jump_label.c:
1.
arch_jump_label_transform_queue() ->
mutex_lock(&text_mutex); +-> raw_spin_lock  -> queued_spin_lock
			 |-> raw_spin_unlock -> queued_spin_unlock
patch_insn_write -> change the raw_spin_lock to ticket_lock
mutex_unlock(&text_mutex);
...

2. /* Dirty the lock value */
arch_jump_label_transform_queue() ->
mutex_lock(&text_mutex); +-> raw_spin_lock -> *ticket_lock*
                         |-> raw_spin_unlock -> *queued_spin_unlock*
			  /* BUG: ticket_lock with queued_spin_unlock */
patch_insn_write  ->  change the raw_spin_unlock to ticket_unlock
mutex_unlock(&text_mutex);
...

3. /* Dead lock */
arch_jump_label_transform_queue() ->
mutex_lock(&text_mutex); +-> raw_spin_lock -> ticket_lock /* deadlock! */
                         |-> raw_spin_unlock -> ticket_unlock
patch_insn_write -> change other raw_spin_#op -> ticket_#op
mutex_unlock(&text_mutex);

So, the solution is to disable mutex usage of
arch_jump_label_transform_queue() during early_boot_irqs_disabled, just
like we have done for stop_machine.

Reported-by: Conor Dooley <conor@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Fixes: ab83647fad ("riscv: Add qspinlock support")
Link: https://lore.kernel.org/linux-riscv/CAJF2gTQwYTGinBmCSgVUoPv0_q4EPt_+WiyfUA1HViAKgUzxAg@mail.gmail.com/T/#mf488e6347817fca03bb93a7d34df33d8615b3775
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241130153310.3349484-1-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-11 11:43:39 -08:00
Eliav Farber
bad6722e47 kexec: Consolidate machine_kexec_mask_interrupts() implementation
Consolidate the machine_kexec_mask_interrupts implementation into a common
function located in a new file: kernel/irq/kexec.c. This removes duplicate
implementations from architecture-specific files in arch/arm, arch/arm64,
arch/powerpc, and arch/riscv, reducing code duplication and improving
maintainability.

The new implementation retains architecture-specific behavior for
CONFIG_GENERIC_IRQ_KEXEC_CLEAR_VM_FORWARD, which was previously implemented
for ARM64. When enabled (currently for ARM64), it clears the active state
of interrupts forwarded to virtual machines (VMs) before handling other
interrupt masking operations.

Signed-off-by: Eliav Farber <farbere@amazon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241204142003.32859-2-farbere@amazon.com
2024-12-11 20:32:34 +01:00
Linus Torvalds
c4bb3a2d64 ARM:
* Fixes.
 
 RISC-V:
 
 * Svade and Svadu (accessed and dirty bit) extension support for host and
   guest.  This was acked on the mailing list by the RISC-V maintainer, see
   https://patchew.org/linux/20240726084931.28924-1-yongxuan.wang@sifive.com/.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmdKS0QUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroP7hggAmt5CJesFGIuDwQgJX1KuNWAS84AX
 Oq5SPZLH0XjE5YDm6AusSzvbtOhRM6mARU5/iqMRE6Mqpf4MXpP9tOo6xaDiL7+m
 bOFsDYEO73WQyrIfFUCZ7dXiTbDVtQfNH8Z1yQwHPsa1d+WDYY3tLbCe5qCdqYMF
 JDiB7K0cQzPDmhCwf3Zf8mW2ZRI0QsTqiuFUfVGGNgFDspWfBFBqkLCkrMNmbp9z
 ye375oKb2VCe6OBJCY+Nl6tdoBUkz+CtZDCxkxuh0Uk4NmsUC9JMye9iwgU9DuI7
 nagFuvpUGcgbZvrx1ly47TL+wcEFLwnBJ0xBZTGIgVoZHj/wX9GM+tSgIw==
 =semZ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm updates from Paolo Bonzini:

 - ARM fixes

 - RISC-V Svade and Svadu (accessed and dirty bit) extension support for
   host and guest

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: riscv: selftests: Add Svade and Svadu Extension to get-reg-list test
  RISC-V: KVM: Add Svade and Svadu Extensions Support for Guest/VM
  dt-bindings: riscv: Add Svade and Svadu Entries
  RISC-V: Add Svade and Svadu Extensions Support
  KVM: arm64: Use MDCR_EL2.HPME to evaluate overflow of hyp counters
  KVM: arm64: Ignore PMCNTENSET_EL0 while checking for overflow status
  KVM: arm64: Mark set_sysreg_masks() as inline to avoid build failure
  KVM: arm64: vgic-its: Add stronger type-checking to the ITS entry sizes
  KVM: arm64: vgic: Kill VGIC_MAX_PRIVATE definition
  KVM: arm64: vgic: Make vgic_get_irq() more robust
  KVM: arm64: vgic-v3: Sanitise guest writes to GICR_INVLPIR
2024-11-30 14:51:08 -08:00
Linus Torvalds
91dbbe6c9f RISC-V Paches for the 6.13 Merge Window, Part 1
* Support for pointer masking in userspace,
 * Support for probing vector misaligned access performance.
 * Support for qspinlock on systems with Zacas and Zabha.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmdHNu4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiZW7D/oCjSIdBHZ6OJN8vATRn2FoHedMgKzE
 8OF0EXX85+PNmznxzyUirPerfQPcog5422vCKLUR5h8QD0x3wdMH8gUaV0Wa11k8
 ldXlV903k7gJLtJMnww2Eiha7kds5XpNWsWBTU0sBAxt2mMUE2VlloBY5YM/fitJ
 3TUihA7vyic5J0H3H4VrkuEoFnN4Xl9WclbwCYFg0uKmiogqXCe5LKey5/JjLpDR
 2DdFe/7PRjQMuUNVrNO4Vm+/YD1nwRdg5ukvIl42KINHWKyn1hl23cKsFobrilw5
 GyMbTzP4hBhy3kpX+zjWPpvTyoHSww7iJK6AvkvgQk/gua8M6abLJheachY/Ciz1
 lJy4okB8H2LtZwMYlJiIXBQzKE1qCwNA1/m24y8SUYQXvjxwGZxaPXAyWvvqBxOP
 /q/jQYfCiQi/h7BncMv9F8cxkU3J8cglzmxTKlM5Rf5YKdOzMyf4t0sm2pPsFX2l
 V4xjZQNMDJ1IHGnRbeMTOqHN6iKymyj8BKph5kATO5W9gq4tWXRSEIPfuGJMq2jq
 T64RweOdHlBPhiXu4hMmRXgT2rNBfTuaqEsVgXAZWkPmqum9uDPjBBiJ89bQO6pk
 dJl7jVJ27HKSd4zLwnxSGCsVahirF4CCtULRam08500Gfz6dEarD7shZznd86cEg
 QiBXqK5W6IWyJw==
 =ND+J
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.13-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-v updates from Palmer Dabbelt:

 - Support for pointer masking in userspace

 - Support for probing vector misaligned access performance

 - Support for qspinlock on systems with Zacas and Zabha

* tag 'riscv-for-linus-6.13-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (38 commits)
  RISC-V: Remove unnecessary include from compat.h
  riscv: Fix default misaligned access trap
  riscv: Add qspinlock support
  dt-bindings: riscv: Add Ziccrse ISA extension description
  riscv: Add ISA extension parsing for Ziccrse
  asm-generic: ticket-lock: Add separate ticket-lock.h
  asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
  riscv: Implement xchg8/16() using Zabha
  riscv: Implement arch_cmpxchg128() using Zacas
  riscv: Improve zacas fully-ordered cmpxchg()
  riscv: Implement cmpxchg8/16() using Zabha
  dt-bindings: riscv: Add Zabha ISA extension description
  riscv: Implement cmpxchg32/64() using Zacas
  riscv: Do not fail to build on byte/halfword operations with Zawrs
  riscv: Move cpufeature.h macros into their own header
  KVM: riscv: selftests: Add Smnpm and Ssnpm to get-reg-list test
  RISC-V: KVM: Allow Smnpm and Ssnpm extensions for guests
  riscv: hwprobe: Export the Supm ISA extension
  riscv: selftests: Add a pointer masking test
  riscv: Allow ptrace control of the tagged address ABI
  ...
2024-11-27 11:19:09 -08:00
Paolo Bonzini
4d911c7abe KVM/riscv changes for 6.13 part #2
- Svade and Svadu extension support for Host and Guest/VM
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmc/KesACgkQrUjsVaLH
 LAcuuQ/9Gv8qezVw5TiV3BiusRns50PVIVZA12lrLLXrjiUzuo0zbIRTozeZbTzb
 0HMuS8isfgNkRmj35ZXQ1nzeckf3GN3j0f/TYeAVDUCj5sARimcRzSL+k9KC62aW
 NxvmHsq+y4jlCr7V4viy9pkHrj2oNO4m6HhiAtgXfATXWYRtKOTkEKVYIPg3c77D
 U7FeUEA1ege6xKK5U+v2gpIZHOXv13RaXQqmZ3j+JfDyzIhakYSKE/mRslX3z4ch
 9SKSUt2PqustyT5qdmM1jt9+q6k1QQw2fWWnxJb6AGsmDfwkEEvPKuok74TUDVPa
 V4mnesTIf8wPjzw4n5XSqf5i+67WuNjBBtxMWhWGmcfpLbC9y0gE+lTaRZrXU7fL
 66VXVDK7YhTsWCJ9sJfRXnGAOtHcIp20lQwemdRYCSMRbD8I6OWwfBbXfo3pUo85
 8mCZtstySXAyr2pnfX44kuav70D46qWyiTm0o4WSVeCExtny0qslJLmKoYgjsB+T
 7eJCrpxcmsTXBMYST6AXiAg5GUVj4DoCzMDT2DcsGoa2DM1RLc8qlwSrsD4jdSdF
 Jc8HSOc54mSUvaE31LWwd1YJiQ20Nh8kfv5nmO7m9LB6ADiZOL1yL7tTsSxrMWSg
 l0lgJbjTRYTPVghXR/kkAD7jXEthWAFJ5knv5CT2bb6Qx7uWPpk=
 =juLq
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-6.13-2' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv changes for 6.13 part #2

- Svade and Svadu extension support for Host and Guest/VM
2024-11-27 12:00:28 -05:00
Paolo Bonzini
c1668520c9 RISC-V Paches for the 6.13 Merge Window, Part 1
* Support for pointer masking in userspace,
 * Support for probing vector misaligned access performance.
 * Support for qspinlock on systems with Zacas and Zabha.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmdHNu4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiZW7D/oCjSIdBHZ6OJN8vATRn2FoHedMgKzE
 8OF0EXX85+PNmznxzyUirPerfQPcog5422vCKLUR5h8QD0x3wdMH8gUaV0Wa11k8
 ldXlV903k7gJLtJMnww2Eiha7kds5XpNWsWBTU0sBAxt2mMUE2VlloBY5YM/fitJ
 3TUihA7vyic5J0H3H4VrkuEoFnN4Xl9WclbwCYFg0uKmiogqXCe5LKey5/JjLpDR
 2DdFe/7PRjQMuUNVrNO4Vm+/YD1nwRdg5ukvIl42KINHWKyn1hl23cKsFobrilw5
 GyMbTzP4hBhy3kpX+zjWPpvTyoHSww7iJK6AvkvgQk/gua8M6abLJheachY/Ciz1
 lJy4okB8H2LtZwMYlJiIXBQzKE1qCwNA1/m24y8SUYQXvjxwGZxaPXAyWvvqBxOP
 /q/jQYfCiQi/h7BncMv9F8cxkU3J8cglzmxTKlM5Rf5YKdOzMyf4t0sm2pPsFX2l
 V4xjZQNMDJ1IHGnRbeMTOqHN6iKymyj8BKph5kATO5W9gq4tWXRSEIPfuGJMq2jq
 T64RweOdHlBPhiXu4hMmRXgT2rNBfTuaqEsVgXAZWkPmqum9uDPjBBiJ89bQO6pk
 dJl7jVJ27HKSd4zLwnxSGCsVahirF4CCtULRam08500Gfz6dEarD7shZznd86cEg
 QiBXqK5W6IWyJw==
 =ND+J
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.13-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux into HEAD

RISC-V Paches for the 6.13 Merge Window, Part 1

* Support for pointer masking in userspace,
* Support for probing vector misaligned access performance.
* Support for qspinlock on systems with Zacas and Zabha.
2024-11-27 11:49:44 -05:00
Linus Torvalds
9f16d5e6f2 The biggest change here is eliminating the awful idea that KVM had, of
essentially guessing which pfns are refcounted pages.  The reason to
 do so was that KVM needs to map both non-refcounted pages (for example
 BARs of VFIO devices) and VM_PFNMAP/VM_MIXMEDMAP VMAs that contain
 refcounted pages.  However, the result was security issues in the past,
 and more recently the inability to map VM_IO and VM_PFNMAP memory
 that _is_ backed by struct page but is not refcounted.  In particular
 this broke virtio-gpu blob resources (which directly map host graphics
 buffers into the guest as "vram" for the virtio-gpu device) with the
 amdgpu driver, because amdgpu allocates non-compound higher order pages
 and the tail pages could not be mapped into KVM.
 
 This requires adjusting all uses of struct page in the per-architecture
 code, to always work on the pfn whenever possible.  The large series that
 did this, from David Stevens and Sean Christopherson, also cleaned up
 substantially the set of functions that provided arch code with the
 pfn for a host virtual addresses.  The previous maze of twisty little
 passages, all different, is replaced by five functions (__gfn_to_page,
 __kvm_faultin_pfn, the non-__ versions of these two, and kvm_prefetch_pages)
 saving almost 200 lines of code.
 
 ARM:
 
 * Support for stage-1 permission indirection (FEAT_S1PIE) and
   permission overlays (FEAT_S1POE), including nested virt + the
   emulated page table walker
 
 * Introduce PSCI SYSTEM_OFF2 support to KVM + client driver. This call
   was introduced in PSCIv1.3 as a mechanism to request hibernation,
   similar to the S4 state in ACPI
 
 * Explicitly trap + hide FEAT_MPAM (QoS controls) from KVM guests. As
   part of it, introduce trivial initialization of the host's MPAM
   context so KVM can use the corresponding traps
 
 * PMU support under nested virtualization, honoring the guest
   hypervisor's trap configuration and event filtering when running a
   nested guest
 
 * Fixes to vgic ITS serialization where stale device/interrupt table
   entries are not zeroed when the mapping is invalidated by the VM
 
 * Avoid emulated MMIO completion if userspace has requested synchronous
   external abort injection
 
 * Various fixes and cleanups affecting pKVM, vCPU initialization, and
   selftests
 
 LoongArch:
 
 * Add iocsr and mmio bus simulation in kernel.
 
 * Add in-kernel interrupt controller emulation.
 
 * Add support for virtualization extensions to the eiointc irqchip.
 
 PPC:
 
 * Drop lingering and utterly obsolete references to PPC970 KVM, which was
   removed 10 years ago.
 
 * Fix incorrect documentation references to non-existing ioctls
 
 RISC-V:
 
 * Accelerate KVM RISC-V when running as a guest
 
 * Perf support to collect KVM guest statistics from host side
 
 s390:
 
 * New selftests: more ucontrol selftests and CPU model sanity checks
 
 * Support for the gen17 CPU model
 
 * List registers supported by KVM_GET/SET_ONE_REG in the documentation
 
 x86:
 
 * Cleanup KVM's handling of Accessed and Dirty bits to dedup code, improve
   documentation, harden against unexpected changes.  Even if the hardware
   A/D tracking is disabled, it is possible to use the hardware-defined A/D
   bits to track if a PFN is Accessed and/or Dirty, and that removes a lot
   of special cases.
 
 * Elide TLB flushes when aging secondary PTEs, as has been done in x86's
   primary MMU for over 10 years.
 
 * Recover huge pages in-place in the TDP MMU when dirty page logging is
   toggled off, instead of zapping them and waiting until the page is
   re-accessed to create a huge mapping.  This reduces vCPU jitter.
 
 * Batch TLB flushes when dirty page logging is toggled off.  This reduces
   the time it takes to disable dirty logging by ~3x.
 
 * Remove the shrinker that was (poorly) attempting to reclaim shadow page
   tables in low-memory situations.
 
 * Clean up and optimize KVM's handling of writes to MSR_IA32_APICBASE.
 
 * Advertise CPUIDs for new instructions in Clearwater Forest
 
 * Quirk KVM's misguided behavior of initialized certain feature MSRs to
   their maximum supported feature set, which can result in KVM creating
   invalid vCPU state.  E.g. initializing PERF_CAPABILITIES to a non-zero
   value results in the vCPU having invalid state if userspace hides PDCM
   from the guest, which in turn can lead to save/restore failures.
 
 * Fix KVM's handling of non-canonical checks for vCPUs that support LA57
   to better follow the "architecture", in quotes because the actual
   behavior is poorly documented.  E.g. most MSR writes and descriptor
   table loads ignore CR4.LA57 and operate purely on whether the CPU
   supports LA57.
 
 * Bypass the register cache when querying CPL from kvm_sched_out(), as
   filling the cache from IRQ context is generally unsafe; harden the
   cache accessors to try to prevent similar issues from occuring in the
   future.  The issue that triggered this change was already fixed in 6.12,
   but was still kinda latent.
 
 * Advertise AMD_IBPB_RET to userspace, and fix a related bug where KVM
   over-advertises SPEC_CTRL when trying to support cross-vendor VMs.
 
 * Minor cleanups
 
 * Switch hugepage recovery thread to use vhost_task.  These kthreads can
   consume significant amounts of CPU time on behalf of a VM or in response
   to how the VM behaves (for example how it accesses its memory); therefore
   KVM tried to place the thread in the VM's cgroups and charge the CPU
   time consumed by that work to the VM's container.  However the kthreads
   did not process SIGSTOP/SIGCONT, and therefore cgroups which had KVM
   instances inside could not complete freezing.  Fix this by replacing the
   kthread with a PF_USER_WORKER thread, via the vhost_task abstraction.
   Another 100+ lines removed, with generally better behavior too like
   having these threads properly parented in the process tree.
 
 * Revert a workaround for an old CPU erratum (Nehalem/Westmere) that didn't
   really work; there was really nothing to work around anyway: the broken
   patch was meant to fix nested virtualization, but the PERF_GLOBAL_CTRL
   MSR is virtualized and therefore unaffected by the erratum.
 
 * Fix 6.12 regression where CONFIG_KVM will be built as a module even
   if asked to be builtin, as long as neither KVM_INTEL nor KVM_AMD is 'y'.
 
 x86 selftests:
 
 * x86 selftests can now use AVX.
 
 Documentation:
 
 * Use rST internal links
 
 * Reorganize the introduction to the API document
 
 Generic:
 
 * Protect vcpu->pid accesses outside of vcpu->mutex with a rwlock instead
   of RCU, so that running a vCPU on a different task doesn't encounter long
   due to having to wait for all CPUs become quiescent.  In general both reads
   and writes are rare, but userspace that supports confidential computing is
   introducing the use of "helper" vCPUs that may jump from one host processor
   to another.  Those will be very happy to trigger a synchronize_rcu(), and
   the effect on performance is quite the disaster.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmc9MRYUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroP00QgArxqxBIGLCW5t7bw7vtNq63QYRyh4
 dTiDguLiYQJ+AXmnRu11R6aPC7HgMAvlFCCmH+GEce4WEgt26hxCmncJr/aJOSwS
 letCS7TrME16PeZvh25A1nhPBUw6mTF1qqzgcdHMrqXG8LuHoGcKYGSRVbkf3kfI
 1ZoMq1r8ChXbVVmCx9DQ3gw1TVr5Dpjs2voLh8rDSE9Xpw0tVVabHu3/NhQEz/F+
 t8/nRaqH777icCHIf9PCk5HnarHxLAOvhM2M0Yj09PuBcE5fFQxpxltw/qiKQqqW
 ep4oquojGl87kZnhlDaac2UNtK90Ws+WxxvCwUmbvGN0ZJVaQwf4FvTwig==
 =lWpE
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "The biggest change here is eliminating the awful idea that KVM had of
  essentially guessing which pfns are refcounted pages.

  The reason to do so was that KVM needs to map both non-refcounted
  pages (for example BARs of VFIO devices) and VM_PFNMAP/VM_MIXMEDMAP
  VMAs that contain refcounted pages.

  However, the result was security issues in the past, and more recently
  the inability to map VM_IO and VM_PFNMAP memory that _is_ backed by
  struct page but is not refcounted. In particular this broke virtio-gpu
  blob resources (which directly map host graphics buffers into the
  guest as "vram" for the virtio-gpu device) with the amdgpu driver,
  because amdgpu allocates non-compound higher order pages and the tail
  pages could not be mapped into KVM.

  This requires adjusting all uses of struct page in the
  per-architecture code, to always work on the pfn whenever possible.
  The large series that did this, from David Stevens and Sean
  Christopherson, also cleaned up substantially the set of functions
  that provided arch code with the pfn for a host virtual addresses.

  The previous maze of twisty little passages, all different, is
  replaced by five functions (__gfn_to_page, __kvm_faultin_pfn, the
  non-__ versions of these two, and kvm_prefetch_pages) saving almost
  200 lines of code.

  ARM:

   - Support for stage-1 permission indirection (FEAT_S1PIE) and
     permission overlays (FEAT_S1POE), including nested virt + the
     emulated page table walker

   - Introduce PSCI SYSTEM_OFF2 support to KVM + client driver. This
     call was introduced in PSCIv1.3 as a mechanism to request
     hibernation, similar to the S4 state in ACPI

   - Explicitly trap + hide FEAT_MPAM (QoS controls) from KVM guests. As
     part of it, introduce trivial initialization of the host's MPAM
     context so KVM can use the corresponding traps

   - PMU support under nested virtualization, honoring the guest
     hypervisor's trap configuration and event filtering when running a
     nested guest

   - Fixes to vgic ITS serialization where stale device/interrupt table
     entries are not zeroed when the mapping is invalidated by the VM

   - Avoid emulated MMIO completion if userspace has requested
     synchronous external abort injection

   - Various fixes and cleanups affecting pKVM, vCPU initialization, and
     selftests

  LoongArch:

   - Add iocsr and mmio bus simulation in kernel.

   - Add in-kernel interrupt controller emulation.

   - Add support for virtualization extensions to the eiointc irqchip.

  PPC:

   - Drop lingering and utterly obsolete references to PPC970 KVM, which
     was removed 10 years ago.

   - Fix incorrect documentation references to non-existing ioctls

  RISC-V:

   - Accelerate KVM RISC-V when running as a guest

   - Perf support to collect KVM guest statistics from host side

  s390:

   - New selftests: more ucontrol selftests and CPU model sanity checks

   - Support for the gen17 CPU model

   - List registers supported by KVM_GET/SET_ONE_REG in the
     documentation

  x86:

   - Cleanup KVM's handling of Accessed and Dirty bits to dedup code,
     improve documentation, harden against unexpected changes.

     Even if the hardware A/D tracking is disabled, it is possible to
     use the hardware-defined A/D bits to track if a PFN is Accessed
     and/or Dirty, and that removes a lot of special cases.

   - Elide TLB flushes when aging secondary PTEs, as has been done in
     x86's primary MMU for over 10 years.

   - Recover huge pages in-place in the TDP MMU when dirty page logging
     is toggled off, instead of zapping them and waiting until the page
     is re-accessed to create a huge mapping. This reduces vCPU jitter.

   - Batch TLB flushes when dirty page logging is toggled off. This
     reduces the time it takes to disable dirty logging by ~3x.

   - Remove the shrinker that was (poorly) attempting to reclaim shadow
     page tables in low-memory situations.

   - Clean up and optimize KVM's handling of writes to
     MSR_IA32_APICBASE.

   - Advertise CPUIDs for new instructions in Clearwater Forest

   - Quirk KVM's misguided behavior of initialized certain feature MSRs
     to their maximum supported feature set, which can result in KVM
     creating invalid vCPU state. E.g. initializing PERF_CAPABILITIES to
     a non-zero value results in the vCPU having invalid state if
     userspace hides PDCM from the guest, which in turn can lead to
     save/restore failures.

   - Fix KVM's handling of non-canonical checks for vCPUs that support
     LA57 to better follow the "architecture", in quotes because the
     actual behavior is poorly documented. E.g. most MSR writes and
     descriptor table loads ignore CR4.LA57 and operate purely on
     whether the CPU supports LA57.

   - Bypass the register cache when querying CPL from kvm_sched_out(),
     as filling the cache from IRQ context is generally unsafe; harden
     the cache accessors to try to prevent similar issues from occuring
     in the future. The issue that triggered this change was already
     fixed in 6.12, but was still kinda latent.

   - Advertise AMD_IBPB_RET to userspace, and fix a related bug where
     KVM over-advertises SPEC_CTRL when trying to support cross-vendor
     VMs.

   - Minor cleanups

   - Switch hugepage recovery thread to use vhost_task.

     These kthreads can consume significant amounts of CPU time on
     behalf of a VM or in response to how the VM behaves (for example
     how it accesses its memory); therefore KVM tried to place the
     thread in the VM's cgroups and charge the CPU time consumed by that
     work to the VM's container.

     However the kthreads did not process SIGSTOP/SIGCONT, and therefore
     cgroups which had KVM instances inside could not complete freezing.

     Fix this by replacing the kthread with a PF_USER_WORKER thread, via
     the vhost_task abstraction. Another 100+ lines removed, with
     generally better behavior too like having these threads properly
     parented in the process tree.

   - Revert a workaround for an old CPU erratum (Nehalem/Westmere) that
     didn't really work; there was really nothing to work around anyway:
     the broken patch was meant to fix nested virtualization, but the
     PERF_GLOBAL_CTRL MSR is virtualized and therefore unaffected by the
     erratum.

   - Fix 6.12 regression where CONFIG_KVM will be built as a module even
     if asked to be builtin, as long as neither KVM_INTEL nor KVM_AMD is
     'y'.

  x86 selftests:

   - x86 selftests can now use AVX.

  Documentation:

   - Use rST internal links

   - Reorganize the introduction to the API document

  Generic:

   - Protect vcpu->pid accesses outside of vcpu->mutex with a rwlock
     instead of RCU, so that running a vCPU on a different task doesn't
     encounter long due to having to wait for all CPUs become quiescent.

     In general both reads and writes are rare, but userspace that
     supports confidential computing is introducing the use of "helper"
     vCPUs that may jump from one host processor to another. Those will
     be very happy to trigger a synchronize_rcu(), and the effect on
     performance is quite the disaster"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (298 commits)
  KVM: x86: Break CONFIG_KVM_X86's direct dependency on KVM_INTEL || KVM_AMD
  KVM: x86: add back X86_LOCAL_APIC dependency
  Revert "KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()"
  KVM: x86: switch hugepage recovery thread to vhost_task
  KVM: x86: expose MSR_PLATFORM_INFO as a feature MSR
  x86: KVM: Advertise CPUIDs for new instructions in Clearwater Forest
  Documentation: KVM: fix malformed table
  irqchip/loongson-eiointc: Add virt extension support
  LoongArch: KVM: Add irqfd support
  LoongArch: KVM: Add PCHPIC user mode read and write functions
  LoongArch: KVM: Add PCHPIC read and write functions
  LoongArch: KVM: Add PCHPIC device support
  LoongArch: KVM: Add EIOINTC user mode read and write functions
  LoongArch: KVM: Add EIOINTC read and write functions
  LoongArch: KVM: Add EIOINTC device support
  LoongArch: KVM: Add IPI user mode read and write function
  LoongArch: KVM: Add IPI read and write function
  LoongArch: KVM: Add IPI device support
  LoongArch: KVM: Add iocsr and mmio bus simulation in kernel
  KVM: arm64: Pass on SVE mapping failures
  ...
2024-11-23 16:00:50 -08:00
Linus Torvalds
5c00ff742b - The series "zram: optimal post-processing target selection" from
Sergey Senozhatsky improves zram's post-processing selection algorithm.
   This leads to improved memory savings.
 
 - Wei Yang has gone to town on the mapletree code, contributing several
   series which clean up the implementation:
 
 	- "refine mas_mab_cp()"
 	- "Reduce the space to be cleared for maple_big_node"
 	- "maple_tree: simplify mas_push_node()"
 	- "Following cleanup after introduce mas_wr_store_type()"
 	- "refine storing null"
 
 - The series "selftests/mm: hugetlb_fault_after_madv improvements" from
   David Hildenbrand fixes this selftest for s390.
 
 - The series "introduce pte_offset_map_{ro|rw}_nolock()" from Qi Zheng
   implements some rationaizations and cleanups in the page mapping code.
 
 - The series "mm: optimize shadow entries removal" from Shakeel Butt
   optimizes the file truncation code by speeding up the handling of shadow
   entries.
 
 - The series "Remove PageKsm()" from Matthew Wilcox completes the
   migration of this flag over to being a folio-based flag.
 
 - The series "Unify hugetlb into arch_get_unmapped_area functions" from
   Oscar Salvador implements a bunch of consolidations and cleanups in the
   hugetlb code.
 
 - The series "Do not shatter hugezeropage on wp-fault" from Dev Jain
   takes away the wp-fault time practice of turning a huge zero page into
   small pages.  Instead we replace the whole thing with a THP.  More
   consistent cleaner and potentiall saves a large number of pagefaults.
 
 - The series "percpu: Add a test case and fix for clang" from Andy
   Shevchenko enhances and fixes the kernel's built in percpu test code.
 
 - The series "mm/mremap: Remove extra vma tree walk" from Liam Howlett
   optimizes mremap() by avoiding doing things which we didn't need to do.
 
 - The series "Improve the tmpfs large folio read performance" from
   Baolin Wang teaches tmpfs to copy data into userspace at the folio size
   rather than as individual pages.  A 20% speedup was observed.
 
 - The series "mm/damon/vaddr: Fix issue in
   damon_va_evenly_split_region()" fro Zheng Yejian fixes DAMON splitting.
 
 - The series "memcg-v1: fully deprecate charge moving" from Shakeel Butt
   removes the long-deprecated memcgv2 charge moving feature.
 
 - The series "fix error handling in mmap_region() and refactor" from
   Lorenzo Stoakes cleanup up some of the mmap() error handling and
   addresses some potential performance issues.
 
 - The series "x86/module: use large ROX pages for text allocations" from
   Mike Rapoport teaches x86 to use large pages for read-only-execute
   module text.
 
 - The series "page allocation tag compression" from Suren Baghdasaryan
   is followon maintenance work for the new page allocation profiling
   feature.
 
 - The series "page->index removals in mm" from Matthew Wilcox remove
   most references to page->index in mm/.  A slow march towards shrinking
   struct page.
 
 - The series "damon/{self,kunit}tests: minor fixups for DAMON debugfs
   interface tests" from Andrew Paniakin performs maintenance work for
   DAMON's self testing code.
 
 - The series "mm: zswap swap-out of large folios" from Kanchana Sridhar
   improves zswap's batching of compression and decompression.  It is a
   step along the way towards using Intel IAA hardware acceleration for
   this zswap operation.
 
 - The series "kasan: migrate the last module test to kunit" from
   Sabyrzhan Tasbolatov completes the migration of the KASAN built-in tests
   over to the KUnit framework.
 
 - The series "implement lightweight guard pages" from Lorenzo Stoakes
   permits userapace to place fault-generating guard pages within a single
   VMA, rather than requiring that multiple VMAs be created for this.
   Improved efficiencies for userspace memory allocators are expected.
 
 - The series "memcg: tracepoint for flushing stats" from JP Kobryn uses
   tracepoints to provide increased visibility into memcg stats flushing
   activity.
 
 - The series "zram: IDLE flag handling fixes" from Sergey Senozhatsky
   fixes a zram buglet which potentially affected performance.
 
 - The series "mm: add more kernel parameters to control mTHP" from
   Maíra Canal enhances our ability to control/configuremultisize THP from
   the kernel boot command line.
 
 - The series "kasan: few improvements on kunit tests" from Sabyrzhan
   Tasbolatov has a couple of fixups for the KASAN KUnit tests.
 
 - The series "mm/list_lru: Split list_lru lock into per-cgroup scope"
   from Kairui Song optimizes list_lru memory utilization when lockdep is
   enabled.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZzwFqgAKCRDdBJ7gKXxA
 jkeuAQCkl+BmeYHE6uG0hi3pRxkupseR6DEOAYIiTv0/l8/GggD/Z3jmEeqnZaNq
 xyyenpibWgUoShU2wZ/Ha8FE5WDINwg=
 =JfWR
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-11-18-19-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - The series "zram: optimal post-processing target selection" from
   Sergey Senozhatsky improves zram's post-processing selection
   algorithm. This leads to improved memory savings.

 - Wei Yang has gone to town on the mapletree code, contributing several
   series which clean up the implementation:
	- "refine mas_mab_cp()"
	- "Reduce the space to be cleared for maple_big_node"
	- "maple_tree: simplify mas_push_node()"
	- "Following cleanup after introduce mas_wr_store_type()"
	- "refine storing null"

 - The series "selftests/mm: hugetlb_fault_after_madv improvements" from
   David Hildenbrand fixes this selftest for s390.

 - The series "introduce pte_offset_map_{ro|rw}_nolock()" from Qi Zheng
   implements some rationaizations and cleanups in the page mapping
   code.

 - The series "mm: optimize shadow entries removal" from Shakeel Butt
   optimizes the file truncation code by speeding up the handling of
   shadow entries.

 - The series "Remove PageKsm()" from Matthew Wilcox completes the
   migration of this flag over to being a folio-based flag.

 - The series "Unify hugetlb into arch_get_unmapped_area functions" from
   Oscar Salvador implements a bunch of consolidations and cleanups in
   the hugetlb code.

 - The series "Do not shatter hugezeropage on wp-fault" from Dev Jain
   takes away the wp-fault time practice of turning a huge zero page
   into small pages. Instead we replace the whole thing with a THP. More
   consistent cleaner and potentiall saves a large number of pagefaults.

 - The series "percpu: Add a test case and fix for clang" from Andy
   Shevchenko enhances and fixes the kernel's built in percpu test code.

 - The series "mm/mremap: Remove extra vma tree walk" from Liam Howlett
   optimizes mremap() by avoiding doing things which we didn't need to
   do.

 - The series "Improve the tmpfs large folio read performance" from
   Baolin Wang teaches tmpfs to copy data into userspace at the folio
   size rather than as individual pages. A 20% speedup was observed.

 - The series "mm/damon/vaddr: Fix issue in
   damon_va_evenly_split_region()" fro Zheng Yejian fixes DAMON
   splitting.

 - The series "memcg-v1: fully deprecate charge moving" from Shakeel
   Butt removes the long-deprecated memcgv2 charge moving feature.

 - The series "fix error handling in mmap_region() and refactor" from
   Lorenzo Stoakes cleanup up some of the mmap() error handling and
   addresses some potential performance issues.

 - The series "x86/module: use large ROX pages for text allocations"
   from Mike Rapoport teaches x86 to use large pages for
   read-only-execute module text.

 - The series "page allocation tag compression" from Suren Baghdasaryan
   is followon maintenance work for the new page allocation profiling
   feature.

 - The series "page->index removals in mm" from Matthew Wilcox remove
   most references to page->index in mm/. A slow march towards shrinking
   struct page.

 - The series "damon/{self,kunit}tests: minor fixups for DAMON debugfs
   interface tests" from Andrew Paniakin performs maintenance work for
   DAMON's self testing code.

 - The series "mm: zswap swap-out of large folios" from Kanchana Sridhar
   improves zswap's batching of compression and decompression. It is a
   step along the way towards using Intel IAA hardware acceleration for
   this zswap operation.

 - The series "kasan: migrate the last module test to kunit" from
   Sabyrzhan Tasbolatov completes the migration of the KASAN built-in
   tests over to the KUnit framework.

 - The series "implement lightweight guard pages" from Lorenzo Stoakes
   permits userapace to place fault-generating guard pages within a
   single VMA, rather than requiring that multiple VMAs be created for
   this. Improved efficiencies for userspace memory allocators are
   expected.

 - The series "memcg: tracepoint for flushing stats" from JP Kobryn uses
   tracepoints to provide increased visibility into memcg stats flushing
   activity.

 - The series "zram: IDLE flag handling fixes" from Sergey Senozhatsky
   fixes a zram buglet which potentially affected performance.

 - The series "mm: add more kernel parameters to control mTHP" from
   Maíra Canal enhances our ability to control/configuremultisize THP
   from the kernel boot command line.

 - The series "kasan: few improvements on kunit tests" from Sabyrzhan
   Tasbolatov has a couple of fixups for the KASAN KUnit tests.

 - The series "mm/list_lru: Split list_lru lock into per-cgroup scope"
   from Kairui Song optimizes list_lru memory utilization when lockdep
   is enabled.

* tag 'mm-stable-2024-11-18-19-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (215 commits)
  cma: enforce non-zero pageblock_order during cma_init_reserved_mem()
  mm/kfence: add a new kunit test test_use_after_free_read_nofault()
  zram: fix NULL pointer in comp_algorithm_show()
  memcg/hugetlb: add hugeTLB counters to memcg
  vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event
  mm: mmap_lock: check trace_mmap_lock_$type_enabled() instead of regcount
  zram: ZRAM_DEF_COMP should depend on ZRAM
  MAINTAINERS/MEMORY MANAGEMENT: add document files for mm
  Docs/mm/damon: recommend academic papers to read and/or cite
  mm: define general function pXd_init()
  kmemleak: iommu/iova: fix transient kmemleak false positive
  mm/list_lru: simplify the list_lru walk callback function
  mm/list_lru: split the lock to per-cgroup scope
  mm/list_lru: simplify reparenting and initial allocation
  mm/list_lru: code clean up for reparenting
  mm/list_lru: don't export list_lru_add
  mm/list_lru: don't pass unnecessary key parameters
  kasan: add kunit tests for kmalloc_track_caller, kmalloc_node_track_caller
  kasan: change kasan_atomics kunit test as KUNIT_CASE_SLOW
  kasan: use EXPORT_SYMBOL_IF_KUNIT to export symbols
  ...
2024-11-23 09:58:07 -08:00
Yong-Xuan Wang
94a7734d09 RISC-V: Add Svade and Svadu Extensions Support
Svade and Svadu extensions represent two schemes for managing the PTE A/D
bits. When the PTE A/D bits need to be set, Svade extension intdicates
that a related page fault will be raised. In contrast, the Svadu extension
supports hardware updating of PTE A/D bits. Since the Svade extension is
mandatory and the Svadu extension is optional in RVA23 profile, by default
the M-mode firmware will enable the Svadu extension in the menvcfg CSR
when only Svadu is present in DT.

This patch detects Svade and Svadu extensions from DT and adds
arch_has_hw_pte_young() to enable optimization in MGLRU and
__wp_page_copy_user() when we have the PTE A/D bits hardware updating
support.

Co-developed-by: Jinyu Tang <tjytimi@163.com>
Signed-off-by: Jinyu Tang <tjytimi@163.com>
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20240726084931.28924-2-yongxuan.wang@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-11-21 17:40:06 +05:30
Linus Torvalds
e6de688e93 Devicetree updates for v6.13:
Bindings:
 
 - Enable dtc "interrupt_provider" warnings for binding examples.
   Fix the warnings in fsl,mu-msi and ti,sci-inta due to this.
 
 - Convert zii,rave-sp-wdt, zii,rave-sp-pwrbutton,  and
   altr,fpga-passive-serial to DT schema format
 
 - Add some documentation on the different forms of YAML text blocks
   which are a constant source of review comments
 
 - Fix some schema errors in constraints for arrays
 
 - Add compatibles for qcom,sar2130p-pdc and onnn,adt7462
 
 DT core:
 
 - Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n
 
 - Add some warnings on deprecated address handling
 
 - Rework early_init_dt_scan() so the arch can pass in the phys address
   of the DTB as __pa() is not always valid to use. This fixes a warning
   for arm64 with kexec.
 
 - Add and use some new DT graph iterators for iterating over ports and
   endpoints
 
 - Rework reserved-memory handling to be sized dynamically for fixed
   regions
 
 - Optimize of_modalias() to avoid a strlen() call
 
 - Constify struct device_node and property pointers where ever possible
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmc7qaoACgkQ+vtdtY28
 YcN54g/+Ifz4hQTSWV+VBhihovMMPiQUdxZ+MfJfPnPcZ7NJzaTf+zqhZyS4wQou
 v0pdtyR0B1fCM/EvKaYD+1aTTAQFEIT5Dqac+9ePwqaYqSk+yCTxyzW9m+P3rTPV
 THo8SGRss7T+Rs+2WaUGxphTJItMGIRdbBvoqK+82EdKFXXKw2BSD8tlJTWwbTam
 9xkrpUzw7f4FvVY8vVhRyOd5i8/v+FH8D65DMIT6ME9zRn4MzKVzCg6udgYeCBld
 C2XbV+wnyewtjrN2IX+2uQ2mheb7yJu3AEI3iFR5x/sRrsSLpisxrUl38xOOpxrM
 XxYtHgE3omjagQ+y+L2PMthlKvhFrXVXIvhUH8xxje5z1Vyq3VMfiABkHlMpAnys
 5LY4xEhvqDkPNo65UmjMiHxGW/xtcKsmAZBOp+HLerZfCJIFvl380fi8mNg/Sjvz
 7ExCSpzCPsHASZg7QCTplU3BUtg+067Ch/k8Hsn/Og73Pqm3xH4IezQZKwweN9ZT
 LC6OQBI7C3Yt1hom9qgUcA4H4/aaPxTVV7i0DGuAKh8Lon6SaoX2yFpweUBgbsL/
 c9DIW4vbYBIGASxxUbHlNMKvPCKACKmpFXhsnH5Waj+VWSOwsJ8bjGpH8PfMKdFW
 dyJB/r94GqCGpCW7+FC1qGmXiQJGkCo89pKBVjSf4Kj45ht/76o=
 =NCYS
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:
 "Bindings:

   - Enable dtc "interrupt_provider" warnings for binding examples. Fix
     the warnings in fsl,mu-msi and ti,sci-inta due to this.

   - Convert zii,rave-sp-wdt, zii,rave-sp-pwrbutton, and
     altr,fpga-passive-serial to DT schema format

   - Add some documentation on the different forms of YAML text blocks
     which are a constant source of review comments

   - Fix some schema errors in constraints for arrays

   - Add compatibles for qcom,sar2130p-pdc and onnn,adt7462

  DT core:

   - Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n

   - Add some warnings on deprecated address handling

   - Rework early_init_dt_scan() so the arch can pass in the phys
     address of the DTB as __pa() is not always valid to use. This fixes
     a warning for arm64 with kexec.

   - Add and use some new DT graph iterators for iterating over ports
     and endpoints

   - Rework reserved-memory handling to be sized dynamically for fixed
     regions

   - Optimize of_modalias() to avoid a strlen() call

   - Constify struct device_node and property pointers where ever
     possible"

* tag 'devicetree-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (36 commits)
  of: Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n
  dt-bindings: interrupt-controller: qcom,pdc: Add SAR2130P compatible
  of/address: Rework bus matching to avoid warnings
  of: WARN on deprecated #address-cells/#size-cells handling
  of/fdt: Don't use default address cell sizes for address translation
  dt-bindings: Enable dtc "interrupt_provider" warnings
  of/fdt: add dt_phys arg to early_init_dt_scan and early_init_dt_verify
  dt-bindings: cache: qcom,llcc: Fix X1E80100 reg entries
  dt-bindings: watchdog: convert zii,rave-sp-wdt.txt to yaml format
  dt-bindings: input: convert zii,rave-sp-pwrbutton.txt to yaml
  media: xilinx-tpg: use new of_graph functions
  fbdev: omapfb: use new of_graph functions
  gpu: drm: omapdrm: use new of_graph functions
  ASoC: audio-graph-card2: use new of_graph functions
  ASoC: audio-graph-card: use new of_graph functions
  ASoC: test-component: use new of_graph functions
  of: property: use new of_graph functions
  of: property: add of_graph_get_next_port_endpoint()
  of: property: add of_graph_get_next_port()
  of: module: remove strlen() call in of_modalias()
  ...
2024-11-20 13:19:25 -08:00
Linus Torvalds
aad3a0d084 ftrace updates for v6.13:
- Merged tag ftrace-v6.12-rc4
 
   There was a fix to locking in register_ftrace_graph() for shadow stacks
   that was sent upstream. But this code was also being rewritten, and the
   locking fix was needed. Merging this fix was required to continue the
   work.
 
 - Restructure the function graph shadow stack to prepare it for use with
   kretprobes
 
   With the goal of merging the shadow stack logic of function graph and
   kretprobes, some more restructuring of the function shadow stack is
   required.
 
   Move out function graph specific fields from the fgraph infrastructure and
   store it on the new stack variables that can pass data from the entry
   callback to the exit callback.
 
   Hopefully, with this change, the merge of kretprobes to use fgraph shadow
   stacks will be ready by the next merge window.
 
 - Make shadow stack 4k instead of using PAGE_SIZE.
 
   Some architectures have very large PAGE_SIZE values which make its use for
   shadow stacks waste a lot of memory.
 
 - Give shadow stacks its own kmem cache.
 
   When function graph is started, every task on the system gets a shadow
   stack. In the future, shadow stacks may not be 4K in size. Have it have
   its own kmem cache so that whatever size it becomes will still be
   efficient in allocations.
 
 - Initialize profiler graph ops as it will be needed for new updates to fgraph
 
 - Convert to use guard(mutex) for several ftrace and fgraph functions
 
 - Add more comments and documentation
 
 - Show function return address in function graph tracer
 
   Add an option to show the caller of a function at each entry of the
   function graph tracer, similar to what the function tracer does.
 
 - Abstract out ftrace_regs from being used directly like pt_regs
 
   ftrace_regs was created to store a partial pt_regs. It holds only the
   registers and stack information to get to the function arguments and
   return values. On several archs, it is simply a wrapper around pt_regs.
   But some users would access ftrace_regs directly to get the pt_regs which
   will not work on all archs. Make ftrace_regs an abstract structure that
   requires all access to its fields be through accessor functions.
 
 - Show how long it takes to do function code modifications
 
   When code modification for function hooks happen, it always had the time
   recorded in how long it took to do the conversion. But this value was
   never exported. Recently the code was touched due to new ROX modification
   handling that caused a large slow down in doing the modifications and
   had a significant impact on boot times.
 
   Expose the timings in the dyn_ftrace_total_info file. This file was
   created a while ago to show information about memory usage and such to
   implement dynamic function tracing. It's also an appropriate file to store
   the timings of this modification as well. This will make it easier to see
   the impact of changes to code modification on boot up timings.
 
 - Other clean ups and small fixes
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZztrUxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qnnNAQD6w4q9VQ7oOE2qKLqtnj87h4c1GqKn
 SPkpEfC3n/ATEAD/fnYjT/eOSlHiGHuD/aTA+U/bETrT99bozGM/4mFKEgY=
 =6nCa
 -----END PGP SIGNATURE-----

Merge tag 'ftrace-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ftrace updates from Steven Rostedt:

 - Restructure the function graph shadow stack to prepare it for use
   with kretprobes

   With the goal of merging the shadow stack logic of function graph and
   kretprobes, some more restructuring of the function shadow stack is
   required.

   Move out function graph specific fields from the fgraph
   infrastructure and store it on the new stack variables that can pass
   data from the entry callback to the exit callback.

   Hopefully, with this change, the merge of kretprobes to use fgraph
   shadow stacks will be ready by the next merge window.

 - Make shadow stack 4k instead of using PAGE_SIZE.

   Some architectures have very large PAGE_SIZE values which make its
   use for shadow stacks waste a lot of memory.

 - Give shadow stacks its own kmem cache.

   When function graph is started, every task on the system gets a
   shadow stack. In the future, shadow stacks may not be 4K in size.
   Have it have its own kmem cache so that whatever size it becomes will
   still be efficient in allocations.

 - Initialize profiler graph ops as it will be needed for new updates to
   fgraph

 - Convert to use guard(mutex) for several ftrace and fgraph functions

 - Add more comments and documentation

 - Show function return address in function graph tracer

   Add an option to show the caller of a function at each entry of the
   function graph tracer, similar to what the function tracer does.

 - Abstract out ftrace_regs from being used directly like pt_regs

   ftrace_regs was created to store a partial pt_regs. It holds only the
   registers and stack information to get to the function arguments and
   return values. On several archs, it is simply a wrapper around
   pt_regs. But some users would access ftrace_regs directly to get the
   pt_regs which will not work on all archs. Make ftrace_regs an
   abstract structure that requires all access to its fields be through
   accessor functions.

 - Show how long it takes to do function code modifications

   When code modification for function hooks happen, it always had the
   time recorded in how long it took to do the conversion. But this
   value was never exported. Recently the code was touched due to new
   ROX modification handling that caused a large slow down in doing the
   modifications and had a significant impact on boot times.

   Expose the timings in the dyn_ftrace_total_info file. This file was
   created a while ago to show information about memory usage and such
   to implement dynamic function tracing. It's also an appropriate file
   to store the timings of this modification as well. This will make it
   easier to see the impact of changes to code modification on boot up
   timings.

 - Other clean ups and small fixes

* tag 'ftrace-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (22 commits)
  ftrace: Show timings of how long nop patching took
  ftrace: Use guard to take ftrace_lock in ftrace_graph_set_hash()
  ftrace: Use guard to take the ftrace_lock in release_probe()
  ftrace: Use guard to lock ftrace_lock in cache_mod()
  ftrace: Use guard for match_records()
  fgraph: Use guard(mutex)(&ftrace_lock) for unregister_ftrace_graph()
  fgraph: Give ret_stack its own kmem cache
  fgraph: Separate size of ret_stack from PAGE_SIZE
  ftrace: Rename ftrace_regs_return_value to ftrace_regs_get_return_value
  selftests/ftrace: Fix check of return value in fgraph-retval.tc test
  ftrace: Use arch_ftrace_regs() for ftrace_regs_*() macros
  ftrace: Consolidate ftrace_regs accessor functions for archs using pt_regs
  ftrace: Make ftrace_regs abstract from direct use
  fgragh: No need to invoke the function call_filter_check_discard()
  fgraph: Simplify return address printing in function graph tracer
  function_graph: Remove unnecessary initialization in ftrace_graph_ret_addr()
  function_graph: Support recording and printing the function return address
  ftrace: Have calltime be saved in the fgraph storage
  ftrace: Use a running sleeptime instead of saving on shadow stack
  fgraph: Use fgraph data to store subtime for profiler
  ...
2024-11-20 11:34:10 -08:00
Linus Torvalds
0352387523 First step of consolidating the VDSO data page handling:
The VDSO data page handling is architecture specific for historical
   reasons, but there is no real technical reason to do so.
 
   Aside of that VDSO data has become a dump ground for various mechanisms
   and fail to provide a clear separation of the functionalities.
 
   Clean this up by:
 
     * consolidating the VDSO page data by getting rid of architecture
       specific warts especially in x86 and PowerPC.
 
     * removing the last includes of header files which are pulling in other
       headers outside of the VDSO namespace.
 
     * seperating timekeeping and other VDSO data accordingly.
 
   Further consolidation of the VDSO page handling is done in subsequent
   changes scheduled for the next merge window.
 
   This also lays the ground for expanding the VDSO time getters for
   independent PTP clocks in a generic way without making every architecture
   add support seperately.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmc7kyoTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVBjD/9awdN2YeCGIM9rlHIktUdNRmRSL2SL
 6av1CPffN5DenONYTXWrDYPkC4yfjUwIs8H57uzFo10yA7RQ/Qfq+O68k5GnuFew
 jvpmmYSZ6TT21AmAaCIhn+kdl9YbEJFvN2AWH85Bl29k9FGB04VzJlQMMjfEZ1a5
 Mhwv+cfYNuPSZmU570jcxW2XgbyTWlLZBByXX/Tuz9bwpmtszba507bvo45x6gIP
 twaWNzrsyJpdXfMrfUnRiChN8jHlDN7I6fgQvpsoRH5FOiVwIFo0Ip2rKbk+ONfD
 W/rcU5oeqRIxRVDHzf2Sv8WPHMCLRv01ZHBcbJOtgvZC3YiKgKYoeEKabu9ZL1BH
 6VmrxjYOBBFQHOYAKPqBuS7BgH5PmtMbDdSZXDfRaAKaCzhCRysdlWW7z48r2R//
 zPufb7J6Tle23AkuZWhFjvlGgSBl4zxnTFn31HYOyQps3TMI4y50Z2DhE/EeU8a6
 DRl8/k1KQVDUZ6udJogS5kOr1J8pFtUPrA2uhR8UyLdx7YKiCzcdO1qWAjtXlVe8
 oNpzinU+H9bQqGe9IyS7kCG9xNaCRZNkln5Q1WfnkTzg5f6ihfaCvIku3l4bgVpw
 3HmcxYiC6RxQB+ozwN7hzCCKT4L9aMhr/457TNOqRkj2Elw3nvJ02L4aI86XAKLE
 jwO9Fkp9qcCxCw==
 =q5eD
 -----END PGP SIGNATURE-----

Merge tag 'timers-vdso-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull vdso data page handling updates from Thomas Gleixner:
 "First steps of consolidating the VDSO data page handling.

  The VDSO data page handling is architecture specific for historical
  reasons, but there is no real technical reason to do so.

  Aside of that VDSO data has become a dump ground for various
  mechanisms and fail to provide a clear separation of the
  functionalities.

  Clean this up by:

   - consolidating the VDSO page data by getting rid of architecture
     specific warts especially in x86 and PowerPC.

   - removing the last includes of header files which are pulling in
     other headers outside of the VDSO namespace.

   - seperating timekeeping and other VDSO data accordingly.

  Further consolidation of the VDSO page handling is done in subsequent
  changes scheduled for the next merge window.

  This also lays the ground for expanding the VDSO time getters for
  independent PTP clocks in a generic way without making every
  architecture add support seperately"

* tag 'timers-vdso-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  x86/vdso: Add missing brackets in switch case
  vdso: Rename struct arch_vdso_data to arch_vdso_time_data
  powerpc: Split systemcfg struct definitions out from vdso
  powerpc: Split systemcfg data out of vdso data page
  powerpc: Add kconfig option for the systemcfg page
  powerpc/pseries/lparcfg: Use num_possible_cpus() for potential processors
  powerpc/pseries/lparcfg: Fix printing of system_active_processors
  powerpc/procfs: Propagate error of remap_pfn_range()
  powerpc/vdso: Remove offset comment from 32bit vdso_arch_data
  x86/vdso: Split virtual clock pages into dedicated mapping
  x86/vdso: Delete vvar.h
  x86/vdso: Access vdso data without vvar.h
  x86/vdso: Move the rng offset to vsyscall.h
  x86/vdso: Access rng vdso data without vvar.h
  x86/vdso: Access timens vdso data without vvar.h
  x86/vdso: Allocate vvar page from C code
  x86/vdso: Access rng data from kernel without vvar
  x86/vdso: Place vdso_data at beginning of vvar page
  x86/vdso: Use __arch_get_vdso_data() to access vdso data
  x86/mm/mmap: Remove arch_vma_name()
  ...
2024-11-19 16:09:13 -08:00
Paolo Bonzini
2e9a2c624e Merge branch 'kvm-docs-6.13' into HEAD
- Drop obsolete references to PPC970 KVM, which was removed 10 years ago.

- Fix incorrect references to non-existing ioctls

- List registers supported by KVM_GET/SET_ONE_REG on s390

- Use rST internal links

- Reorganize the introduction to the API document
2024-11-13 07:18:12 -05:00
Palmer Dabbelt
64f7b77f0b
Merge patch series "Zacas/Zabha support and qspinlocks"
Alexandre Ghiti <alexghiti@rivosinc.com> says:

This implements [cmp]xchgXX() macros using Zacas and Zabha extensions
and finally uses those newly introduced macros to add support for
qspinlocks: note that this implementation of qspinlocks satisfies the
forward progress guarantee.

It also uses Ziccrse to provide the qspinlock implementation.

Thanks to Guo and Leonardo for their work!

* b4-shazam-merge: (1314 commits)
  riscv: Add qspinlock support
  dt-bindings: riscv: Add Ziccrse ISA extension description
  riscv: Add ISA extension parsing for Ziccrse
  asm-generic: ticket-lock: Add separate ticket-lock.h
  asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
  riscv: Implement xchg8/16() using Zabha
  riscv: Implement arch_cmpxchg128() using Zacas
  riscv: Improve zacas fully-ordered cmpxchg()
  riscv: Implement cmpxchg8/16() using Zabha
  dt-bindings: riscv: Add Zabha ISA extension description
  riscv: Implement cmpxchg32/64() using Zacas
  riscv: Do not fail to build on byte/halfword operations with Zawrs
  riscv: Move cpufeature.h macros into their own header

Link: https://lore.kernel.org/r/20241103145153.105097-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-11-11 07:35:09 -08:00
Alexandre Ghiti
ab83647fad
riscv: Add qspinlock support
In order to produce a generic kernel, a user can select
CONFIG_COMBO_SPINLOCKS which will fallback at runtime to the ticket
spinlock implementation if Zabha or Ziccrse are not present.

Note that we can't use alternatives here because the discovery of
extensions is done too late and we need to start with the qspinlock
implementation because the ticket spinlock implementation would pollute
the spinlock value, so let's use static keys.

This is largely based on Guo's work and Leonardo reviews at [1].

Link: https://lore.kernel.org/linux-riscv/20231225125847.2778638-1-guoren@kernel.org/ [1]
Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20241103145153.105097-14-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-11-11 07:33:20 -08:00
Alexandre Ghiti
2d36fe89d8
riscv: Add ISA extension parsing for Ziccrse
Add support to parse the Ziccrse string in the riscv,isa string.

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20241103145153.105097-12-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-11-11 07:33:18 -08:00
Alexandre Ghiti
1658ef4314
riscv: Implement cmpxchg8/16() using Zabha
This adds runtime support for Zabha in cmpxchg8/16() operations.

Note that in the absence of Zacas support in the toolchain, CAS
instructions from Zabha won't be used.

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20241103145153.105097-6-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-11-11 07:33:12 -08:00
Mike Rapoport (Microsoft)
0c3beacf68 asm-generic: introduce text-patching.h
Several architectures support text patching, but they name the header
files that declare patching functions differently.

Make all such headers consistently named text-patching.h and add an empty
header in asm-generic for architectures that do not support text patching.

Link: https://lkml.kernel.org/r/20241023162711.2579610-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Tested-by: kdevops <kdevops@lists.linux.dev>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Song Liu <song@kernel.org>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-07 14:25:15 -08:00
Nam Cao
a812eee0b6 vdso: Rename struct arch_vdso_data to arch_vdso_time_data
The struct arch_vdso_data is only about vdso time data. So rename it to
arch_vdso_time_data to make it obvious.
Non time-related data will be migrated out of these structs soon.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-28-b64f0842d512@linutronix.de
2024-11-02 12:37:36 +01:00
Thomas Weißschuh
d34b60752f riscv: vdso: Use only one single vvar mapping
The vvar mapping is the same for all processes. Use a single mapping to
simplify the logic and align it with the other architectures.

In addition this will enable the move of the vvar handling into generic code.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-6-b64f0842d512@linutronix.de
2024-11-02 12:37:33 +01:00
Usama Arif
b2473a3597 of/fdt: add dt_phys arg to early_init_dt_scan and early_init_dt_verify
__pa() is only intended to be used for linear map addresses and using
it for initial_boot_params which is in fixmap for arm64 will give an
incorrect value. Hence save the physical address when it is known at
boot time when calling early_init_dt_scan for arm64 and use it at kexec
time instead of converting the virtual address using __pa().

Note that arm64 doesn't need the FDT region reserved in the DT as the
kernel explicitly reserves the passed in FDT. Therefore, only a debug
warning is fixed with this change.

Reported-by: Breno Leitao <leitao@debian.org>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Usama Arif <usamaarif642@gmail.com>
Fixes: ac10be5cdb ("arm64: Use common of_kexec_alloc_and_setup_fdt()")
Link: https://lore.kernel.org/r/20241023171426.452688-1-usamaarif642@gmail.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-10-29 15:32:45 -05:00
Quan Zhou
5bb5ccb3e8 riscv: perf: add guest vs host distinction
Introduce basic guest support in perf, enabling it to distinguish
between PMU interrupts in the host or guest, and collect
fundamental information.

Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/a67d527dc1b11493fe11f7f53584772fdd983744.1728957131.git.zhouquan@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-10-28 16:41:12 +05:30
Palmer Dabbelt
5f153a692b
Merge commit 'bf40167d54d5' into fixes
This fix is part of a series on for-next, but it fixes broken builds so
I'm picking it up as a fix.

* commit 'bf40167d54d5':
  riscv: vdso: Prevent the compiler from inserting calls to memset()
2024-10-25 06:18:43 -07:00
Chunyan Zhang
164f66de6b
riscv: Remove duplicated GET_RM
The macro GET_RM defined twice in this file, one can be removed.

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Fixes: 956d705dd2 ("riscv: Unaligned load/store handling for M_MODE")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241008094141.549248-3-zhangchunyan@iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-25 06:18:42 -07:00
Chunyan Zhang
46d4e5ac6f
riscv: Remove unused GENERATING_ASM_OFFSETS
The macro is not used in the current version of kernel, it looks like
can be removed to avoid a build warning:

../arch/riscv/kernel/asm-offsets.c: At top level:
../arch/riscv/kernel/asm-offsets.c:7: warning: macro "GENERATING_ASM_OFFSETS" is not used [-Wunused-macros]
    7 | #define GENERATING_ASM_OFFSETS

Fixes: 9639a44394 ("RISC-V: Provide a cleaner raw_smp_processor_id()")
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Link: https://lore.kernel.org/r/20241008094141.549248-2-zhangchunyan@iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-25 06:18:41 -07:00
WangYuli
e0872ab726
riscv: Use '%u' to format the output of 'cpu'
'cpu' is an unsigned integer, so its conversion specifier should
be %u, not %d.

Suggested-by: Wentao Guan <guanwentao@uniontech.com>
Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://lore.kernel.org/all/alpine.DEB.2.21.2409122309090.40372@angie.orcam.me.uk/
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Fixes: f1e58583b9 ("RISC-V: Support cpu hotplug")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/4C127DEECDA287C8+20241017032010.96772-1-wangyuli@uniontech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-25 06:18:40 -07:00
Miquel Sabaté Solà
37233169a6
riscv: Prevent a bad reference count on CPU nodes
When populating cache leaves we previously fetched the CPU device node
at the very beginning. But when ACPI is enabled we go through a
specific branch which returns early and does not call 'of_node_put' for
the node that was acquired.

Since we are not using a CPU device node for the ACPI code anyways, we
can simply move the initialization of it just passed the ACPI block, and
we are guaranteed to have an 'of_node_put' call for the acquired node.
This prevents a bad reference count of the CPU device node.

Moreover, the previous function did not check for errors when acquiring
the device node, so a return -ENOENT has been added for that case.

Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Fixes: 604f32ea69 ("riscv: cacheinfo: initialize cacheinfo's level and  type from ACPI PPTT")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240913080053.36636-1-mikisabate@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-25 06:18:39 -07:00
Heinrich Schuchardt
d41373a4b9
riscv: efi: Set NX compat flag in PE/COFF header
The IMAGE_DLLCHARACTERISTICS_NX_COMPAT informs the firmware that the
EFI binary does not rely on pages that are both executable and
writable.

The flag is used by some distro versions of GRUB to decide if the EFI
binary may be executed.

As the Linux kernel neither has RWX sections nor needs RWX pages for
relocation we should set the flag.

Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Fixes: cb7d2dd561 ("RISC-V: Add PE/COFF header for EFI stub")
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240929140233.211800-1-heinrich.schuchardt@canonical.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-25 06:18:38 -07:00
Alexandre Ghiti
afedc3126e
riscv: Do not use fortify in early code
Early code designates the code executed when the MMU is not yet enabled,
and this comes with some limitations (see
Documentation/arch/riscv/boot.rst, section "Pre-MMU execution").

FORTIFY_SOURCE must be disabled then since it can trigger kernel panics
as reported in [1].

Reported-by: Jason Montleon <jmontleo@redhat.com>
Closes: https://lore.kernel.org/linux-riscv/CAJD_bPJes4QhmXY5f63GHV9B9HFkSCoaZjk-qCT2NGS7Q9HODg@mail.gmail.com/ [1]
Fixes: a35707c3d8 ("riscv: add memory-type errata for T-Head")
Fixes: 26e7aacb83 ("riscv: Allow to downgrade paging mode from the command line")
Cc: stable@vger.kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241009072749.45006-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-25 06:18:36 -07:00
Yunhui Cui
1966db682f
RISC-V: ACPI: fix early_ioremap to early_memremap
When SVPBMT is enabled, __acpi_map_table() will directly access the
data in DDR through the IO attribute, rather than through hardware
cache consistency, resulting in incorrect data in the obtained ACPI
table.

The log: ACPI: [ACPI:0x18] Invalid zero length.

We do not assume whether the bootloader flushes or not. We should
access in a cacheable way instead of maintaining cache consistency
by software.

Fixes: 3b426d4b5b ("RISC-V: ACPI : Fix for usage of pointers in different address space")
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Link: https://lore.kernel.org/r/20241014130141.86426-1-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-25 06:18:31 -07:00
Palmer Dabbelt
075fde5818
Merge patch series "riscv: Userspace pointer masking and tagged address ABI"
Samuel Holland <samuel.holland@sifive.com> says:

RISC-V defines three extensions for pointer masking[1]:
 - Smmpm: configured in M-mode, affects M-mode
 - Smnpm: configured in M-mode, affects the next lower mode (S or U-mode)
 - Ssnpm: configured in S-mode, affects the next lower mode (VS, VU, or U-mode)

This series adds support for configuring Smnpm or Ssnpm (depending on
which privilege mode the kernel is running in) to allow pointer masking
in userspace (VU or U-mode), extending the PR_SET_TAGGED_ADDR_CTRL API
from arm64. Unlike arm64 TBI, userspace pointer masking is not enabled
by default on RISC-V. Additionally, the tag width (referred to as PMLEN)
is variable, so userspace needs to ask the kernel for a specific tag
width, which is interpreted as a lower bound on the number of tag bits.

This series also adds support for a tagged address ABI similar to arm64
and x86. Since accesses from the kernel to user memory use the kernel's
pointer masking configuration, not the user's, the kernel must untag
user pointers in software before dereferencing them. And since the tag
width is variable, as with LAM on x86, it must be kept the same across
all threads in a process so untagged_addr_remote() can work.

[1]: https://github.com/riscv/riscv-j-extension/raw/d70011dde6c2/zjpm-spec.pdf

* b4-shazam-merge:
  KVM: riscv: selftests: Add Smnpm and Ssnpm to get-reg-list test
  RISC-V: KVM: Allow Smnpm and Ssnpm extensions for guests
  riscv: hwprobe: Export the Supm ISA extension
  riscv: selftests: Add a pointer masking test
  riscv: Allow ptrace control of the tagged address ABI
  riscv: Add support for the tagged address ABI
  riscv: Add support for userspace pointer masking
  riscv: Add CSR definitions for pointer masking
  riscv: Add ISA extension parsing for pointer masking
  dt-bindings: riscv: Add pointer masking ISA extensions

Link: https://lore.kernel.org/r/20241016202814.4061541-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-24 14:13:03 -07:00
Samuel Holland
3c2e0aff7b
riscv: hwprobe: Export the Supm ISA extension
Supm is a virtual ISA extension defined in the RISC-V Pointer Masking
specification, which indicates that pointer masking is available in
U-mode. It can be provided by either Smnpm or Ssnpm, depending on which
mode the kernel runs in. Userspace should not care about this
distinction, so export Supm instead of either underlying extension.

Hide the extension if the kernel was compiled without support for the
pointer masking prctl() interface.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241016202814.4061541-9-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-24 14:12:59 -07:00
Samuel Holland
78844482a1
riscv: Allow ptrace control of the tagged address ABI
This allows a tracer to control the ABI of the tracee, as on arm64.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241016202814.4061541-7-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-24 14:12:57 -07:00
Samuel Holland
2e17430858
riscv: Add support for the tagged address ABI
When pointer masking is enabled for userspace, the kernel can accept
tagged pointers as arguments to some system calls. Allow this by
untagging the pointers in access_ok() and the uaccess routines. The
uaccess routines must peform untagging in software because U-mode and
S-mode have entirely separate pointer masking configurations. In fact,
hardware may not even implement pointer masking for S-mode.

Since the number of tag bits is variable, untagged_addr_remote() needs
to know what PMLEN to use for the remote mm. Therefore, the pointer
masking mode must be the same for all threads sharing an mm. Enforce
this with a lock flag in the mm context, as x86 does for LAM. The flag
gets reset in init_new_context() during fork(), as the new mm is no
longer multithreaded.

Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241016202814.4061541-6-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-24 14:12:56 -07:00
Samuel Holland
09d6775f50
riscv: Add support for userspace pointer masking
RISC-V supports pointer masking with a variable number of tag bits
(which is called "PMLEN" in the specification) and which is configured
at the next higher privilege level.

Wire up the PR_SET_TAGGED_ADDR_CTRL and PR_GET_TAGGED_ADDR_CTRL prctls
so userspace can request a lower bound on the number of tag bits and
determine the actual number of tag bits. As with arm64's
PR_TAGGED_ADDR_ENABLE, the pointer masking configuration is
thread-scoped, inherited on clone() and fork() and cleared on execve().

Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241016202814.4061541-5-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-24 14:12:55 -07:00
Samuel Holland
2e6f6ea452
riscv: Add ISA extension parsing for pointer masking
The RISC-V Pointer Masking specification defines three extensions:
Smmpm, Smnpm, and Ssnpm. Add support for parsing each of them. The
specific extension which provides pointer masking support to userspace
(Supm) depends on the kernel's privilege mode, so provide a macro to
abstract this selection.

Smmpm implies the existence of the mseccfg CSR. As it is the only user
of this CSR so far, there is no need for an Xlinuxmseccfg extension.

Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241016202814.4061541-3-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-24 14:12:53 -07:00
Palmer Dabbelt
ce16531d48
Merge patch series "Prevent dynamic relocations in vDSO"
The first is a fix and the second a check to make sure we don't
regress on the relocations, so I'm picking this up as a series to get
the fix into fixes.

* b4-shazam-merge:
  riscv: Check that vdso does not contain any dynamic relocations
  riscv: vdso: Prevent the compiler from inserting calls to memset()

Link: https://lore.kernel.org/r/20241016083625.136311-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-24 14:12:12 -07:00
Alexandre Ghiti
c6898d66fd
riscv: Check that vdso does not contain any dynamic relocations
Like other architectures, use the common cmd_vdso_check to make sure of
that.

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Vladimir Isaev <vladimir.isaev@syntacore.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20241016083625.136311-3-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-24 10:52:53 -07:00
Alexandre Ghiti
bf40167d54
riscv: vdso: Prevent the compiler from inserting calls to memset()
The compiler is smart enough to insert a call to memset() in
riscv_vdso_get_cpus(), which generates a dynamic relocation.

So prevent this by using -fno-builtin option.

Fixes: e2c0cdfba7 ("RISC-V: User-facing API")
Cc: stable@vger.kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20241016083625.136311-2-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-24 10:52:52 -07:00
Palmer Dabbelt
18efe86bf2
Merge patch series "RISC-V: Detect and report speed of unaligned vector accesses"
Charlie Jenkins <charlie@rivosinc.com> says:

Adds support for detecting and reporting the speed of unaligned vector
accesses on RISC-V CPUs. Adds vec_misaligned_speed key to the hwprobe
adds Zicclsm to cpufeature and fixes the check for scalar unaligned
emulated all CPUs. The vec_misaligned_speed key keeps the same format
as the scalar unaligned access speed key.

This set does not emulate unaligned vector accesses on CPUs that do not
support them. Only reports if userspace can run them and speed of
unaligned vector accesses if supported.

* b4-shazam-merge:
  RISC-V: hwprobe: Document unaligned vector perf key
  RISC-V: Report vector unaligned access speed hwprobe
  RISC-V: Detect unaligned vector accesses supported
  RISC-V: Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED
  RISC-V: Scalar unaligned access emulated on hotplug CPUs
  RISC-V: Check scalar unaligned access on all CPUs

Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-0-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-18 12:38:36 -07:00
Jesse Taube
e7c9d66e31
RISC-V: Report vector unaligned access speed hwprobe
Detect if vector misaligned accesses are faster or slower than
equivalent vector byte accesses. This is useful for usermode to know
whether vector byte accesses or vector misaligned accesses have a better
bandwidth for operations like memcpy.

Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-5-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-18 12:38:34 -07:00
Jesse Taube
d1703dc7bc
RISC-V: Detect unaligned vector accesses supported
Run an unaligned vector access to test if the system supports
vector unaligned access. Add the result to a new key in hwprobe.
This is useful for usermode to know if vector misaligned accesses are
supported and if they are faster or slower than equivalent byte accesses.

Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-4-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-18 12:38:33 -07:00
Jesse Taube
c05a62c925
RISC-V: Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED
Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED to allow
for the addition of RISCV_VECTOR_MISALIGNED in a later patch.

Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-3-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-18 12:38:32 -07:00
Jesse Taube
9c528b5f79
RISC-V: Scalar unaligned access emulated on hotplug CPUs
The check_unaligned_access_emulated() function should have been called
during CPU hotplug to ensure that if all CPUs had emulated unaligned
accesses, the new CPU also does.

This patch adds the call to check_unaligned_access_emulated() in
the hotplug path.

Fixes: 55e0bf49a0 ("RISC-V: Probe misaligned access speed in parallel")
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-2-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-18 12:38:31 -07:00
Jesse Taube
8d20a739f1
RISC-V: Check scalar unaligned access on all CPUs
Originally, the check_unaligned_access_emulated_all_cpus function
only checked the boot hart. This fixes the function to check all
harts.

Fixes: 71c54b3d16 ("riscv: report misaligned accesses emulation to hwprobe")
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-1-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-18 12:38:10 -07:00
Steven Rostedt
7888af4166 ftrace: Make ftrace_regs abstract from direct use
ftrace_regs was created to hold registers that store information to save
function parameters, return value and stack. Since it is a subset of
pt_regs, it should only be used by its accessor functions. But because
pt_regs can easily be taken from ftrace_regs (on most archs), it is
tempting to use it directly. But when running on other architectures, it
may fail to build or worse, build but crash the kernel!

Instead, make struct ftrace_regs an empty structure and have the
architectures define __arch_ftrace_regs and all the accessor functions
will typecast to it to get to the actual fields. This will help avoid
usage of ftrace_regs directly.

Link: https://lore.kernel.org/all/20241007171027.629bdafd@gandalf.local.home/

Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Cc: "x86@kernel.org" <x86@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Naveen N Rao <naveen@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Paul  Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas  Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav  Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/20241008230628.958778821@goodmis.org
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-10 20:18:01 -04:00
Samuel Holland
368546ebe7
riscv: Call riscv_user_isa_enable() only on the boot hart
Now that the [ms]envcfg CSR value is maintained per thread, not per
hart, riscv_user_isa_enable() only needs to be called once during boot,
to set the value for the init task. This also allows it to be marked as
__init.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240814081126.956287-4-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-05 08:51:15 -07:00
Samuel Holland
5fc7355f01
riscv: Add support for per-thread envcfg CSR values
Some bits in the [ms]envcfg CSR, such as the CFI state and pointer
masking mode, need to be controlled on a per-thread basis. Support this
by keeping a copy of the CSR value in struct thread_struct and writing
it during context switches. It is safe to discard the old CSR value
during the context switch because the CSR is modified only by software,
so the CSR will remain in sync with the copy in thread_struct.

Use ALTERNATIVE directly instead of riscv_has_extension_unlikely() to
minimize branchiness in the context switching code.

Since thread_struct is copied during fork(), setting the value for the
init task sets the default value for all other threads.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240814081126.956287-3-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-05 08:51:14 -07:00
Samuel Holland
1b57747e97
riscv: Enable cbo.zero only when all harts support Zicboz
Currently, we enable cbo.zero for usermode on each hart that supports
the Zicboz extension. This means that the [ms]envcfg CSR value may
differ between harts. Other features, such as pointer masking and CFI,
require setting [ms]envcfg bits on a per-thread basis. The combination
of these two adds quite some complexity and overhead to context
switching, as we would need to maintain two separate masks for the
per-hart and per-thread bits. Andrew Jones, who originally added Zicboz
support, writes[1][2]:

  I've approached Zicboz the same way I would approach all
  extensions, which is to be per-hart. I'm not currently aware of
  a platform that is / will be composed of harts where some have
  Zicboz and others don't, but there's nothing stopping a platform
  like that from being built.

  So, how about we add code that confirms Zicboz is on all harts.
  If any hart does not have it, then we complain loudly and disable
  it on all the other harts. If it was just a hardware description
  bug, then it'll get fixed. If there's actually a platform which
  doesn't have Zicboz on all harts, then, when the issue is reported,
  we can decide to not support it, support it with defconfig, or
  support it under a Kconfig guard which must be enabled by the user.

Let's follow his suggested solution and require the extension to be
available on all harts, so the envcfg CSR value does not need to change
when a thread migrates between harts. Since we are doing this for all
extensions with fields in envcfg, the CSR itself only needs to be saved/
restored when it is present on all harts.

This should not be a regression as no known hardware has asymmetric
Zicboz support, but if anyone reports seeing the warning, we will
re-evaluate our solution.

Link: https://lore.kernel.org/linux-riscv/20240322-168f191eeb8479b2ea169a5e@orel/ [1]
Link: https://lore.kernel.org/linux-riscv/20240323-28943722feb57a41fb0ff488@orel/ [2]
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240814081126.956287-2-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-10-05 08:51:13 -07:00
Linus Torvalds
97d8894b6f RISC-V Patches for the 6.12 Merge Window, Part 1
* Support for using Zkr to seed KASLR.
 * Support for IPI-triggered CPU backtracing.
 * Support for generic CPU vulnerabilities reporting to userspace.
 * A few cleanups for missing licenses.
 * The size limit on the XIP kernel has been removed.
 * Support for tracing userspace stacks.
 * Support for the Svvptc extension.
 * Various cleanups and fixes throughout the tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmbykZITHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiTDLEACABTPlMX+Ke1lMYWj6MLBTXMSlQJ6w
 TKLVk3slp4POPsd+ViWy4J6EJDpCkNp6iK30Bv4AGainA3RJnbS7neKYy+MTw0ZV
 pJWTn73sxHeF+APnZ+geFYcmyteL/SZplgHgwLipFaojs7i/+tOvLFRRD3LueCz6
 Q6Muvke9R5uN6LL3tUrmIhJeyZjOwJtKEYoRz6Fo5Tq3RRK0VHYZkdpqRQ86rr9U
 yNbRNlBLlANstq8xjtiwm22ImPGIsvvhs0AvFft0H72zhf3Tfy0VcTLlDJYwkdKN
 Bz59v4N4jg1ajXuAsj4/BQdj5uRkzJNZ3Yq1yvMmGI+2UV1tInHEMeZi63+4gs8F
 0FYCziA+j6/08u8v8CrwdDOpJ9Iq/NMV6359bt5ySu3rM6QnPRGnHGkv4Ptk9Oyh
 sSPiuHS8YxpRBOB8xXNp5xFipTZ4+65VqCz253pAsbbp5aZ/o9Jw/bbBFFWcSsRZ
 tV2xhELVzPiC7dowfYkQhcNZOLlCaJ/Mvo2nMOtjbUwzaXkcjIRcYEy7+dTlfXyr
 3h5sStjWAKEmtLEvCYSI+lAT544tbV1x+lCMJGuvztapsTMtAP4GpQKblXl5qqnV
 L+VWIPJV1elI26H/p/Max9Z1s48NtfwoJSRL7Rnaya6uJ3iWG/BtajHlFbvgOkJf
 ObPZ//Yr9e91gg==
 =zDwL
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support using Zkr to seed KASLR

 - Support IPI-triggered CPU backtracing

 - Support for generic CPU vulnerabilities reporting to userspace

 - A few cleanups for missing licenses

 - The size limit on the XIP kernel has been removed

 - Support for tracing userspace stacks

 - Support for the Svvptc extension

 - Various cleanups and fixes throughout the tree

* tag 'riscv-for-linus-6.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (47 commits)
  crash: Fix riscv64 crash memory reserve dead loop
  perf/riscv-sbi: Add platform specific firmware event handling
  tools: Optimize ring buffer for riscv
  tools: Add riscv barrier implementation
  RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t
  ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
  riscv: Enable bitops instrumentation
  riscv: Omit optimized string routines when using KASAN
  ACPI: RISCV: Make acpi_numa_get_nid() to be static
  riscv: Randomize lower bits of stack address
  selftests: riscv: Allow mmap test to compile on 32-bit
  riscv: Make riscv_isa_vendor_ext_andes array static
  riscv: Use LIST_HEAD() to simplify code
  riscv: defconfig: Disable RZ/Five peripheral support
  RISC-V: Implement kgdb_roundup_cpus() to enable future NMI Roundup
  riscv: avoid Imbalance in RAS
  riscv: cacheinfo: Add back init_cache_level() function
  riscv: Remove unused _TIF_WORK_MASK
  drivers/perf: riscv: Remove redundant macro check
  riscv: define ILLEGAL_POINTER_VALUE for 64bit
  ...
2024-09-24 10:59:17 -07:00
Haibo Xu
732b177663
ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
Currently, only acpi_early_node_map[0] was initialized to NUMA_NO_NODE.
To ensure all the values were properly initialized, switch to initialize
all of them to NUMA_NO_NODE.

Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> (arm64 platform)
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240729035958.1957185-1-haibo1.xu@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-20 01:31:26 -07:00
Palmer Dabbelt
5835437609
Merge patch series "riscv: Improve KASAN coverage to fix unit tests"
Samuel Holland <samuel.holland@sifive.com> says:

This series fixes two areas where uninstrumented assembly routines
caused gaps in KASAN coverage on RISC-V, which were caught by KUnit
tests. The KASAN KUnit test suite passes after applying this series.

This series fixes the following test failures:
  # kasan_strings: EXPECTATION FAILED at mm/kasan/kasan_test.c:1520
  KASAN failure expected in "kasan_int_result = strcmp(ptr, "2")", but none occurred
  # kasan_strings: EXPECTATION FAILED at mm/kasan/kasan_test.c:1524
  KASAN failure expected in "kasan_int_result = strlen(ptr)", but none occurred
  not ok 60 kasan_strings
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1531
  KASAN failure expected in "set_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1533
  KASAN failure expected in "clear_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1535
  KASAN failure expected in "clear_bit_unlock(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1536
  KASAN failure expected in "__clear_bit_unlock(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1537
  KASAN failure expected in "change_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1543
  KASAN failure expected in "test_and_set_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1545
  KASAN failure expected in "test_and_set_bit_lock(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1546
  KASAN failure expected in "test_and_clear_bit(nr, addr)", but none occurred
  # kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1548
  KASAN failure expected in "test_and_change_bit(nr, addr)", but none occurred
  not ok 61 kasan_bitops_generic

Samuel Holland (2):
  riscv: Omit optimized string routines when using KASAN
  riscv: Enable bitops instrumentation

arch/riscv/include/asm/bitops.h | 43 ++++++++++++++++++---------------
 arch/riscv/include/asm/string.h |  2 ++
 arch/riscv/kernel/riscv_ksyms.c |  3 ---
 arch/riscv/lib/Makefile         |  2 ++
 arch/riscv/lib/strcmp.S         |  1 +
 arch/riscv/lib/strlen.S         |  1 +
 arch/riscv/lib/strncmp.S        |  1 +
 arch/riscv/purgatory/Makefile   |  2 ++
 8 files changed, 32 insertions(+), 23 deletions(-)

* b4-shazam-merge:
  riscv: Enable bitops instrumentation
  riscv: Omit optimized string routines when using KASAN

Link: https://lore.kernel.org/r/20240801033725.28816-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-19 01:10:44 -07:00
Samuel Holland
58ff537109
riscv: Omit optimized string routines when using KASAN
The optimized string routines are implemented in assembly, so they are
not instrumented for use with KASAN. Fall back to the C version of the
routines in order to improve KASAN coverage. This fixes the
kasan_strings() unit test.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240801033725.28816-2-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-19 01:10:00 -07:00
Hanjun Guo
21d98d658f
ACPI: RISCV: Make acpi_numa_get_nid() to be static
acpi_numa_get_nid() is only called in acpi_numa.c for riscv,
no need to add it in head file, so make it static and remove
related functions in the asm/acpi.h.

Spotted by doing some cleanup for arm64 ACPI.

Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Haibo Xu <haibo1.xu@intel.com>
Link: https://lore.kernel.org/r/20240811031804.3347298-1-guohanjun@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 12:02:48 -07:00
Yunhui Cui
048e2906d4
riscv: Randomize lower bits of stack address
Implement arch_align_stack() to randomize the lower bits
of the stack address.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20240625030502.68988-1-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 08:05:10 -07:00
Charlie Jenkins
594ffcf4ef
riscv: Make riscv_isa_vendor_ext_andes array static
Since this array is only used in this file, it should be static.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202407241530.ej5SVgX1-lkp@intel.com/
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240807-make_andes_static-v1-1-b64bf4c3d941@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 08:05:08 -07:00
Jinjie Ruan
3cc754c237
riscv: Use LIST_HEAD() to simplify code
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD().

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240904013344.2026738-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 06:26:07 -07:00
Jinjie Ruan
983f121499
RISC-V: Implement kgdb_roundup_cpus() to enable future NMI Roundup
Until now, the generic weak kgdb_roundup_cpus() has been used for kgdb on
RISCV. A custom one allows to debug CPUs that are stuck with interrupts
disabled with NMI support in the future. And using an IPI is better than
the generic one since it avoids the potential situation described in the
generic kgdb_call_nmi_hook(). As Andrew pointed out, once there is NMI
support, we can easily extend this and the CPU backtrace support
to use NMIs.

After this patch, the kgdb test show that:
	# echo g > /proc/sysrq-trigger
	[2]kdb> btc
	btc: cpu status: Currently on cpu 2
	Available cpus: 0-1(-), 2, 3(-)
	Stack traceback for pid 0
	0xffffffff81c13a40        0        0  1    0   -  0xffffffff81c14510  swapper/0
	CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.10.0-g3120273055b6-dirty #51
	Hardware name: riscv-virtio,qemu (DT)
	Call Trace:
	[<ffffffff80006c48>] dump_backtrace+0x28/0x30
	[<ffffffff80fceb38>] show_stack+0x38/0x44
	[<ffffffff80fe6a04>] dump_stack_lvl+0x58/0x7a
	[<ffffffff80fe6a3e>] dump_stack+0x18/0x20
	[<ffffffff801143fa>] kgdb_cpu_enter+0x682/0x6b2
	[<ffffffff801144ca>] kgdb_nmicallback+0xa0/0xac
	[<ffffffff8000a392>] handle_IPI+0x9c/0x120
	[<ffffffff800a2baa>] handle_percpu_devid_irq+0xa4/0x1e4
	[<ffffffff8009cca8>] generic_handle_domain_irq+0x28/0x36
	[<ffffffff800a9e5c>] ipi_mux_process+0xe8/0x110
	[<ffffffff806e1e30>] imsic_handle_irq+0xf8/0x13a
	[<ffffffff8009cca8>] generic_handle_domain_irq+0x28/0x36
	[<ffffffff806dff12>] riscv_intc_aia_irq+0x2e/0x40
	[<ffffffff80fe6ab0>] handle_riscv_irq+0x54/0x86
	[<ffffffff80ff2e4a>] call_on_irq_stack+0x32/0x40

Rebased on Ryo Takakura's "RISC-V: Enable IPI CPU Backtrace" patch.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240727063438.886155-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-17 05:52:44 -07:00
Jisheng Zhang
8f1534e744
riscv: avoid Imbalance in RAS
Inspired by[1], modify the code to remove the code of modifying ra to
avoid imbalance RAS (return address stack) which may lead to incorret
predictions on return.

Link: https://lore.kernel.org/linux-riscv/20240607061335.2197383-1-cyrilbur@tenstorrent.com/ [1]
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Cyril Bur <cyrilbur@tenstorrent.com>
Link: https://lore.kernel.org/r/20240720170659.1522-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:58:25 -07:00
Palmer Dabbelt
7e340f4fad
Merge patch series "Svvptc extension to remove preventive sfence.vma"
Alexandre Ghiti <alexghiti@rivosinc.com> says:

In RISC-V, after a new mapping is established, a sfence.vma needs to be
emitted for different reasons:

- if the uarch caches invalid entries, we need to invalidate it otherwise
  we would trap on this invalid entry,
- if the uarch does not cache invalid entries, a reordered access could fail
  to see the new mapping and then trap (sfence.vma acts as a fence).

We can actually avoid emitting those (mostly) useless and costly sfence.vma
by handling the traps instead:

- for new kernel mappings: only vmalloc mappings need to be taken care of,
  other new mapping are rare and already emit the required sfence.vma if
  needed.
  That must be achieved very early in the exception path as explained in
  patch 3, and this also fixes our fragile way of dealing with vmalloc faults.

- for new user mappings: Svvptc makes update_mmu_cache() a no-op but we can
  take some gratuitous page faults (which are very unlikely though).

Patch 1 and 2 introduce Svvptc extension probing.

On our uarch that does not cache invalid entries and a 6.5 kernel, the
gains are measurable:

* Kernel boot:                  6%
* ltp - mmapstress01:           8%
* lmbench - lat_pagefault:      20%
* lmbench - lat_mmap:           5%

Here are the corresponding numbers of sfence.vma emitted:

* Ubuntu boot to login:
Before: ~630k sfence.vma
After:  ~200k sfence.vma

* ltp - mmapstress01
Before: ~45k
After:  ~6.3k

* lmbench - lat_pagefault
Before: ~665k
After:   832 (!)

* lmbench - lat_mmap
Before: ~546k
After:   718 (!)

Thanks to Ved and Matt Evans for triggering the discussion that led to
this patchset!

* b4-shazam-merge:
  riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc
  riscv: Stop emitting preventive sfence.vma for new vmalloc mappings
  dt-bindings: riscv: Add Svvptc ISA extension description
  riscv: Add ISA extension parsing for Svvptc

Link: https://lore.kernel.org/r/20240717060125.139416-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:58:24 -07:00
Steffen Persvold
1845d381f2
riscv: cacheinfo: Add back init_cache_level() function
commit 5944ce092b (arch_topology: Build cacheinfo from primary CPU)
removed the init_cache_level() function from arch/riscv/kernel/cacheinfo.c
and relies on the init_cpu_topology() function in drivers/base/arch_topology.c
to call fetch_cache_info() which in turn calls init_of_cache_level() to
populate the cache hierarchy information. However, init_cpu_topology() is only
called from smpboot.c:smp_prepare_cpus() and thus only available when
CONFIG_SMP is defined.

To support non-SMP enabled kernels to still detect cache hierarchy, we add back
the init_cache_level() function. The init_level_allocate_ci() function handles
this gracefully on SMP-enabled kernels anyway where fetch_cache_info() is
called from init_cpu_topology() earlier in the boot phase.

Signed-off-by: Steffen Persvold <spersvold@gmail.com>
Link: https://lore.kernel.org/r/20240707003515.5058-1-spersvold@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:50 -07:00
Palmer Dabbelt
f25170a053
Merge patch series "riscv: stacktrace: Add USER_STACKTRACE support"
Jinjie Ruan <ruanjinjie@huawei.com> says:

Add RISC-V USER_STACKTRACE support, and fix the fp alignment bug
in perf_callchain_user() by the way as Björn pointed out.

* b4-shazam-merge:
  riscv: stacktrace: Add USER_STACKTRACE support
  riscv: Fix fp alignment bug in perf_callchain_user()

Link: https://lore.kernel.org/r/20240708032847.2998158-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 20:15:47 -07:00
Alexandre Ghiti
503638e0ba
riscv: Stop emitting preventive sfence.vma for new vmalloc mappings
In 6.5, we removed the vmalloc fault path because that can't work (see
[1] [2]). Then in order to make sure that new page table entries were
seen by the page table walker, we had to preventively emit a sfence.vma
on all harts [3] but this solution is very costly since it relies on IPI.

And even there, we could end up in a loop of vmalloc faults if a vmalloc
allocation is done in the IPI path (for example if it is traced, see
[4]), which could result in a kernel stack overflow.

Those preventive sfence.vma needed to be emitted because:

- if the uarch caches invalid entries, the new mapping may not be
  observed by the page table walker and an invalidation may be needed.
- if the uarch does not cache invalid entries, a reordered access
  could "miss" the new mapping and traps: in that case, we would actually
  only need to retry the access, no sfence.vma is required.

So this patch removes those preventive sfence.vma and actually handles
the possible (and unlikely) exceptions. And since the kernel stacks
mappings lie in the vmalloc area, this handling must be done very early
when the trap is taken, at the very beginning of handle_exception: this
also rules out the vmalloc allocations in the fault path.

Link: https://lore.kernel.org/linux-riscv/20230531093817.665799-1-bjorn@kernel.org/ [1]
Link: https://lore.kernel.org/linux-riscv/20230801090927.2018653-1-dylan@andestech.com [2]
Link: https://lore.kernel.org/linux-riscv/20230725132246.817726-1-alexghiti@rivosinc.com/ [3]
Link: https://lore.kernel.org/lkml/20200508144043.13893-1-joro@8bytes.org/ [4]
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20240717060125.139416-4-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 00:11:04 -07:00
Alexandre Ghiti
a6efe33cc5
riscv: Add ISA extension parsing for Svvptc
Add support to parse the Svvptc string in the riscv,isa string.

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240717060125.139416-2-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15 00:11:02 -07:00
Jinjie Ruan
1a74833182
riscv: stacktrace: Add USER_STACKTRACE support
Currently, userstacktrace is unsupported for riscv. So use the
perf_callchain_user() code as blueprint to implement the
arch_stack_walk_user() which add userstacktrace support on riscv.
Meanwhile, we can use arch_stack_walk_user() to simplify the implementation
of perf_callchain_user().

A ftrace test case is shown as below:

	# cd /sys/kernel/debug/tracing
	# echo 1 > options/userstacktrace
	# echo 1 > options/sym-userobj
	# echo 1 > events/sched/sched_process_fork/enable
	# cat trace
	......
	            bash-178     [000] ...1.    97.968395: sched_process_fork: comm=bash pid=178 child_comm=bash child_pid=231
	            bash-178     [000] ...1.    97.970075: <user stack trace>
	 => /lib/libc.so.6[+0xb5090]

Also a simple perf test is ok as below:

	# perf record -e cpu-clock --call-graph fp top
	# perf report --call-graph

	.....
	[[31m  66.54%[[m     0.00%  top      [kernel.kallsyms]            [k] ret_from_exception
            |
            ---ret_from_exception
               |
               |--[[31m58.97%[[m--do_trap_ecall_u
               |          |
               |          |--[[31m17.34%[[m--__riscv_sys_read
               |          |          ksys_read
               |          |          |
               |          |           --[[31m16.88%[[m--vfs_read
               |          |                     |
               |          |                     |--[[31m10.90%[[m--seq_read

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Tested-by: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240708032847.2998158-3-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 23:57:16 -07:00
Jinjie Ruan
22ab08955e
riscv: Fix fp alignment bug in perf_callchain_user()
The standard RISC-V calling convention said:
	"The stack grows downward and the stack pointer is always
	kept 16-byte aligned".

So perf_callchain_user() should check whether 16-byte aligned for fp.

Link: https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf

Fixes: dbeb90b0c1 ("riscv: Add perf callchain support")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240708032847.2998158-2-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 23:57:15 -07:00
Changbin Du
7587a3602b
riscv: vdso: do not strip debugging info for vdso.so.dbg
The vdso.so.dbg is a debug version of vdso and could be used for debugging
purpose. For example, perf-annotate requires debugging info to show source
lines. So let's keep its debugging info.

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Reviewed-by: Cyril Bur <cyrilbur@tenstorrent.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240611040947.3024710-1-changbin.du@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 01:02:30 -07:00
Palmer Dabbelt
9ea7b92b77
Merge patch series "remove size limit on XIP kernel"
Nam Cao <namcao@linutronix.de> says:

Hi,

For XIP kernel, the writable data section is always at offset specified in
XIP_OFFSET, which is hard-coded to 32MB.

Unfortunately, this means the read-only section (placed before the
writable section) is restricted in size. This causes build failure if the
kernel gets too large.

This series remove the use of XIP_OFFSET one by one, then remove this
macro entirely at the end, with the goal of lifting this size restriction.

Also some cleanup and documentation along the way.

* b4-shazam-merge
  riscv: remove limit on the size of read-only section for XIP kernel
  riscv: drop the use of XIP_OFFSET in create_kernel_page_table()
  riscv: drop the use of XIP_OFFSET in kernel_mapping_va_to_pa()
  riscv: drop the use of XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET
  riscv: drop the use of XIP_OFFSET in XIP_FIXUP_OFFSET
  riscv: replace misleading va_kernel_pa_offset on XIP kernel
  riscv: don't export va_kernel_pa_offset in vmcoreinfo for XIP kernel
  riscv: cleanup XIP_FIXUP macro
  riscv: change XIP's kernel_map.size to be size of the entire kernel
  ...

Link: https://lore.kernel.org/r/cover.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:23:05 -07:00
Nam Cao
b635a84bde
riscv: remove limit on the size of read-only section for XIP kernel
XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size. This causes
build failures if the kernel gets too big [1].

Remove this limit.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404211031.J6l2AfJk-lkp@intel.com [1]
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/3bf3a77be10ebb0d8086c028500baa16e7a8e648.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:23:02 -07:00
Nam Cao
f2df5b4fdd
riscv: don't export va_kernel_pa_offset in vmcoreinfo for XIP kernel
The crash utility uses va_kernel_pa_offset to translate virtual addresses.
This is incorrect in the case of XIP kernel, because va_kernel_pa_offset is
not the virtual-physical address offset (yes, the name is misleading; this
variable will be removed for XIP in a following commit).

Stop exporting this variable for XIP kernel. The replacement is to be
determined, note it as a TODO for now.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/8f8760d3f9a11af4ea0acbc247e4f49ff5d317e9.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:22:56 -07:00
Rafael J. Wysocki
45de40574f Merge branch 'acpi-riscv'
Merge ACPI and irqchip updates related to external interrupt controller
support on RISC-V:

 - Add ACPI device enumeration support for interrupt controller probing
   including taking dependencies into account (Sunil V L).

 - Implement ACPI-based interrupt controller probing on RISC-V (Sunil V L).

 - Add ACPI support for AIA in riscv-intc and add ACPI support to
   riscv-imsic, riscv-aplic, and sifive-plic (Sunil V L).

* acpi-riscv:
  irqchip/sifive-plic: Add ACPI support
  irqchip/riscv-aplic: Add ACPI support
  irqchip/riscv-imsic: Add ACPI support
  irqchip/riscv-imsic-state: Create separate function for DT
  irqchip/riscv-intc: Add ACPI support for AIA
  ACPI: RISC-V: Implement function to add implicit dependencies
  ACPI: RISC-V: Initialize GSI mapping structures
  ACPI: RISC-V: Implement function to reorder irqchip probe entries
  ACPI: RISC-V: Implement PCI related functionality
  ACPI: pci_link: Clear the dependencies after probe
  ACPI: bus: Add RINTC IRQ model for RISC-V
  ACPI: scan: Define weak function to populate dependencies
  ACPI: scan: Add RISC-V interrupt controllers to honor list
  ACPI: scan: Refactor dependency creation
  ACPI: bus: Add acpi_riscv_init() function
  ACPI: scan: Add a weak arch_sort_irqchip_probe() to order the IRQCHIP probe
  arm64: PCI: Migrate ACPI related functions to pci-acpi.c
2024-09-11 21:44:22 +02:00
Alexandre Ghiti
1ff95eb2be
riscv: Fix RISCV_ALTERNATIVE_EARLY
RISCV_ALTERNATIVE_EARLY will issue sbi_ecall() very early in the boot
process, before the first memory mapping is setup so we can't have any
instrumentation happening here.

In addition, when the kernel is relocatable, we must also not issue any
relocation this early since they would have been patched virtually only.

So, instead of disabling instrumentation for the whole kernel/sbi.c file
and compiling it with -fno-pie, simply move __sbi_ecall() and
__sbi_base_ecall() into their own file where this is fixed.

Reported-by: Conor Dooley <conor.dooley@microchip.com>
Closes: https://lore.kernel.org/linux-riscv/20240813-pony-truck-3e7a83e9759e@spud/
Reported-by: syzbot+cfbcb82adf6d7279fd35@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-riscv/00000000000065062c061fcec37b@google.com/
Fixes: 1745cfafeb ("riscv: don't use global static vars to store alternative data")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240829165048.49756-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-03 07:57:55 -07:00
Samuel Holland
b686ecdeac
riscv: misaligned: Restrict user access to kernel memory
raw_copy_{to,from}_user() do not call access_ok(), so this code allowed
userspace to access any virtual memory address.

Cc: stable@vger.kernel.org
Fixes: 7c83232161 ("riscv: add support for misaligned trap handling in S-mode")
Fixes: 441381506b ("riscv: misaligned: remove CONFIG_RISCV_M_MODE specific code")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240815005714.1163136-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-31 17:43:38 -07:00
Sunil V L
01415e78cf ACPI: RISC-V: Implement PCI related functionality
Replace the dummy implementation for PCI related functions with actual
implementation. This needs ECAM and MCFG CONFIG options to be enabled
for RISC-V.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://patch.msgid.link/20240812005929.113499-10-sunilvl@ventanamicro.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-27 15:48:35 +02:00
Palmer Dabbelt
32d5f7add0
Merge patch series "RISC-V: hwprobe: Misaligned scalar perf fix and rename"
Evan Green <evan@rivosinc.com> says:

The CPUPERF0 hwprobe key was documented and identified in code as
a bitmask value, but its contents were an enum. This produced
incorrect behavior in conjunction with the WHICH_CPUS hwprobe flag.
The first patch in this series fixes the bitmask/enum problem by
creating a new hwprobe key that returns the same data, but is
properly described as a value instead of a bitmask. The second patch
renames the value definitions in preparation for adding vector misaligned
access info. As of this version, the old defines are kept in place to
maintain source compatibility with older userspace programs.

* b4-shazam-merge:
  RISC-V: hwprobe: Add SCALAR to misaligned perf defines
  RISC-V: hwprobe: Add MISALIGNED_PERF key

Link: https://lore.kernel.org/r/20240809214444.3257596-1-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-15 13:12:21 -07:00
Alexandre Ghiti
e01d48c699
riscv: Fix out-of-bounds when accessing Andes per hart vendor extension array
The out-of-bounds access is reported by UBSAN:

[    0.000000] UBSAN: array-index-out-of-bounds in ../arch/riscv/kernel/vendor_extensions.c:41:66
[    0.000000] index -1 is out of range for type 'riscv_isavendorinfo [32]'
[    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2ubuntu-defconfig #2
[    0.000000] Hardware name: riscv-virtio,qemu (DT)
[    0.000000] Call Trace:
[    0.000000] [<ffffffff94e078ba>] dump_backtrace+0x32/0x40
[    0.000000] [<ffffffff95c83c1a>] show_stack+0x38/0x44
[    0.000000] [<ffffffff95c94614>] dump_stack_lvl+0x70/0x9c
[    0.000000] [<ffffffff95c94658>] dump_stack+0x18/0x20
[    0.000000] [<ffffffff95c8bbb2>] ubsan_epilogue+0x10/0x46
[    0.000000] [<ffffffff95485a82>] __ubsan_handle_out_of_bounds+0x94/0x9c
[    0.000000] [<ffffffff94e09442>] __riscv_isa_vendor_extension_available+0x90/0x92
[    0.000000] [<ffffffff94e043b6>] riscv_cpufeature_patch_func+0xc4/0x148
[    0.000000] [<ffffffff94e035f8>] _apply_alternatives+0x42/0x50
[    0.000000] [<ffffffff95e04196>] apply_boot_alternatives+0x3c/0x100
[    0.000000] [<ffffffff95e05b52>] setup_arch+0x85a/0x8bc
[    0.000000] [<ffffffff95e00ca0>] start_kernel+0xa4/0xfb6

The dereferencing using cpu should actually not happen, so remove it.

Fixes: 23c996fc2b ("riscv: Extend cpufeature.c to detect vendor extensions")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240814192619.276794-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-15 13:12:16 -07:00
Ying Sun
c6ebf2c528
riscv/kexec_file: Fix relocation type R_RISCV_ADD16 and R_RISCV_SUB16 unknown
Runs on the kernel with CONFIG_RISCV_ALTERNATIVE enabled:
  kexec -sl vmlinux

Error:
  kexec_image: Unknown rela relocation: 34
  kexec_image: Error loading purgatory ret=-8
and
  kexec_image: Unknown rela relocation: 38
  kexec_image: Error loading purgatory ret=-8

The purgatory code uses the 16-bit addition and subtraction relocation
type, but not handled, resulting in kexec_file_load failure.
So add handle to arch_kexec_apply_relocations_add().

Tested on RISC-V64 Qemu-virt, issue fixed.

Co-developed-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Ying Sun <sunying@isrc.iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240711083236.2859632-1-sunying@isrc.iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 17:44:33 -07:00
Evan Green
1f5288874d
RISC-V: hwprobe: Add SCALAR to misaligned perf defines
In preparation for misaligned vector performance hwprobe keys, rename
the hwprobe key values associated with misaligned scalar accesses to
include the term SCALAR. Leave the old defines in place to maintain
source compatibility.

This change is intended to be a functional no-op.

Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240809214444.3257596-3-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:13:24 -07:00
Evan Green
c42e2f0767
RISC-V: hwprobe: Add MISALIGNED_PERF key
RISCV_HWPROBE_KEY_CPUPERF_0 was mistakenly flagged as a bitmask in
hwprobe_key_is_bitmask(), when in reality it was an enum value. This
causes problems when used in conjunction with RISCV_HWPROBE_WHICH_CPUS,
since SLOW, FAST, and EMULATED have values whose bits overlap with
each other. If the caller asked for the set of CPUs that was SLOW or
EMULATED, the returned set would also include CPUs that were FAST.

Introduce a new hwprobe key, RISCV_HWPROBE_KEY_MISALIGNED_PERF, which
returns the same values in response to a direct query (with no flags),
but is properly handled as an enumerated value. As a result, SLOW,
FAST, and EMULATED are all correctly treated as distinct values under
the new key when queried with the WHICH_CPUS flag.

Leave the old key in place to avoid disturbing applications which may
have already come to rely on the key, with or without its broken
behavior with respect to the WHICH_CPUS flag.

Fixes: e178bf146e ("RISC-V: hwprobe: Introduce which-cpus flag")
Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240809214444.3257596-2-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:13:23 -07:00
Haibo Xu
a445699879
RISC-V: ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
Currently, only acpi_early_node_map[0] was initialized to NUMA_NO_NODE.
To ensure all the values were properly initialized, switch to initialize
all of them to NUMA_NO_NODE.

Fixes: eabd9db64e ("ACPI: RISCV: Add NUMA support based on SRAT and SLIT")
Reported-by: Andrew Jones <ajones@ventanamicro.com>
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/0d362a8ae50558b95685da4c821b2ae9e8cf78be.1722828421.git.haibo1.xu@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:12:41 -07:00
Celeste Liu
6111939463
riscv: entry: always initialize regs->a0 to -ENOSYS
Otherwise when the tracer changes syscall number to -1, the kernel fails
to initialize a0 with -ENOSYS and subsequently fails to return the error
code of the failed syscall to userspace. For example, it will break
strace syscall tampering.

Fixes: 52449c17bd ("riscv: entry: set a0 = -ENOSYS only when syscall != -1")
Reported-by: "Dmitry V. Levin" <ldv@strace.io>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
Link: https://lore.kernel.org/r/20240627142338.5114-2-CoelacanthusHex@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:12:22 -07:00
Ryo Takakura
f15c21a3de
RISC-V: Enable IPI CPU Backtrace
Add arch_trigger_cpumask_backtrace() which is a generic infrastructure
for sampling other CPUs' backtrace using IPI.

The feature is used when lockups are detected or in case of oops/panic
if parameters are set accordingly.

Below is the case of oops with the oops_all_cpu_backtrace enabled.

$ sysctl kernel.oops_all_cpu_backtrace=1

triggering oops shows:
[  212.214237] NMI backtrace for cpu 1
[  212.214390] CPU: 1 PID: 610 Comm: in:imklog Tainted: G           OE      6.10.0-rc6 #1
[  212.214570] Hardware name: riscv-virtio,qemu (DT)
[  212.214690] epc : fallback_scalar_usercopy+0x8/0xdc
[  212.214809]  ra : _copy_to_user+0x20/0x40
[  212.214913] epc : ffffffff80c3a930 ra : ffffffff8059ba7e sp : ff20000000eabb50
[  212.215061]  gp : ffffffff82066f90 tp : ff6000008e958000 t0 : 3463303866660000
[  212.215210]  t1 : 000000000000005b t2 : 3463303866666666 s0 : ff20000000eabb60
[  212.215358]  s1 : 0000000000000386 a0 : 00007ff6e81df926 a1 : ff600000824df800
[  212.215505]  a2 : 000000000000003f a3 : 7fffffffffffffc0 a4 : 0000000000000000
[  212.215651]  a5 : 000000000000003f a6 : 0000000000000000 a7 : 0000000000000000
[  212.215857]  s2 : ff600000824df800 s3 : ffffffff82066cc0 s4 : 0000000000001c1a
[  212.216074]  s5 : ffffffff8206a5a8 s6 : 00007ff6e81df926 s7 : ffffffff8206a5a0
[  212.216278]  s8 : ff600000824df800 s9 : ffffffff81e25de0 s10: 000000000000003f
[  212.216471]  s11: ffffffff8206a59d t3 : ff600000824df812 t4 : ff600000824df812
[  212.216651]  t5 : ff600000824df818 t6 : 0000000000040000
[  212.216796] status: 0000000000040120 badaddr: 0000000000000000 cause: 8000000000000001
[  212.217035] [<ffffffff80c3a930>] fallback_scalar_usercopy+0x8/0xdc
[  212.217207] [<ffffffff80095f56>] syslog_print+0x1f4/0x2b2
[  212.217362] [<ffffffff80096e5c>] do_syslog.part.0+0x94/0x2d8
[  212.217502] [<ffffffff800979e8>] do_syslog+0x66/0x88
[  212.217636] [<ffffffff803a5dda>] kmsg_read+0x44/0x5c
[  212.217764] [<ffffffff80392dbe>] proc_reg_read+0x7a/0xa8
[  212.217952] [<ffffffff802ff726>] vfs_read+0xb0/0x24e
[  212.218090] [<ffffffff803001ba>] ksys_read+0x64/0xe4
[  212.218264] [<ffffffff8030025a>] __riscv_sys_read+0x20/0x2c
[  212.218453] [<ffffffff80c4af9a>] do_trap_ecall_u+0x60/0x1d4
[  212.218664] [<ffffffff80c56998>] ret_from_exception+0x0/0x64

Signed-off-by: Ryo Takakura <takakura@valinux.co.jp>
Link: https://lore.kernel.org/r/20240718093659.158912-1-takakura@valinux.co.jp
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-07 07:11:59 -07:00
Alexandre Ghiti
ee9a68394b
riscv: Re-introduce global icache flush in patch_text_XXX()
commit edf2d546bf ("riscv: patch: Flush the icache right after
patching to avoid illegal insns") mistakenly removed the global icache
flush in patch_text_nosync() and patch_text_set_nosync() functions, so
reintroduce them.

Fixes: edf2d546bf ("riscv: patch: Flush the icache right after patching to avoid illegal insns")
Reported-by: Samuel Holland <samuel.holland@sifive.com>
Closes: https://lore.kernel.org/linux-riscv/a28ddc26-d77a-470a-a33f-88144f717e86@sifive.com/
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240801191404.55181-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-06 06:49:14 -07:00
Palmer Dabbelt
7c08a2615f
Merge patch series "RISC-V: Parse DT for Zkr to seed KASLR"
Jesse Taube <jesse@rivosinc.com> says:

Add functions to pi/fdt_early.c to help parse the FDT to check if
the isa string has the Zkr extension. Then use the Zkr extension to
seed the KASLR base address.

The first two patches fix the visibility of symbols.

* b4-shazam-merge:
  RISC-V: Use Zkr to seed KASLR base address
  RISC-V: pi: Add kernel/pi/pi.h
  RISC-V: lib: Add pi aliases for string functions
  RISC-V: pi: Force hidden visibility for all symbol references

Link: https://lore.kernel.org/r/20240709173937.510084-1-jesse@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-05 12:06:43 -07:00
Jesse Taube
945302df3d
RISC-V: Use Zkr to seed KASLR base address
Parse the device tree for Zkr in the isa string.
If Zkr is present, use it to seed the kernel base address.

On an ACPI system, as of this commit, there is no easy way to check if
Zkr is present. Blindly running the instruction isn't an option as;
we have to be able to trust the firmware.

Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240709173937.510084-5-jesse@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-05 12:06:41 -07:00
Jesse Taube
b331182715
RISC-V: pi: Add kernel/pi/pi.h
Add pi.h header for declarations of the kernel/pi prefixed functions
and any other related declarations.

Suggested-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240709173937.510084-4-jesse@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-05 12:06:40 -07:00
Jesse Taube
14c3ec6723
RISC-V: pi: Force hidden visibility for all symbol references
Eliminate all GOT entries in the .pi section, by forcing hidden
visibility for all symbol references, which informs the compiler that
such references will be resolved at link time without the need for
allocating GOT entries.

Include linux/hidden.h in Makefile, like arm64, for the
hidden visibility attribute.

Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240709173937.510084-2-jesse@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-05 12:06:38 -07:00
Linus Torvalds
948752d2e0 RISC-V Fixes for 6.11-rc2
* A fix to avoid dropping some of the internal pseudo-extensions, which
   breaks *envcfg dependency parsing.
 * The kernel entry address is now aligned in purgatory, which avoids a
   misaligned load that can lead to crash on systems that don't support
   misaligned accesses early in boot.
 * The FW_SFENCE_VMA_RECEIVED perf event was duplicated in a handful of
   perf JSON configurations, one of them been updated to
   FW_SFENCE_VMA_ASID_SENT.
 * The starfive cache driver is now restricted to 64-bit systems, as it
   isn't 32-bit clean.
 * A fix for to avoid aliasing legacy-mode perf counters with software
   perf counters.
 * VM_FAULT_SIGSEGV is now handled in the page fault code.
 * A fix for stalls during CPU hotplug due to IPIs being disabled.
 * A fix for memblock bounds checking.  This manifests as a crash on
   systems with discontinuous memory maps that have regions that don't
   fit in the linear map.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmas/qwTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiWp7EACDcorcihBG8uSsX//GKJPjkiGIbZkT
 MIMN3yqIzJuSftxpvgVxpyq2MFKYy7BK/75sK+4VoQpoCJEtdxbdh0JUqck/Nrgj
 Kn0hxWy7RO6Rp9ggf9dTdca64Tdxh32Eegpum3E46zuhYQBMcNze4z4NsOXs6ems
 254ww8+v7V5R7FGsxm1PG4Hs3soxZ9FPdWE69ndxmjr9N5FFkchk5YbV8AgKYtSJ
 sfu5Q+68zh58GVZhn0usug0fHNgVzdvwy3PIBDGD58hqIDAs9WlF80MiW3sESTIe
 PrJcAFBU4tHp+8h+OMaKw2xfybrZpNmqobx7dED34PJu0R4+Uvz7MUKMMPUJeB+q
 7UOZokjF2Hvd5VsAeTc1PisvzVsWkWpkzJqZmdaTr2m8J4m5z7/nby+ZcXmoOlVz
 JiMDgrkM4KIziq++9bYbBfcxsS9dMsvNtEQAHByL/zdVfAFTvWUMUmAgg27C3K9Z
 QbHfbpxqQ/pEu4CsRUIx4GnkEKnWPLuGovnYboGmC3BCDwQkkV8H0tcEhJtWMKte
 6h+vvKBX2POS4l8467ElmcTRv5Cfpi/dmhZrC9SHHQhNF5OiHHM2CmSEOKS1bUPj
 e4+k/QGmVQOAJGRRPkpD+DFMhHT/jhvbYV4kDXr/h9AKJQ2eWRGMSOMaPJ/X311N
 R5W1yiJilIhXuQ==
 =K52W
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix to avoid dropping some of the internal pseudo-extensions, which
   breaks *envcfg dependency parsing

 - The kernel entry address is now aligned in purgatory, which avoids a
   misaligned load that can lead to crash on systems that don't support
   misaligned accesses early in boot

 - The FW_SFENCE_VMA_RECEIVED perf event was duplicated in a handful of
   perf JSON configurations, one of them been updated to
   FW_SFENCE_VMA_ASID_SENT

 - The starfive cache driver is now restricted to 64-bit systems, as it
   isn't 32-bit clean

 - A fix for to avoid aliasing legacy-mode perf counters with software
   perf counters

 - VM_FAULT_SIGSEGV is now handled in the page fault code

 - A fix for stalls during CPU hotplug due to IPIs being disabled

 - A fix for memblock bounds checking. This manifests as a crash on
   systems with discontinuous memory maps that have regions that don't
   fit in the linear map

* tag 'riscv-for-linus-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix linear mapping checks for non-contiguous memory regions
  RISC-V: Enable the IPI before workqueue_online_cpu()
  riscv/mm: Add handling for VM_FAULT_SIGSEGV in mm_fault_error()
  perf: riscv: Fix selecting counters in legacy mode
  cache: StarFive: Require a 64-bit system
  perf arch events: Fix duplicate RISC-V SBI firmware event name
  riscv/purgatory: align riscv_kernel_entry
  riscv: cpufeature: Do not drop Linux-internal extensions
2024-08-02 09:33:35 -07:00
Arnd Bergmann
343416f0c1 syscalls: fix syscall macros for newfstat/newfstatat
The __NR_newfstat and __NR_newfstatat macros accidentally got renamed
in the conversion to the syscall.tbl format, dropping the 'new' portion
of the name.

In an unrelated change, the two syscalls are no longer architecture
specific but are once more defined on all 64-bit architectures, so the
'newstat' ABI keyword can be dropped from the table as a simplification.

Fixes: Fixes: 4fe53bf2ba ("syscalls: add generic scripts/syscall.tbl")
Closes: https://lore.kernel.org/lkml/838053e0-b186-4e9f-9668-9a3384a71f23@app.fastmail.com/T/#t
Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-08-02 15:20:47 +02:00