Commit Graph

400 Commits

Author SHA1 Message Date
Andrew Jones
7ea5a73617
RISC-V: Add Zicboz detection and block size parsing
Parse "riscv,cboz-block-size" from the DT by piggybacking on Zicbom's
riscv_init_cbom_blocksize(). Additionally check the DT for the presence
of the "zicboz" extension and, when it's present, validate the parsed
cboz block size as we do Zicbom's cbom block size with
riscv_isa_extension_check().

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230224162631.405473-5-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-14 21:26:04 -07:00
Andrew Jones
8b05e7d040
RISC-V: Factor out body of riscv_init_cbom_blocksize loop
Refactor riscv_init_cbom_blocksize() to prepare for it to be used
for both cbom block size and cboz block size.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230224162631.405473-3-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-14 21:26:02 -07:00
Dylan Jhong
47dd902aae
RISC-V: mm: Support huge page in vmalloc_fault()
Since RISC-V supports ioremap() with huge page (pud/pmd) mapping,
However, vmalloc_fault() assumes that the vmalloc range is limited
to pte mappings. To complete the vmalloc_fault() function by adding
huge page support.

Fixes: 310f541a02 ("riscv: Enable HAVE_ARCH_HUGE_VMAP for 64BIT")
Cc: stable@vger.kernel.org
Signed-off-by: Dylan Jhong <dylan@andestech.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20230310075021.3919290-1-dylan@andestech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-14 19:15:34 -07:00
Palmer Dabbelt
4a4c459872
Merge patch series "riscv, mm: detect svnapot cpu support at runtime"
Qinglin Pan <panqinglin00@gmail.com> says:

Svnapot is a RISC-V extension for marking contiguous 4K pages as a non-4K
page. This patch set is for using Svnapot in hugetlb fs and huge vmap.

This patchset adds a Kconfig item for using Svnapot in
"Platform type"->"SVNAPOT extension support". Its default value is on,
and people can set it off if they don't allow kernel to detect Svnapot
hardware support and leverage it.

Tested on:
  - qemu rv64 with "Svnapot support" off and svnapot=true.
  - qemu rv64 with "Svnapot support" on and svnapot=true.
  - qemu rv64 with "Svnapot support" off and svnapot=false.
  - qemu rv64 with "Svnapot support" on and svnapot=false.

* b4-shazam-merge:
  riscv: mm: support Svnapot in huge vmap
  riscv: mm: support Svnapot in hugetlb page
  riscv: mm: modify pte format for Svnapot

Link: https://lore.kernel.org/r/20230209131647.17245-1-panqinglin00@gmail.com
[Palmer: fix up the feature ordering in the merge]
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-09 18:13:45 -08:00
Palmer Dabbelt
9b7fef255c
Merge patch series "riscv: asid: switch to alternative way to fix stale TLB entries"
Sergey Matyukevich <geomatsi@gmail.com> says:

Some time ago two different patches have been posted to fix stale TLB
entries that caused applications crashes.

The patch [0] suggested 'aggregating' mm_cpumask, i.e. current cpu is not
cleared for the switched-out task in switch_mm function. For additional
explanations see the commit message by Guo Ren. The same approach is
used by arc architecture, so another good comment is for switch_mm
in arch/arc/include/asm/mmu_context.h.

The patch [1] attempted to reduce the number of TLB flushes by deferring
(and possibly avoiding) them for CPUs not running the task.

Patch [1] has been merged. However we already have two bug reports from
different vendors. So apparently something is missing in the approach
suggested in [1]. In both cases the patch [0] fixed the issue.

This patch series reverts [1] and replaces it by [0].

[0] https://lore.kernel.org/linux-riscv/20221111075902.798571-1-guoren@kernel.org/
[1] https://lore.kernel.org/linux-riscv/20220829205219.283543-1-geomatsi@gmail.com/

* b4-shazam-merge:
  riscv: asid: Fixup stale TLB entry cause application crash
  Revert "riscv: mm: notify remote harts about mmu cache updates"

Link: https://lore.kernel.org/r/20230226150137.1919750-1-geomatsi@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-09 15:22:05 -08:00
Guo Ren
82dd33fde0
riscv: asid: Fixup stale TLB entry cause application crash
After use_asid_allocator is enabled, the userspace application will
crash by stale TLB entries. Because only using cpumask_clear_cpu without
local_flush_tlb_all couldn't guarantee CPU's TLB entries were fresh.
Then set_mm_asid would cause the user space application to get a stale
value by stale TLB entry, but set_mm_noasid is okay.

Here is the symptom of the bug:
unhandled signal 11 code 0x1 (coredump)
   0x0000003fd6d22524 <+4>:     auipc   s0,0x70
   0x0000003fd6d22528 <+8>:     ld      s0,-148(s0) # 0x3fd6d92490
=> 0x0000003fd6d2252c <+12>:    ld      a5,0(s0)
(gdb) i r s0
s0          0x8082ed1cc3198b21       0x8082ed1cc3198b21
(gdb) x /2x 0x3fd6d92490
0x3fd6d92490:   0xd80ac8a8      0x0000003f
The core dump file shows that register s0 is wrong, but the value in
memory is correct. Because 'ld s0, -148(s0)' used a stale mapping entry
in TLB and got a wrong result from an incorrect physical address.

When the task ran on CPU0, which loaded/speculative-loaded the value of
address(0x3fd6d92490), then the first version of the mapping entry was
PTWed into CPU0's TLB.
When the task switched from CPU0 to CPU1 (No local_tlb_flush_all here by
asid), it happened to write a value on the address (0x3fd6d92490). It
caused do_page_fault -> wp_page_copy -> ptep_clear_flush ->
ptep_get_and_clear & flush_tlb_page.
The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need TLB
flush, but CPU0 had cleared the CPU0's mm_cpumask in the previous
switch_mm. So we only flushed the CPU1 TLB and set the second version
mapping of the PTE. When the task switched from CPU1 to CPU0 again, CPU0
still used a stale TLB mapping entry which contained a wrong target
physical address. It raised a bug when the task happened to read that
value.

   CPU0                               CPU1
   - switch 'task' in
   - read addr (Fill stale mapping
     entry into TLB)
   - switch 'task' out (no tlb_flush)
                                      - switch 'task' in (no tlb_flush)
                                      - write addr cause pagefault
                                        do_page_fault() (change to
                                        new addr mapping)
                                          wp_page_copy()
                                            ptep_clear_flush()
                                              ptep_get_and_clear()
                                              & flush_tlb_page()
                                        write new value into addr
                                      - switch 'task' out (no tlb_flush)
   - switch 'task' in (no tlb_flush)
   - read addr again (Use stale
     mapping entry in TLB)
     get wrong value from old phyical
     addr, BUG!

The solution is to keep all CPUs' footmarks of cpumask(mm) in switch_mm,
which could guarantee to invalidate all stale TLB entries during TLB
flush.

Fixes: 65d4b9c530 ("RISC-V: Implement ASID allocator")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Tested-by: Zong Li <zong.li@sifive.com>
Tested-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com>
Cc: Anup Patel <apatel@ventanamicro.com>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: stable@vger.kernel.org
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20230226150137.1919750-3-geomatsi@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-09 15:22:02 -08:00
Sergey Matyukevich
e921050022
Revert "riscv: mm: notify remote harts about mmu cache updates"
This reverts the remaining bits of commit 4bd1d80efb ("riscv: mm:
notify remote harts harts about mmu cache updates").

According to bug reports, suggested approach to fix stale TLB entries
is not sufficient. It needs to be replaced by a more robust solution.

Fixes: 4bd1d80efb ("riscv: mm: notify remote harts about mmu cache updates")
Reported-by: Zong Li <zong.li@sifive.com>
Reported-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com>
Cc: stable@vger.kernel.org
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20230226150137.1919750-2-geomatsi@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-09 15:22:01 -08:00
Qinglin Pan
82a1a1f3bf
riscv: mm: support Svnapot in hugetlb page
Svnapot can be used to support 64KB hugetlb page, so it can become a new
option when using hugetlbfs. Add a basic implementation of hugetlb page,
and support 64KB as a size in it by using Svnapot.

For test, boot kernel with command line contains "default_hugepagesz=64K
hugepagesz=64K hugepages=20" and run a simple test like this:

tools/testing/selftests/vm/map_hugetlb 1 16

And it should be passed.

Signed-off-by: Qinglin Pan <panqinglin00@gmail.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20230209131647.17245-3-panqinglin00@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-07 19:39:16 -08:00
Linus Torvalds
1a8d05a726 VM_FAULT_RETRY fixes
Some of the page fault handlers do not deal with the following case
 correctly:
 	* handle_mm_fault() has returned VM_FAULT_RETRY
 	* there is a pending fatal signal
 	* fault had happened in kernel mode
 Correct action in such case is not "return unconditionally" - fatal
 signals are handled only upon return to userland and something like
 copy_to_user() would end up retrying the faulting instruction and
 triggering the same fault again and again.
 
 What we need to do in such case is to make the caller to treat that
 as failed uaccess attempt - handle exception if there is an exception
 handler for faulting instruction or oops if there isn't one.
 
 Over the years some architectures had been fixed and now are handling
 that case properly; some still do not.  This series should fix the
 remaining ones.
 
 Status:
 	m68k, riscv, hexagon, parisc: tested/acked by maintainers.
 	alpha, sparc32, sparc64: tested locally - bug has been
 reproduced on the unpatched kernel and verified to be fixed by
 this series.
 	ia64, microblaze, nios2, openrisc: build, but otherwise
 completely untested.
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZAP54AAKCRBZ7Krx/gZQ
 6/IjAP92oS8kDuodAtT7WorV+GETWr2HFKfvDiPSmEGiSnXIigD9Hj9svXyeeAgl
 /TqGR50Yrvn3IUQ0A0bUDpBAG1qyVwY=
 =9+Bd
 -----END PGP SIGNATURE-----

Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull VM_FAULT_RETRY fixes from Al Viro:
 "Some of the page fault handlers do not deal with the following case
  correctly:

   - handle_mm_fault() has returned VM_FAULT_RETRY

   - there is a pending fatal signal

   - fault had happened in kernel mode

  Correct action in such case is not "return unconditionally" - fatal
  signals are handled only upon return to userland and something like
  copy_to_user() would end up retrying the faulting instruction and
  triggering the same fault again and again.

  What we need to do in such case is to make the caller to treat that as
  failed uaccess attempt - handle exception if there is an exception
  handler for faulting instruction or oops if there isn't one.

  Over the years some architectures had been fixed and now are handling
  that case properly; some still do not. This series should fix the
  remaining ones.

  Status:

   - m68k, riscv, hexagon, parisc: tested/acked by maintainers.

   - alpha, sparc32, sparc64: tested locally - bug has been reproduced
     on the unpatched kernel and verified to be fixed by this series.

   - ia64, microblaze, nios2, openrisc: build, but otherwise completely
     untested"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  openrisc: fix livelock in uaccess
  nios2: fix livelock in uaccess
  microblaze: fix livelock in uaccess
  ia64: fix livelock in uaccess
  sparc: fix livelock in uaccess
  alpha: fix livelock in uaccess
  parisc: fix livelock in uaccess
  hexagon: fix livelock in uaccess
  riscv: fix livelock in uaccess
  m68k: fix livelock in uaccess
2023-03-05 11:07:58 -08:00
Al Viro
d835eb3a57 riscv: fix livelock in uaccess
riscv equivalent of 26178ec11e "x86: mm: consolidate VM_FAULT_RETRY handling"
If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
to page tables.  In such case we must *not* return to the faulting insn -
that would repeat the entire thing without making any progress; what we need
instead is to treat that as failed (user) memory access.

Tested-by: Björn Töpel <bjorn@kernel.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-03-02 12:30:15 -05:00
Linus Torvalds
01687e7c93 RISC-V Patches for the 6.3 Merge Window, Part 1
There's a bunch of fixes/cleanups throughout the tree as usual, but we
 also have a handful of new features.
 
 * Various improvements to the extension detection and alternative
   patching infrastructure.
 * Zbb-optimized string routines.
 * Support for cpu-capacity in the RISC-V DT bindings.
 * Zicbom no longer depends on toolchain support.
 * Some performance and code size improvements to ftrace.
 * Support for ARCH_WANT_LD_ORPHAN_WARN.
 * Oops now contain the faulting instruction.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmP49coTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiTFnEACuQtGJWSwzH+ORswVyItKzqHcBMU3t
 rWHTFxQ7MdWgQO8nrWUtSypGY4n0DFTCe9w4H3tQFRDaTXbI+ycFjidEDt3eJCMb
 n6WiGuZpdKVS81CQ0Es4dTWQ1i/28fe1851CGK/PkybXdrPPofdCJ9k3Wepxflb/
 2UYxRDyjKt3KbJ2OmN2oF8Ek1rrsGhIC/Dhbdb2JsGZhYF10ZYjquaOLs31WbHMG
 O+n/N/JfZRAif1MDQ71ygAm9KV0kGqe/wcRtsJGETwJ8U3I/cjn2mAGd8BRdy4iL
 9GFmTmi8q27ntUbakikNz3b4aE9xVnLDvRIyOciI3l8rQjrFAsfnQbuRwlaq6BVJ
 BF3e6nAjkcLj23FhbROTlfncEOzrklbNZ+uQIuvyffAUjDoePw9x7o0r+qj7FnOY
 WMfNecJMeE5OGVBqHSVFEcAMlN6uYu6wqbEipEpc+8sTg+w1LM0bUVNhV86/BrnL
 bh+4+7MPYtg45vy2Y8AuPUBFqR2uCekDpbxciCEGsaIzUYRas2zrt9UkWGjKs1VV
 q0qeLSNNA1wBq+q6FprTceipFQIqD5KnmI2GMucF6v4YFg5AzeSOpRc6aeqcs7Z2
 +ApShSOFPjjntZbcpTgkvhrPExr0Jel0xY7YSazUUqY0xOHUwGNBEh/E4rzsRLxr
 qvUpFAIZT60dfQ==
 =XgYl
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.3-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:
 "There's a bunch of fixes/cleanups throughout the tree as usual, but we
  also have a handful of new features:

   - Various improvements to the extension detection and alternative
     patching infrastructure

   - Zbb-optimized string routines

   - Support for cpu-capacity in the RISC-V DT bindings

   - Zicbom no longer depends on toolchain support

   - Some performance and code size improvements to ftrace

   - Support for ARCH_WANT_LD_ORPHAN_WARN

   - Oops now contain the faulting instruction"

* tag 'riscv-for-linus-6.3-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (67 commits)
  RISC-V: add a spin_shadow_stack declaration
  riscv: mm: hugetlb: Enable ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
  riscv: Add header include guards to insn.h
  riscv: alternative: proceed one more instruction for auipc/jalr pair
  riscv: Avoid enabling interrupts in die()
  riscv, mm: Perform BPF exhandler fixup on page fault
  RISC-V: take text_mutex during alternative patching
  riscv: hwcap: Don't alphabetize ISA extension IDs
  RISC-V: fix ordering of Zbb extension
  riscv: jump_label: Fixup unaligned arch_static_branch function
  RISC-V: Only provide the single-letter extensions in HWCAP
  riscv: mm: fix regression due to update_mmu_cache change
  scripts/decodecode: Add support for RISC-V
  riscv: Add instruction dump to RISC-V splats
  riscv: select ARCH_WANT_LD_ORPHAN_WARN for !XIP_KERNEL
  riscv: vmlinux.lds.S: explicitly catch .init.bss sections from EFI stub
  riscv: vmlinux.lds.S: explicitly catch .riscv.attributes sections
  riscv: vmlinux.lds.S: explicitly catch .rela.dyn symbols
  riscv: lds: define RUNTIME_DISCARD_EXIT
  RISC-V: move some stray __RISCV_INSN_FUNCS definitions from kprobes
  ...
2023-02-25 11:14:08 -08:00
Björn Töpel
416721ff05
riscv, mm: Perform BPF exhandler fixup on page fault
Commit 21855cac82 ("riscv/mm: Prevent kernel module to access user
memory without uaccess routines") added early exits/deaths for page
faults stemming from accesses to user-space without using proper
uaccess routines (where sstatus.SUM is set).

Unfortunatly, this is too strict for some BPF programs, which relies
on BPF exhandler fixups. These BPF programs loads "BTF pointers". A
BTF pointers could either be a valid kernel pointer or NULL, but not a
userspace address.

Resolve the problem by calling the fixup handler in the early exit
path.

Fixes: 21855cac82 ("riscv/mm: Prevent kernel module to access user memory without uaccess routines")
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20230214162515.184827-1-bjorn@kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-21 17:21:39 -08:00
Guo Ren
950b879b7f
riscv: Fixup race condition on PG_dcache_clean in flush_icache_pte
In commit 588a513d34 ("arm64: Fix race condition on PG_dcache_clean
in __sync_icache_dcache()"), we found RISC-V has the same issue as the
previous arm64. The previous implementation didn't guarantee the correct
sequence of operations, which means flush_icache_all() hasn't been
called when the PG_dcache_clean was set. That would cause a risk of page
synchronization.

Fixes: 08f051eda3 ("RISC-V: Flush I$ when making a dirty page executable")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230127035306.1819561-1-guoren@kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-09 11:40:32 -08:00
Mayuresh Chitale
f0293cd1f4
riscv: mm: Implement pmdp_collapse_flush for THP
When THP is enabled, 4K pages are collapsed into a single huge
page using the generic pmdp_collapse_flush() which will further
use flush_tlb_range() to shoot-down stale TLB entries. Unfortunately,
the generic pmdp_collapse_flush() only invalidates cached leaf PTEs
using address specific SFENCEs which results in repetitive (or
unpredictable) page faults on RISC-V implementations which cache
non-leaf PTEs.

Provide a RISC-V specific pmdp_collapse_flush() which ensures both
cached leaf and non-leaf PTEs are invalidated by using non-address
specific SFENCEs as recommended by the RISC-V privileged specification.

Fixes: e88b333142 ("riscv: mm: add THP support on 64-bit")
Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Link: https://lore.kernel.org/r/20230130074815.1694055-1-mchitale@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-01 20:52:09 -08:00
Linus Torvalds
eb67d239f3 RISC-V Patches for the 6.2 Merge Window, Part 1
* Support for the T-Head PMU via the perf subsystem.
 * ftrace support for rv32.
 * Support for non-volatile memory devices.
 * Various fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmOZ6WsTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiWGcD/wLGiHq3ekQhl5D+CaA1WlJ5XzQFfY2
 bv1ZCZGdjuiv66jiMlmEsbpfUCk3bSAIjCO3MHQNDmTuPJztCHVJXOHbZFWItzzO
 soW4nXHKW1sGHa7hDLGQUPkltA48OdPoyqEDvlnpyEWFT+2xHwdFEURWE85FXGeq
 ZzFSKUQqX/V52n9TS4M4QtmNnQatR3TgIs8ttzD4JqwWFBbp4/iBfIGt6n3W24XH
 9lKWikO4YOYUPl0KVIakM4d8NmX7g+7vhCKWavLke1fF/IQOlyWwA0eM8ryj33OG
 L1nFkqfF3mCw9i72WHftlc0rAgVqcYS8ntnQkPNpt2zPp3xFjDwEy+XiZrRE+sAp
 m5Ma2Tkw7G3ueBtXwP1yo+EKa7PrVFbCRD/rEpLJAC6+9ktvc7cYs39E08O+wrwT
 qkYThDolovqMOqfOq6afEGy5lfIa5U00vxK+3MXiE3eLEjHSJhwTXadUbwyMjJWE
 zOwA6p5NfDFzklESSNTtIBY85Zlh/g2q6GWCy7yBQnlaSdbpDxcnAlSZipq66Iqm
 9ytdZiHid4BIRQxr5qyXTB184BvFnWNRs9NGhCj38uLEnuxwSChzwoh/WPDxLNte
 U9ouvwJO5U2qAZsMGJhY8W2s/9WvWpSqRhSMA/nnNV1Hh+URFz8rFXAln6kNn//v
 j+cYGCyjLnO1hg==
 =4Ak2
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.2-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the T-Head PMU via the perf subsystem

 - ftrace support for rv32

 - Support for non-volatile memory devices

 - Various fixes and cleanups

* tag 'riscv-for-linus-6.2-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits)
  Documentation: RISC-V: patch-acceptance: s/implementor/implementer
  Documentation: RISC-V: Mention the UEFI Standards
  Documentation: RISC-V: Allow patches for non-standard behavior
  Documentation: RISC-V: Fix a typo in patch-acceptance
  riscv: Fixup compile error with !MMU
  riscv: Fix P4D_SHIFT definition for 3-level page table mode
  riscv: Apply a static assert to riscv_isa_ext_id
  RISC-V: Add some comments about the shadow and overflow stacks
  RISC-V: Align the shadow stack
  RISC-V: Ensure Zicbom has a valid block size
  RISC-V: Introduce riscv_isa_extension_check
  RISC-V: Improve use of isa2hwcap[]
  riscv: Don't duplicate _ALTERNATIVE_CFG* macros
  riscv: alternatives: Drop the underscores from the assembly macro names
  riscv: alternatives: Don't name unused macro parameters
  riscv: Don't duplicate __ALTERNATIVE_CFG in __ALTERNATIVE_CFG_2
  riscv: mm: call best_map_size many times during linear-mapping
  riscv: Move cast inside kernel_mapping_[pv]a_to_[vp]a
  riscv: Fix crash during early errata patching
  riscv: boot: add zstd support
  ...
2022-12-14 15:23:49 -08:00
Palmer Dabbelt
59a582ad13
Merge patch series "RISC-V: Ensure Zicbom has a valid block size"
Andrew Jones <ajones@ventanamicro.com> says:

When a DT puts zicbom in the isa string, but does not provide a block
size, ALT_CMO_OP() will attempt to do cache operations on address
zero since the start address will be ANDed with zero. We can't simply
BUG() in riscv_init_cbom_blocksize() when we fail to find a block
size because the failure will happen before logging works, leaving
users to scratch their heads as to why the boot hung. Instead, ensure
Zicbom is disabled and output an error which will hopefully alert
people that the DT needs to be fixed. While at it, add a check that
the block size is a power-of-2 too.

* b4-shazam-merge:
  RISC-V: Ensure Zicbom has a valid block size
  RISC-V: Introduce riscv_isa_extension_check
  RISC-V: Improve use of isa2hwcap[]

Link: https://lore.kernel.org/r/20221129143447.49714-1-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-09 19:13:01 -08:00
Qinglin Pan
6ff8ca3f93
riscv: mm: call best_map_size many times during linear-mapping
Modify the best_map_size function to give map_size many times instead
of only once, so a memory region can be mapped by both PMD_SIZE and
PAGE_SIZE.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20221128023643.329091-1-panqinglin2020@iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-09 13:42:26 -08:00
Palmer Dabbelt
61b2f0bdaa
Merge patch series "riscv: Fix crash during early errata patching"
These are fixes, but due to the possible early boot fallout they're
going in the merge window to get a bit more time to bake on linux-next.

* b4-shazam-merge
  riscv: Move cast inside kernel_mapping_[pv]a_to_[vp]a
  riscv: Fix crash during early errata patching

Link: https://lore.kernel.org/r/20221126060920.65009-1-samuel@sholland.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-08 17:22:57 -08:00
Samuel Holland
583286e207
riscv: Move cast inside kernel_mapping_[pv]a_to_[vp]a
Before commit 44c9225729 ("RISC-V: enable XIP"), these macros cast
their argument to unsigned long. That commit moved the cast after an
assignment to an unsigned long variable, rendering it ineffectual.
Move the cast back, so we can remove the cast at each call site.

Reviewed-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/r/20221126060920.65009-2-samuel@sholland.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-08 17:22:54 -08:00
Samuel Holland
0c49688174
riscv: Fix crash during early errata patching
The patch function for the T-Head PBMT errata calls __pa_symbol() before
relocation. This crashes when CONFIG_DEBUG_VIRTUAL is enabled, because
__pa_symbol() forwards to __phys_addr_symbol(), and __phys_addr_symbol()
checks against the absolute kernel start/end address.

Fix this by checking against the kernel map instead of a symbol address.

Fixes: a35707c3d8 ("riscv: add memory-type errata for T-Head")
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/r/20221126060920.65009-1-samuel@sholland.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-08 17:22:53 -08:00
Palmer Dabbelt
049696a39d
Merge patch series "Add PMEM support for RISC-V"
Anup Patel <apatel@ventanamicro.com> says:

The Linux NVDIMM PEM drivers require arch support to map and access the
persistent memory device. This series adds RISC-V PMEM support using
recently added Svpbmt and Zicbom support.

* b4-shazam-merge:
  RISC-V: Enable PMEM drivers
  RISC-V: Implement arch specific PMEM APIs
  RISC-V: Fix MEMREMAP_WB for systems with Svpbmt

Link: https://lore.kernel.org/r/20221114090536.1662624-1-apatel@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-08 15:45:28 -08:00
Anup Patel
a49ab905a1
RISC-V: Implement arch specific PMEM APIs
The NVDIMM PMEM driver expects arch specific APIs for cache maintenance
and if arch does not provide these APIs then NVDIMM PMEM driver will
always use MEMREMAP_WT to map persistent memory which in-turn maps as
UC memory type defined by the RISC-V Svpbmt specification.

Now that the Svpbmt and Zicbom support is available in RISC-V kernel,
we implement PMEM APIs using ALT_CMO_OP() macros so that the NVDIMM
PMEM driver can use MEMREMAP_WB to map persistent memory.

Co-developed-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20221114090536.1662624-3-apatel@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-08 15:43:59 -08:00
Sergey Matyukevich
4bd1d80efb
riscv: mm: notify remote harts about mmu cache updates
Current implementation of update_mmu_cache function performs local TLB
flush. It does not take into account ASID information. Besides, it does
not take into account other harts currently running the same mm context
or possible migration of the running context to other harts. Meanwhile
TLB flush is not performed for every context switch if ASID support
is enabled.

Patch [1] proposed to add ASID support to update_mmu_cache to avoid
flushing local TLB entirely. This patch takes into account other
harts currently running the same mm context as well as possible
migration of this context to other harts.

For this purpose the approach from flush_icache_mm is reused. Remote
harts currently running the same mm context are informed via SBI calls
that they need to flush their local TLBs. All the other harts are marked
as needing a deferred TLB flush when this mm context runs on them.

[1] https://lore.kernel.org/linux-riscv/20220821013926.8968-1-tjytimi@163.com/

Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com>
Fixes: 65d4b9c530 ("RISC-V: Implement ASID allocator")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/linux-riscv/20220829205219.283543-1-geomatsi@gmail.com/#t
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-08 15:18:16 -08:00
Tong Tiangen
d33deda095
riscv/mm: hugepage's PG_dcache_clean flag is only set in head page
HugeTLB pages are always fully mapped, so only setting head page's
PG_dcache_clean flag is enough.

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Link: https://lore.kernel.org/lkml/20220331065640.5777-2-songmuchun@bytedance.com/
Link: https://lore.kernel.org/r/20221024094725.3054311-2-tongtiangen@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-02 11:25:49 -08:00
Linus Torvalds
991f173cd2 RISC-V Fixes for 6.1-rc5
* A fix to add the missing PWM LEDs into the  SiFive HiFive Unleashed
   device tree.
 * A fix to fully clear a task's registers on creation, as they end up in
   userspace and thus leak kernel memory.
 * A pair of VDSO-related build fixes that manifest on recent LLVM-based
   toolchains.
 * A fix to our early init to ensure the DT is adequately processed
   before reserved memory nodes are processed.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmNuex4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiUm5D/94ElBfEEH1Si2qwzfk5Vwxap0Jsjwj
 BA7GHito1P22P4yHlvXnMyPvjnl3f0/Cc7Q6O5+J8HvnR9FWeu97s84oBJU/Thkt
 D9pd68nknY/HPZ1P5eb4gzBUwo2D6WWn4ISk2uoa8XAJ9EKfhylnYbn32rzyRerj
 p8HFT1lP71mryLCr0FsxEOdqtBl7+41/Htd/jxEA+lwJtHJ3F3Yk+TnaeCzebGjT
 kg5vwTzNwgkl7o4rlLsBv3d/Oz8kvLZh7V9qsZ9Ta//ZNpCGw0PMOwzxCKSYS6C0
 ZK342BRThsqxAjc2y1QtKOzFBPzEECzbgSSixFiEVXY4qpDY4ZlojXEOxRLevMjl
 KpLJL4e3wXpfyknZXLeIPIh0tFzHnuWrgliBVwa5CO4hYXhxWRY+qzh6LTqypx/W
 ewvuQ1auxmFFB6U6RHEXWuUv5mtzymXVT8lcHltpGcpGvxcbvnSw9SYtXTM6Sz5d
 DzEtEpgCZTx9MEp981FJ+uMYs4mGtURRDqqKZzeLOiBmXe8r7wfqnTIyosW1/f8V
 aXYJB90yMX5QvVJPKJHocphaKbNl75ywF8z0JSdnilO8CKJMJ6elJ9KleiJGJGFc
 zFWyFVPt18Eg+w+r4vMGmH2tz+hlaaZ+sOOGFExeM7KRU18WhP4MMVLgJiaLtoul
 UUX63X1wx7S2JA==
 =EHH7
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix to add the missing PWM LEDs into the SiFive HiFive Unleashed
   device tree.

 - A fix to fully clear a task's registers on creation, as they end up
   in userspace and thus leak kernel memory.

 - A pair of VDSO-related build fixes that manifest on recent LLVM-based
   toolchains.

 - A fix to our early init to ensure the DT is adequately processed
   before reserved memory nodes are processed.

* tag 'riscv-for-linus-6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: vdso: Do not add missing symbols to version section in linker script
  riscv: fix reserved memory setup
  riscv: vdso: fix build with llvm
  riscv: process: fix kernel info leakage
  riscv: dts: sifive unleashed: Add PWM controlled LEDs
2022-11-11 09:40:19 -08:00
Conor Dooley
50e63dd8ed
riscv: fix reserved memory setup
Currently, RISC-V sets up reserved memory using the "early" copy of the
device tree. As a result, when trying to get a reserved memory region
using of_reserved_mem_lookup(), the pointer to reserved memory regions
is using the early, pre-virtual-memory address which causes a kernel
panic when trying to use the buffer's name:

 Unable to handle kernel paging request at virtual address 00000000401c31ac
 Oops [#1]
 Modules linked in:
 CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.0-rc1-00001-g0d9d6953d834 #1
 Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
 epc : string+0x4a/0xea
  ra : vsnprintf+0x1e4/0x336
 epc : ffffffff80335ea0 ra : ffffffff80338936 sp : ffffffff81203be0
  gp : ffffffff812e0a98 tp : ffffffff8120de40 t0 : 0000000000000000
  t1 : ffffffff81203e28 t2 : 7265736572203a46 s0 : ffffffff81203c20
  s1 : ffffffff81203e28 a0 : ffffffff81203d22 a1 : 0000000000000000
  a2 : ffffffff81203d08 a3 : 0000000081203d21 a4 : ffffffffffffffff
  a5 : 00000000401c31ac a6 : ffff0a00ffffff04 a7 : ffffffffffffffff
  s2 : ffffffff81203d08 s3 : ffffffff81203d00 s4 : 0000000000000008
  s5 : ffffffff000000ff s6 : 0000000000ffffff s7 : 00000000ffffff00
  s8 : ffffffff80d9821a s9 : ffffffff81203d22 s10: 0000000000000002
  s11: ffffffff80d9821c t3 : ffffffff812f3617 t4 : ffffffff812f3617
  t5 : ffffffff812f3618 t6 : ffffffff81203d08
 status: 0000000200000100 badaddr: 00000000401c31ac cause: 000000000000000d
 [<ffffffff80338936>] vsnprintf+0x1e4/0x336
 [<ffffffff80055ae2>] vprintk_store+0xf6/0x344
 [<ffffffff80055d86>] vprintk_emit+0x56/0x192
 [<ffffffff80055ed8>] vprintk_default+0x16/0x1e
 [<ffffffff800563d2>] vprintk+0x72/0x80
 [<ffffffff806813b2>] _printk+0x36/0x50
 [<ffffffff8068af48>] print_reserved_mem+0x1c/0x24
 [<ffffffff808057ec>] paging_init+0x528/0x5bc
 [<ffffffff808031ae>] setup_arch+0xd0/0x592
 [<ffffffff8080070e>] start_kernel+0x82/0x73c

early_init_fdt_scan_reserved_mem() takes no arguments as it operates on
initial_boot_params, which is populated by early_init_dt_verify(). On
RISC-V, early_init_dt_verify() is called twice. Once, directly, in
setup_arch() if CONFIG_BUILTIN_DTB is not enabled and once indirectly,
very early in the boot process, by parse_dtb() when it calls
early_init_dt_scan_nodes().

This first call uses dtb_early_va to set initial_boot_params, which is
not usable later in the boot process when
early_init_fdt_scan_reserved_mem() is called. On arm64 for example, the
corresponding call to early_init_dt_scan_nodes() uses fixmap addresses
and doesn't suffer the same fate.

Move early_init_fdt_scan_reserved_mem() further along the boot sequence,
after the direct call to early_init_dt_verify() in setup_arch() so that
the names use the correct virtual memory addresses. The above supposed
that CONFIG_BUILTIN_DTB was not set, but should work equally in the case
where it is - unflatted_and_copy_device_tree() also updates
initial_boot_params.

Reported-by: Valentina Fernandez <valentina.fernandezalanis@microchip.com>
Reported-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Link: https://lore.kernel.org/linux-riscv/f8e67f82-103d-156c-deb0-d6d6e2756f5e@microchip.com/
Fixes: 922b0375fc ("riscv: Fix memblock reservation for device tree blob")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Evgenii Shatokhin <e.shatokhin@yadro.com>
Link: https://lore.kernel.org/r/20221107151524.3941467-1-conor.dooley@microchip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-10 14:46:33 -08:00
Liu Shixin
310f541a02
riscv: Enable HAVE_ARCH_HUGE_VMAP for 64BIT
This sets the HAVE_ARCH_HUGE_VMAP option, and defines the required page
table functions. With this feature, ioremap area will be mapped with
huge page granularity according to its actual size. This feature can be
disabled by kernel parameter "nohugeiomap".

Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Björn Töpel <bjorn@kernel.org>
Tested-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20221012120038.1034354-2-liushixin2@huawei.com
[Palmer: minor formatting]
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-28 17:10:01 -07:00
Linus Torvalds
283f13d43b RISC-V Fixes for 6.1-rc3
* A fix for a build warning in the jump_label code.
 * One of the git://github -> https://github cleanups, for the SiFive
   drivers.
 * A fix for the kasan initialization code, this still likely warrants
   some cleanups but that's a bigger problem and at least this fixes the
   crashes in the short term.
 * A pair of fixes for extension support detection on mixed LLVM/GNU
   toolchains.
 * A fix for a runtime warning in the /proc/cpuinfo code.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmNcIeETHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiQooEAC23Pgmdp5A74VgmCxt1pLPuLwapHL+
 piAvzBx96GMajdWhzfBxwR8PTAAdiffpyFD+mal9MmmDAy+fduYQD76ge28Kx33y
 yJIIya+9vSHcCkMUfj8Kag93qAsXTwJIvuyrHJ8JGdnxgs4jywKPge6S4GYOiFD2
 8UALXgr/4Xi1WJnHvGPElLbVUqAcha2F4Wbl8P0dEUnuCFbiHQfDqWAx7OgC9zrZ
 CJhM3vQNADXO+7Yj8MDyeRQW2JVeBi1xNJ00fGgArUNHQqRMnEgZ4Hbfymf6j+yq
 1RP6sCjha3qIGofqD7B0pfcgThjRi8kpHL+RjiFrb4UigfxErdZDhBsOxTVB5slG
 oe4V7mrm3gdHPgW3ZytjmkXhHNzRRFfmuUwfqR/h9iumkpnNntuphPQVCtCBKDNx
 dZ6JGdVw7ljgp51t66vvJLy3lpYYcnhh64sCFYdWSY1+n0sJB6QfwnDPp8AFznCn
 9PafHYzEOM/e5PBlRqoORuRfkRmk54pJu+uKHh/FtcG3Z7q6htyF8QHDMaYizPna
 MDqcCl1DdKBh24D0E2KRNzhNTGNSFNKcSnXB6dF9wH4jLlglyIaNe0vqmfFxVWkV
 ufMcfXjxgl5zHJUf0IBuw57pVZddriRhVU+2YECnMgquuC9drjHa1sG71oYm/loY
 9XYPR12ckU7lqg==
 =In1r
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for a build warning in the jump_label code

 - One of the git://github -> https://github cleanups, for the SiFive
   drivers

 - A fix for the kasan initialization code, this still likely warrants
   some cleanups but that's a bigger problem and at least this fixes the
   crashes in the short term

 - A pair of fixes for extension support detection on mixed LLVM/GNU
   toolchains

 - A fix for a runtime warning in the /proc/cpuinfo code

* tag 'riscv-for-linus-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: Fix /proc/cpuinfo cpumask warning
  riscv: fix detection of toolchain Zihintpause support
  riscv: fix detection of toolchain Zicbom support
  riscv: mm: add missing memcpy in kasan_init
  MAINTAINERS: git://github.com -> https://github.com for sifive
  riscv: jump_label: mark arguments as const to satisfy asm constraints
2022-10-28 17:03:00 -07:00
Qinglin Pan
9f2ac64d6c
riscv: mm: add missing memcpy in kasan_init
Hi Atish,

It seems that the panic is due to the missing memcpy during kasan_init.
Could you please check whether this patch is helpful?

When doing kasan_populate, the new allocated base_pud/base_p4d should
contain kasan_early_shadow_{pud, p4d}'s content. Add the missing memcpy
to avoid page fault when read/write kasan shadow region.

Tested on:
 - qemu with sv57 and CONFIG_KASAN on.
 - qemu with sv48 and CONFIG_KASAN on.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Tested-by: Atish Patra <atishp@rivosinc.com>
Fixes: 8fbdccd2b1 ("riscv: mm: Support kasan for sv57")
Link: https://lore.kernel.org/r/20221009083050.3814850-1-panqinglin2020@iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-27 14:55:58 -07:00
Andrew Jones
5c20a3a9df RISC-V: Fix compilation without RISCV_ISA_ZICBOM
riscv_cbom_block_size and riscv_init_cbom_blocksize() should always
be available and riscv_init_cbom_blocksize() should always be
invoked, even when compiling without RISCV_ISA_ZICBOM enabled. This
is because disabling RISCV_ISA_ZICBOM means "don't use zicbom
instructions in the kernel" not "pretend there isn't zicbom, even
when there is". When zicbom is available, whether the kernel enables
its use with RISCV_ISA_ZICBOM or not, KVM will offer it to guests.
Ensure we can build KVM and that the block size is initialized even
when compiling without RISCV_ISA_ZICBOM.

Fixes: 8f7e001e03 ("RISC-V: Clean up the Zicbom block size probing")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-10-21 11:52:39 +05:30
Linus Torvalds
498574970f RISC-V Patches for the 6.1 Merge Window, Part 2
* A handful of DT updates for the PolarFire SOC.
 * A fix to correct the handling of write-only mappings.
 * m{vetndor,arcd,imp}id is now in /proc/cpuinfo
 * The SiFive L2 cache controller support has been refactored to also
   support L3 caches.
 
 There's also a handful of fixes, cleanups and improvements throughout
 the tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmNJiegTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiaW+D/9nN7JMKD4KbED8MUgRVcs+TS4MQk5J
 QkjmAXme7w2H1+T5mxNWHk0QvEC6qMu2JQWHeott0ROYPkbXoNtuOOkzcsZhVaeb
 rjWH/WC5QhEeMPDc1qc0AmxVkOa937f2NkGtoEvhW6SMWvStHpefdOHo/ij106Re
 7wxkcj0fjgn/zPmDltRbWSMyFZWJec5DvZ3AB4NYHnc2ycr8Z7HAG08rjZkdkIjQ
 zYGsCkUtrk4qShikKK3cSelW6hH1/FdM+bAo0rzt1frmTw1FLaFOsZZjOaVYekzi
 l0jxsb8FRBWyKBFqcagukjZwFy6D+7Q+masb0cXY03eZpEVrroA4bHiPkuZ22Ms/
 7ol/I5FvTyrj2R4zd70ziYVF3usO78t5HC4AIQmFl25TNcQdYQd8X28yh12iG6QN
 Pa0lh/EOr5idaT2+TErhzRepICnK1Nj9y0H5TZxYljLAhH9j0d/8+Iw88vzJnPga
 vek7unZ3BzkooLVIpfpkT8vC94MA0hoP849MVFQDtZgZhPMbjIkN91HrDiw9ktV2
 SK9cuPndfrs5nW1WKu8F0cDziusbMHv4F51TPVRu/dFcIkspv+aLojAThgvcm42K
 55LgvDgLjo7P3PUghiDXUoPZXvL19t5Bnq2/E47rlSTvvGssIu/onZO6BkxybAm6
 BkHuchr8TGBWMg==
 =iCOr
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.1-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - DT updates for the PolarFire SOC

 - a fix to correct the handling of write-only mappings

 - m{vetndor,arcd,imp}id is now in /proc/cpuinfo

 - the SiFive L2 cache controller support has been refactored to also
   support L3 caches

 - misc fixes, cleanups and improvements throughout the tree

* tag 'riscv-for-linus-6.1-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits)
  MAINTAINERS: add RISC-V's patchwork
  RISC-V: Make port I/O string accessors actually work
  riscv: enable software resend of irqs
  RISC-V: Re-enable counter access from userspace
  riscv: vdso: fix NULL deference in vdso_join_timens() when vfork
  riscv: Add cache information in AUX vector
  soc: sifive: ccache: define the macro for the register shifts
  soc: sifive: ccache: use pr_fmt() to remove CCACHE: prefixes
  soc: sifive: ccache: reduce printing on init
  soc: sifive: ccache: determine the cache level from dts
  soc: sifive: ccache: Rename SiFive L2 cache to Composable cache.
  dt-bindings: sifive-ccache: change Sifive L2 cache to Composable cache
  riscv: check for kernel config option in t-head memory types errata
  riscv: use BIT() marco for cpufeature probing
  riscv: use BIT() macros in t-head errata init
  riscv: drop some idefs from CMO initialization
  riscv: cleanup svpbmt cpufeature probing
  riscv: Pass -mno-relax only on lld < 15.0.0
  RISC-V: Avoid dereferening NULL regs in die()
  dt-bindings: riscv: add new riscv,isa strings for emulators
  ...
2022-10-14 11:21:11 -07:00
Palmer Dabbelt
8aeb7b17f0
RISC-V: Make mmap() with PROT_WRITE imply PROT_READ
Commit 2139619bca ("riscv: mmap with PROT_WRITE but no PROT_READ is
invalid") made mmap() reject mappings with only PROT_WRITE set in an
attempt to fix an observed inconsistency in behavior when attempting
to read from a PROT_WRITE-only mapping. The root cause of this behavior
was actually that while RISC-V's protection_map maps VM_WRITE to
readable PTE permissions (since write-only PTEs are considered reserved
by the privileged spec), the page fault handler considered loads from
VM_WRITE-only VMAs illegal accesses. Fix the underlying cause by
handling faults in VM_WRITE-only VMAs (patch 1) and then re-enable
use of mmap(PROT_WRITE) (patch 2), making RISC-V's behavior consistent
with all other architectures that don't support write-only PTEs.

* remotes/palmer/riscv-wonly:
  riscv: Allow PROT_WRITE-only mmap()
  riscv: Make VM_WRITE imply VM_READ

Link: https://lore.kernel.org/r/20220915193702.2201018-1-abrestic@rivosinc.com/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13 12:49:12 -07:00
Andrew Jones
afd5dde9a1 RISC-V: KVM: Provide UAPI for Zicbom block size
We're about to allow guests to use the Zicbom extension. KVM
userspace needs to know the cache block size in order to
properly advertise it to the guest. Provide a virtual config
register for userspace to get it with the GET_ONE_REG API, but
setting it cannot be supported, so disallow SET_ONE_REG.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-10-02 10:18:59 +05:30
Linus Torvalds
a7b7751aeb RISC-V Fixes for 6.0-rc7
* A handful of build fixes for the T-Head errata, including some
   functional issues the compilers found.
 * A fix for a nasty sigreturn bug.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmMtlZYTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYifj4EAC/POyM8gOYkiaaAgMvjeNTiJYdvBq/
 ZuWBpnmDfayXyg+OHRpe2EQDJL51oDLA0LejN3V0C3fMOA1/zvrb/2LGbwd8v3Ly
 i/Lm9/18P6MJo3vSZ0A5DEOZKj41QsoGd5PIn8uuYT8nYQhCtm2g2ug8lX5Eb3QH
 pI9xjYZWI3nJ8/ah7NOrtB7LWan8BZdH6VuPBKN6zbG466Td7jT3QFQuyw4Ri0Lk
 rbMifm985EUEs/o9X+ObXmi+5M1PoAw+DVMeFq7VkgkHHBwCuVBlEd80wT6wT//1
 qSPQh5i/zs7GT5o8fgRQ+VJCLBh2s3meIFPa3TZ+fkh17w8Ww7tx+NtcOFgp6jWP
 NcPsaF8tokOwNVJNbFdCjXRAHxc6TuZ8hfjj+IsDxxbit9BMVSXdEpeOjURbLCNX
 HI3qooTCIBYh+9CzK3dF+ep0dnGC/awBnoLoH8UgNVxmitSVyc5PgR4w3NBpjoO8
 Htxl1ESll8nIXam6Rzi0UHgDgmMCRNdrW6vil/QD9UNcBYbkrUWhD/PmTfH+2uYC
 UtfzVCnXpeDRkLbNxBMwBGSsb/kMl1hkHcMbXRRgIQRxzqRPs69K8umcc6Xy+3AB
 timl5kzIeGid0nFG1+H67uGjc8Ea0RW3awel2Y8PmCGJDUNMBmKlizqrwMDfZrB0
 HxxG8vcuPF7zWg==
 =9l0i
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A handful of build fixes for the T-Head errata, including some
   functional issues the compilers found

 - A fix for a nasty sigreturn bug

* tag 'riscv-for-linus-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: Avoid coupling the T-Head CMOs and Zicbom
  riscv: fix a nasty sigreturn bug...
  riscv: make t-head erratas depend on MMU
  riscv: fix RISCV_ISA_SVPBMT kconfig dependency warning
  RISC-V: Clean up the Zicbom block size probing
2022-09-23 08:51:05 -07:00
Andrew Bresticker
7ab72c5973
riscv: Make VM_WRITE imply VM_READ
RISC-V does not presently have write-only mappings as that PTE bit pattern
is considered reserved in the privileged spec, so allow handling of read
faults in VMAs that have VM_WRITE without VM_READ in order to be consistent
with other architectures that have similar limitations.

Fixes: 2139619bca ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid")
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220915193702.2201018-2-abrestic@rivosinc.com/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-09-22 09:44:50 -07:00
Palmer Dabbelt
8f7e001e03
RISC-V: Clean up the Zicbom block size probing
This fixes two issues: I truncated the warning's hart ID when porting to
the 64-bit hart ID code, and the original code's warning handling could
fire on an uninitialized hart ID.

The biggest change here is that riscv_cbom_block_size is no longer
initialized, as IMO the default isn't sane: there's nothing in the ISA
that mandates any specific cache block size, so falling back to one will
just silently produce the wrong answer on some systems.  This also
changes the probing order so the cache block size is known before
enabling Zicbom support.

CC: stable@vger.kernel.org
CC: Andrew Jones <ajones@ventanamicro.com>
CC: Heiko Stuebner <heiko@sntech.de>
CC: Atish Patra <atishp@rivosinc.com>
Fixes: 3aefb2ee5b ("riscv: implement Zicbom-based CMO instructions + the t-head variant")
Fixes: 1631ba1259 ("riscv: Add support for non-coherent devices using zicbom extension")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
[Conor: fixed the redefinition errors]
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220912224800.998121-1-mail@conchuod.ie
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-09-13 02:06:11 -07:00
Steven Price
8782fb61cc mm: pagewalk: Fix race between unmap and page walker
The mmap lock protects the page walker from changes to the page tables
during the walk.  However a read lock is insufficient to protect those
areas which don't have a VMA as munmap() detaches the VMAs before
downgrading to a read lock and actually tearing down PTEs/page tables.

For users of walk_page_range() the solution is to simply call pte_hole()
immediately without checking the actual page tables when a VMA is not
present. We now never call __walk_page_range() without a valid vma.

For walk_page_range_novma() the locking requirements are tightened to
require the mmap write lock to be taken, and then walking the pgd
directly with 'no_vma' set.

This in turn means that all page walkers either have a valid vma, or
it's that special 'novma' case for page table debugging.  As a result,
all the odd '(!walk->vma && !walk->no_vma)' tests can be removed.

Fixes: dd2283f260 ("mm: mmap: zap pages with read mmap_sem in munmap")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-09-03 10:13:13 -07:00
Palmer Dabbelt
da06cc5bb6
RISC-V: fixups to work with crash tool
A handful of fixes to our kexec/crash kernel support that allow crash
tool to function.

Link: https://lore.kernel.org/r/mhng-f5fdaa37-e99a-4214-a297-ec81f0fed0c1@palmer-mbp2014

* commit 'f9293ad46d8ba9909187a37b7215324420ad4596':
  RISC-V: Add modules to virtual kernel memory layout dump
  RISC-V: Fixup schedule out issue in machine_crash_shutdown()
  RISC-V: Fixup get incorrect user mode PC for kernel mode regs
  RISC-V: kexec: Fixup use of smp_processor_id() in preemptible context
2022-08-11 09:04:01 -07:00
Xianting Tian
f9293ad46d
RISC-V: Add modules to virtual kernel memory layout dump
Modules always live before the kernel, MODULES_END is fixed but
MODULES_VADDR isn't fixed, it depends on the kernel size.
Let's add it to virtual kernel memory layout dump.

As MODULES is only defined for CONFIG_64BIT, so we dump it when
CONFIG_64BIT=y.

eg,
MODULES_VADDR - MODULES_END
0xffffffff01133000 - 0xffffffff80000000

Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220811074150.3020189-5-xianting.tian@linux.alibaba.com
Cc: stable@vger.kernel.org
Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-08-11 08:58:21 -07:00
Palmer Dabbelt
3aefb2ee5b
riscv: implement Zicbom-based CMO instructions + the t-head variant
This series is based on the alternatives changes done in my svpbmt
series and thus also depends on Atish's isa-extension parsing series.

It implements using the cache-management instructions from the  Zicbom-
extension to handle cache flush, etc actions on platforms needing them.

SoCs using cpu cores from T-Head like the Allwinne D1 implement a
different set of cache instructions. But while they are different,
instructions they provide the same functionality, so a variant can easly
hook into the existing alternatives mechanism on those.

[Palmer:  Some minor fixups, including a RISCV_ISA_ZICBOM dependency on
MMU that's probably not strictly necessary.  The Zicbom support will
trip up sparse for users that have new toolchains, I just sent a patch.]

Link: https://lore.kernel.org/all/20220706231536.2041855-1-heiko@sntech.de/
Link: https://lore.kernel.org/linux-sparse/20220811033138.20676-1-palmer@rivosinc.com/T/#u

* palmer/riscv-zicbom:
  riscv: implement cache-management errata for T-Head SoCs
  riscv: Add support for non-coherent devices using zicbom extension
  dt-bindings: riscv: document cbom-block-size
  of: also handle dma-noncoherent in of_dma_is_coherent()
2022-08-10 20:49:32 -07:00
Heiko Stuebner
1631ba1259
riscv: Add support for non-coherent devices using zicbom extension
The Zicbom ISA-extension was ratified in november 2021
and introduces instructions for dcache invalidate, clean
and flush operations.

Implement cache management operations for non-coherent devices
based on them.

Of course not all cores will support this, so implement an
alternative-based mechanism that replaces empty instructions
with ones done around Zicbom instructions.

As discussed in previous versions, assume the platform
being coherent by default so that non-coherent devices need
to get marked accordingly by firmware.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20220706231536.2041855-4-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-07-28 15:30:51 -07:00
Anshuman Khandual
4147b5e2d5 riscv/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
This enables ARCH_HAS_VM_GET_PAGE_PROT on the platform and exports
standard vm_get_page_prot() implementation via DECLARE_VM_GET_PAGE_PROT,
which looks up a private and static protection_map[] array.  Subsequently
all __SXXX and __PXXX macros can be dropped which are no longer needed.

Link: https://lkml.kernel.org/r/20220711070600.2378316-17-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-17 17:14:39 -07:00
Peter Xu
d92725256b mm: avoid unnecessary page fault retires on shared memory types
I observed that for each of the shared file-backed page faults, we're very
likely to retry one more time for the 1st write fault upon no page.  It's
because we'll need to release the mmap lock for dirty rate limit purpose
with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()).

Then after that throttling we return VM_FAULT_RETRY.

We did that probably because VM_FAULT_RETRY is the only way we can return
to the fault handler at that time telling it we've released the mmap lock.

However that's not ideal because it's very likely the fault does not need
to be retried at all since the pgtable was well installed before the
throttling, so the next continuous fault (including taking mmap read lock,
walk the pgtable, etc.) could be in most cases unnecessary.

It's not only slowing down page faults for shared file-backed, but also add
more mmap lock contention which is in most cases not needed at all.

To observe this, one could try to write to some shmem page and look at
"pgfault" value in /proc/vmstat, then we should expect 2 counts for each
shmem write simply because we retried, and vm event "pgfault" will capture
that.

To make it more efficient, add a new VM_FAULT_COMPLETED return code just to
show that we've completed the whole fault and released the lock.  It's also
a hint that we should very possibly not need another fault immediately on
this page because we've just completed it.

This patch provides a ~12% perf boost on my aarch64 test VM with a simple
program sequentially dirtying 400MB shmem file being mmap()ed and these are
the time it needs:

  Before: 650.980 ms (+-1.94%)
  After:  569.396 ms (+-1.38%)

I believe it could help more than that.

We need some special care on GUP and the s390 pgfault handler (for gmap
code before returning from pgfault), the rest changes in the page fault
handlers should be relatively straightforward.

Another thing to mention is that mm_account_fault() does take this new
fault as a generic fault to be accounted, unlike VM_FAULT_RETRY.

I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do
not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping
them as-is.

Link: https://lkml.kernel.org/r/20220530183450.42886-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vineet Gupta <vgupta@kernel.org>
Acked-by: Guo Ren <guoren@kernel.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>	[arm part]
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Stafford Horne <shorne@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Weinberger <richard@nod.at>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Will Deacon <will@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Chris Zankel <chris@zankel.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Rich Felker <dalias@libc.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Helge Deller <deller@gmx.de>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:48:27 -07:00
Jisheng Zhang
9c375cfc73
riscv: mm: init: make pt_ops_set_[early|late|fixmap] static
These three functions are only used in init.c, so make them static.
Fix W=1 warnings like below:

arch/riscv/mm/init.c:721:13: warning: no previous prototype for function
'pt_ops_set_early' [-Wmissing-prototypes]
   void __init pt_ops_set_early(void)
               ^

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20220516143204.2603-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-06-02 14:50:41 -07:00
Alexandre Ghiti
26b8f69edd
riscv: Improve virtual kernel memory layout dump
With the arrival of sv48 and its large address space, it would be
cumbersome to statically define the unit size to use to print the different
portions of the virtual memory layout: instead, determine it dynamically.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-06-01 21:46:29 -07:00
Linus Torvalds
35b51afd23 RISC-V Patches for the 5.19 Merge Window, Part 1
* Support for the Svpbmt extension, which allows memory attributes to be
   encoded in pages.
 * Support for the Allwinner D1's implementation of page-based memory
   attributes.
 * Support for running rv32 binaries on rv64 systems, via the compat
   subsystem.
 * Support for kexec_file().
 * Support for the new generic ticket-based spinlocks, which allows us to
   also move to qrwlock.  These should have already gone in through the
   asm-geneic tree as well.
 * A handful of cleanups and fixes, include some larger ones around
   atomics and XIP.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmKWOx8THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYieAiEADAUdP7ctoaSQwk5skd/fdA3b4KJuKn
 1Zjl+Br32WP0DlbirYBYWRUQZnCCsvABbTiwSJMcG7NBpU5pyQ5XDtB3OA5kJswO
 Fdp8Nd53//+GK1M5zdEM9OdgvT9fbfTZ3qTu8bKsROOQhGwnYL+Csc9KjFRqEmzN
 oQii0jlb3n5PM4FL3GsbV4uMn9zzkP9mnVAPQktcock2EKFEK/Fy3uNYMQiO2KPi
 n8O6bIDaeRdQ6SurzWOuOkt0cro0tEF85ilzT04mynQsOU0el5oGqCxnOhNH3VWg
 ndqPT6Yafw12hZOtbKJeP+nF8IIR6aJLP3jOtRwEVgcfbXYAw4QwbAV8kQZISefN
 ipn8JGY7GX9Y9TYU692OUGkcmAb3/dxb6c0WihBdvJ0M6YyLD5X+YKHNuG2onLgK
 ss43C5Mxsu629rsjdu/PV91B1+pve3rG9siVmF+g4eo0x9rjMq6/JB0Kal/8SLI1
 Je5T55d5ujV1a2XxhZLQOSD5owrK7J1M9owb0bloTnr9nVwFTWDrfEQEU82o3kP+
 Xm+FfXktnz9ai55NjkMbbEur5D++dKJhBavwCTnBcTrJmMtEH0R45GTK9ZehP+WC
 rNVrRXjIsS18wsTfJxnkZeFQA38as6VBKTzvwHvOgzTrrZU1/xk3lpkouYtAO6BG
 gKacHshVilmUuA==
 =Loi6
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.19-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the Svpbmt extension, which allows memory attributes to
   be encoded in pages

 - Support for the Allwinner D1's implementation of page-based memory
   attributes

 - Support for running rv32 binaries on rv64 systems, via the compat
   subsystem

 - Support for kexec_file()

 - Support for the new generic ticket-based spinlocks, which allows us
   to also move to qrwlock. These should have already gone in through
   the asm-geneic tree as well

 - A handful of cleanups and fixes, include some larger ones around
   atomics and XIP

* tag 'riscv-for-linus-5.19-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits)
  RISC-V: Prepare dropping week attribute from arch_kexec_apply_relocations[_add]
  riscv: compat: Using seperated vdso_maps for compat_vdso_info
  RISC-V: Fix the XIP build
  RISC-V: Split out the XIP fixups into their own file
  RISC-V: ignore xipImage
  RISC-V: Avoid empty create_*_mapping definitions
  riscv: Don't output a bogus mmu-type on a no MMU kernel
  riscv: atomic: Add custom conditional atomic operation implementation
  riscv: atomic: Optimize dec_if_positive functions
  riscv: atomic: Cleanup unnecessary definition
  RISC-V: Load purgatory in kexec_file
  RISC-V: Add purgatory
  RISC-V: Support for kexec_file on panic
  RISC-V: Add kexec_file support
  RISC-V: use memcpy for kexec_file mode
  kexec_file: Fix kexec_file.c build error for riscv platform
  riscv: compat: Add COMPAT Kbuild skeletal support
  riscv: compat: ptrace: Add compat_arch_ptrace implement
  riscv: compat: signal: Add rt_frame implementation
  riscv: add memory-type errata for T-Head
  ...
2022-05-31 14:10:54 -07:00
Palmer Dabbelt
4e2bbecd71
RISC-V: Various XIP fixes
This fixes a handful of issues with the XIP support, which has bit
rotted some lately.

* palmer/riscv-xip:
  RISC-V: Fix the XIP build
  RISC-V: Split out the XIP fixups into their own file
  RISC-V: ignore xipImage
  RISC-V: Avoid empty create_*_mapping definitions
2022-05-26 14:35:07 -07:00
Linus Torvalds
3f306ea2e1 dma-mapping updates for Linux 5.19
- don't over-decrypt memory (Robin Murphy)
  - takes min align mask into account for the swiotlb max mapping size
    (Tianyu Lan)
  - use GFP_ATOMIC in dma-debug (Mikulas Patocka)
  - fix DMA_ATTR_NO_KERNEL_MAPPING on xen/arm (me)
  - don't fail on highmem CMA pages in dma_direct_alloc_pages (me)
  - cleanup swiotlb initialization and share more code with swiotlb-xen
    (me, Stefano Stabellini)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmKObTQLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYObmA//dIcDB/q4iFGD+WJh4MhM+asx0ZsdF2OJz42WEhgT
 Z9duOrgcneEQundCamqJP9rNTs980LHDA8uWQC5rZEc9vxuRVOdS7bSgYRUwWh6B
 r0ZjOsvQCn+ChoZML8uyk4rfmEINq+EvJuec3G5fgecZOhPuJS2i2uzzv5cHwqgP
 ChC0fwyZlkfdECXgvZXbEoCJLfTgGNlziN6Ai8dirSoqgEQUoCsY89/M7OiEBvV2
 R4XUWD7OvQERfB4t6xLuUHyzf9PAuWB+OiblRVNeAmK3lMjxVrc3k4kIowgklnzD
 8hfmphAa9Zou3zdfi6Gd4fiQRHRVOwKVp1rtqUmJ+lPSiwyMzu64z9ld2+2qac0h
 V4sSr/yJkhxnBT4/0MkTChvhnRobisackpUzNRpiM4ck7cNVb7eAvkISsbH+pWI9
 aEexPhbyskjlV+GOyM4QL4ygG0dpXY0HSyoh6uaSVsaXMycnWIsJCPidXxV1HGV0
 q2/RLHuHwYxia8cYCF01/DQvwOKSjwbU0zModxtRezGD5GYh2C0a+SrA1aX+qiTu
 yGJCs2UHtSQstAt78tTVp499YeDeL/oGSQkPAu8zyRkSczzF+CncGTuXyoJbAWyK
 otcgERWljgZ4scxjfu1uacfoVhKQ7nOu7hiJokL0U80FESAennLC3ZlocvB9h/ff
 HNA=
 =n2rk
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - don't over-decrypt memory (Robin Murphy)

 - takes min align mask into account for the swiotlb max mapping size
   (Tianyu Lan)

 - use GFP_ATOMIC in dma-debug (Mikulas Patocka)

 - fix DMA_ATTR_NO_KERNEL_MAPPING on xen/arm (me)

 - don't fail on highmem CMA pages in dma_direct_alloc_pages (me)

 - cleanup swiotlb initialization and share more code with swiotlb-xen
   (me, Stefano Stabellini)

* tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping: (23 commits)
  dma-direct: don't over-decrypt memory
  swiotlb: max mapping size takes min align mask into account
  swiotlb: use the right nslabs-derived sizes in swiotlb_init_late
  swiotlb: use the right nslabs value in swiotlb_init_remap
  swiotlb: don't panic when the swiotlb buffer can't be allocated
  dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
  dma-direct: don't fail on highmem CMA pages in dma_direct_alloc_pages
  swiotlb-xen: fix DMA_ATTR_NO_KERNEL_MAPPING on arm
  x86: remove cruft from <asm/dma-mapping.h>
  swiotlb: remove swiotlb_init_with_tbl and swiotlb_init_late_with_tbl
  swiotlb: merge swiotlb-xen initialization into swiotlb
  swiotlb: provide swiotlb_init variants that remap the buffer
  swiotlb: pass a gfp_mask argument to swiotlb_init_late
  swiotlb: add a SWIOTLB_ANY flag to lift the low memory restriction
  swiotlb: make the swiotlb_init interface more useful
  x86: centralize setting SWIOTLB_FORCE when guest memory encryption is enabled
  x86: remove the IOMMU table infrastructure
  MIPS/octeon: use swiotlb_init instead of open coding it
  arm/xen: don't check for xen_initial_domain() in xen_create_contiguous_region
  swiotlb: rename swiotlb_late_init_with_default_size
  ...
2022-05-25 19:18:36 -07:00
Palmer Dabbelt
d9e418d0ca
RISC-V: Fix the XIP build
A handful of functions unused functions were enabled during XIP builds,
which themselves didn't build correctly.  This just disables the
functions entirely.

Fixes: e8a62cc26d ("riscv: Implement sv48 support")
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20220420184056.7886-5-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-05-25 14:43:42 -07:00
Palmer Dabbelt
f83050a82d
RISC-V: Avoid empty create_*_mapping definitions
These trigger a handful of build warnings.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 677b9eb881 ("riscv: mm: Prepare pt_ops helper functions for sv57")
Link: https://lore.kernel.org/r/20220420184056.7886-2-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-05-24 17:50:35 -07:00
Palmer Dabbelt
93c0651617
riscv: support for Svpbmt and D1 memory types
Adds support for Svpbmt, the "Supervisor-mode: page-based memory types"
extension, which allows pages to be marked as non-cacheable and/or I/O.
This also includes support for the Allwinner D1's page table attributes
via the alternatives framework, which differ from Svpbmt in various ways
but are necessary to make the D1 function.

* palmer/riscv-d1:
  riscv: add memory-type errata for T-Head
  riscv: don't use global static vars to store alternative data
  riscv: remove FIXMAP_PAGE_IO and fall back to its default value
  riscv: add RISC-V Svpbmt extension support
  riscv: Fix accessing pfn bits in PTEs for non-32bit variants
  riscv: move boot alternatives to after fill_hwcap
  riscv: prevent compressed instructions in alternatives
  riscv: extend concatenated alternatives-lines to the same length
  riscv: implement ALTERNATIVE_2 macro
  riscv: implement module alternatives
  riscv: allow different stages with alternatives
  riscv: integrate alternatives better into the main architecture
2022-05-12 09:12:09 -07:00
Heiko Stuebner
a35707c3d8
riscv: add memory-type errata for T-Head
Some current cpus based on T-Head cores implement memory-types
way different than described in the svpbmt spec even going
so far as using PTE bits marked as reserved.

Add the T-Head vendor-id and necessary errata code to
replace the affected instructions.

Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/r/20220511192921.2223629-13-heiko@sntech.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-05-11 21:36:33 -07:00
Nick Kossifidis
c6fe81191b
RISC-V: relocate DTB if it's outside memory region
In case the DTB provided by the bootloader/BootROM is before the kernel
image or outside /memory, we won't be able to access it through the
linear mapping, and get a segfault on setup_arch(). Currently OpenSBI
relocates DTB but that's not always the case (e.g. if FW_JUMP_FDT_ADDR
is not specified), and it's also not the most portable approach since
the default FW_JUMP_FDT_ADDR of the generic platform relocates the DTB
at a specific offset that may not be available. To avoid this situation
copy DTB so that it's visible through the linear mapping.

Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Link: https://lore.kernel.org/r/20220322132839.3653682-1-mick@ics.forth.gr
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Fixes: f105aa940e ("riscv: add BUILTIN_DTB support for MMU-enabled targets")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-29 07:59:18 -07:00
Anup Patel
d5fdade933
RISC-V: mm: Fix set_satp_mode() for platform not having Sv57
When Sv57 is not available the satp.MODE test in set_satp_mode() will
fail and lead to pgdir re-programming for Sv48. The pgdir re-programming
will fail as well due to pre-existing pgdir entry used for Sv57 and as
a result kernel fails to boot on RISC-V platform not having Sv57.

To fix above issue, we should clear the pgdir memory in set_satp_mode()
before re-programming.

Fixes: 011f09d120 ("riscv: mm: Set sv57 on defaultly")
Reported-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-21 15:10:39 -07:00
Chuanhua Han
7da9ca3f5b
riscv: mm: Remove the copy operation of pmd
Since all processes share the kernel address space,
we only need to copy pgd in case of a vmalloc page
fault exception, the other levels of page tables are
shared, so the operation of copying pmd is unnecessary.

Signed-off-by: Chuanhua Han <hanchuanhua@oppo.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-20 17:36:58 -07:00
Christoph Hellwig
c6af2aa9ff swiotlb: make the swiotlb_init interface more useful
Pass a boolean flag to indicate if swiotlb needs to be enabled based on
the addressing needs, and replace the verbose argument with a set of
flags, including one to force enable bounce buffering.

Note that this patch removes the possibility to force xen-swiotlb use
with the swiotlb=force parameter on the command line on x86 (arm and
arm64 never supported that), but this interface will be restored shortly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-04-18 07:21:11 +02:00
Linus Torvalds
aa5b537b0e RISC-V Patches for the 5.18 Merge Window, Part 1
* Support for Sv57-based virtual memory.
 * Various improvements for the MicroChip PolarFire SOC and the
   associated Icicle dev board, which should allow upstream kernels to
   boot without any additional modifications.
 * An improved memmove() implementation.
 * Support for the new Ssconfpmf and SBI PMU extensions, which allows for
   a much more useful perf implementation on RISC-V systems.
 * Support for restartable sequences.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmI96FcTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiQBFD/425+6xmoOru6Wiki3Ja0fqQToNrQyW
 IbmE/8AxUP7UxMvJSNzvQm8deXgklzvmegXCtnjwZZins971vMzzDSI83k/zn8I7
 m5thVC9z01BjodV+pvIp/44hS6FesolOLzkVHksX0Zh6h0iidrc34Qf5HrqvvNfN
 CZ/4K1+E9ig5r9qZp4WdvocCXj+FzwF/30GjKoW9vwA599CEG/dCo+TNN9GKD6XS
 k+xiUGwlIRA+kCLSPFCi7ev9XPr1tCmQB7uB8Igcvr7Y3mWl8HKfajQVXBnXNRC3
 ifbDxpx1elJiLPyf7Rza8jIDwDhLQdxBiwPgDgP9h9R4x0uF4efq8PzLzFlFmaE+
 9Z9thfykBb5dXYDFDje9bAOXvKnGk7Iqxdsz0qWo/ChEQawX1+11bJb0TNN8QTT9
 YvlQfUXgb1dmEcj5yG2uVE1Y8L7YNLRMsZU3W3FbmPJZoavSOuU4x0yCGeLyv597
 76af3nuBJ5v80Db97gu6St+HIACeevKflsZUf/8GS/p7d1DlvmrWzQUMEycxPTG9
 UZpZak58jh7AqQ9JbLnavhwmeacY50vpZOw6QHGAHSN+8daCPlOHDG7Ver7Z+kNj
 +srJ7iKMvLnnaEjGNgavfxdqTOme1gv4LWs/JdHYMkpphqVN92xBDJnhXTPRVZiQ
 0x39vK86qtB46A==
 =Omc6
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for Sv57-based virtual memory.

 - Various improvements for the MicroChip PolarFire SOC and the
   associated Icicle dev board, which should allow upstream kernels to
   boot without any additional modifications.

 - An improved memmove() implementation.

 - Support for the new Ssconfpmf and SBI PMU extensions, which allows
   for a much more useful perf implementation on RISC-V systems.

 - Support for restartable sequences.

* tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (36 commits)
  rseq/selftests: Add support for RISC-V
  RISC-V: Add support for restartable sequence
  MAINTAINERS: Add entry for RISC-V PMU drivers
  Documentation: riscv: Remove the old documentation
  RISC-V: Add sscofpmf extension support
  RISC-V: Add perf platform driver based on SBI PMU extension
  RISC-V: Add RISC-V SBI PMU extension definitions
  RISC-V: Add a simple platform driver for RISC-V legacy perf
  RISC-V: Add a perf core library for pmu drivers
  RISC-V: Add CSR encodings for all HPMCOUNTERS
  RISC-V: Remove the current perf implementation
  RISC-V: Improve /proc/cpuinfo output for ISA extensions
  RISC-V: Do no continue isa string parsing without correct XLEN
  RISC-V: Implement multi-letter ISA extension probing framework
  RISC-V: Extract multi-letter extension names from "riscv, isa"
  RISC-V: Minimal parser for "riscv, isa" strings
  RISC-V: Correctly print supported extensions
  riscv: Fixed misaligned memory access. Fixed pointer comparison.
  MAINTAINERS: update riscv/microchip entry
  riscv: dts: microchip: add new peripherals to icicle kit device tree
  ...
2022-03-25 10:11:38 -07:00
Jisheng Zhang
d414cb379a riscv: mm: init: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef
Replace the conditional compilation using "#ifdef CONFIG_KEXEC_CORE" by a
check for "IS_ENABLED(CONFIG_KEXEC_CORE)", to simplify the code and
increase compile coverage.

Link: https://lkml.kernel.org/r/20211206160514.2000-3-jszhang@kernel.org
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23 19:00:34 -07:00
Alexandre Ghiti
e4fcfe6eca
riscv: Fix kasan pud population
In sv48, the kasan inner regions are not aligned on PGDIR_SIZE and then
when we populate the kasan linear mapping region, we clear the kasan
vmalloc region which is in the same PGD.

Fix this by copying the content of the kasan early pud after allocating a
new PGD for the first time.

Fixes: e8a62cc26d ("riscv: Implement sv48 support")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:34:29 -08:00
Alexandre Ghiti
625e24a550
riscv: Move high_memory initialization to setup_bootmem
high_memory used to be initialized in mem_init, way after setup_bootmem.
But a call to dma_contiguous_reserve in this function gives rise to the
below warning because high_memory is equal to 0 and is used at the very
beginning at cma_declare_contiguous_nid.

It went unnoticed since the move of the kasan region redefined
KERN_VIRT_SIZE so that it does not encompass -1 anymore.

Fix this by initializing high_memory in setup_bootmem.

------------[ cut here ]------------
virt_to_phys used for non-linear address: ffffffffffffffff (0xffffffffffffffff)
WARNING: CPU: 0 PID: 0 at arch/riscv/mm/physaddr.c:14 __virt_to_phys+0xac/0x1b8
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.17.0-rc1-00007-ga68b89289e26 #27
Hardware name: riscv-virtio,qemu (DT)
epc : __virt_to_phys+0xac/0x1b8
 ra : __virt_to_phys+0xac/0x1b8
epc : ffffffff80014922 ra : ffffffff80014922 sp : ffffffff84a03c30
 gp : ffffffff85866c80 tp : ffffffff84a3f180 t0 : ffffffff86bce657
 t1 : fffffffef09406e8 t2 : 0000000000000000 s0 : ffffffff84a03c70
 s1 : ffffffffffffffff a0 : 000000000000004f a1 : 00000000000f0000
 a2 : 0000000000000002 a3 : ffffffff8011f408 a4 : 0000000000000000
 a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffff84a03747
 s2 : ffffffd800000000 s3 : ffffffff86ef4000 s4 : ffffffff8467f828
 s5 : fffffff800000000 s6 : 8000000000006800 s7 : 0000000000000000
 s8 : 0000000480000000 s9 : 0000000080038ea0 s10: 0000000000000000
 s11: ffffffffffffffff t3 : ffffffff84a035c0 t4 : fffffffef09406e8
 t5 : fffffffef09406e9 t6 : ffffffff84a03758
status: 0000000000000100 badaddr: 0000000000000000 cause: 0000000000000003
[<ffffffff8322ef4c>] cma_declare_contiguous_nid+0xf2/0x64a
[<ffffffff83212a58>] dma_contiguous_reserve_area+0x46/0xb4
[<ffffffff83212c3a>] dma_contiguous_reserve+0x174/0x18e
[<ffffffff83208fc2>] paging_init+0x12c/0x35e
[<ffffffff83206bd2>] setup_arch+0x120/0x74e
[<ffffffff83201416>] start_kernel+0xce/0x68c
irq event stamp: 0
hardirqs last  enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<0000000000000000>] 0x0
softirqs last  enabled at (0): [<0000000000000000>] 0x0
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace 0000000000000000 ]---

Fixes: f7ae02333d ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:34:12 -08:00
Alexandre Ghiti
c648c4bb7d
riscv: Fix config KASAN && DEBUG_VIRTUAL
__virt_to_phys function is called very early in the boot process (ie
kasan_early_init) so it should not be instrumented by KASAN otherwise it
bugs.

Fix this by declaring phys_addr.c as non-kasan instrumentable.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: 8ad8b72721 (riscv: Add KASAN support)
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:32:41 -08:00
Alexandre Ghiti
5f763b3b59
riscv: Fix DEBUG_VIRTUAL false warnings
KERN_VIRT_SIZE used to encompass the kernel mapping before it was
redefined when moving the kasan mapping next to the kernel mapping to only
match the maximum amount of physical memory.

Then, kernel mapping addresses that go through __virt_to_phys are now
declared as wrong which is not true, one can use __virt_to_phys on such
addresses.

Fix this by redefining the condition that matches wrong addresses.

Fixes: f7ae02333d ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:32:04 -08:00
Alexandre Ghiti
a3d3280378
riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
In order to get the pfn of a struct page* when sparsemem is enabled
without vmemmap, the mem_section structures need to be initialized which
happens in sparse_init.

But kasan_early_init calls pfn_to_page way before sparse_init is called,
which then tries to dereference a null mem_section pointer.

Fix this by removing the usage of this function in kasan_early_init.

Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 13:11:30 -08:00
Palmer Dabbelt
9195c294bc
RISC-V: Add Sv57 page table support
This implements Sv57 support at runtime. The kernel will try to boot
with 5-level page table firstly , and will fallback to 4-level if the HW
does not support it. And it will finally fallback to 3-level if the HW
alse does not support sv48.

* riscv-sv57:
  riscv: mm: Support kasan for sv57
  riscv: mm: Set sv57 on defaultly
  riscv: mm: Prepare pt_ops helper functions for sv57
  riscv: mm: Control p4d's folding by pgtable_l5_enabled
2022-02-22 09:40:52 -08:00
Qinglin Pan
8fbdccd2b1
riscv: mm: Support kasan for sv57
This patchset add kasan_populate and kasan_shallow_populate for sv57,
and is tested on both qemu and unmatched with CONFIG_KASAN and
CONFIG_KASAN_VMALLOC on.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14 16:32:47 -08:00
Qinglin Pan
011f09d120
riscv: mm: Set sv57 on defaultly
This patch sets sv57 on defaultly if CONFIG_64BIT. And do fallback to try
to set sv48 on boot time if sv57 is not supported in current hardware.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14 16:32:45 -08:00
Qinglin Pan
677b9eb881
riscv: mm: Prepare pt_ops helper functions for sv57
This patch prepare some pt_ops helper functions which will be used in
creating sv57 mappings during boot time.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14 16:32:42 -08:00
Qinglin Pan
d10efa21a9
riscv: mm: Control p4d's folding by pgtable_l5_enabled
To determine pgtable level at boot time, we can not use helper functions
in include/asm-generic/pgtable-nop4d.h and must implement these
functions. This patch uses pgtable_l5_enabled variable instead of
including pgtable-nop4d.h to controle p4d's folding, and implements
corresponding helper functions.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14 16:32:39 -08:00
Jisheng Zhang
67ff2f2626
riscv: mm: init: mark satp_mode __ro_after_init
satp_mode is never modified after init, so it can be marked as
__ro_after_init.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14 15:41:41 -08:00
Jisheng Zhang
f81393a5b2
riscv: extable: fix err reg writing in dedicated uaccess handler
Mayuresh reported commit 20802d8d47 ("riscv: extable: add a dedicated
uaccess handler") breaks the writev02 test case in LTP. This is due to
the err reg isn't correctly set with the errno(-EFAULT in writev02
case). First of all, the err and zero regs are reg numbers rather than
reg offsets in struct pt_regs; Secondly, regs_set_gpr() should write
the regs when offset isn't zero(zero means epc)

Fix it by correcting regs_set_gpr() logic and passing the correct reg
offset to it.

Reported-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Fixes: 20802d8d47 ("riscv: extable: add a dedicated uaccess handler")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-08 17:02:47 -08:00
Palmer Dabbelt
ca0cb9a60f
riscv/mm: Add XIP_FIXUP for riscv_pfn_base
This manifests as a crash early in boot on VexRiscv.

Signed-off-by: Myrtle Shah <gatecat@ds0.me>
[Palmer: split commit]
Fixes: 44c9225729 ("RISC-V: enable XIP")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-04 13:27:23 -08:00
Palmer Dabbelt
4b1c70aa8e
riscv/mm: Add XIP_FIXUP for phys_ram_base
This manifests as a crash early in boot on VexRiscv.

Signed-off-by: Myrtle Shah <gatecat@ds0.me>
[Palmer: split commit]
Fixes: 6d7f91d914 ("riscv: Get rid of CONFIG_PHYS_RAM_BASE in kernel physical address conversion")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-04 13:18:56 -08:00
Atish Patra
26fb751ca3
RISC-V: Do not use cpumask data structure for hartid bitmap
Currently, SBI APIs accept a hartmask that is generated from struct
cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
is not the correct data structure for hartids as it can be higher
than NR_CPUs for platforms with sparse or discontguous hartids.

Remove all association between hartid mask and struct cpumask.

Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-20 09:27:22 -08:00
kernel test robot
20aa49541a
riscv: fix boolconv.cocci warnings
arch/riscv/mm/init.c:48:11-16: WARNING: conversion to bool not needed here

 Remove unneeded conversion to bool

Semantic patch information:
 Relational and logical operators evaluate to bool,
 explicit conversion is overly verbose and unneeded.

Generated by: scripts/coccinelle/misc/boolconv.cocci

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 19:52:30 -08:00
Palmer Dabbelt
0c34e79e52
RISC-V: Introduce sv48 support without relocatable kernel
This patchset allows to have a single kernel for sv39 and sv48 without
being relocatable.

The idea comes from Arnd Bergmann who suggested to do the same as x86,
that is mapping the kernel to the end of the address space, which allows
the kernel to be linked at the same address for both sv39 and sv48 and
then does not require to be relocated at runtime.

This implements sv48 support at runtime. The kernel will try to boot
with 4-level page table and will fallback to 3-level if the HW does not
support it. Folding the 4th level into a 3-level page table has almost
no cost at runtime.

Note that kasan region had to be moved to the end of the address space
since its location must be known at compile-time and then be valid for
both sv39 and sv48 (and sv57 that is coming).

* riscv-sv48-v3:
  riscv: Explicit comment about user virtual address space size
  riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
  riscv: Implement sv48 support
  asm-generic: Prepare for riscv use of pud_alloc_one and pud_free
  riscv: Allow to dynamically define VA_BITS
  riscv: Introduce functions to switch pt_ops
  riscv: Split early kasan mapping to prepare sv48 introduction
  riscv: Move KASAN mapping next to the kernel mapping
  riscv: Get rid of MAXPHYSMEM configs

Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 19:37:44 -08:00
Alexandre Ghiti
e8a62cc26d
riscv: Implement sv48 support
By adding a new 4th level of page table, give the possibility to 64bit
kernel to address 2^48 bytes of virtual address: in practice, that offers
128TB of virtual address space to userspace and allows up to 64TB of
physical memory.

If the underlying hardware does not support sv48, we will automatically
fallback to a standard 3-level page table by folding the new PUD level into
PGDIR level. In order to detect HW capabilities at runtime, we
use SATP feature that ignores writes with an unsupported mode.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 17:54:09 -08:00
Alexandre Ghiti
840125a97a
riscv: Introduce functions to switch pt_ops
This simply gathers the different pt_ops initialization in functions
where a comment was added to explain why the page table operations must
be changed along the boot process.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 17:54:06 -08:00
Alexandre Ghiti
2efad17e57
riscv: Split early kasan mapping to prepare sv48 introduction
Now that kasan shadow region is next to the kernel, for sv48, this
region won't be aligned on PGDIR_SIZE and then when populating this
region, we'll need to get down to lower levels of the page table. So
instead of reimplementing the page table walk for the early population,
take advantage of the existing functions used for the final population.

Note that kasan swapper initialization must also be split since memblock
is not initialized at this point and as the last PGD is shared with the
kernel, we'd need to allocate a PUD so postpone the kasan final
population after the kernel population is done.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 17:54:05 -08:00
Alexandre Ghiti
f7ae02333d
riscv: Move KASAN mapping next to the kernel mapping
Now that KASAN_SHADOW_OFFSET is defined at compile time as a config,
this value must remain constant whatever the size of the virtual address
space, which is only possible by pushing this region at the end of the
address space next to the kernel mapping.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 17:54:04 -08:00
Jisheng Zhang
805a3ebed5
riscv: mm: init: try best to remove #ifdef CONFIG_XIP_KERNEL usage
Currently, the #ifdef CONFIG_XIP_KERNEL usage can be divided into the
following three types:

The first one is for functions/declarations only used in XIP case.

The second one is for XIP_FIXUP case. Something as below:
|foo_type foo;
|#ifdef CONFIG_XIP_KERNEL
|#define foo    (*(foo_type *)XIP_FIXUP(&foo))
|#endif

Usually, it's better to let the foo macro sit with the foo var
together. But if various foos are defined adjacently, we can
save some #ifdef CONFIG_XIP_KERNEL usage by grouping them together.

The third one is for different implementations for XIP, usually, this
is a #ifdef...#else...#endif case.

This patch moves the pt_ops macro to adjacent #ifdef CONFIG_XIP_KERNEL
and group first type usage cases into one.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 09:56:35 -08:00
Jisheng Zhang
fe036db7d8
riscv: mm: init: try IS_ENABLED(CONFIG_XIP_KERNEL) instead of #ifdef
Try our best to replace the conditional compilation using
"#ifdef CONFIG_XIP_KERNEL" with "IS_ENABLED(CONFIG_XIP_KERNEL)", to
simplify the code and to increase compile coverage.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 09:56:33 -08:00
Jisheng Zhang
3274a6ef3b
riscv: mm: init: remove _pt_ops and use pt_ops directly
Except "pt_ops", other global vars when CONFIG_XIP_KERNEL=y is defined
as below:

|foo_type foo;
|#ifdef CONFIG_XIP_KERNEL
|#define foo	(*(foo_type *)XIP_FIXUP(&foo))
|#endif

Follow the same way for pt_ops to unify the style and to simplify code.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 09:56:32 -08:00
Jisheng Zhang
07aabe8fb6
riscv: mm: init: try best to use IS_ENABLED(CONFIG_64BIT) instead of #ifdef
Try our best to replace the conditional compilation using
"#ifdef CONFIG_64BIT" by a check for "IS_ENABLED(CONFIG_64BIT)", to
simplify the code and to increase compile coverage.

Now we can also remove the __maybe_unused used in max_mapped_addr
declaration.

We also remove the BUG_ON check of mapping the last 4K bytes of the
addressable memory since this is always true for every kernel actually.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 09:56:31 -08:00
Jisheng Zhang
902d6364aa
riscv: mm: init: remove unnecessary "#ifdef CONFIG_CRASH_DUMP"
The is_kdump_kernel() returns false for !CRASH_DUMP case, so we don't
need the #ifdef CONFIG_CRASH_DUMP for is_kdump_kernel() checking.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 09:56:30 -08:00
Linus Torvalds
f1b744f65e RISC-V Patches for the 5.17 Merge Window, Part 1
* Support for the DA9063 as used on the HiFive Unmatched.
 * Support for relative extables, which puts us in line with other
   architectures and save some space in vmlinux.
 * A handful of kexec fixes/improvements, including the ability to run
   crash kernels from PCI-addressable memory on the HiFive Unmatched.
 * Support for the SBI SRST extension, which allows systems that do not
   have an explicit driver in Linux to reboot.
 * A handful of fixes and cleanups, including to the defconfigs and
   device trees.
 
 ---
 This time I do expect to have a part 2, as there's still some smaller
 patches on the list.  I was hoping to get through more of that over the
 weekend, but I got distracted with the ABI issues.  Figured it's better
 to send this sooner rather than waiting.
 
 Included are my merge resolutions against a master from this morning, if
 that helps any:
 
 diff --cc arch/riscv/include/asm/sbi.h
 index 289621da4a2a,9c46dd3ff4a2..000000000000
 --- a/arch/riscv/include/asm/sbi.h
 +++ b/arch/riscv/include/asm/sbi.h
 @@@ -27,7 -27,14 +27,15 @@@ enum sbi_ext_id
         SBI_EXT_IPI = 0x735049,
         SBI_EXT_RFENCE = 0x52464E43,
         SBI_EXT_HSM = 0x48534D,
  +      SBI_EXT_SRST = 0x53525354,
 +
 +       /* Experimentals extensions must lie within this range */
 +       SBI_EXT_EXPERIMENTAL_START = 0x08000000,
 +       SBI_EXT_EXPERIMENTAL_END = 0x08FFFFFF,
 +
 +       /* Vendor extensions must lie within this range */
 +       SBI_EXT_VENDOR_START = 0x09000000,
 +       SBI_EXT_VENDOR_END = 0x09FFFFFF,
   };
 
   enum sbi_ext_base_fid {
 diff --git a/arch/riscv/boot/dts/sifive/hifive-unmatched-a00.dts b/arch/riscv/boot/dts/sifive/hifive-unmatched-a00.dts
 index e03a4c94cf3f..6bfa1f24d3de 100644
 --- a/arch/riscv/boot/dts/sifive/hifive-unmatched-a00.dts
 +++ b/arch/riscv/boot/dts/sifive/hifive-unmatched-a00.dts
 @@ -188,14 +188,6 @@ vdd_ldo11: ldo11 {
                                 regulator-always-on;
                         };
                 };
 -
 -               rtc {
 -                       compatible = "dlg,da9063-rtc";
 -               };
 -
 -               wdt {
 -                       compatible = "dlg,da9063-watchdog";
 -               };
         };
  };
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmHnDV4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiaGWD/wOMHVLrkLZDxKHY3lFU7S7FanpFgcU
 L265fgKtoG/QOI9WPuQlN7pYvrC4ssUvtQ23WwZ+iz4pJlUwoMb2TAqBBeTXxEbW
 pVF2QqnlPdv2ZEn95MFxZ0HQB2+xgJKPL5gdD6Iz7oe2378lf7tywSF7MYpxG/AA
 CeHUxzhEPhQJntufTievMhvYpM7ZyhCr19ZAHXRaPoGReJK5ZMCeYHGTrHD4EisG
 hO/Pg2vx/Ynxi/vb/C69kpTBvu4Qsxnbhgfy1SowrO3FhxcZTbyrZ6l8uRxSAHIg
 dA0NLPh/YDQCPXYnphQcLo+Q9Gy4Sz5es7ULnnMyyEOZxoVyy4up3rCAFAL3Ubav
 CNQdk/ZWtrZ+s4chilA1kW97apxocvmq5ULg+7Hi58ZUzk+y7MQBVCClohyONVEU
 /leJzJ3nq3YHFgfo8Uh7L+iPzlNgycfi4gRnGJIkEVRhXBPTfJ/Pc5wjPoPVsFvt
 pjEYT4YaXITZ0QBLdcuPex5h3PXkRsORsZl8eJGnIz8742KA4tfFraZ4BkbrjoqC
 tLsi7Si9hN3JKhLsNgclb76tDkoz4CY7yZ7TT7hRbKdZZJkVRu1XqUq75X18CVQv
 9p7Q7j1b5H3Z+/5KOxwS0UO73y92yvyVvi0cLqBoD2Tkeq3beumxmy50Qy+O+h1D
 Ut7GwcyavzfS8Q==
 =uqtf
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.17-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the DA9063 as used on the HiFive Unmatched.

 - Support for relative extables, which puts us in line with other
   architectures and save some space in vmlinux.

 - A handful of kexec fixes/improvements, including the ability to run
   crash kernels from PCI-addressable memory on the HiFive Unmatched.

 - Support for the SBI SRST extension, which allows systems that do not
   have an explicit driver in Linux to reboot.

 - A handful of fixes and cleanups, including to the defconfigs and
   device trees.

* tag 'riscv-for-linus-5.17-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits)
  RISC-V: Use SBI SRST extension when available
  riscv: mm: fix wrong phys_ram_base value for RV64
  RISC-V: Use common riscv_cpuid_to_hartid_mask() for both SMP=y and SMP=n
  riscv: head: remove useless __PAGE_ALIGNED_BSS and .balign
  riscv: errata: alternative: mark vendor_patch_func __initdata
  riscv: head: make secondary_start_common() static
  riscv: remove cpu_stop()
  riscv: try to allocate crashkern region from 32bit addressible memory
  riscv: use hart id instead of cpu id on machine_kexec
  riscv: Don't use va_pa_offset on kdump
  riscv: dts: sifive: fu540-c000: Fix PLIC node
  riscv: dts: sifive: fu540-c000: Drop bogus soc node compatible values
  riscv: dts: sifive: Group tuples in register properties
  riscv: dts: sifive: Group tuples in interrupt properties
  riscv: dts: microchip: mpfs: Group tuples in interrupt properties
  riscv: dts: microchip: mpfs: Fix clock controller node
  riscv: dts: microchip: mpfs: Fix reference clock node
  riscv: dts: microchip: mpfs: Fix PLIC node
  riscv: dts: microchip: mpfs: Drop empty chosen node
  riscv: dts: canaan: Group tuples in interrupt properties
  ...
2022-01-19 11:38:21 +02:00
Linus Torvalds
35ce8ae9ae Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull signal/exit/ptrace updates from Eric Biederman:
 "This set of changes deletes some dead code, makes a lot of cleanups
  which hopefully make the code easier to follow, and fixes bugs found
  along the way.

  The end-game which I have not yet reached yet is for fatal signals
  that generate coredumps to be short-circuit deliverable from
  complete_signal, for force_siginfo_to_task not to require changing
  userspace configured signal delivery state, and for the ptrace stops
  to always happen in locations where we can guarantee on all
  architectures that the all of the registers are saved and available on
  the stack.

  Removal of profile_task_ext, profile_munmap, and profile_handoff_task
  are the big successes for dead code removal this round.

  A bunch of small bug fixes are included, as most of the issues
  reported were small enough that they would not affect bisection so I
  simply added the fixes and did not fold the fixes into the changes
  they were fixing.

  There was a bug that broke coredumps piped to systemd-coredump. I
  dropped the change that caused that bug and replaced it entirely with
  something much more restrained. Unfortunately that required some
  rebasing.

  Some successes after this set of changes: There are few enough calls
  to do_exit to audit in a reasonable amount of time. The lifetime of
  struct kthread now matches the lifetime of struct task, and the
  pointer to struct kthread is no longer stored in set_child_tid. The
  flag SIGNAL_GROUP_COREDUMP is removed. The field group_exit_task is
  removed. Issues where task->exit_code was examined with
  signal->group_exit_code should been examined were fixed.

  There are several loosely related changes included because I am
  cleaning up and if I don't include them they will probably get lost.

  The original postings of these changes can be found at:
     https://lkml.kernel.org/r/87a6ha4zsd.fsf@email.froward.int.ebiederm.org
     https://lkml.kernel.org/r/87bl1kunjj.fsf@email.froward.int.ebiederm.org
     https://lkml.kernel.org/r/87r19opkx1.fsf_-_@email.froward.int.ebiederm.org

  I trimmed back the last set of changes to only the obviously correct
  once. Simply because there was less time for review than I had hoped"

* 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (44 commits)
  ptrace/m68k: Stop open coding ptrace_report_syscall
  ptrace: Remove unused regs argument from ptrace_report_syscall
  ptrace: Remove second setting of PT_SEIZED in ptrace_attach
  taskstats: Cleanup the use of task->exit_code
  exit: Use the correct exit_code in /proc/<pid>/stat
  exit: Fix the exit_code for wait_task_zombie
  exit: Coredumps reach do_group_exit
  exit: Remove profile_handoff_task
  exit: Remove profile_task_exit & profile_munmap
  signal: clean up kernel-doc comments
  signal: Remove the helper signal_group_exit
  signal: Rename group_exit_task group_exec_task
  coredump: Stop setting signal->group_exit_task
  signal: Remove SIGNAL_GROUP_COREDUMP
  signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process
  signal: Make coredump handling explicit in complete_signal
  signal: Have prepare_signal detect coredumps using signal->core_state
  signal: Have the oom killer detect coredumps using signal->core_state
  exit: Move force_uaccess back into do_exit
  exit: Guarantee make_task_dead leaks the tsk when calling do_task_exit
  ...
2022-01-17 05:49:30 +02:00
Qi Zheng
36ef159f44 mm: remove redundant check about FAULT_FLAG_ALLOW_RETRY bit
Since commit 4064b98270 ("mm: allow VM_FAULT_RETRY for multiple
times") allowed VM_FAULT_RETRY for multiple times, the
FAULT_FLAG_ALLOW_RETRY bit of fault_flag will not be changed in the page
fault path, so the following check is no longer needed:

	flags & FAULT_FLAG_ALLOW_RETRY

So just remove it.

[akpm@linux-foundation.org: coding style fixes]

Link: https://lkml.kernel.org/r/20211110123358.36511-1-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Kirill Shutemov <kirill@shutemov.name>
Cc: Peter Xu <peterx@redhat.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15 16:30:27 +02:00
Jisheng Zhang
b0fd4b1bf9
riscv: mm: fix wrong phys_ram_base value for RV64
Currently, if 64BIT and !XIP_KERNEL, the phys_ram_base is always 0,
no matter the real start of dram reported by memblock is.

Fixes: 6d7f91d914 ("riscv: Get rid of CONFIG_PHYS_RAM_BASE in kernel physical address conversion")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09 13:51:48 -08:00
Nick Kossifidis
decf89f86e
riscv: try to allocate crashkern region from 32bit addressible memory
When allocating crash kernel region without explicitly specifying its
base address/size, memblock_phys_alloc_range will attempt to allocate
memory top to bottom (memblock.bottom_up is false), so the crash
kernel region will end up in highmem on 64bit systems. This way
swiotlb can't work on the crash kernel, since there won't be any
32bit addressible memory available for the bounce buffers.

Try to allocate 32bit addressible memory if available, for the
crash kernel by restricting the top search address to be less
than SZ_4G. If that fails fallback to the previous behavior.

I tested this on HiFive Unmatched where the pci-e controller needs
swiotlb to work, with this patch it's possible to access the pci-e
controller on crash kernel and mount the rootfs from the nvme.

Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Fixes: e53d28180d ("RISC-V: Add kdump support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09 10:40:14 -08:00
Kefeng Wang
5a7ac592c5
riscv: mm: Enable PMD split page table lock for RV64
After commit 1355c31eeb ("asm-generic: pgalloc: provide generic
pmd_alloc_one() and pmd_free_one()"), the main part to support
PMD split page table lock is in asm-generic/pgalloc.h.

The only change is add pgtable_pmd_page_ctor() into alloc_pmd_late(),
then we could enable ARCH_ENABLE_SPLIT_PMD_PTLOCK for RV64.

Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 19:31:49 -08:00
Alexandre Ghiti
7cc8c75b54
riscv: Make vmalloc/vmemmap end equal to the start of the next region
We used to define VMALLOC_END equal to the start of the next region
*minus one* which is inconsistent with the use of this define in the
core code (for example, see the definitions of VMALLOC_TOTAL and
is_vmalloc_addr).

And then make the definition of VMEMMAP_END consistent with VMALLOC_END
and all other regions actually.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 19:22:13 -08:00
Jisheng Zhang
20802d8d47
riscv: extable: add a dedicated uaccess handler
Inspired by commit 2e77a62cb3 ("arm64: extable: add a dedicated
uaccess handler"), do similar to riscv to add a dedicated uaccess
exception handler to update registers in exception context and
subsequently return back into the function which faulted, so we remove
the need for fixups specialized to each faulting instruction.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 17:53:29 -08:00
Jisheng Zhang
2bf847db0c
riscv: extable: add type and data fields
This is a riscv port of commit d6e2cc5647 ("arm64: extable: add `type`
and `data` fields").

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 17:52:54 -08:00
Jisheng Zhang
4c2e7ce8b9
riscv: extable: use ex for exception_table_entry
The var name "fixup" is a bit confusing, since this is a
exception_table_entry. Use "ex" instead  to refer to an entire entry.
In subsequent patches we'll use `fixup` to refer to the fixup
field specifically.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 17:52:34 -08:00
Jisheng Zhang
ef127bca11
riscv: extable: make fixup_exception() return bool
The return values of fixup_exception() and riscv_bpf_fixup_exception()
represent a boolean condition rather than an error code, so it's better
to return `bool` rather than `int`.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 17:52:29 -08:00
Jisheng Zhang
c07935cb3c
riscv: bpf: move rv_bpf_fixup_exception signature to extable.h
This is to group riscv related extable related functions signature
into one file.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 17:52:24 -08:00
Jisheng Zhang
bb1f85d604
riscv: switch to relative exception tables
Similar as other architectures such as arm64, x86 and so on, use
offsets relative to the exception table entry values rather than
absolute addresses for both the exception locationand the fixup.

However, RISCV label difference will actually produce two relocations,
a pair of R_RISCV_ADD32 and R_RISCV_SUB32. Take below simple code for
example:

$ cat test.S
.section .text
1:
        nop
.section __ex_table,"a"
        .balign 4
        .long (1b - .)
.previous

$ riscv64-linux-gnu-gcc -c test.S
$ riscv64-linux-gnu-readelf -r test.o
Relocation section '.rela__ex_table' at offset 0x100 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000000  000600000023 R_RISCV_ADD32     0000000000000000 .L1^B1 + 0
000000000000  000500000027 R_RISCV_SUB32     0000000000000000 .L0  + 0

The modpost will complain the R_RISCV_SUB32 relocation, so we need to
patch modpost.c to skip this relocation for .rela__ex_table section.

After this patch, the __ex_table section size of defconfig vmlinux is
reduced from 7072 Bytes to 3536 Bytes.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 17:52:20 -08:00
Eric W. Biederman
0e25498f8c exit: Add and use make_task_dead.
There are two big uses of do_exit.  The first is it's design use to be
the guts of the exit(2) system call.  The second use is to terminate
a task after something catastrophic has happened like a NULL pointer
in kernel code.

Add a function make_task_dead that is initialy exactly the same as
do_exit to cover the cases where do_exit is called to handle
catastrophic failure.  In time this can probably be reduced to just a
light wrapper around do_task_dead. For now keep it exactly the same so
that there will be no behavioral differences introducing this new
concept.

Replace all of the uses of do_exit that use it for catastraphic
task cleanup with make_task_dead to make it clear what the code
is doing.

As part of this rename rewind_stack_do_exit
rewind_stack_and_make_dead.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-12-13 12:04:45 -06:00
Linus Torvalds
b89f311d7e RISC-V Patches for the 5.16 Merge Window, Part 1
* Support for time namespaces in the VDSO, along with some associated
   cleanups.
 * Support for building rv32 randconfigs.
 * Improvements to the XIP port that allow larger kernels to function
 * Various device tree cleanups for both the SiFive and Microchip boards
 * A handful of defconfig updates, including enabling Nouveau.
 
 There are also various small cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmGN9T4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiQEBD/9EvGe5jPb8binNiLN81ExYh3HGBJ9A
 sY+/rhVWnTUX/igBU4M9gNHu5Rts2ITlsBda9YnYRgMunAS1cnZJ/jbvckfsFA+s
 svehPgDZSR+BqBnVDN1hEsY+TrNbFFFZIApGDbImuC5+81np/p3u1zkl3hhO0amJ
 7T46qxRq4iazbuohi47/ba6ufXyLb8XBg3LTUfp+lTnH7/VLbwspPWdzNgiJpMzB
 qEWfYd4au2zfqHC6SYHXidy/tkquhJ6DgKq0G3+XUCugeennMeisDWI9GsxTAzIa
 Pynm9GHA/AuK1/kt9qB/Czsg7bqY8RUreUMLNZEyHzwKA1OWQVxlfCdl5zIpxlkq
 eJkxjbhhPuneVRGUSJgKQqBEUtGlH9yISqO2CFFdKRqzm1ImJtkAJGC+SubpKSbU
 D+ZAGbAyEwDUfl1yyKzKRg6YnzntMxh8sfNdJRYMrOhczk8L//R8yxG6SX6bjw7W
 lTQk7wkpo2yS7L799dukKdlllYbUvqavZwsIHTyb1TsRoavQeqoJq/6jKXJNhFHD
 rA/OQgCUXuHTmpgdDUhHuLZVvJqT5n8Vxigi30nIq24KOgTom8tRRtiAWOJv7G+Z
 b/Y3eq0VE8yjhPSbtsAxooZnpEVgXsVqx3t/lBt0MiJE1a3JLSSchRTHBjWllCiV
 +yvCerCh8PoT8w==
 =Mpen
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for time namespaces in the VDSO, along with some associated
   cleanups.

 - Support for building rv32 randconfigs.

 - Improvements to the XIP port that allow larger kernels to function

 - Various device tree cleanups for both the SiFive and Microchip boards

 - A handful of defconfig updates, including enabling Nouveau.

There are also various small cleanups.

* tag 'riscv-for-linus-5.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: defconfig: enable DRM_NOUVEAU
  riscv/vdso: Drop unneeded part due to merge issue
  riscv: remove .text section size limitation for XIP
  riscv: dts: sifive: add missing compatible for plic
  riscv: dts: microchip: add missing compatibles for clint and plic
  riscv: dts: sifive: drop duplicated nodes and properties in sifive
  riscv: dts: sifive: fix Unleashed board compatible
  riscv: dts: sifive: use only generic JEDEC SPI NOR flash compatible
  riscv: dts: microchip: use vendor compatible for Cadence SD4HC
  riscv: dts: microchip: drop unused pinctrl-names
  riscv: dts: microchip: drop duplicated MMC/SDHC node
  riscv: dts: microchip: fix board compatible
  riscv: dts: microchip: drop duplicated nodes
  dt-bindings: mmc: cdns: document Microchip MPFS MMC/SDHCI controller
  riscv: add rv32 and rv64 randconfig build targets
  riscv: mm: don't advertise 1 num_asid for 0 asid bits
  riscv: set default pm_power_off to NULL
  riscv/vdso: Add support for time namespaces
2021-11-13 09:15:42 -08:00
Björn Töpel
f47d4ffe3a riscv, bpf: Fix RV32 broken build, and silence RV64 warning
Commit 252c765bd7 ("riscv, bpf: Add BPF exception tables") only addressed
RV64, and broke the RV32 build [1]. Fix by gating the exception tables code
with CONFIG_ARCH_RV64I.

Further, silence a "-Wmissing-prototypes" warning [2] in the RV64 BPF JIT.

  [1] https://lore.kernel.org/llvm/202111020610.9oy9Rr0G-lkp@intel.com/
  [2] https://lore.kernel.org/llvm/202110290334.2zdMyRq4-lkp@intel.com/

Fixes: 252c765bd7 ("riscv, bpf: Add BPF exception tables")
Signed-off-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Tong Tiangen <tongtiangen@huawei.com>
Link: https://lore.kernel.org/bpf/20211103115453.397209-1-bjorn@kernel.org
2021-11-05 16:52:34 +01:00