va_kernel_pa_offset was only used for 64-bit as the kernel mapping lies
in the linear mapping for 32-bit kernel and then only the offset between
the PAGE_OFFSET and the kernel load address is needed.
But this distinction complexifies the code with #ifdefs and especially
with a separate definition of the address conversions macros.
Simplify the code by defining this variable for both 32-bit and 64-bit.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The usage of CONFIG_PHYS_RAM_BASE for all kernel types was a mistake:
this value is implementation-specific and this breaks the genericity of
the RISC-V kernel.
Fix this by introducing a new variable phys_ram_base that holds this
value at runtime and use it in the kernel physical address conversion
macro. Since this value is used only for XIP kernels, evaluate it only if
CONFIG_XIP_KERNEL is set which in addition optimizes this macro for
standard kernels at compile-time.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Tested-by: Emil Renner Berthing <kernel@esmil.dk>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: 44c9225729 ("RISC-V: enable XIP")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The check that is done in setup_bootmem currently only works for 32-bit
kernel since the kernel mapping has been moved outside of the linear
mapping for 64-bit kernel. So make sure that for 64-bit kernel, the kernel
mapping does not overlap with the last 4K of the addressable memory.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
For 64-bit kernel, the end of the address space is occupied by the
kernel mapping and currently, the functions to populate the kernel page
tables (i.e. create_p*d_mapping) do not override existing mapping so we
must make sure the linear mapping does not map memory in the kernel mapping
by clipping the memory above the memory limit.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Fixes: c9811e379b ("riscv: Add mem kernel parameter support")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
As described in Documentation/riscv/vm-layout.rst, the end of the
virtual address space for 64-bit kernel is occupied by the modules/BPF/
kernel mappings so this actually reduces the amount of memory we are able
to map and then use in the linear mapping. So make sure this limit is
correctly set.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This contains a single fix for 32-bit boot. It happens this was already
fixed by c9811e379b ("riscv: Add mem kernel parameter support"), but
the bug existed before that feature addition so I've applied the patch
earlier and then merged it in (which results in a conflict, which is
fixed via not changing the resulting tree).
* riscv/riscv-fix-32bit:
riscv: Fix 32-bit RISC-V boot failure
Commit dd2d082b57 ("riscv: Cleanup setup_bootmem()") adjusted
the calling sequence in setup_bootmem(), which invalidates the fix
commit de043da0b9 ("RISC-V: Fix usage of memblock_enforce_memory_limit")
did for 32-bit RISC-V unfortunately.
So now 32-bit RISC-V does not boot again when testing booting kernel
on QEMU 'virt' with '-m 2G', which was exactly what the original
commit de043da0b9 ("RISC-V: Fix usage of memblock_enforce_memory_limit")
tried to fix.
Fixes: dd2d082b57 ("riscv: Cleanup setup_bootmem()")
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
We have a lot of variables that are used to hold kernel mapping addresses,
offsets between physical and virtual mappings and some others used for XIP
kernels: they are all defined at different places in mm/init.c, so group
them into a single structure with, for some of them, more explicit and concise
names.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This contains both the short-term fix for the W+X boot mappings and the
larger cleanup.
* riscv-wx-mappings:
riscv: Map the kernel with correct permissions the first time
riscv: Introduce set_kernel_memory helper
riscv: Simplify xip and !xip kernel address conversion macros
riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED
riscv: mm: Fix W+X mappings at boot
For 64-bit kernels, we map all the kernel with write and execute
permissions and afterwards remove writability from text and executability
from data.
For 32-bit kernels, the kernel mapping resides in the linear mapping, so we
map all the linear mapping as writable and executable and afterwards we
remove those properties for unused memory and kernel mapping as
described above.
Change this behavior to directly map the kernel with correct permissions
and avoid going through the whole mapping to fix the permissions.
At the same time, this fixes an issue introduced by commit 2bfc6cd81b
("riscv: Move kernel mapping outside of linear mapping") as reported
here https://github.com/starfive-tech/linux/issues/17.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The memblock_enforce_memory_limit() could change the memblock
range, so move the dram_end assignment after it in bootmem_init(),
then support mem= cmdline.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The SWIOTLB buffer is not needed unless the physical address space
is beyond the limit of dma, only initialize swiotlb when swiotlb_force
is true or not all system memory is DMA-able.
Also move the swiotlb_init() into mem_init().
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Consolidate the following items in init.c
Staticize global vars as much as possible;
Add __initdata mark if the global var isn't needed after init
Add __init mark if the func isn't needed after init
Add __ro_after_init if the global var is read only after init
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The _sdata/_edata is already in sections.h, drop redundant
declaration.
Also move _xiprom/_exiprom declarations at the beginning of
the file, cleanup one CONFIG_XIP_KERNEL.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The empty_zero_page sits at .bss..page_aligned section, so will be
cleared to zero during clearing bss, we don't need to clear it again.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The various uses of protect_kernel_linear_mapping_text_rodata() are
not consistent:
- Its definition depends on "64BIT && !XIP_KERNEL",
- Its forward declaration depends on MMU,
- Its single caller depends on "STRICT_KERNEL_RWX && 64BIT && MMU &&
!XIP_KERNEL".
Fix this by settling on the dependencies of the caller, which can be
simplified as STRICT_KERNEL_RWX depends on "MMU && !XIP_KERNEL".
Provide a dummy definition, as the caller is protected by
"IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)" instead of "#ifdef
CONFIG_STRICT_KERNEL_RWX".
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
When the kernel mapping was moved outside of the linear mapping, the
kernel memory reservation was increased, to take into account mapping
granularity. However, this is done unconditionally, regardless of
whether the kernel memory is mapped read-only or not.
If this extension is not needed, up to 2 MiB may be lost, which has a
big impact on e.g. Canaan K210 (64-bit nommu) platforms with only 8 MiB
of RAM.
Reclaim the lost memory by only extending the reserved region when
needed, i.e. depending on a simplified version of the conditional logic
around the call to protect_kernel_linear_mapping_text_rodata().
Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
* Support for the memtest= kernel command-line argument.
* Support for building the kernel with FORTIFY_SOURCE.
* Support for generic clockevent broadcasts.
* Support for the buildtar build target.
* Some build system cleanups to pass more LLVM-friendly arguments.
* Support for kprobes.
* A rearranged kernel memory map, the first part of supporting sv48
systems.
* Improvements to kexec, along with support for kdump and crash kernels.
* An alternatives-based errata framework, along with support for
handling a pair of errata that manifest on some SiFive designs
(including the HiFive Unmatched).
* Support for XIP.
* A device tree for the Microchip PolarFire ICICLE SoC and associated
dev board.
Along with a bunch of cleanups. There are already a handful of fixes
on the list so there will likely be a part 2.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmCS4lITHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYieZqEACSihfcOgZ/oyGWN3chca917/yCWimM
DOu37Zlh81TNPgzzJwbT44IY5sg/lSecwktxs665TChiJjr3JlM4jmz+u64KOTA8
mTWhqZNr5zT9kFj/m3x0V9yYOVr9g43QRmIlc14d+8JaQDw0N8WeH/yK85/CXDSS
X5gQK/e9q/yPf/NPyPuPm67jDsFnJERINWaAHI8lhA5fvFyy/xRLmSkuexchysss
XOGfyxxX590jGLK1vD+5wccX7ZwfwU4jriTaxyah/VBl8QUur/xSPVyspHIdWiMG
jrNXI1dg6oI861BdjryUpZI0iYJaRe5FRWUx7uTIqHfIyL/MnvYI7USVYOOPb72M
yZgN903R++5NeUUVTzfXwaigTwfXAPB6USFqZpEfRAf204pgNybmznJWThAVBdYG
rUixp7GsEMU3aAT2tE/iHR33JQxQfnZq8Tg43/4gB7MoACrzQrYrGcPnj9xssMyV
F1hnao3dr+5Xjo3MwfkW9JvLPwvDuE3mdrdj+a0XZ45gbTJeuBhYxo3VOsFeijhQ
gf/VYuoNn5iae9fiMzx5rlmFT9NJDYKDhla+BpAel84/6nRryyfCZCaE5FvDynOO
CNQynaeJMIMEygPBYR9FVVCwm+EtVsz3NVFKEuo5ilQpgX8ipctxiqy2+moZALLN
OWlEH6BKEgXqkw==
=PsA8
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.13-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for the memtest= kernel command-line argument.
- Support for building the kernel with FORTIFY_SOURCE.
- Support for generic clockevent broadcasts.
- Support for the buildtar build target.
- Some build system cleanups to pass more LLVM-friendly arguments.
- Support for kprobes.
- A rearranged kernel memory map, the first part of supporting sv48
systems.
- Improvements to kexec, along with support for kdump and crash
kernels.
- An alternatives-based errata framework, along with support for
handling a pair of errata that manifest on some SiFive designs
(including the HiFive Unmatched).
- Support for XIP.
- A device tree for the Microchip PolarFire ICICLE SoC and associated
dev board.
... along with a bunch of cleanups. There are already a handful of fixes
on the list so there will likely be a part 2.
* tag 'riscv-for-linus-5.13-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (45 commits)
RISC-V: Always define XIP_FIXUP
riscv: Remove 32b kernel mapping from page table dump
riscv: Fix 32b kernel build with CONFIG_DEBUG_VIRTUAL=y
RISC-V: Fix error code returned by riscv_hartid_to_cpuid()
RISC-V: Enable Microchip PolarFire ICICLE SoC
RISC-V: Initial DTS for Microchip ICICLE board
dt-bindings: riscv: microchip: Add YAML documentation for the PolarFire SoC
RISC-V: Add Microchip PolarFire SoC kconfig option
RISC-V: enable XIP
RISC-V: Add crash kernel support
RISC-V: Add kdump support
RISC-V: Improve init_resources()
RISC-V: Add kexec support
RISC-V: Add EM_RISCV to kexec UAPI header
riscv: vdso: fix and clean-up Makefile
riscv/mm: Use BUG_ON instead of if condition followed by BUG.
riscv/kprobe: fix kernel panic when invoking sys_read traced by kprobe
riscv: Set ARCH_HAS_STRICT_MODULE_RWX if MMU
riscv: module: Create module allocations without exec permissions
riscv: bpf: Avoid breaking W^X
...
mem_init_print_info() is called in mem_init() on each architecture, and
pass NULL argument, so using void argument and move it into mm_init().
Link: https://lkml.kernel.org/r/20210317015210.33641-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com> [x86]
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> [powerpc]
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Anatoly Pugachev <matorola@gmail.com> [sparc64]
Acked-by: Russell King <rmk+kernel@armlinux.org.uk> [arm]
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Guo Ren <guoren@kernel.org>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce XIP (eXecute In Place) support for RISC-V platforms.
It allows code to be executed directly from non-volatile storage
directly addressable by the CPU, such as QSPI NOR flash which can
be found on many RISC-V platforms. This makes way for significant
optimization of RAM footprint. The XIP kernel is not compressed
since it has to run directly from flash, so it will occupy more
space on the non-volatile storage. The physical flash address used
to link the kernel object files and for storing it has to be known
at compile time and is represented by a Kconfig option.
XIP on RISC-V will for the time being only work on MMU-enabled
kernels.
Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com>
[Alex: Rebase on top of "Move kernel mapping outside the linear mapping" ]
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
[Palmer: disable XIP for allyesconfig]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This patch allows Linux to act as a crash kernel for use with
kdump. Userspace will let the crash kernel know about the
memory region it can use through linux,usable-memory property
on the /memory node (overriding its reg property), and about the
memory region where the elf core header of the previous kernel
is saved, through a reserved-memory node with a compatible string
of "linux,elfcorehdr". This approach is the least invasive and
re-uses functionality already present.
I tested this on riscv64 qemu and it works as expected, you
may test it by retrieving the dmesg of the previous kernel
through /proc/vmcore, using the vmcore-dmesg utility from
kexec-tools.
Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This patch adds support for kdump, the kernel will reserve a
region for the crash kernel and jump there on panic. In order
for userspace tools (kexec-tools) to prepare the crash kernel
kexec image, we also need to expose some information on
/proc/iomem for the memory regions used by the kernel and for
the region reserved for crash kernel. Note that on userspace
the device tree is used to determine the system's memory
layout so the "System RAM" on /proc/iomem is ignored.
I tested this on riscv64 qemu and works as expected, you may
test it by triggering a crash through /proc/sysrq_trigger:
echo c > /proc/sysrq_trigger
Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
BUG_ON() uses unlikely in if(), which can be optimized at compile time.
Signed-off-by: zhouchuangao <zhouchuangao@vivo.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
All of these are never modified after init, so they can be
__ro_after_init.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
They are not needed after booting, so mark them as __init to move them
to the __init section.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This is a preparatory patch for relocatable kernel and sv48 support.
The kernel used to be linked at PAGE_OFFSET address therefore we could use
the linear mapping for the kernel mapping. But the relocated kernel base
address will be different from PAGE_OFFSET and since in the linear mapping,
two different virtual addresses cannot point to the same physical address,
the kernel mapping needs to lie outside the linear mapping so that we don't
have to copy it at the same physical offset.
The kernel mapping is moved to the last 2GB of the address space, BPF
is now always after the kernel and modules use the 2GB memory range right
before the kernel, so BPF and modules regions do not overlap. KASLR
implementation will simply have to move the kernel in the last 2GB range
and just take care of leaving enough space for BPF.
In addition, by moving the kernel to the end of the address space, both
sv39 and sv48 kernels will be exactly the same without needing to be
relocated at runtime.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
[Palmer: Squash the STRICT_RWX fix, and a !MMU fix]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The riscv [rv32_]defconfig enabled CONFIG_MEMTEST,
but memtest feature is not supported in RISCV.
Add early_memtest() to support for memtest.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
After the following patches,
commit de043da0b9 ("RISC-V: Fix usage of memblock_enforce_memory_limit")
commit 1bd14a66ee ("RISC-V: Remove any memblock representing unusable memory area")
commit b10d6bca87 ("arch, drivers: replace for_each_membock() with for_each_mem_range()")
some logic is useless, kill the mem_start/start/end and unneeded code.
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
I have a handful of new RISC-V related patches for this merge window:
* A check to ensure drivers are properly using uaccess. This isn't
manifesting with any of the drivers I'm currently using, but may catch
errors in new drivers.
* Some preliminary support for the FU740, along with the HiFive
Unleashed it will appear on.
* NUMA support for RISC-V, which involves making the arm64 code generic.
* Support for kasan on the vmalloc region.
* A handful of new drivers for the Kendryte K210, along with the DT
plumbing required to boot on a handful of K210-based boards.
* Support for allocating ASIDs.
* Preliminary support for kernels larger than 128MiB.
* Various other improvements to our KASAN support, including the
utilization of huge pages when allocating the KASAN regions.
We may have already found a bug with the KASAN_VMALLOC code, but it's
passing my tests. There's a fix in the works, but that will probably
miss the merge window.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmA4hXATHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYifryD/0SfXGOfj93Cxq7I7AYhhzCN7lJ5jvv
iEQScTlPqU9nfvYodo4EDq0fp+5LIPpTL/XBHtqVjzv0FqRNa28Ea0K7kO8HuXc4
BaUd0m/DqyB4Gfgm4qjc5bDneQ1ZYxVXprYERWNQ5Fj+tdWhaQGOW64N/TVodjjj
NgJtTqbIAcjJqjUtttM8TZN5U1TgwLo+KCqw3iYW12lV1YKBBuvrwvSdD6jnFdIQ
AzG/wRGZhxLoFxgBB/NEsZxDoSd6ztiwxLhS9lX4okZVsryyIdOE70Q/MflfiTlU
xE+AdxQXTMUiiqYSmHeDD6PDb57GT/K3hnjI1yP+lIZpbInsi29JKow1qjyYjfHl
9cSSKYCIXHL7jKU6pgt34G1O5N5+fgqHQhNbfKvlrQ2UPlfs/tWdKHpFIP/z9Jlr
0vCAou7NSEB9zZGqzO63uBLXoN8yfL8FT3uRnnRvoRpfpex5dQX2QqPLQ7327D7N
GUG31nd1PHTJPdxJ1cI4SO24PqPpWDWY9uaea+0jv7ivGClVadZPco/S3ZKloguT
lazYUvyA4oRrSAyln785Rd8vg4CinqTxMtIyZbRMbNkgzVQARi9a8rjvu4n9qms2
2wlXDFi8nR8B4ih5n79dSiiLM9ay9GJDxMcf9VxIxSAYZV2fJALnpK6gV2fzRBUe
+k/uv8BIsFmlwQ==
=CutX
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.12-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
"A handful of new RISC-V related patches for this merge window:
- A check to ensure drivers are properly using uaccess. This isn't
manifesting with any of the drivers I'm currently using, but may
catch errors in new drivers.
- Some preliminary support for the FU740, along with the HiFive
Unleashed it will appear on.
- NUMA support for RISC-V, which involves making the arm64 code
generic.
- Support for kasan on the vmalloc region.
- A handful of new drivers for the Kendryte K210, along with the DT
plumbing required to boot on a handful of K210-based boards.
- Support for allocating ASIDs.
- Preliminary support for kernels larger than 128MiB.
- Various other improvements to our KASAN support, including the
utilization of huge pages when allocating the KASAN regions.
We may have already found a bug with the KASAN_VMALLOC code, but it's
passing my tests. There's a fix in the works, but that will probably
miss the merge window.
* tag 'riscv-for-linus-5.12-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (75 commits)
riscv: Improve kasan population by using hugepages when possible
riscv: Improve kasan population function
riscv: Use KASAN_SHADOW_INIT define for kasan memory initialization
riscv: Improve kasan definitions
riscv: Get rid of MAX_EARLY_MAPPING_SIZE
soc: canaan: Sort the Makefile alphabetically
riscv: Disable KSAN_SANITIZE for vDSO
riscv: Remove unnecessary declaration
riscv: Add Canaan Kendryte K210 SD card defconfig
riscv: Update Canaan Kendryte K210 defconfig
riscv: Add Kendryte KD233 board device tree
riscv: Add SiPeed MAIXDUINO board device tree
riscv: Add SiPeed MAIX GO board device tree
riscv: Add SiPeed MAIX DOCK board device tree
riscv: Add SiPeed MAIX BiT board device tree
riscv: Update Canaan Kendryte K210 device tree
dt-bindings: add resets property to dw-apb-timer
dt-bindings: fix sifive gpio properties
dt-bindings: update sifive uart compatible string
dt-bindings: update sifive clint compatible string
...
At early boot stage, we have a whole PGDIR to map the kernel, so there
is no need to restrict the early mapping size to 128MB. Removing this
define also allows us to simplify some compile time logic.
This fixes large kernel mappings with a size greater than 128MB, as it
is the case for syzbot kernels whose size was just ~130MB.
Note that on rv64, for now, we are then limited to PGDIR size for early
mapping as we can't use PGD mappings (see [1]). That should be enough
given the relative small size of syzbot kernels compared to PGDIR_SIZE
which is 1GB.
[1] https://lore.kernel.org/lkml/20200603153608.30056-1-alex@ghiti.fr/
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Covert to the generic reserve_initrd_mem() function.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Sometimes, especially in a production system we may not want to
use a "smart bootloader" like u-boot to load kernel, ramdisk and
device tree from a filesystem on eMMC, but rather load the kernel
from a NAND partition and just run it as soon as we can, and in
this case it is convenient to have device tree compiled into the
kernel binary. Since this case is not limited to MMU-less systems,
let's support it for these which have MMU enabled too.
While at it, provide __dtb_start as a parameter to setup_vm() in
BUILTIN_DTB case, so we don't have to duplicate BUILTIN_DTB specific
processing in MMU-enabled and MMU-disabled versions of setup_vm().
Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
The max_mapnr is the number of PFNs, not absolute PFN offset.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Fixes: d0d8aae645 ("RISC-V: Set maximum number of mapped pages correctly")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Currently, linux kernel can not use last 4k bytes of addressable space
because IS_ERR_VALUE macro treats those as an error. This will be an issue
for RV32 as any memblock allocator potentially allocate chunk of memory
from the end of DRAM (2GB) leading bad address error even though the
address was technically valid.
Fix this issue by limiting the memblock if available memory spans the
entire address space.
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Use the generic numa implementation to add NUMA support for RISC-V.
This is based on Greentime's patch[1] but modified to use generic NUMA
implementation and few more fixes.
[1] https://lkml.org/lkml/2020/1/10/233
Co-developed-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Currently, we perform some memory init functions in paging init. But,
that will be an issue for NUMA support where DT needs to be flattened
before numa initialization and memblock_present can only be called
after numa initialization.
Move memory initialization related functions to a separate function.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
All SiPeed K210 MAIX boards have the exact same vendor, arch and
implementation IDs, preventing differentiation to select the correct
device tree to use through the SOC_BUILTIN_DTB_DECLARE() macro. This
result in this macro to be useless and mandates changing the code of
the sysctl driver to change the builtin device tree suitable for the
target board.
Fix this problem by removing the SOC_BUILTIN_DTB_DECLARE() macro since
it is used only for the K210 support. The code searching the builtin
DTBs using the vendor, arch an implementation IDs is also removed.
Support for builtin DTB falls back to the simpler and more traditional
handling of builtin DTB using the CONFIG_BUILTIN_DTB option, similarly
to other architectures.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
memblock_enforce_memory_limit accepts the maximum memory size not the
maximum address that can be handled by kernel. Fix the function invocation
accordingly.
Fixes: 1bd14a66ee ("RISC-V: Remove any memblock representing unusable memory area")
Cc: stable@vger.kernel.org
Reported-by: Bin Meng <bin.meng@windriver.com>
Tested-by: Bin Meng <bin.meng@windriver.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
We have a handful of new kernel features for 5.11:
* Support for the contiguous memory allocator.
* Support for IRQ Time Accounting
* Support for stack tracing
* Support for strict /dev/mem
* Support for kernel section protection
I'm being a bit conservative on the cutoff for this round due to the
timing, so this is all the new development I'm going to take for this
cycle (even if some of it probably normally would have been OK). There
are, however, some fixes on the list that I will likely be sending along
either later this week or early next week.
There is one issue in here: one of my test configurations
(PREEMPT{,_DEBUG}=y) fails to boot on QEMU 5.0.0 (from April) as of the
.text.init alignment patch. With any luck we'll sort out the issue, but
given how many bugs get fixed all over the place and how unrelated those
features seem my guess is that we're just running into something that's
been lurking for a while and has already been fixed in the newer QEMU
(though I wouldn't be surprised if it's one of these implicit
assumptions we have in the boot flow). If it was hardware I'd be
strongly inclined to look more closely, but given that users can upgrade
their simulators I'm less worried about it.
There are two merge conflicts, both in build files. They're both a bit
clunky: arch/riscv/Kconfig is out of order (I have a script that's
supposed to keep them in order, I'll fix it) and lib/Makefile is out of
order (though GENERIC_LIB here doesn't mean quite what it does above).
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAl/cHO4THHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiTlmD/4uDyNHBM1XH/XD4fSEwTYJvGLqt/Jo
vtrGR/fm0SlQFUKCcywSzxcVAeGn56CACbEIDYLuL4xXRJmbwEuaRrHVx2sEhS9p
pNhy+wus/SgDz5EUAawMyR2AEWgzl77hY5T/+AAo4yv65SGGBfsIdz5noIVwGNqW
r0g5cw2O99z0vwu1aSrK4isWHconG9MfQnnVyepPSh67pyWS4aUCr1K3vLiqD2dE
XcgtwdcgzUIY5aEoJNrWo5qTrcaG8m6MRNCDAKJ6MKdDA2wdGIN868G0wQnoURRm
Y+yW7w3P20kM0b87zH50jujTWg38NBKOfaXb0mAfawZMapL60veTVmvs2kNtFXCy
F6JWRkgTiRnGY72FtRR0igWXT5M7fz0EiLFXLMItGcgj79TUget4l/3sRMN47S/O
cA/WiwptJH3mh8IkL6z5ZxWEThdOrbFt8F1T+Gyq/ayblcPnJaLn/wrWoeOwviWR
fvEC7smuF5SBTbWZK5tBOP21Nvhb7bfr49Sgr8Tvdjl15tz97qK+2tsLXwkBoQnJ
wU45jcXfzr5wgiGBOQANRite5bLsJ0TuOrTgA5gsGpv+JSDGbpcJbm0833x00nX/
3GsW5xr+vsLCvljgPAtKsyDNRlGQu908Gxrat2+s8u92bLr1bwn30uKL5h6i/n1w
QgWATuPPGXZZdw==
=GWIH
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.11-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
"We have a handful of new kernel features for 5.11:
- Support for the contiguous memory allocator.
- Support for IRQ Time Accounting
- Support for stack tracing
- Support for strict /dev/mem
- Support for kernel section protection
I'm being a bit conservative on the cutoff for this round due to the
timing, so this is all the new development I'm going to take for this
cycle (even if some of it probably normally would have been OK). There
are, however, some fixes on the list that I will likely be sending
along either later this week or early next week.
There is one issue in here: one of my test configurations
(PREEMPT{,_DEBUG}=y) fails to boot on QEMU 5.0.0 (from April) as of
the .text.init alignment patch.
With any luck we'll sort out the issue, but given how many bugs get
fixed all over the place and how unrelated those features seem my
guess is that we're just running into something that's been lurking
for a while and has already been fixed in the newer QEMU (though I
wouldn't be surprised if it's one of these implicit assumptions we
have in the boot flow). If it was hardware I'd be strongly inclined to
look more closely, but given that users can upgrade their simulators
I'm less worried about it"
* tag 'riscv-for-linus-5.11-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
arm64: Use the generic devmem_is_allowed()
arm: Use the generic devmem_is_allowed()
RISC-V: Use the new generic devmem_is_allowed()
lib: Add a generic version of devmem_is_allowed()
riscv: Fixed kernel test robot warning
riscv: kernel: Drop unused clean rule
riscv: provide memmove implementation
RISC-V: Move dynamic relocation section under __init
RISC-V: Protect all kernel sections including init early
RISC-V: Align the .init.text section
RISC-V: Initialize SBI early
riscv: Enable ARCH_STACKWALK
riscv: Make stack walk callback consistent with generic code
riscv: Cleanup stacktrace
riscv: Add HAVE_IRQ_TIME_ACCOUNTING
riscv: Enable CMA support
riscv: Ignore Image.* and loader.bin
riscv: Clean up boot dir
riscv: Fix compressed Image formats build
RISC-V: Add kernel image sections to the resource tree
Currently, .init.text & .init.data are intermixed which makes it impossible
apply different permissions to them. .init.data shouldn't need exec
permissions while .init.text shouldn't have write permission. Moreover,
the strict permission are only enforced /init starts. This leaves the
kernel vulnerable from possible buggy built-in modules.
Keep .init.text & .data in separate sections so that different permissions
are applied to each section. Apply permissions to individual sections as
early as possible. This improves the kernel protection under
CONFIG_STRICT_KERNEL_RWX. We also need to restore the permissions for the
entire _init section after it is freed so that those pages can be used
for other purpose.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Tested-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
riscv has selected HAVE_DMA_CONTIGUOUS, but doesn't call
dma_contiguous_reserve(). This calls dma_contiguous_reserve(), which
enables CMA.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This patch (previously part of my kexec/kdump series) populates
/proc/iomem with the various sections of the kernel image. We need
this for kexec-tools to be able to prepare the crashkernel image
for kdump to work. Since resource tree initialization is not
related to memory initialization I added the code to kernel/setup.c
and removed the original code (derived from the arm64 tree) from
mm/init.c.
Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Currently, we use PGD mappings for early DTB mapping in early_pgd
but this breaks Linux kernel on SiFive Unleashed because on SiFive
Unleashed PMP checks don't work correctly for PGD mappings.
To fix early DTB mappings on SiFive Unleashed, we use non-PGD
mappings (i.e. PMD) for early DTB access.
Fixes: 8f3a2b4a96 ("RISC-V: Move DT mapping outof fixmap")
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Tested-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
RISC-V limits the physical memory size by -PAGE_OFFSET. Any memory beyond
that size from DRAM start is unusable. Just remove any memblock pointing
to those memory region without worrying about computing the maximum size.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This contains a handful of cleanups and new features, including:
* A handful of cleanups for our page fault handling.
* Improvements to how we fill out cacheinfo.
* Support for EFI-based systems.
---
This contains a merge from the EFI tree that was necessary as some of the EFI
support landed over there. It's my first time doing something like this,
I haven't included the set_fs stuff because the base branch it depends on
hasn't been merged yet. I'll probably have another merge window PR, as
there's more in flight (most notably the fix for new binutils I just sent out),
but I figured there was no reason to delay this any longer.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAl+KQ6gTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYibmwD/4qWfOW7R/kUWi08ethcaAhNEWLvqIh
2/KjGLORw+NTZ1F4pEFyQG5LRd3yWDT/UXh/k8gXINqmdclNV01Z3T+O7WuRlISs
07i26W1qRpNeJ7lDVhr9foKpeOU/AXvidgoF330nGlyO4HZkYKhK2yB3t8uGWywr
Zt/EpMJeBIRKzWiLhOgLAdYJthhZ9AlnouNnr9myHnO5Ksel+AZ/BKYvn7ZbHMns
6vFUxp6392/LERRRIfDqPsTuxPIYMHjuEsGSESLsjAIyq/shgN1knG/C+zwU5DcK
zUDBt1DEP7Tb45w7VBASSjn1M+cUolz9/c2dBhlVcdBlk1GKF+KILSTmWUBpQ8oP
ETVAuQK5HTcjy9bVcJMj0Oa3mFshVAAByOH+Wyrdo+qSLkb7y3spPvsL4dyjrKjL
+pe6C7WvavaEFoQXVWO2sTUBGYt7qDLRdrDgOGBIHylTXhTxf2wYzAF4ZmDROECT
Qfc7Ac3aIWYvWDmxE+x8OniuclfZ0DndKLKQj6FJWUTIxFZzTxsHK75d47D1ID0S
ZwAmUd0eYjjwMTO/6AM/Aqu3o8IP4GOXjJf4ijxH9+LjpUhm/ibmHDAUY69sU1WX
kdX51gQzoEuW7XMVz1HoTSvaGGKtyFDuRxs8RG/tSFaRtznRz0Sro6BpLCeG968n
k/d5WL/vZZ/NDA==
=FYs/
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.10-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
"A handful of cleanups and new features:
- A handful of cleanups for our page fault handling
- Improvements to how we fill out cacheinfo
- Support for EFI-based systems"
* tag 'riscv-for-linus-5.10-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (22 commits)
RISC-V: Add page table dump support for uefi
RISC-V: Add EFI runtime services
RISC-V: Add EFI stub support.
RISC-V: Add PE/COFF header for EFI stub
RISC-V: Implement late mapping page table allocation functions
RISC-V: Add early ioremap support
RISC-V: Move DT mapping outof fixmap
RISC-V: Fix duplicate included thread_info.h
riscv/mm/fault: Set FAULT_FLAG_INSTRUCTION flag in do_page_fault()
riscv/mm/fault: Fix inline placement in vmalloc_fault() declaration
riscv: Add cache information in AUX vector
riscv: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO
riscv: Set more data to cacheinfo
riscv/mm/fault: Move access error check to function
riscv/mm/fault: Move FAULT_FLAG_WRITE handling in do_page_fault()
riscv/mm/fault: Simplify mm_fault_error()
riscv/mm/fault: Move fault error handling to mm_fault_error()
riscv/mm/fault: Simplify fault error handling
riscv/mm/fault: Move vmalloc fault handling to vmalloc_fault()
riscv/mm/fault: Move bad area handling to bad_area()
...
for_each_memblock() is used to iterate over memblock.memory in a few
places that use data from memblock_region rather than the memory ranges.
Introduce separate for_each_mem_region() and
for_each_reserved_mem_region() to improve encapsulation of memblock
internals from its users.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org> [x86]
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> [MIPS]
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> [.clang-format]
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-18-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are several occurrences of the following pattern:
for_each_memblock(memory, reg) {
start = __pfn_to_phys(memblock_region_memory_base_pfn(reg);
end = __pfn_to_phys(memblock_region_memory_end_pfn(reg));
/* do something with start and end */
}
Using for_each_mem_range() iterator is more appropriate in such cases and
allows simpler and cleaner code.
[akpm@linux-foundation.org: fix arch/arm/mm/pmsa-v7.c build]
[rppt@linux.ibm.com: mips: fix cavium-octeon build caused by memblock refactoring]
Link: http://lkml.kernel.org/r/20200827124549.GD167163@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-13-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
RISC-V does not (yet) support NUMA and for UMA architectures node 0 is
used implicitly during early memory initialization.
There is no need to call memblock_set_node(), remove this call and the
surrounding code.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-7-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, the memory containing DT is not reserved. Thus, that region
of memory can be reallocated or reused for other purposes. This may result
in corrupted DT for nommu virt board in Qemu. We may not face any issue
in kendryte as DT is embedded in the kernel image for that.
Fixes: 6bd33e1ece ("riscv: add nommu support")
Cc: stable@vger.kernel.org
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Currently, page table setup is done during setup_va_final where fixmap can
be used to create the temporary mappings. The physical frame is allocated
from memblock_alloc_* functions. However, this won't work if page table
mapping needs to be created for a different mm context (i.e. efi mm) at
a later point of time.
Use generic kernel page allocation function & macros for any mapping
after setup_vm_final.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
UEFI uses early IO or memory mappings for runtime services before
normal ioremap() is usable. Add the necessary fixmap bindings and
pmd mappings for generic ioremap support to work.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Currently, RISC-V reserves 1MB of fixmap memory for device tree. However,
it maps only single PMD (2MB) space for fixmap which leaves only < 1MB space
left for other kernel features such as early ioremap which requires fixmap
as well. The fixmap size can be increased by another 2MB but it brings
additional complexity and changes the virtual memory layout as well.
If we require some additional feature requiring fixmap again, it has to be
moved again.
Technically, DT doesn't need a fixmap as the memory occupied by the DT is
only used during boot. That's why, We map device tree in early page table
using two consecutive PGD mappings at lower addresses (< PAGE_OFFSET).
This frees lot of space in fixmap and also makes maximum supported
device tree size supported as PGDIR_SIZE. Thus, init memory section can be used
for the same purpose as well. This simplifies fixmap implementation.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This invalidates local TLB after modifying the page tables during early init as
it's too early to handle suprious faults as we otherwise do.
Fixes: f2c17aabc9 ("RISC-V: Implement compile-time fixed mappings")
Reported-by: Syven Wang <syven.wang@sifive.com>
Signed-off-by: Syven Wang <syven.wang@sifive.com>
Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
[Palmer: Cleaned up the commit text]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Merge misc updates from Andrew Morton:
- a few MM hotfixes
- kthread, tools, scripts, ntfs and ocfs2
- some of MM
Subsystems affected by this patch series: kthread, tools, scripts, ntfs,
ocfs2 and mm (hofixes, pagealloc, slab-generic, slab, slub, kcsan,
debug, pagecache, gup, swap, shmem, memcg, pagemap, mremap, mincore,
sparsemem, vmalloc, kasan, pagealloc, hugetlb and vmscan).
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (162 commits)
mm: vmscan: consistent update to pgrefill
mm/vmscan.c: fix typo
khugepaged: khugepaged_test_exit() check mmget_still_valid()
khugepaged: retract_page_tables() remember to test exit
khugepaged: collapse_pte_mapped_thp() protect the pmd lock
khugepaged: collapse_pte_mapped_thp() flush the right range
mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible
mm: thp: replace HTTP links with HTTPS ones
mm/page_alloc: fix memalloc_nocma_{save/restore} APIs
mm/page_alloc.c: skip setting nodemask when we are in interrupt
mm/page_alloc: fallbacks at most has 3 elements
mm/page_alloc: silence a KASAN false positive
mm/page_alloc.c: remove unnecessary end_bitidx for [set|get]_pfnblock_flags_mask()
mm/page_alloc.c: simplify pageblock bitmap access
mm/page_alloc.c: extract the common part in pfn_to_bitidx()
mm/page_alloc.c: replace the definition of NR_MIGRATETYPE_BITS with PB_migratetype_bits
mm/shuffle: remove dynamic reconfiguration
mm/memory_hotplug: document why shuffle_zone() is relevant
mm/page_alloc: remove nr_free_pagecache_pages()
mm: remove vm_total_pages
...
After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP we have two equivalent
functions that call memory_present() for each region in memblock.memory:
sparse_memory_present_with_active_regions() and membocks_present().
Moreover, all architectures have a call to either of these functions
preceding the call to sparse_init() and in the most cases they are called
one after the other.
Mark the regions from memblock.memory as present during sparce_init() by
making sparse_init() call memblocks_present(), make memblocks_present()
and memory_present() functions static and remove redundant
sparse_memory_present_with_active_regions() function.
Also remove no longer required HAVE_MEMORY_PRESENT configuration option.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200712083130.22919-1-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "arm64: Enable vmemmap mapping from device memory", v4.
This series enables vmemmap backing memory allocation from device memory
ranges on arm64. But before that, it enables vmemmap_populate_basepages()
and vmemmap_alloc_block_buf() to accommodate struct vmem_altmap based
alocation requests.
This patch (of 3):
vmemmap_populate_basepages() is used across platforms to allocate backing
memory for vmemmap mapping. This is used as a standard default choice or
as a fallback when intended huge pages allocation fails. This just
creates entire vmemmap mapping with base pages (PAGE_SIZE).
On arm64 platforms, vmemmap_populate_basepages() is called instead of the
platform specific vmemmap_populate() when ARM64_SWAPPER_USES_SECTION_MAPS
is not enabled as in case for ARM64_16K_PAGES and ARM64_64K_PAGES configs.
At present vmemmap_populate_basepages() does not support allocating from
driver defined struct vmem_altmap while trying to create vmemmap mapping
for a device memory range. It prevents ARM64_16K_PAGES and
ARM64_64K_PAGES configs on arm64 from supporting device memory with
vmemap_altmap request.
This enables vmem_altmap support in vmemmap_populate_basepages() unlocking
device memory allocation for vmemap mapping on arm64 platforms with 16K or
64K base page configs.
Each architecture should evaluate and decide on subscribing device memory
based base page allocation through vmemmap_populate_basepages(). Hence
lets keep it disabled on all archs in order to preserve the existing
semantics. A subsequent patch enables it on arm64.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jia He <justin.he@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Yu Zhao <yuzhao@google.com>
Link: http://lkml.kernel.org/r/1594004178-8861-1-git-send-email-anshuman.khandual@arm.com
Link: http://lkml.kernel.org/r/1594004178-8861-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have a lot of new kernel features for this merge window:
* ARCH_SUPPORTS_ATOMIC_RMW, to allow OSQ locks to be enabled.
* The ability to enable NO_HZ_FULL
* Support for enabling kcov, kmemleak, stack protector, and VM debugging.
* JUMP_LABEL support.
There are also a handful of cleanups.
next points out a trivial Kconfig merge conflict. I don't see any way to have
done this better: the symbols are sorted, it just happens that
HAVE_COPY_THREAD_TLS was in the middle of two new symbols. In case it helps
any, here's a pretty current conflict resolution:
diff --cc arch/riscv/Kconfig
index bc37241a6875,6c4bce7cad8a..7b5905529146
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@@ -57,9 -54,6 +59,8 @@@ config RISC
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
select HAVE_ASM_MODVERSIONS
+ select HAVE_CONTEXT_TRACKING
- select HAVE_COPY_THREAD_TLS
+ select HAVE_DEBUG_KMEMLEAK
select HAVE_DMA_CONTIGUOUS if MMU
select HAVE_EBPF_JIT if MMU
select HAVE_FUTEX_CMPXCHG if FUTEX
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAl8sa2YTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYieFTD/0RlhicBrn5UGpaUwtgwuMENYOrb3pn
5SsPzhCni3/8HvMzr/gOGHWM2YOQZkY5FReIqy1IdbPSe/exjv6DyKMdZr+OI+3+
232TAwAtILGlKB+zBMJWA3eLah0pbDvs7RpuhiPfWSzzWUrAMHcGSq0TzM2pYe1r
KPOLj6bSKpnZO+Dto80V8w4ZeWmtAArVDiujIy8zlvgpM1Z66C2SazloQH7HkPS7
D2hvLZZU00etyAZI/aJJsemCBRg9nsVoqGTBSXWpUPATBRMZFfHovbh7AUJlqY5E
HPHBSf3KDeOjoF8EXkvT/6z5Q6+LUpRyK+KKwWCs/337i1652P31nczDps9J6Eq0
IC3J/YWcy4eZ7pMEps0vQmr9aX+FusOCJUmJqJW77Uzi0fHHTXa+O3olwiz/JqYz
c3yIihVAvtw9eCSoqlL7YoL9HNyfrXKCtZ4l/DLjwGG5WJw7C7mGMbBMClcD4ytU
/Z91ON/UgWFW5805dBaajp72SStp1FP54HsAH5E12XYZSiepnu70G3BUNJHvDJT5
QOKkrhOswwit5DW30Celh6kmtidAWiavy6R0AbbpcqI+bItkEZWD1BSStTc0WdhV
JgOp0ieCzu5Jyw03KC1nZA8VgM3zrAmU0moSDqirddzNYuuBTqt39doZOEs1MS2W
TiR1JSnGNaKl4Q==
=asH5
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.9-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
"We have a lot of new kernel features for this merge window:
- ARCH_SUPPORTS_ATOMIC_RMW, to allow OSQ locks to be enabled
- The ability to enable NO_HZ_FULL
- Support for enabling kcov, kmemleak, stack protector, and VM
debugging
- JUMP_LABEL support
There are also a handful of cleanups"
* tag 'riscv-for-linus-5.9-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (24 commits)
riscv: disable stack-protector for vDSO
RISC-V: Fix build warning for smpboot.c
riscv: fix build warning of mm/pageattr
riscv: Fix build warning for mm/init
RISC-V: Setup exception vector early
riscv: Select ARCH_HAS_DEBUG_VM_PGTABLE
riscv: Use generic pgprot_* macros from <linux/pgtable.h>
mm: pgtable: Make generic pgprot_* macros available for no-MMU
riscv: Cleanup unnecessary define in asm-offset.c
riscv: Add jump-label implementation
riscv: Support R_RISCV_ADD64 and R_RISCV_SUB64 relocs
Replace HTTP links with HTTPS ones: RISC-V
riscv: Add STACKPROTECTOR supported
riscv: Fix typo in asm/hwcap.h uapi header
riscv: Add kmemleak support
riscv: Allow building with kcov coverage
riscv: Enable context tracking
riscv: Support irq_work via self IPIs
riscv: Enable LOCKDEP_SUPPORT & fixup TRACE_IRQFLAGS_SUPPORT
riscv: Fixup lockdep_assert_held with wrong param cpu_running
...
Add static keyword for resource_init, this function is only used in this
object file.
Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Currently, maximum physical memory allowed is equal to -PAGE_OFFSET.
That's why we remove any memory blocks spanning beyond that size. However,
it is done only for memblock containing linux kernel which will not work
if there are multiple memblocks.
Process all memory blocks to figure out how much memory needs to be removed
and remove at the end instead of updating the memblock list in place.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Currently, maximum number of mapper pages are set to the pfn calculated
from the memblock size of the memblock containing kernel. This will work
until that memblock spans the entire memory. However, it will be set to
a wrong value if there are multiple memblocks defined in kernel
(e.g. with efi runtime services).
Set the the maximum value to the pfn calculated from dram size.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Add System RAM to /proc/iomem, various tools expect it such as kdump.
It is also needed for page_is_ram API which checks the specified address
whether registered as System RAM in iomem_resource list.
Signed-off-by: Zong Li <zong.li@sifive.com>
[Palmer: check MEMBLOCK_NOMAP]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
* Select statements are now sorted alphanumerically.
* Our first-level interrupts are now handled via a full irqchip driver.
* CPU hotplug is fixed.
* Our vDSO calls now use the common vDSO infrastructure.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAl7hq3QTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiRcXD/9dEmZ/UgKNGE1BYlQoLbS4o3u4dt6K
aZkl4AvadpgxlmCl5OAqv/8+UIsMmzhJ4y8bQL1FOdPhRQfModFlQFwzDiUbPguU
Fgh+wXF+/iDywtfA2fVm7OaMBKpftzTBF+YKRsZHdrUF1l3es9f99mxfelcZWx2h
nMrOdKFjmEeqhPlkF17Wr30elKGO7NqT3caBam9X/do1bgGnJ9sLfehr4b7dXdzk
QWm6cp8xmSM7A2jKUT8l7WKmZn3a8DDTDws/yKDuFr+2UxfXspPtc+XzN36zRSAd
DkL3Zwp+egld4y43019BaK2yY4sQ59HzJYRD+4Z0BiRltBs2gexVqkFy2k8kGemh
X4kLe2opNQdsh9tcAM+s2VnBuwuiKPXc6AtNXaQKzeuZ6286axweYlCcYufTgzXP
oEu1haDMjsZz9/mXNiQhvGIPMU/obXSRdJYvryhIwpDOqR3cvbpeQTtC/16raNwd
OjE0qFE7AtI9pa7+oCQPfcJurjm6cPkv25b+L2SQ+dW9WkE6QzIP5ynMuxdhxg2m
OxKbuV0mZ3MgbdK+nEc72gUtbUjdb3t/1a9GwoNNLW78eKER3uXl4vxAyIqSKgf7
RViL0/CzEPqU97S/3qVPC27KhsBbqvXwM7gE1MVnm1HiEUiKnlZkLjzFqkorLUMz
emv+mW+kdjZ1aQ==
=FQnf
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:
- Kconfig select statements are now sorted alphanumerically
- first-level interrupts are now handled via a full irqchip driver
- CPU hotplug is fixed
- vDSO calls now use the common vDSO infrastructure
* tag 'riscv-for-linus-5.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: set the permission of vdso_data to read-only
riscv: use vDSO common flow to reduce the latency of the time-related functions
riscv: fix build warning of missing prototypes
RISC-V: Don't mark init section as non-executable
RISC-V: Force select RISCV_INTC for CONFIG_RISCV
RISC-V: Remove do_IRQ() function
clocksource/drivers/timer-riscv: Use per-CPU timer interrupt
irqchip: RISC-V per-HART local interrupt controller driver
RISC-V: Rename and move plic_find_hart_id() to arch directory
RISC-V: self-contained IPI handling routine
RISC-V: Sort select statements alphanumerically
The head text section (i.e. _start, secondary_start_sbi, etc) and the
init section fall under same page table level-1 mapping.
Currently, the runtime CPU hotplug is broken because we are marking
init section as non-executable which in-turn marks head text section
as non-executable.
Further investigating other architectures, it seems marking the init
section as non-executable is redundant because the init section pages
are anyway poisoned and freed.
To fix broken runtime CPU hotplug, we simply remove the code marking
the init section as non-executable.
Fixes: d27c3c9081 ("riscv: add STRICT_KERNEL_RWX support")
Cc: stable@vger.kernel.org
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
All architectures define pte_index() as
(address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)
and all architectures define pte_offset_kernel() as an entry in the array
of PTEs indexed by the pte_index().
For the most architectures the pte_offset_kernel() implementation relies
on the availability of pmd_page_vaddr() that converts a PMD entry value to
the virtual address of the page containing PTEs array.
Let's move x86 definitions of the PTE accessors to the generic place in
<linux/pgtable.h> and then simply drop the respective definitions from the
other architectures.
The architectures that didn't provide pmd_page_vaddr() are updated to have
that defined.
The generic implementation of pte_offset_kernel() can be overridden by an
architecture and alpha makes use of this because it has special ordering
requirements for its version of pte_offset_kernel().
[rppt@linux.ibm.com: v2]
Link: http://lkml.kernel.org/r/20200514170327.31389-11-rppt@kernel.org
[rppt@linux.ibm.com: update]
Link: http://lkml.kernel.org/r/20200514170327.31389-12-rppt@kernel.org
[rppt@linux.ibm.com: update]
Link: http://lkml.kernel.org/r/20200514170327.31389-13-rppt@kernel.org
[akpm@linux-foundation.org: fix x86 warning]
[sfr@canb.auug.org.au: fix powerpc build]
Link: http://lkml.kernel.org/r/20200607153443.GB738695@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-10-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "mm: consolidate definitions of page table accessors", v2.
The low level page table accessors (pXY_index(), pXY_offset()) are
duplicated across all architectures and sometimes more than once. For
instance, we have 31 definition of pgd_offset() for 25 supported
architectures.
Most of these definitions are actually identical and typically it boils
down to, e.g.
static inline unsigned long pmd_index(unsigned long address)
{
return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
}
static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
{
return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
}
These definitions can be shared among 90% of the arches provided
XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.
For architectures that really need a custom version there is always
possibility to override the generic version with the usual ifdefs magic.
These patches introduce include/linux/pgtable.h that replaces
include/asm-generic/pgtable.h and add the definitions of the page table
accessors to the new header.
This patch (of 12):
The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the
functions involving page table manipulations, e.g. pte_alloc() and
pmd_alloc(). So, there is no point to explicitly include <asm/pgtable.h>
in the files that include <linux/mm.h>.
The include statements in such cases are remove with a simple loop:
for f in $(git grep -l "include <linux/mm.h>") ; do
sed -i -e '/include <asm\/pgtable.h>/ d' $f
done
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* The remainder of the code necessary to support the Kendryte K210.
* Support for building device trees into the kernel, as the K210 doesn't
have a bootloader that provides one.
* A K210 device tree and the associated defconfig update.
* Support for skipping PMP initialization on systems that trap on PMP
accesses rather than treating them as WARL.
* Support for KGDB.
* Improvements to text patching.
* Some cleanups to the SiFive L2 cache driver.
I may have a second part, but I wanted to get this out earlier rather than
later as they've been ready to go for a while now.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAl7YRtMTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYif+oD/4kaoQqmiw0VvyFjBlPT/TC7OPbaoXO
5WBegSEAl+oOq+fUtjHomzn6HaTW2cPNexsqC/KuGP1/QL0OQ23T5O18LaCmrGMd
CvELeAcvu6G4kbRuHwF0U16hBI2A1xIYVUJ8qmbuHPpEwVel3AsP2RIjQ+bOvm7g
fVTIrlzl1s1eTsOXdOJb3sFZtISHb7o3lHZmcplDva3N13x8E0FrjuoJhHv0f2mj
kcmZcy3sr8luAkwAJmbqmdPuwYTlteMnucXdCUv/kG2SaPkBwU5TS+MN7No2ylXH
v4vxjkuSyJ1db72pla6uUB/cYD+mh2YI4tjjvH93s3k4I/GZMCrqivXZW2QtxBdt
na++GxK8e4FrBG+VCCCQvx2mqugpMFMDnZDt36N2BqwjOtEzO3W8GqjrKCovXtQf
farlgAiuFNVcoo7VFdpNZE+N5elQa0GWf57csCqqMNN4J8JLaOgyS7ByozJ9b9/A
1x4g7kQANnMvYiYAeV+C2YfBcK6x8gHEPlWio1QZqirW5la34VUEWa18+rQeSRMz
DArKPauCR1GuYZZVUkkDI8cXj0Gs5fTXsYmn+WsIpjFHnDYeYVe6Ocmczxyb3o3O
lQwm7+drHlSmMvgrOlWJMrIUbdiomgelDU/iD4PiZzKozC52yKDS6A5gzoG6uZSM
UleeAIeVG/TDCw==
=keTq
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.8-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- The remainder of the code necessary to support the Kendryte K210:
* Support for building device trees into the kernel, as the K210
doesn't have a bootloader that provides one
* A K210 device tree and the associated defconfig update
* Support for skipping PMP initialization on systems that trap on
PMP accesses rather than treating them as WARL
- Support for KGDB
- Improvements to text patching
- Some cleanups to the SiFive L2 cache driver
* tag 'riscv-for-linus-5.8-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
soc: sifive: l2 cache: Mark l2_get_priv_group as static
soc: sifive: l2 cache: Eliminate an unsigned zero compare warning
riscv: Add support to determine no. of L2 cache way enabled
riscv: cacheinfo: Implement cache_get_priv_group with a generic ops structure
riscv: Use text_mutex instead of patch_lock
riscv: Use NOKPROBE_SYMBOL() instead of __krpobes annotation
riscv: Remove the 'riscv_' prefix of function name
riscv: Add SW single-step support for KDB
riscv: Use the XML target descriptions to report 3 system registers
riscv: Add KGDB support
kgdb: Add kgdb_has_hit_break function
RISC-V: Skip setting up PMPs on traps
riscv: K210: Update defconfig
riscv: K210: Add a built-in device tree
riscv: Allow device trees to be built into the kernel
Support DEBUG_WX to check whether there are mapping with write and execute
permission at the same time.
[akpm@linux-foundation.org: replace macros with C]
Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/282e266311bced080bc6f7c255b92f87c1eb65d6.1587455584.git.zong.li@sifive.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
free_area_init() has effectively became a wrapper for
free_area_init_nodes() and there is no point of keeping it. Still
free_area_init() name is shorter and more general as it does not imply
necessity to initialize multiple nodes.
Rename free_area_init_nodes() to free_area_init(), update the callers and
drop old version of free_area_init().
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Hoan Tran <hoan@os.amperecomputing.com> [arm64]
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200412194859.12663-6-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
arch/riscv/mm/init.c: In function ‘print_vm_layout’:
arch/riscv/mm/init.c:68:37: error: ‘FIXADDR_START’ undeclared (first use in this function);
arch/riscv/mm/init.c:69:20: error: ‘FIXADDR_TOP’ undeclared
arch/riscv/mm/init.c:70:37: error: ‘PCI_IO_START’ undeclared
arch/riscv/mm/init.c:71:20: error: ‘PCI_IO_END’ undeclared
arch/riscv/mm/init.c:72:38: error: ‘VMEMMAP_START’ undeclared
arch/riscv/mm/init.c:73:20: error: ‘VMEMMAP_END’ undeclared (first use in this function);
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Some systems don't provide a useful device tree to the kernel on boot.
Chasing around bootloaders for these systems is a headache, so instead
le't's just keep a device tree table in the kernel, keyed by the SOC's
unique identifier, that contains the relevant DTB.
This is only implemented for M mode right now. While we could implement
this via the SBI calls that allow access to these identifiers, we don't
have any systems that need this right now.
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This patch removes the unused functions set_kernel_text_rw/ro.
Currently, it is not being invoked from anywhere and no other architecture
(except arm) uses this code. Even in ARM, these functions are not invoked
from anywhere currently.
Fixes: d27c3c9081 ("riscv: add STRICT_KERNEL_RWX support")
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
This tag contains the patches I'd like to target for 5.7. It has a handful of
new features:
* Partial support for the Kendryte K210. There are still a few outstanding
issues that I have patches for, but I don't actually have a board to test
them so they're not included yet.
* SBI v0.2 support.
* Fixes to support for building with LLVM-based toolchains. The resulting
images are known not to boot yet.
This builds and boots for me. There is one merge conflict, it's just a Kconfig
merge issue. I can publish a resolved branch if you'd like.
I don't anticipate a part two, but I'll probably have something early in the
RCs to finish up the K210 support.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAl6OAAoTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiUqKEACidkNwwFf10hN6ojnIsBeh0mvZ0QuD
qw5Uj0L5rmKdf84IRUGH8A3tyal39BoNz41Eo0mvZoInj542fVMArrqpAIKHN6e+
GMOoHgeZO329zQYMqBX1RN/W9MV80KPKZcROeWkL+AbAmbQBaVRq08Ur1QIg2bHI
84H0LzlCd1xz9k827ypOyz7ix4OYkli7DcUgdiPTK95CjaseALQXvSYA237lcXpB
3g2L+/TDrjtGHn+vy3XWLJISY/BY4ZKfWN0UL4CJHvGuL61tJ+VRXaA3DQcBNd56
7du41GTz9BU6J5wZTVnB5HstebwiXyP8pY34Pp8S4/wWyVdoi5hZ0Jn7sC9oDdnA
r/CjawrGCZv6IEt69YA1edo3AoR13gXCbylRovdxVMRYa0OLmcTfFr843svTZzbQ
ECSt6te2J2YwtYeLO6AlZeu2gBLW0Mxh5JBmiB8sy9C8tVlD/EFTYrnhEQnjUEVx
wV76wfbeYL1be5IS4Tu/d0F5My6miIL+JafUND0bJQ7igp08po/YY4NIg/xyYlM2
Aqie3MuTYlA3/I20N1K2mQkQnjKS4Y5AqNDj5povew2mPUvTGuLhZDZ/asKxdBIf
BSq3V74V/Vc+qsh1d5IhUCDVthGYqBoJoBSUjcbItrpgmhLyvhbbSCLeF8ehDPeI
Y9074bg5YH79pg==
=P1DO
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
"This contains a handful of new features:
- Partial support for the Kendryte K210.
There are still a few outstanding issues that I have patches for,
but I don't actually have a board to test them so they're not
included yet.
- SBI v0.2 support.
- Fixes to support for building with LLVM-based toolchains. The
resulting images are known not to boot yet.
I don't anticipate a part two, but I'll probably have something early
in the RCs to finish up the K210 support"
* tag 'riscv-for-linus-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (38 commits)
riscv: create a loader.bin boot image for Kendryte SoC
riscv: Kendryte K210 default config
riscv: Add Kendryte K210 device tree
riscv: Select required drivers for Kendryte SOC
riscv: Add Kendryte K210 SoC support
riscv: Add SOC early init support
riscv: Unaligned load/store handling for M_MODE
RISC-V: Support cpu hotplug
RISC-V: Add supported for ordered booting method using HSM
RISC-V: Add SBI HSM extension definitions
RISC-V: Export SBI error to linux error mapping function
RISC-V: Add cpu_ops and modify default booting method
RISC-V: Move relocate and few other functions out of __init
RISC-V: Implement new SBI v0.2 extensions
RISC-V: Introduce a new config for SBI v0.1
RISC-V: Add SBI v0.2 extension definitions
RISC-V: Add basic support for SBI v0.2
RISC-V: Mark existing SBI as 0.1 SBI.
riscv: Use macro definition instead of magic number
riscv: Add support to dump the kernel page tables
...
The commit contains that make text section as non-writable, rodata
section as read-only, and data section as non-executable.
The init section should be changed to non-executable.
Signed-off-by: Zong Li <zong.li@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
When looking for the memblock where the kernel lives, we should check
that the memory range associated to the memblock entirely comprises the
kernel image and not only intersects with it.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
__pa_symbol is the marcro that should be used for kernel symbols. It is
also a pre-requisite for DEBUG_VIRTUAL which will do bounds checking.
Signed-off-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Add support for dumping the kernel address space layout to the console.
User can enable CONFIG_DEBUG_VM to dump the virtual memory region into
dmesg buffer during boot-up.
Signed-off-by: Yash Shah <yash.shah@sifive.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
[paul.walmsley@sifive.com: dropped .init/.text/.data/.bss prints;
added PCI legacy I/O region display]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
The kernel runs in M-mode without using page tables, and thus can't run
bare metal without help from additional firmware.
Most of the patch is just stubbing out code not needed without page
tables, but there is an interesting detail in the signals implementation:
- The normal RISC-V syscall ABI only implements rt_sigreturn as VDSO
entry point, but the ELF VDSO is not supported for nommu Linux.
We instead copy the code to call the syscall onto the stack.
In addition to enabling the nommu code a new defconfig for a small
kernel image that can run in nommu mode on qemu is also provided, to run
a kernel in qemu you can use the following command line:
qemu-system-riscv64 -smp 2 -m 64 -machine virt -nographic \
-kernel arch/riscv/boot/loader \
-drive file=rootfs.ext2,format=raw,id=hd0 \
-device virtio-blk-device,drive=hd0
Contains contributions from Damien Le Moal <Damien.LeMoal@wdc.com>.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
[paul.walmsley@sifive.com: updated to apply; add CONFIG_MMU guards
around PCI_IOBASE definition to fix build issues; fixed checkpatch
issues; move the PCI_IO_* and VMEMMAP address space macros along
with the others; resolve sparse warning]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
The PMD_SIZE is equal to PGDIR_SIZE when __PAGETABLE_PMD_FOLDED is
defined.
Signed-off-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[paul.walmsley@sifive.com: fixed spelling in commit summary]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
sparse complains loudly when string literals associated with
preprocessor directives are split into multiple, separately quoted
strings across different lines:
arch/riscv/mm/init.c:341:9: error: Expected ; at the end of type declaration
arch/riscv/mm/init.c:341:9: error: got "not use absolute addressing."
arch/riscv/mm/init.c:358:9: error: Trying to use reserved word 'do' as identifier
arch/riscv/mm/init.c:358:9: error: Expected ; at end of declaration
[ ... ]
It turns out this doesn't compile. The existing Linux practice for
this situation is simply to use a single long line. So, fix by
concatenating the strings.
This patch should have no functional impact.
This version incorporates changes based on feedback from Luc Van
Oostenryck <luc.vanoostenryck@gmail.com>.
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Reviewed-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/linux-riscv/CAAhSdy2nX2LwEEAZuMtW_ByGTkHO6KaUEvVxRnba_ENEjmFayQ@mail.gmail.com/T/#mc1a58bc864f71278123d19a7abc083a9c8e37033
Fixes: 387181dcdb ("RISC-V: Always compile mm/init.c with cmodel=medany and notrace")
Cc: Anup Patel <anup.patel@wdc.com>
Add prototypes for assembly language functions defined in head.S,
and include these prototypes into C source files that call those
functions.
This patch resolves the following warnings from sparse:
arch/riscv/kernel/setup.c:39:10: warning: symbol 'hart_lottery' was not declared. Should it be static?
arch/riscv/kernel/setup.c:42:13: warning: symbol 'parse_dtb' was not declared. Should it be static?
arch/riscv/kernel/smpboot.c:33:6: warning: symbol '__cpu_up_stack_pointer' was not declared. Should it be static?
arch/riscv/kernel/smpboot.c:34:6: warning: symbol '__cpu_up_task_pointer' was not declared. Should it be static?
arch/riscv/mm/fault.c:25:17: warning: symbol 'do_page_fault' was not declared. Should it be static?
This change should have no functional impact.
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Using CONFIG_SPARSEMEM_VMEMMAP instead of CONFIG_SPARSEMEM to fix
following build issue.
riscv64-linux-ld: arch/riscv/mm/init.o: in function 'vmemmap_populate':
init.c:(.meminit.text+0x8): undefined reference to 'vmemmap_populate_basepages'
Cc: Logan Gunthorpe <logang@deltatee.com>
Fixes: d95f1a542c ("RISC-V: Implement sparsemem")
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
This fixes an error with how the FDT blob is reserved in memblock.
An incorrect physical address calculation exposed the FDT header to
unintended corruption, which typically manifested with of_fdt_raw_init()
faulting during late boot after fdt_totalsize() returned a wrong value.
Systems with smaller physical memory sizes more frequently trigger this
issue, as the kernel is more likely to allocate from the DMA32 zone
where bbl places the DTB after the kernel image.
Commit 671f9a3e2e ("RISC-V: Setup initial page tables in two stages")
changed the mapping of the DTB to reside in the fixmap area.
Consequently, early_init_fdt_reserve_self() cannot be used anymore in
setup_bootmem() since it relies on __pa() to derive a physical address,
which does not work with dtb_early_va that is no longer a valid kernel
logical address.
The reserved[0x1] region shows the effect of the pointer underflow
resulting from the __pa(initial_boot_params) offset subtraction:
[ 0.000000] MEMBLOCK configuration:
[ 0.000000] memory size = 0x000000001fe00000 reserved size = 0x0000000000a2e514
[ 0.000000] memory.cnt = 0x1
[ 0.000000] memory[0x0] [0x0000000080200000-0x000000009fffffff], 0x000000001fe00000 bytes flags: 0x0
[ 0.000000] reserved.cnt = 0x2
[ 0.000000] reserved[0x0] [0x0000000080200000-0x0000000080c2dfeb], 0x0000000000a2dfec bytes flags: 0x0
[ 0.000000] reserved[0x1] [0xfffffff080100000-0xfffffff080100527], 0x0000000000000528 bytes flags: 0x0
With the fix applied:
[ 0.000000] MEMBLOCK configuration:
[ 0.000000] memory size = 0x000000001fe00000 reserved size = 0x0000000000a2e514
[ 0.000000] memory.cnt = 0x1
[ 0.000000] memory[0x0] [0x0000000080200000-0x000000009fffffff], 0x000000001fe00000 bytes flags: 0x0
[ 0.000000] reserved.cnt = 0x2
[ 0.000000] reserved[0x0] [0x0000000080200000-0x0000000080c2dfeb], 0x0000000000a2dfec bytes flags: 0x0
[ 0.000000] reserved[0x1] [0x0000000080e00000-0x0000000080e00527], 0x0000000000000528 bytes flags: 0x0
Fixes: 671f9a3e2e ("RISC-V: Setup initial page tables in two stages")
Signed-off-by: Albert Ou <aou@eecs.berkeley.edu>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Implement sparsemem support for Risc-v which helps pave the
way for memory hotplug and eventually P2P support.
Introduce Kconfig options for virtual and physical address bits which
are used to calculate the size of the vmemmap and set the
MAX_PHYSMEM_BITS.
The vmemmap is located directly before the VMALLOC region and sized
such that we can allocate enough pages to populate all the virtual
address space in the system (similar to the way it's done in arm64).
During initialization, call memblocks_present() and sparse_init(),
and provide a stub for vmemmap_populate() (all of which is similar to
arm64).
[greentime.hu@sifive.com: fixed pfn_valid, FIXADDR_TOP and fixed a bug
rebasing onto v5.3]
Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Andrew Waterman <andrew@sifive.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Michael Clark <michaeljclark@mac.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Zong Li <zong@andestech.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
[paul.walmsley@sifive.com: updated to apply; minor commit message
reformat]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Since commit a3182c91ef ("RISC-V: Access CSRs using CSR numbers"),
we should prefer accessing CSRs using their CSR numbers, but there
are several leftovers like sstatus / sptbr we missed.
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Currently, the setup_vm() does initial page table setup in one-shot
very early before enabling MMU. Due to this, the setup_vm() has to map
all possible kernel virtual addresses since it does not know size and
location of RAM. This means we have kernel mappings for non-existent
RAM and any buggy driver (or kernel) code doing out-of-bound access
to RAM will not fault and cause underterministic behaviour.
Further, the setup_vm() creates PMD mappings (i.e. 2M mappings) for
RV64 systems. This means for PAGE_OFFSET=0xffffffe000000000 (i.e.
MAXPHYSMEM_128GB=y), the setup_vm() will require 129 pages (i.e.
516 KB) of memory for initial page tables which is never freed. The
memory required for initial page tables will further increase if
we chose a lower value of PAGE_OFFSET (e.g. 0xffffff0000000000)
This patch implements two-staged initial page table setup, as follows:
1. Early (i.e. setup_vm()): This stage maps kernel image and DTB in
a early page table (i.e. early_pg_dir). The early_pg_dir will be used
only by boot HART so it can be freed as-part of init memory free-up.
2. Final (i.e. setup_vm_final()): This stage maps all possible RAM
banks in the final page table (i.e. swapper_pg_dir). The boot HART
will start using swapper_pg_dir at the end of setup_vm_final(). All
non-boot HARTs directly use the swapper_pg_dir created by boot HART.
We have following advantages with this new approach:
1. Kernel mappings for non-existent RAM don't exists anymore.
2. Memory consumed by initial page tables is now indpendent of the
chosen PAGE_OFFSET.
3. Memory consumed by initial page tables on RV64 system is 2 pages
(i.e. 8 KB) which has significantly reduced and these pages will be
freed as-part of the init memory free-up.
The patch also provides a foundation for implementing strict kernel
mappings where we protect kernel text and rodata using PTE permissions.
Suggested-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
[paul.walmsley@sifive.com: updated to apply; fixed a checkpatch warning]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
The RISC-V free_initrd_mem is identical to the default one, except
that it doesn't poison the freed memory. Remove it so that the
default implementations gets used instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Currently, the setup_bootmem() reserves memory from RAM start to the
kernel end. This prevents us from exploring ways to use the RAM below
(or before) the kernel start hence this patch updates setup_bootmem()
to only reserve memory from the kernel start to the kernel end.
Suggested-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation version 2 this program is distributed
in the hope that it will be useful but without any warranty without
even the implied warranty of merchantability or fitness for a
particular purpose see the gnu general public license for more
details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 97 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.025053186@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The riscv version of free_initmem() differs from the generic one only in
that it sets the freed memory to zero.
Make ricsv use the generic version and poison the freed memory.
Link: http://lkml.kernel.org/r/1550515285-17446-5-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Richard Kuo <rkuo@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Maximum Physical Memory 2GiB option for 64bit systems is currently
broken because kernel hangs at boot-time when this option is enabled
and the underlying system has more than 2GiB memory.
This issue can be easily reproduced on SiFive Unleashed board where
we have 8GiB of memory.
This patch fixes above issue by removing unusable memory region in
setup_bootmem().
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
The Linux RISC-V 32bit kernel is broken after we moved setup_vm() from
kernel/setup.c to mm/init.c because Linux RISC-V 32bit kernel by default
uses cmodel=medlow which results in a non-position-independent setup_vm().
This patch fixes Linux RISC-V 32bit kernel booting by:
1. Forcing cmodel=medany for mm/init.c
2. Moving remaing MM-related stuff va_pa_offset, pfn_base and
empty_zero_page from kernel/setup.c to mm/init.c
Further, the setup_vm() cannot handle GCC instrumentation for FTRACE so
we disable it for mm/init.c by not using "-pg" compiler flag.
Fixes: 6f1e9e946f ("RISC-V: Move setup_vm() to mm/init.c")
Suggested-by: Christoph Hellwig <hch@lst.de>
Suggested-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
We should free-up initrd memory in free_initrd_mem() instead
of doing nothing.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
This patch implements compile-time virtual to physical mappings. These
compile-time fixed mappings can be used by earlycon, ACPI, and early
ioremap for creating fixed mappings when FIX_EARLYCON_MEM=y.
To start with, we have enabled compile-time fixed mappings for earlycon.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
The setup_vm() is responsible for setting up initial page table hence
should be placed in mm/init.c.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
The setup_bootmem() mainly populates memblocks and does early memory
reservations. The right location for this function is mm/init.c. It
calls setup_initrd() so we move that as well.
Signed-off-by: Anup Patel <anup.patel@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
max_low_pfn should be pfn_size not byte_size.
Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Signed-off-by: Mao Han <mao_han@c-sky.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.
The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include <linux/memblock.h>
@@
@@
- #include <linux/bootmem.h>
+ #include <linux/memblock.h>
[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The DMA32 is for 64-bit usage.
Signed-off-by: Zong Li <zong@andestech.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
init_mm.pgd (aka swapped_pgd) gets relocated like all other kernel
symbols by the elf loader, so there is no need to reload it from satp.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
This patch allows devices that require memory that can be addressed
using 32-bit addresses to work easily on RISC-V systems. The newly
improved dma-direct ops will tap into this pool automatically for
32-bit addressing.
Based on an earlier patch from Wesley W. Terpstra.
CC: Wesley W. Terpstra <terpstra@sifive.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
This contains the various __init C functions, the initial assembly
kernel entry point, and the code to reset the system. When a file was
init-related this patch contains the entire file.
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>