mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
synced 2025-09-05 11:53:41 +00:00
loongarch-next
452 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
![]() |
34e3c4500c |
LoongArch: Rework CPU feature probe from CPUCFG/IOCSR
Probe ISA level, TLB, IOCSR information from CPUCFG to improve kernel resilience to different core implementations. BTW, IOCSR register definition appears to be a platform-specific spec instead of an architecture spec, even for the Loongson CPUs there is no guarantee that IOCSR will always present. Thus it's dangerous to perform IOCSR probing without checking CPU type and instruction availability. Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
d0bb0b6000 |
LoongArch: Enable ACPI BGRT handling
Add ACPI BGRT support on LoongArch so it can display image provied by acpi table at boot stage and switch to graphical UI smoothly. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
617a814f14 |
ALong with the usual shower of singleton patches, notable patch series in
this pull request are: "Align kvrealloc() with krealloc()" from Danilo Krummrich. Adds consistency to the APIs and behaviour of these two core allocation functions. This also simplifies/enables Rustification. "Some cleanups for shmem" from Baolin Wang. No functional changes - mode code reuse, better function naming, logic simplifications. "mm: some small page fault cleanups" from Josef Bacik. No functional changes - code cleanups only. "Various memory tiering fixes" from Zi Yan. A small fix and a little cleanup. "mm/swap: remove boilerplate" from Yu Zhao. Code cleanups and simplifications and .text shrinkage. "Kernel stack usage histogram" from Pasha Tatashin and Shakeel Butt. This is a feature, it adds new feilds to /proc/vmstat such as $ grep kstack /proc/vmstat kstack_1k 3 kstack_2k 188 kstack_4k 11391 kstack_8k 243 kstack_16k 0 which tells us that 11391 processes used 4k of stack while none at all used 16k. Useful for some system tuning things, but partivularly useful for "the dynamic kernel stack project". "kmemleak: support for percpu memory leak detect" from Pavel Tikhomirov. Teaches kmemleak to detect leaksage of percpu memory. "mm: memcg: page counters optimizations" from Roman Gushchin. "3 independent small optimizations of page counters". "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from David Hildenbrand. Improves PTE/PMD splitlock detection, makes powerpc/8xx work correctly by design rather than by accident. "mm: remove arch_make_page_accessible()" from David Hildenbrand. Some folio conversions which make arch_make_page_accessible() unneeded. "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David Finkel. Cleans up and fixes our handling of the resetting of the cgroup/process peak-memory-use detector. "Make core VMA operations internal and testable" from Lorenzo Stoakes. Rationalizaion and encapsulation of the VMA manipulation APIs. With a view to better enable testing of the VMA functions, even from a userspace-only harness. "mm: zswap: fixes for global shrinker" from Takero Funaki. Fix issues in the zswap global shrinker, resulting in improved performance. "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao. Fill in some missing info in /proc/zoneinfo. "mm: replace follow_page() by folio_walk" from David Hildenbrand. Code cleanups and rationalizations (conversion to folio_walk()) resulting in the removal of follow_page(). "improving dynamic zswap shrinker protection scheme" from Nhat Pham. Some tuning to improve zswap's dynamic shrinker. Significant reductions in swapin and improvements in performance are shown. "mm: Fix several issues with unaccepted memory" from Kirill Shutemov. Improvements to the new unaccepted memory feature, "mm/mprotect: Fix dax puds" from Peter Xu. Implements mprotect on DAX PUDs. This was missing, although nobody seems to have notied yet. "Introduce a store type enum for the Maple tree" from Sidhartha Kumar. Cleanups and modest performance improvements for the maple tree library code. "memcg: further decouple v1 code from v2" from Shakeel Butt. Move more cgroup v1 remnants away from the v2 memcg code. "memcg: initiate deprecation of v1 features" from Shakeel Butt. Adds various warnings telling users that memcg v1 features are deprecated. "mm: swap: mTHP swap allocator base on swap cluster order" from Chris Li. Greatly improves the success rate of the mTHP swap allocation. "mm: introduce numa_memblks" from Mike Rapoport. Moves various disparate per-arch implementations of numa_memblk code into generic code. "mm: batch free swaps for zap_pte_range()" from Barry Song. Greatly improves the performance of munmap() of swap-filled ptes. "support large folio swap-out and swap-in for shmem" from Baolin Wang. With this series we no longer split shmem large folios into simgle-page folios when swapping out shmem. "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao. Nice performance improvements and code reductions for gigantic folios. "support shmem mTHP collapse" from Baolin Wang. Adds support for khugepaged's collapsing of shmem mTHP folios. "mm: Optimize mseal checks" from Pedro Falcato. Fixes an mprotect() performance regression due to the addition of mseal(). "Increase the number of bits available in page_type" from Matthew Wilcox. Increases the number of bits available in page_type! "Simplify the page flags a little" from Matthew Wilcox. Many legacy page flags are now folio flags, so the page-based flags and their accessors/mutators can be removed. "mm: store zero pages to be swapped out in a bitmap" from Usama Arif. An optimization which permits us to avoid writing/reading zero-filled zswap pages to backing store. "Avoid MAP_FIXED gap exposure" from Liam Howlett. Fixes a race window which occurs when a MAP_FIXED operqtion is occurring during an unrelated vma tree walk. "mm: remove vma_merge()" from Lorenzo Stoakes. Major rotorooting of the vma_merge() functionality, making ot cleaner, more testable and better tested. "misc fixups for DAMON {self,kunit} tests" from SeongJae Park. Minor fixups of DAMON selftests and kunit tests. "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang. Code cleanups and folio conversions. "Shmem mTHP controls and stats improvements" from Ryan Roberts. Cleanups for shmem controls and stats. "mm: count the number of anonymous THPs per size" from Barry Song. Expose additional anon THP stats to userspace for improved tuning. "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more folio conversions and removal of now-unused page-based APIs. "replace per-quota region priorities histogram buffer with per-context one" from SeongJae Park. DAMON histogram rationalization. "Docs/damon: update GitHub repo URLs and maintainer-profile" from SeongJae Park. DAMON documentation updates. "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn" from Jason Wang: fixes usage of page allocator __GFP_NOFAIL and GFP_ATOMIC flags. "mm: split underused THPs" from Yu Zhao. Improve THP=always policy - this was overprovisioning THPs in sparsely accessed memory areas. "zram: introduce custom comp backends API" frm Sergey Senozhatsky. Add support for zram run-time compression algorithm tuning. "mm: Care about shadow stack guard gap when getting an unmapped area" from Mark Brown. Fix up the various arch_get_unmapped_area() implementations to better respect guard areas. "Improve mem_cgroup_iter()" from Kinsey Ho. Improve the reliability of mem_cgroup_iter() and various code cleanups. "mm: Support huge pfnmaps" from Peter Xu. Extends the usage of huge pfnmap support. "resource: Fix region_intersects() vs add_memory_driver_managed()" from Huang Ying. Fix a bug in region_intersects() for systems with CXL memory. "mm: hwpoison: two more poison recovery" from Kefeng Wang. Teaches a couple more code paths to correctly recover from the encountering of poisoned memry. "mm: enable large folios swap-in support" from Barry Song. Support the swapin of mTHP memory into appropriately-sized folios, rather than into single-page folios. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZu1BBwAKCRDdBJ7gKXxA jlWNAQDYlqQLun7bgsAN4sSvi27VUuWv1q70jlMXTfmjJAvQqwD/fBFVR6IOOiw7 AkDbKWP2k0hWPiNJBGwoqxdHHx09Xgo= =s0T+ -----END PGP SIGNATURE----- Merge tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Along with the usual shower of singleton patches, notable patch series in this pull request are: - "Align kvrealloc() with krealloc()" from Danilo Krummrich. Adds consistency to the APIs and behaviour of these two core allocation functions. This also simplifies/enables Rustification. - "Some cleanups for shmem" from Baolin Wang. No functional changes - mode code reuse, better function naming, logic simplifications. - "mm: some small page fault cleanups" from Josef Bacik. No functional changes - code cleanups only. - "Various memory tiering fixes" from Zi Yan. A small fix and a little cleanup. - "mm/swap: remove boilerplate" from Yu Zhao. Code cleanups and simplifications and .text shrinkage. - "Kernel stack usage histogram" from Pasha Tatashin and Shakeel Butt. This is a feature, it adds new feilds to /proc/vmstat such as $ grep kstack /proc/vmstat kstack_1k 3 kstack_2k 188 kstack_4k 11391 kstack_8k 243 kstack_16k 0 which tells us that 11391 processes used 4k of stack while none at all used 16k. Useful for some system tuning things, but partivularly useful for "the dynamic kernel stack project". - "kmemleak: support for percpu memory leak detect" from Pavel Tikhomirov. Teaches kmemleak to detect leaksage of percpu memory. - "mm: memcg: page counters optimizations" from Roman Gushchin. "3 independent small optimizations of page counters". - "mm: split PTE/PMD PT table Kconfig cleanups+clarifications" from David Hildenbrand. Improves PTE/PMD splitlock detection, makes powerpc/8xx work correctly by design rather than by accident. - "mm: remove arch_make_page_accessible()" from David Hildenbrand. Some folio conversions which make arch_make_page_accessible() unneeded. - "mm, memcg: cg2 memory{.swap,}.peak write handlers" fro David Finkel. Cleans up and fixes our handling of the resetting of the cgroup/process peak-memory-use detector. - "Make core VMA operations internal and testable" from Lorenzo Stoakes. Rationalizaion and encapsulation of the VMA manipulation APIs. With a view to better enable testing of the VMA functions, even from a userspace-only harness. - "mm: zswap: fixes for global shrinker" from Takero Funaki. Fix issues in the zswap global shrinker, resulting in improved performance. - "mm: print the promo watermark in zoneinfo" from Kaiyang Zhao. Fill in some missing info in /proc/zoneinfo. - "mm: replace follow_page() by folio_walk" from David Hildenbrand. Code cleanups and rationalizations (conversion to folio_walk()) resulting in the removal of follow_page(). - "improving dynamic zswap shrinker protection scheme" from Nhat Pham. Some tuning to improve zswap's dynamic shrinker. Significant reductions in swapin and improvements in performance are shown. - "mm: Fix several issues with unaccepted memory" from Kirill Shutemov. Improvements to the new unaccepted memory feature, - "mm/mprotect: Fix dax puds" from Peter Xu. Implements mprotect on DAX PUDs. This was missing, although nobody seems to have notied yet. - "Introduce a store type enum for the Maple tree" from Sidhartha Kumar. Cleanups and modest performance improvements for the maple tree library code. - "memcg: further decouple v1 code from v2" from Shakeel Butt. Move more cgroup v1 remnants away from the v2 memcg code. - "memcg: initiate deprecation of v1 features" from Shakeel Butt. Adds various warnings telling users that memcg v1 features are deprecated. - "mm: swap: mTHP swap allocator base on swap cluster order" from Chris Li. Greatly improves the success rate of the mTHP swap allocation. - "mm: introduce numa_memblks" from Mike Rapoport. Moves various disparate per-arch implementations of numa_memblk code into generic code. - "mm: batch free swaps for zap_pte_range()" from Barry Song. Greatly improves the performance of munmap() of swap-filled ptes. - "support large folio swap-out and swap-in for shmem" from Baolin Wang. With this series we no longer split shmem large folios into simgle-page folios when swapping out shmem. - "mm/hugetlb: alloc/free gigantic folios" from Yu Zhao. Nice performance improvements and code reductions for gigantic folios. - "support shmem mTHP collapse" from Baolin Wang. Adds support for khugepaged's collapsing of shmem mTHP folios. - "mm: Optimize mseal checks" from Pedro Falcato. Fixes an mprotect() performance regression due to the addition of mseal(). - "Increase the number of bits available in page_type" from Matthew Wilcox. Increases the number of bits available in page_type! - "Simplify the page flags a little" from Matthew Wilcox. Many legacy page flags are now folio flags, so the page-based flags and their accessors/mutators can be removed. - "mm: store zero pages to be swapped out in a bitmap" from Usama Arif. An optimization which permits us to avoid writing/reading zero-filled zswap pages to backing store. - "Avoid MAP_FIXED gap exposure" from Liam Howlett. Fixes a race window which occurs when a MAP_FIXED operqtion is occurring during an unrelated vma tree walk. - "mm: remove vma_merge()" from Lorenzo Stoakes. Major rotorooting of the vma_merge() functionality, making ot cleaner, more testable and better tested. - "misc fixups for DAMON {self,kunit} tests" from SeongJae Park. Minor fixups of DAMON selftests and kunit tests. - "mm: memory_hotplug: improve do_migrate_range()" from Kefeng Wang. Code cleanups and folio conversions. - "Shmem mTHP controls and stats improvements" from Ryan Roberts. Cleanups for shmem controls and stats. - "mm: count the number of anonymous THPs per size" from Barry Song. Expose additional anon THP stats to userspace for improved tuning. - "mm: finish isolate/putback_lru_page()" from Kefeng Wang: more folio conversions and removal of now-unused page-based APIs. - "replace per-quota region priorities histogram buffer with per-context one" from SeongJae Park. DAMON histogram rationalization. - "Docs/damon: update GitHub repo URLs and maintainer-profile" from SeongJae Park. DAMON documentation updates. - "mm/vdpa: correct misuse of non-direct-reclaim __GFP_NOFAIL and improve related doc and warn" from Jason Wang: fixes usage of page allocator __GFP_NOFAIL and GFP_ATOMIC flags. - "mm: split underused THPs" from Yu Zhao. Improve THP=always policy. This was overprovisioning THPs in sparsely accessed memory areas. - "zram: introduce custom comp backends API" frm Sergey Senozhatsky. Add support for zram run-time compression algorithm tuning. - "mm: Care about shadow stack guard gap when getting an unmapped area" from Mark Brown. Fix up the various arch_get_unmapped_area() implementations to better respect guard areas. - "Improve mem_cgroup_iter()" from Kinsey Ho. Improve the reliability of mem_cgroup_iter() and various code cleanups. - "mm: Support huge pfnmaps" from Peter Xu. Extends the usage of huge pfnmap support. - "resource: Fix region_intersects() vs add_memory_driver_managed()" from Huang Ying. Fix a bug in region_intersects() for systems with CXL memory. - "mm: hwpoison: two more poison recovery" from Kefeng Wang. Teaches a couple more code paths to correctly recover from the encountering of poisoned memry. - "mm: enable large folios swap-in support" from Barry Song. Support the swapin of mTHP memory into appropriately-sized folios, rather than into single-page folios" * tag 'mm-stable-2024-09-20-02-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (416 commits) zram: free secondary algorithms names uprobes: turn xol_area->pages[2] into xol_area->page uprobes: introduce the global struct vm_special_mapping xol_mapping Revert "uprobes: use vm_special_mapping close() functionality" mm: support large folios swap-in for sync io devices mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios mm: fix swap_read_folio_zeromap() for large folios with partial zeromap mm/debug_vm_pgtable: Use pxdp_get() for accessing page table entries set_memory: add __must_check to generic stubs mm/vma: return the exact errno in vms_gather_munmap_vmas() memcg: cleanup with !CONFIG_MEMCG_V1 mm/show_mem.c: report alloc tags in human readable units mm: support poison recovery from copy_present_page() mm: support poison recovery from do_cow_fault() resource, kunit: add test case for region_intersects() resource: make alloc_free_mem_region() works for iomem_resource mm: z3fold: deprecate CONFIG_Z3FOLD vfio/pci: implement huge_fault support mm/arm64: support large pfn mappings mm/x86: support large pfn mappings ... |
||
![]() |
4a39ac5b7d |
Random number generator updates for Linux 6.12-rc1.
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmboHyUACgkQSfxwEqXe A66wGQ/8DRIjBllwf1YuTWi4T6OcfoYxK6C9bXO6QPP5gzdTyFE9pvDuuPyad6+F FR086ydTHeodemz1dFiQCL9etcUaxo4+6FRKyXKF9/1ezGbTA5nJd0/fKJGlqbI2 EoA4LNYHOsvCZk1BTpxRNWKeKphU9zQgQdSigy6Rx8p269UkGmIZjD1PtUc+vqfR Ox0dK/Cswyo236fRi5HzaoMntWI4vXgLfxty0e1R7tfbstkCxSKWAON1lo3uHgkA 0HpJXWgWXAPt9gp++Fs/jGNpOqbt6IaKeV5f7CjYfvWhlFjNMhQxF+PbxknaZn/k K0gQsItOIoFTfbQdLDIdfnj9awMdLW8FB2A1WXHpNr9pVC4ickPb1bMTF/XRd0tm wBNu4BL0gklx6017KZg5uINMIduzMLGkBLRFiBW0en/sZMLTJTMg58BJn0CL1Pmh 1ll/Q3ToSMHalvxU2OnJagTwh4fzzCEpK/hW9WiDO4jSCsMXyX0clinrCjNo1JfA tqgTWEy3uGtg+dg0Du9VD5JASbNQSJ0ZRnas5+qz10IRWWfTolrsk61dliXLQ4Sv tSryDtsE2znwJF1Krh4aHNSSVhD5/l/8QaXkf9aZc/kkaHxwsx83FuWnqw6nMz8c l4B2MbH0jUgsEqEyx+0iwk+FXE9kZKWumTVLjFZ6bRnq3q+uq0U= =mWCw -----END PGP SIGNATURE----- Merge tag 'random-6.12-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "Originally I'd planned on sending each of the vDSO getrandom() architecture ports to their respective arch trees. But as we started to work on this, we found lots of interesting issues in the shared code and infrastructure, the fixes for which the various archs needed to base their work. So in the end, this turned into a nice collaborative effort fixing up issues and porting to 5 new architectures -- arm64, powerpc64, powerpc32, s390x, and loongarch64 -- with everybody pitching in and commenting on each other's code. It was a fun development cycle. This contains: - Numerous fixups to the vDSO selftest infrastructure, getting it running successfully on more platforms, and fixing bugs in it. - Additions to the vDSO getrandom & chacha selftests. Basically every time manual review unearthed a bug in a revision of an arch patch, or an ambiguity, the tests were augmented. By the time the last arch was submitted for review, s390x, v1 of the series was essentially fine right out of the gate. - Fixes to the the generic C implementation of vDSO getrandom, to build and run successfully on all archs, decoupling it from assumptions we had (unintentionally) made on x86_64 that didn't carry through to the other architectures. - Port of vDSO getrandom to LoongArch64, from Xi Ruoyao and acked by Huacai Chen. - Port of vDSO getrandom to ARM64, from Adhemerval Zanella and acked by Will Deacon. - Port of vDSO getrandom to PowerPC, in both 32-bit and 64-bit varieties, from Christophe Leroy and acked by Michael Ellerman. - Port of vDSO getrandom to S390X from Heiko Carstens, the arch maintainer. While it'd be natural for there to be things to fix up over the course of the development cycle, these patches got a decent amount of review from a fairly diverse crew of folks on the mailing lists, and, for the most part, they've been cooking in linux-next, which has been helpful for ironing out build issues. In terms of architectures, I think that mostly takes care of the important 64-bit archs with hardware still being produced and running production loads in settings where vDSO getrandom is likely to help. Arguably there's still RISC-V left, and we'll see for 6.13 whether they find it useful and submit a port" * tag 'random-6.12-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (47 commits) selftests: vDSO: check cpu caps before running chacha test s390/vdso: Wire up getrandom() vdso implementation s390/vdso: Move vdso symbol handling to separate header file s390/vdso: Allow alternatives in vdso code s390/module: Provide find_section() helper s390/facility: Let test_facility() generate static branch if possible s390/alternatives: Remove ALT_FACILITY_EARLY s390/facility: Disable compile time optimization for decompressor code selftests: vDSO: fix vdso_config for s390 selftests: vDSO: fix ELF hash table entry size for s390x powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO64 powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32 powerpc/vdso: Refactor CFLAGS for CVDSO build powerpc/vdso32: Add crtsavres mm: Define VM_DROPPABLE for powerpc/32 powerpc/vdso: Fix VDSO data access when running in a non-root time namespace selftests: vDSO: don't include generated headers for chacha test arm64: vDSO: Wire up getrandom() vDSO implementation arm64: alternative: make alternative_has_cap_likely() VDSO compatible selftests: vDSO: also test counter in vdso_test_chacha ... |
||
![]() |
0eb0bd21e8 |
LoongArch: Remove STACK_FRAME_NON_STANDARD(do_syscall)
For now, we can remove STACK_FRAME_NON_STANDARD(do_syscall) because
there is no objtool warning "do_syscall+0x11c: return with modified
stack frame", then there is handle_syscall() which is the previous
frame of do_syscall() in the call trace when executing the command
"echo l > /proc/sysrq-trigger".
Fixes:
|
||
![]() |
987cbafe62 |
Merge tag 'irq-core-2024-09-16' into loongarch-next
LoongArch architecture changes for 6.12 depend on the irq core changes about AVEC irqchip to avoid confliction, so merge them to create a base. |
||
![]() |
cb69d86550 |
Updates for the interrupt subsystem:
- Core: - Remove a global lock in the affinity setting code The lock protects a cpumask for intermediate results and the lock causes a bottleneck on simultaneous start of multiple virtual machines. Replace the lock and the static cpumask with a per CPU cpumask which is nicely serialized by raw spinlock held when executing this code. - Provide support for giving a suffix to interrupt domain names. That's required to support devices with subfunctions so that the domain names are distinct even if they originate from the same device node. - The usual set of cleanups and enhancements all over the place - Drivers: - Support for longarch AVEC interrupt chip - Refurbishment of the Armada driver so it can be extended for new variants. - The usual set of cleanups and enhancements all over the place -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmbn5p8THHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoRFtD/43eB3h5usY2OPW0JmDqrE6qnzsvjPZ 1H52BcmMcOuI6yCfTnbi/fBB52mwSEGq9Dmt1GXradyq9/CJDIqZ1ajI1rA2jzW2 YdbeTDpKm1rS2ddzfp2LT2BryrNt+7etrRO7qHn4EKSuOcNuV2f58WPbIIqasvaK uPbUDVDPrvXxLNcjoab6SqaKrEoAaHSyKpd0MvDd80wHrtcSC/QouW7JDSUXv699 RwvLebN1OF6mQ2J8Z3DLeCQpcbAs+UT8UvID7kYUJi1g71J/ZY+xpMLoX/gHiDNr isBtsuEAiZeNaFpksc7A6Jgu5ljZf2/aLCqbPLlHaduHFNmo94x9KUbIF2cpEMN+ rsf5Ff7AVh1otz3cUwLLsm+cFLWRRoZdLuncn7rrgB4Yg0gll7qzyLO6YGvQHr8U Ocj1RXtvvWsMk4XzhgCt1AH/42cO6go+bhA4HspeYykNpsIldIUl1MeFbO8sWiDJ kybuwiwHp3oaMLjEK4Lpq65u7Ll8Lju2zRde65YUJN2nbNmJFORrOLmeC1qsr6ri dpend6n2qD9UD1oAt32ej/uXnG160nm7UKescyxiZNeTm1+ez8GW31hY128ifTY3 4R3urGS38p3gazXBsfw6eqkeKx0kEoDNoQqrO5gBvb8kowYTvoZtkwMGAN9OADwj w6vvU0i+NIyVMA== =JlJ2 -----END PGP SIGNATURE----- Merge tag 'irq-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Core: - Remove a global lock in the affinity setting code The lock protects a cpumask for intermediate results and the lock causes a bottleneck on simultaneous start of multiple virtual machines. Replace the lock and the static cpumask with a per CPU cpumask which is nicely serialized by raw spinlock held when executing this code. - Provide support for giving a suffix to interrupt domain names. That's required to support devices with subfunctions so that the domain names are distinct even if they originate from the same device node. - The usual set of cleanups and enhancements all over the place Drivers: - Support for longarch AVEC interrupt chip - Refurbishment of the Armada driver so it can be extended for new variants. - The usual set of cleanups and enhancements all over the place" * tag 'irq-core-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (73 commits) genirq: Use cpumask_intersects() genirq/cpuhotplug: Use cpumask_intersects() irqchip/apple-aic: Only access system registers on SoCs which provide them irqchip/apple-aic: Add a new "Global fast IPIs only" feature level irqchip/apple-aic: Skip unnecessary enabling of use_fast_ipi dt-bindings: apple,aic: Document A7-A11 compatibles irqdomain: Use IS_ERR_OR_NULL() in irq_domain_trim_hierarchy() genirq/msi: Use kmemdup_array() instead of kmemdup() genirq/proc: Change the return value for set affinity permission error genirq/proc: Use irq_move_pending() in show_irq_affinity() genirq/proc: Correctly set file permissions for affinity control files genirq: Get rid of global lock in irq_do_set_affinity() genirq: Fix typo in struct comment irqchip/loongarch-avec: Add AVEC irqchip support irqchip/loongson-pch-msi: Prepare get_pch_msi_handle() for AVECINTC irqchip/loongson-eiointc: Rename CPUHP_AP_IRQ_LOONGARCH_STARTING LoongArch: Architectural preparation for AVEC irqchip LoongArch: Move irqchip function prototypes to irq-loongson.h irqchip/loongson-pch-msi: Switch to MSI parent domains softirq: Remove unused 'action' parameter from action callback ... |
||
![]() |
18efd0b10e |
LoongArch: vDSO: Wire up getrandom() vDSO implementation
Hook up the generic vDSO implementation to the LoongArch vDSO data page by providing the required __arch_chacha20_blocks_nostack, __arch_get_k_vdso_rng_data, and getrandom_syscall implementations. Also wire up the selftests. Signed-off-by: Xi Ruoyao <xry111@xry111.site> Acked-by: Huacai Chen <chenhuacai@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> |
||
![]() |
3abb708ec0 |
LoongArch: KVM: Implement function kvm_para_has_feature()
Implement function kvm_para_has_feature() to detect supported paravirt features. It can be used by device driver to detect and enable paravirt features, such as the EIOINTC irqchip driver is able to detect feature KVM_FEATURE_VIRT_EXTIOI and do some optimization. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
cdc118f802 |
LoongArch: KVM: Enable paravirt feature control from VMM
Export kernel paravirt features to user space, so that VMM can control each single paravirt feature. By default paravirt features will be the same with kvm supported features if VMM does not set it. Also a new feature KVM_FEATURE_VIRT_EXTIOI is added which can be set from user space. This feature indicates that the virt EIOINTC can route interrupts to 256 vCPUs, rather than 4 vCPUs like with real HW. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
e5ba90abb2 |
LoongArch: Revert qspinlock to test-and-set simple lock on VM
Similar with x86, when VM is detected, revert to a simple test-and-set lock to avoid the horrors of queue preemption. Tested on 3C5000 Dual-way machine with 32 cores and 2 numa nodes, test case is kcbench on kernel mainline 6.10, the detailed command is "kcbench --src /root/src/linux" Performance on host machine kernel compile time performance impact Original 150.29 seconds With patch 150.19 seconds almost no impact Performance on virtual machine: 1. 1 VM with 32 vCPUs and 2 numa node, numa node pinned kernel compile time performance impact Original 170.87 seconds With patch 171.73 seconds almost no impact 2. 2 VMs, each VM with 32 vCPUs and 2 numa node, numa node pinned kernel compile time performance impact Original 2362.04 seconds With patch 354.73 seconds +565% Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
3515863d9f |
arch, mm: pull out allocation of NODE_DATA to generic code
Architectures that support NUMA duplicate the code that allocates NODE_DATA on the node-local memory with slight variations in reporting of the addresses where the memory was allocated. Use x86 version as the basis for the generic alloc_node_data() function and call this function in architecture specific numa initialization. Round up node data size to SMP_CACHE_BYTES rather than to PAGE_SIZE like x86 used to do since the bootmem era when allocation granularity was PAGE_SIZE anyway. Link: https://lkml.kernel.org/r/20240807064110.1003856-10-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Zi Yan <ziy@nvidia.com> # for x86_64 and arm64 Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [arm64 + CXL via QEMU] Acked-by: Dan Williams <dan.j.williams@intel.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Rob Herring (Arm) <robh@kernel.org> Cc: Samuel Holland <samuel.holland@sifive.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
![]() |
46bcce5031 |
arch, mm: move definition of node_data to generic code
Every architecture that supports NUMA defines node_data in the same way: struct pglist_data *node_data[MAX_NUMNODES]; No reason to keep multiple copies of this definition and its forward declarations, especially when such forward declaration is the only thing in include/asm/mmzone.h for many architectures. Add definition and declaration of node_data to generic code and drop architecture-specific versions. Link: https://lkml.kernel.org/r/20240807064110.1003856-8-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Zi Yan <ziy@nvidia.com> # for x86_64 and arm64 Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [arm64 + CXL via QEMU] Acked-by: Dan Williams <dan.j.williams@intel.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Rob Herring (Arm) <robh@kernel.org> Cc: Samuel Holland <samuel.holland@sifive.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
![]() |
80376323e2 |
LoongArch: Add ifdefs to fix LSX and LASX related warnings
There exist some warnings when building kernel if CONFIG_CPU_HAS_LBT is
set but CONFIG_CPU_HAS_LSX and CONFIG_CPU_HAS_LASX are not set. In this
case, there are no definitions of _restore_lsx & _restore_lasx and there
are also no definitions of kvm_restore_lsx & kvm_restore_lasx in fpu.S
and switch.S respectively, just add some ifdefs to fix these warnings.
AS arch/loongarch/kernel/fpu.o
arch/loongarch/kernel/fpu.o: warning: objtool: unexpected relocation symbol type in .rela.discard.func_stack_frame_non_standard: 0
arch/loongarch/kernel/fpu.o: warning: objtool: unexpected relocation symbol type in .rela.discard.func_stack_frame_non_standard: 0
AS [M] arch/loongarch/kvm/switch.o
arch/loongarch/kvm/switch.o: warning: objtool: unexpected relocation symbol type in .rela.discard.func_stack_frame_non_standard: 0
arch/loongarch/kvm/switch.o: warning: objtool: unexpected relocation symbol type in .rela.discard.func_stack_frame_non_standard: 0
MODPOST Module.symvers
ERROR: modpost: "kvm_restore_lsx" [arch/loongarch/kvm/kvm.ko] undefined!
ERROR: modpost: "kvm_restore_lasx" [arch/loongarch/kvm/kvm.ko] undefined!
Cc: stable@vger.kernel.org # 6.9+
Fixes:
|
||
![]() |
274ea3563e |
LoongArch: Define ARCH_IRQ_INIT_FLAGS as IRQ_NOPROBE
Currently we call irq_set_noprobe() in a loop for all IRQs, but indeed it only works for IRQs below NR_IRQS_LEGACY because at init_IRQ() only legacy interrupts have been allocated. Instead, we can define ARCH_IRQ_INIT_FLAGS as IRQ_NOPROBE in asm/hwirq.h and the core will automatically set the flag for all interrupts. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn> |
||
![]() |
ae16f05c92 |
irqchip/loongarch-avec: Add AVEC irqchip support
Introduce the advanced extended interrupt controllers (AVECINTC). This feature will allow each core to have 256 independent interrupt vectors and MSI interrupts can be independently routed to any vector on any CPU. The whole topology of irqchips in LoongArch machines looks like this if AVECINTC is supported: +-----+ +-----------------------+ +-------+ | IPI | --> | CPUINTC | <-- | Timer | +-----+ +-----------------------+ +-------+ ^ ^ ^ | | | +---------+ +----------+ +---------+ +-------+ | EIOINTC | | AVECINTC | | LIOINTC | <-- | UARTs | +---------+ +----------+ +---------+ +-------+ ^ ^ | | +---------+ +---------+ | PCH-PIC | | PCH-MSI | +---------+ +---------+ ^ ^ ^ | | | +---------+ +---------+ +---------+ | Devices | | PCH-LPC | | Devices | +---------+ +---------+ +---------+ ^ | +---------+ | Devices | +---------+ Co-developed-by: Jianmin Lv <lvjianmin@loongson.cn> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> Co-developed-by: Liupu Wang <wangliupu@loongson.cn> Signed-off-by: Liupu Wang <wangliupu@loongson.cn> Co-developed-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240823104337.25577-2-zhangtianyang@loongson.cn |
||
![]() |
843ed9317b |
LoongArch: Architectural preparation for AVEC irqchip
Add architectural preparation for AVEC irqchip, including: 1. CPUCFG feature bits definition for AVEC; 2. Detection of AVEC irqchip in cpu_probe(); 3. New IPI type definition (IPI_CLEAR_VECTOR) for AVEC; 4. Provide arch_probe_nr_irqs() for large NR_IRQS; 5. Other related changes about the number of interrupts. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240823103936.25092-2-zhangtianyang@loongson.cn |
||
![]() |
e688c22073 |
LoongArch: Enable general EFI poweroff method
efi_shutdown_init() can register a general sys_off handler named efi_power_off(). Enable this by providing efi_poweroff_required(), like arm and x86. Since EFI poweroff is also supported on LoongArch, and the enablement makes the poweroff function usable for hardwares which lack ACPI S5. We prefer ACPI poweroff rather than EFI poweroff (like x86), so we only require EFI poweroff if acpi_gbl_reduced_hardware or acpi_no_s5 is true. Cc: stable@vger.kernel.org Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Miao Wang <shankerwangmiao@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
343416f0c1 |
syscalls: fix syscall macros for newfstat/newfstatat
The __NR_newfstat and __NR_newfstatat macros accidentally got renamed
in the conversion to the syscall.tbl format, dropping the 'new' portion
of the name.
In an unrelated change, the two syscalls are no longer architecture
specific but are once more defined on all 64-bit architectures, so the
'newstat' ABI keyword can be dropped from the table as a simplification.
Fixes: Fixes:
|
||
![]() |
a362ade892 |
LoongArch changes for v6.11
1, Define __ARCH_WANT_NEW_STAT in unistd.h; 2, Always enumerate MADT and setup logical-physical CPU mapping; 3, Add irq_work support via self IPIs; 4, Add RANDOMIZE_KSTACK_OFFSET support; 5, Add ARCH_HAS_PTE_DEVMAP support; 6, Add ARCH_HAS_DEBUG_VM_PGTABLE support; 7, Add writecombine support for DMW-based ioremap(); 8, Add architectural preparation for CPUFreq; 9, Add ACPI standard hardware register based S3 support; 10, Add support for relocating the kernel with RELR relocation; 11, Some bug fixes and other small changes. -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmabzOsWHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImeqssD/9AG3WGb25R4IvgnZYuRCxpXsLk Qrj4YSPazaTLrQBWk1g+KqcBLe+jZV4zmnz0H93qoOpyMDwsmExugDug7QCKiBl1 olVZ0CeQ6dyMHAnFjTgy29KcyJRFith4jXFGq6kpNa80pezsXz7b869GkLZflZfy W9hALfcaxB4kx+z4HXblbOIsfzVwh2eBD/nkWukBG28CPMQ7pV4TtejIqSd9kDC5 LQjVQhjyrDgR3EPJEzr+48/hgFB6cZ8fmfv5JVTu+rQMngUldxDijj8xfoIUgIjN 2khFc2Orx5RVyIuBxtLKWf70HD9xXC0fqUVjFEn0Yn5i1JVLoMdqjownSWvPy3t7 z3V0E0VaYUdLgA3GeA5Fw1uZbORlocAZbA5B8bXY2foNfwPwLlGpNiyNiqx5kQmQ O+9jQJqdrZZ18wXEW8sR8AnT5+lzIQdv1GlkYt2f5a1rjMZwHtPZI4aPRDojPo/3 Fv0Q1+2XVnbPngzJJz9tlYCzt5iuY9z7DwsnbEBSiLZRapJ9ZECmJjSGnnR/fLLS ifdyooua8bviMwzmUEmfSgPRHyTZs+BjkD7AQ4xyRDAv0T2d9sDwkAWYBcViTslF awe6+x+zn6yXekhiloN8L+3HJ67bYojXmLciNqvFcVtSNgJQpXBjLDO9orCbNqmw ISxNA0GbR+eWGMdvCA== =bla1 -----END PGP SIGNATURE----- Merge tag 'loongarch-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Define __ARCH_WANT_NEW_STAT in unistd.h - Always enumerate MADT and setup logical-physical CPU mapping - Add irq_work support via self IPIs - Add RANDOMIZE_KSTACK_OFFSET support - Add ARCH_HAS_PTE_DEVMAP support - Add ARCH_HAS_DEBUG_VM_PGTABLE support - Add writecombine support for DMW-based ioremap() - Add architectural preparation for CPUFreq - Add ACPI standard hardware register based S3 support - Add support for relocating the kernel with RELR relocation - Some bug fixes and other small changes * tag 'loongarch-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: Make the users of larch_insn_gen_break() constant LoongArch: Check TIF_LOAD_WATCH to enable user space watchpoint LoongArch: Use rustc option -Zdirect-access-external-data LoongArch: Add support for relocating the kernel with RELR relocation LoongArch: Remove a redundant checking in relocator LoongArch: Use correct API to map cmdline in relocate_kernel() LoongArch: Automatically disable KASLR for hibernation LoongArch: Add ACPI standard hardware register based S3 support LoongArch: Add architectural preparation for CPUFreq LoongArch: Add writecombine support for DMW-based ioremap() LoongArch: Add ARCH_HAS_DEBUG_VM_PGTABLE support LoongArch: Add ARCH_HAS_PTE_DEVMAP support LoongArch: Add RANDOMIZE_KSTACK_OFFSET support LoongArch: Add irq_work support via self IPIs LoongArch: Always enumerate MADT and setup logical-physical CPU mapping LoongArch: Define __ARCH_WANT_NEW_STAT in unistd.h |
||
![]() |
2c9b351240 |
ARM:
* Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement * Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware * Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol * FPSIMD/SVE support for nested, including merged trap configuration and exception routing * New command-line parameter to control the WFx trap behavior under KVM * Introduce kCFI hardening in the EL2 hypervisor * Fixes + cleanups for handling presence/absence of FEAT_TCRX * Miscellaneous fixes + documentation updates LoongArch: * Add paravirt steal time support. * Add support for KVM_DIRTY_LOG_INITIALLY_SET. * Add perf kvm-stat support for loongarch. RISC-V: * Redirect AMO load/store access fault traps to guest * perf kvm stat support * Use guest files for IMSIC virtualization, when available ONE_REG support for the Zimop, Zcmop, Zca, Zcf, Zcd, Zcb and Zawrs ISA extensions is coming through the RISC-V tree. s390: * Assortment of tiny fixes which are not time critical x86: * Fixes for Xen emulation. * Add a global struct to consolidate tracking of host values, e.g. EFER * Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX. * Print the name of the APICv/AVIC inhibits in the relevant tracepoint. * Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor. * Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop. * Update to the newfangled Intel CPU FMS infrastructure. * Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored. * Misc cleanups x86 - MMU: * Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support. * Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs. * Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages. * Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards. x86 - AMD: * Make per-CPU save_area allocations NUMA-aware. * Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code. * Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges. This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification. There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO/KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data. x86 - Intel: * Remove an unnecessary EPT TLB flush when enabling hardware. * Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1). * KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation. Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed. See commit |
||
![]() |
998b17d444 |
LoongArch: Make the users of larch_insn_gen_break() constant
LoongArch defines UPROBE_SWBP_INSN as a function call and this breaks
arch_uprobe_trampoline() which uses it to initialize a static variable.
Add the new "__builtin_constant_p" helper, __emit_break(), and redefine
the current users of larch_insn_gen_break() to use it.
Fixes:
|
||
![]() |
3892b11eac |
LoongArch: Check TIF_LOAD_WATCH to enable user space watchpoint
Currently, there are some places to set CSR.PRMD.PWE, the first one is
in hw_breakpoint_thread_switch() to enable user space singlestep via
checking TIF_SINGLESTEP, the second one is in hw_breakpoint_control() to
enable user space watchpoint. For the latter case, it should also check
TIF_LOAD_WATCH to make the logic correct and clear.
Fixes:
|
||
![]() |
e05d4cd9b8 |
LoongArch: Add support for relocating the kernel with RELR relocation
RELR as a relocation packing format for relative relocations for reducing the size of relative relocation records. In a position independent executable there are often many relative relocation records, and our vmlinux is a PIE. The LLD linker (since 17.0.0) and the BFD linker (since 2.43) supports packing the relocations in the RELR format for LoongArch, with the flag -z pack-relative-relocs. Commits |
||
![]() |
0ad158e4ef |
LoongArch: Remove a redundant checking in relocator
With our linker script "relocated_addr >= VMLINUX_LOAD_ADDRESS" should be always true. Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
0124fbb4c6 |
LoongArch: Use correct API to map cmdline in relocate_kernel()
fw_arg1 is in memory space rather than I/O space, so we should use early_memremap_ro() instead of early_ioremap() to map the cmdline. Moreover, we should unmap it after using. Suggested-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
67e6b115dd |
LoongArch: Automatically disable KASLR for hibernation
Hibernation assumes the memory layout after resume be the same as that before sleep, so it expects the kernel is loaded at the same position. To achieve this goal we automatically disable KASLR if user explicitly requests hibernation via the "resume=" command line. Since "nohibernate" and "noresume" have higher priorities than "resume=", we only disable KASLR if there is no "nohibernate" and "noresume". Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
8e02c3b782 |
LoongArch: Add writecombine support for DMW-based ioremap()
Currently, only TLB-based ioremap() support writecombine, so add the counterpart for DMW-based ioremap() with help of DMW2. The base address (WRITECOMBINE_BASE) is configured as 0xa000000000000000. DMW3 is unused by kernel now, however firmware may leave garbage in them and interfere kernel's address mapping. So clear it as necessary. BTW, centralize the DMW configuration to macro SETUP_DMWINS. Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
a0f7085f6a |
LoongArch: Add RANDOMIZE_KSTACK_OFFSET support
Add support of kernel stack offset randomization while handling syscall, the offset is defaultly limited by KSTACK_OFFSET_MAX(). In order to avoid triggering stack canaries (due to __builtin_alloca()) and slowing down the entry path, use __no_stack_protector attribute to disable stack protector for do_syscall() at function level. With this patch, the REPORT_STACK test show that: `loongarch64 bits of stack entropy: 7` Reviewed-by: Kees Cook <kees@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
08f417db70 |
LoongArch: Add irq_work support via self IPIs
Add irq_work support for LoongArch via self IPIs. This make it possible to run works in hardware interrupt context, which is a prerequisite for NOHZ_FULL. Implement: - arch_irq_work_raise() - arch_irq_work_has_interrupt() Reviewed-by: Guo Ren <guoren@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
12d3b559b8 |
LoongArch: Always enumerate MADT and setup logical-physical CPU mapping
Some drivers want to use cpu_logical_map(), early_cpu_to_node() and some other CPU mapping APIs, even if we use "nr_cpus=1" to hard limit the CPU number. This is strongly required for the multi-bridges machines. Currently, we stop parsing the MADT if the nr_cpus limit is reached, but to achieve the above goal we should always enumerate the MADT table and setup logical-physical CPU mapping whether there is a nr_cpus limit. Rework the MADT enumeration: 1. Define a flag "cpu_enumerated" to distinguish the first enumeration (cpu_enumerated=0) and the physical hotplug case (cpu_enumerated=1) for set_processor_mask(). 2. If cpu_enumerated=0, stop parsing only when NR_CPUS limit is reached, so we can setup logical-physical CPU mapping; if cpu_enumerated=1, stop parsing when nr_cpu_ids limit is reached, so we can avoid some runtime bugs. Once logical-physical CPU mapping is setup, we will let cpu_enumerated=1. 3. Use find_first_zero_bit() instead of cpumask_next_zero() to find the next zero bit (free logical CPU id) in the cpu_present_mask, because cpumask_next_zero() will stop at nr_cpu_ids. 4. Only touch cpu_possible_mask if cpu_enumerated=0, this is in order to avoid some potential crashes, because cpu_possible_mask is marked as __ro_after_init. 5. In prefill_possible_map(), clear cpu_present_mask bits greater than nr_cpu_ids, in order to avoid a CPU be "present" but not "possible". Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
7697a0fe01 |
LoongArch: Define __ARCH_WANT_NEW_STAT in unistd.h
Chromium sandbox apparently wants to deny statx [1] so it could properly inspect arguments after the sandboxed process later falls back to fstat. Because there's currently not a "fd-only" version of statx, so that the sandbox has no way to ensure the path argument is empty without being able to peek into the sandboxed process's memory. For architectures able to do newfstatat though, glibc falls back to newfstatat after getting -ENOSYS for statx, then the respective SIGSYS handler [2] takes care of inspecting the path argument, transforming allowed newfstatat's into fstat instead which is allowed and has the same type of return value. But, as LoongArch is the first architecture to not have fstat nor newfstatat, the LoongArch glibc does not attempt falling back at all when it gets -ENOSYS for statx -- and you see the problem there! Actually, back when the LoongArch port was under review, people were aware of the same problem with sandboxing clone3 [3], so clone was eventually kept. Unfortunately it seemed at that time no one had noticed statx, so besides restoring fstat/newfstatat to LoongArch uapi (and postponing the problem further), it seems inevitable that we would need to tackle seccomp deep argument inspection. However, this is obviously a decision that shouldn't be taken lightly, so we just restore fstat/newfstatat by defining __ARCH_WANT_NEW_STAT in unistd.h. This is the simplest solution for now, and so we hope the community will tackle the long-standing problem of seccomp deep argument inspection in the future [4][5]. Also add "newstat" to syscall_abis_64 in Makefile.syscalls due to upstream asm-generic changes. More infomation please reading this thread [6]. [1] https://chromium-review.googlesource.com/c/chromium/src/+/2823150 [2] https://chromium.googlesource.com/chromium/src/sandbox/+/c085b51940bd/linux/seccomp-bpf-helpers/sigsys_handlers.cc#355 [3] https://lore.kernel.org/linux-arch/20220511211231.GG7074@brightrain.aerifal.cx/ [4] https://lwn.net/Articles/799557/ [5] https://lpc.events/event/4/contributions/560/attachments/397/640/deep-arg-inspection.pdf [6] https://lore.kernel.org/loongarch/20240226-granit-seilschaft-eccc2433014d@brauner/T/#t Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
26a3b85bac |
loongarch: convert to generic syscall table
The uapi/asm/unistd_64.h and asm/syscall_table_64.h headers can now be generated from scripts/syscall.tbl, which makes this consistent with the other architectures that have their own syscall.tbl. Unlike the other architectures using the asm-generic header, loongarch uses none of the deprecated system calls at the moment. Both the user visible side of asm/unistd.h and the internal syscall table in the kernel should have the same effective contents after this. Signed-off-by: Arnd Bergmann <arnd@arndb.de> |
||
![]() |
03779999ac |
LoongArch: KVM: Add PV steal time support in guest side
Per-cpu struct kvm_steal_time is added here, its size is 64 bytes and also defined as 64 bytes, so that the whole structure is in one physical page. When a VCPU is online, function pv_enable_steal_time() is called. This function will pass guest physical address of struct kvm_steal_time and tells hypervisor to enable steal time. When a vcpu is offline, physical address is set as 0 and tells hypervisor to disable steal time. Here is an output of vmstat on guest when there is workload on both host and guest. It shows steal time stat information. procs -----------memory---------- -----io---- -system-- ------cpu----- r b swpd free inact active bi bo in cs us sy id wa st 15 1 0 7583616 184112 72208 20 0 162 52 31 6 43 0 20 17 0 0 7583616 184704 72192 0 0 6318 6885 5 60 8 5 22 16 0 0 7583616 185392 72144 0 0 1766 1081 0 49 0 1 50 16 0 0 7583616 184816 72304 0 0 6300 6166 4 62 12 2 20 18 0 0 7583632 184480 72240 0 0 2814 1754 2 58 4 1 35 Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
295f10061a |
syscalls: mmap(): use unsigned offset type consistently
Most architectures that implement the old-style mmap() with byte offset use 'unsigned long' as the type for that offset, but microblaze and riscv have the off_t type that is shared with userspace, matching the prototype in include/asm-generic/syscalls.h. Make this consistent by using an unsigned argument everywhere. This changes the behavior slightly, as the argument is shifted to a page number, and an user input with the top bit set would result in a negative page offset rather than a large one as we use elsewhere. For riscv, the 32-bit sys_mmap2() definition actually used a custom type that is different from the global declaration, but this was missed due to an incorrect type check. Signed-off-by: Arnd Bergmann <arnd@arndb.de> |
||
![]() |
3eb2a8b235 |
LoongArch: Fix multiple hardware watchpoint issues
In the current code, if multiple hardware breakpoints/watchpoints in a user-space thread, some of them will not be triggered. When debugging the following code using gdb. lihui@bogon:~$ cat test.c #include <stdio.h> int a = 0; int main() { printf("start test\n"); a = 1; printf("a = %d\n", a); printf("end test\n"); return 0; } lihui@bogon:~$ gcc -g test.c -o test lihui@bogon:~$ gdb test ... (gdb) start ... Temporary breakpoint 1, main () at test.c:5 5 printf("start test\n"); (gdb) watch a Hardware watchpoint 2: a (gdb) hbreak 8 Hardware assisted breakpoint 3 at 0x1200006ec: file test.c, line 8. (gdb) c Continuing. start test a = 1 Breakpoint 3, main () at test.c:8 8 printf("end test\n"); ... The first hardware watchpoint is not triggered, the root causes are: 1. In hw_breakpoint_control(), The FWPnCFG1.2.4/MWPnCFG1.2.4 register settings are not distinguished. They should be set based on hardware watchpoint functions (fetch or load/store operations). 2. In breakpoint_handler() and watchpoint_handler(), it doesn't identify which watchpoint is triggered. So, all watchpoint-related perf_event callbacks are called and siginfo is sent to the user space. This will cause user-space unable to determine which watchpoint is triggered. The kernel need to identity which watchpoint is triggered via MWPS/ FWPS registers, and then call the corresponding perf event callbacks to report siginfo to the user-space. Modify the relevant code to solve above issues. All changes according to the LoongArch Reference Manual: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#control-and-status-registers-related-to-watchpoints With this patch: lihui@bogon:~$ gdb test ... (gdb) start ... Temporary breakpoint 1, main () at test.c:5 5 printf("start test\n"); (gdb) watch a Hardware watchpoint 2: a (gdb) hbreak 8 Hardware assisted breakpoint 3 at 0x1200006ec: file test.c, line 8. (gdb) c Continuing. start test Hardware watchpoint 2: a Old value = 0 New value = 1 main () at test.c:7 7 printf("a = %d\n", a); (gdb) c Continuing. a = 1 Breakpoint 3, main () at test.c:8 8 printf("end test\n"); (gdb) c Continuing. end test [Inferior 1 (process 778) exited normally] Cc: stable@vger.kernel.org Signed-off-by: Hui Li <lihui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
c8e57ab099 |
LoongArch: Trigger user-space watchpoints correctly
In the current code, gdb can set the watchpoint successfully through ptrace interface, but watchpoint will not be triggered. When debugging the following code using gdb. lihui@bogon:~$ cat test.c #include <stdio.h> int a = 0; int main() { a = 1; printf("a = %d\n", a); return 0; } lihui@bogon:~$ gcc -g test.c -o test lihui@bogon:~$ gdb test ... (gdb) watch a ... (gdb) r ... a = 1 [Inferior 1 (process 4650) exited normally] No watchpoints were triggered, the root causes are: 1. Kernel uses perf_event and hw_breakpoint framework to control watchpoint, but the perf_event corresponding to watchpoint is not enabled. So it needs to be enabled according to MWPnCFG3 or FWPnCFG3 PLV bit field in ptrace_hbp_set_ctrl(), and privilege is set according to the monitored addr in hw_breakpoint_control(). Furthermore, add a judgment in ptrace_hbp_set_addr() to ensure kernel-space addr cannot be monitored in user mode. 2. The global enable control for all watchpoints is the WE bit of CSR.CRMD, and hardware sets the value to 0 when an exception is triggered. When the ERTN instruction is executed to return, the hardware restores the value of the PWE field of CSR.PRMD here. So, before a thread containing watchpoints be scheduled, the PWE field of CSR.PRMD needs to be set to 1. Add this modification in hw_breakpoint_control(). All changes according to the LoongArch Reference Manual: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#control-and-status-registers-related-to-watchpoints https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#basic-control-and-status-registers With this patch: lihui@bogon:~$ gdb test ... (gdb) watch a Hardware watchpoint 1: a (gdb) r ... Hardware watchpoint 1: a Old value = 0 New value = 1 main () at test.c:6 6 printf("a = %d\n", a); (gdb) c Continuing. a = 1 [Inferior 1 (process 775) exited normally] Cc: stable@vger.kernel.org Signed-off-by: Hui Li <lihui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
f63a47b34b |
LoongArch: Fix watchpoint setting error
In the current code, when debugging the following code using gdb, "invalid argument ..." message will be displayed. lihui@bogon:~$ cat test.c #include <stdio.h> int a = 0; int main() { a = 1; return 0; } lihui@bogon:~$ gcc -g test.c -o test lihui@bogon:~$ gdb test ... (gdb) watch a Hardware watchpoint 1: a (gdb) r ... Invalid argument setting hardware debug registers There are mainly two types of issues. 1. Some incorrect judgment condition existed in user_watch_state argument parsing, causing -EINVAL to be returned. When setting up a watchpoint, gdb uses the ptrace interface, ptrace(PTRACE_SETREGSET, tid, NT_LOONGARCH_HW_WATCH, (void *) &iov)). Register values in user_watch_state as follows: addr[0] = 0x0, mask[0] = 0x0, ctrl[0] = 0x0 addr[1] = 0x0, mask[1] = 0x0, ctrl[1] = 0x0 addr[2] = 0x0, mask[2] = 0x0, ctrl[2] = 0x0 addr[3] = 0x0, mask[3] = 0x0, ctrl[3] = 0x0 addr[4] = 0x0, mask[4] = 0x0, ctrl[4] = 0x0 addr[5] = 0x0, mask[5] = 0x0, ctrl[5] = 0x0 addr[6] = 0x0, mask[6] = 0x0, ctrl[6] = 0x0 addr[7] = 0x12000803c, mask[7] = 0x0, ctrl[7] = 0x610 In arch_bp_generic_fields(), return -EINVAL when ctrl.len is LOONGARCH_BREAKPOINT_LEN_8(0b00). So delete the incorrect judgment here. In ptrace_hbp_fill_attr_ctrl(), when note_type is NT_LOONGARCH_HW_WATCH and ctrl[0] == 0x0, if ((type & HW_BREAKPOINT_RW) != type) will return -EINVAL. Here ctrl.type should be set based on note_type, and unnecessary judgments can be removed. 2. The watchpoint argument was not set correctly due to unnecessary offset and alignment_mask. Modify ptrace_hbp_fill_attr_ctrl() and hw_breakpoint_arch_parse(), which ensure the watchpont argument is set correctly. All changes according to the LoongArch Reference Manual: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#control-and-status-registers-related-to-watchpoints Cc: stable@vger.kernel.org Signed-off-by: Hui Li <lihui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
beb2800074 |
LoongArch: Fix entry point in kernel image header
Currently kernel entry in head.S is in DMW address range, firmware is instructed to jump to this address after loading the kernel image. However kernel should not make any assumption on firmware's DMW setting, thus the entry point should be a physical address falls into direct translation region. Fix by converting entry address to physical and amend entry calculation logic in libstub accordingly. BTW, use ABSOLUTE() to calculate variables to make Clang/LLVM happy. Cc: stable@vger.kernel.org Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
3de9c42d02 |
LoongArch: Add all CPUs enabled by fdt to NUMA node 0
NUMA enabled kernel on FDT based machine fails to boot because CPUs
are all in NUMA_NO_NODE and mm subsystem won't accept that.
Fix by adding them to default NUMA node at FDT parsing phase and move
numa_add_cpu(0) to a later point.
Cc: stable@vger.kernel.org
Fixes:
|
||
![]() |
b56f67a6c7 |
LoongArch: Fix built-in DTB detection
fdt_check_header(__dtb_start) will always success because kernel
provides a dummy dtb, and by coincidence __dtb_start clashed with
entry of this dummy dtb. The consequence is fdt passed from firmware
will never be taken.
Fix by trying to utilise __dtb_start only when CONFIG_BUILTIN_DTB is
enabled.
Cc: stable@vger.kernel.org
Fixes:
|
||
![]() |
6c3ca6654a |
LoongArch: Remove CONFIG_ACPI_TABLE_UPGRADE in platform_init()
Both acpi_table_upgrade() and acpi_boot_table_init() are defined as empty functions under !CONFIG_ACPI_TABLE_UPGRADE and !CONFIG_ACPI in include/linux/acpi.h, there are no implicit declaration errors with various configs. #ifdef CONFIG_ACPI_TABLE_UPGRADE void acpi_table_upgrade(void); #else static inline void acpi_table_upgrade(void) { } #endif #ifdef CONFIG_ACPI ... void acpi_boot_table_init (void); ... #else /* !CONFIG_ACPI */ ... static inline void acpi_boot_table_init(void) { } ... #endif /* !CONFIG_ACPI */ As Huacai suggested, CONFIG_ACPI_TABLE_UPGRADE is ugly and not necessary here, just remove it. At the same time, just keep CONFIG_ACPI to prevent potential build errors in future, and give a signal to indicate the code is ACPI-specific. For the same reason, we also put acpi_table_upgrade() under CONFIG_ACPI. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
4f05e82003 |
LoongArch changes for v6.10
1, Select some options in Kconfig; 2, Give a chance to build with !CONFIG_SMP; 3, Switch to use built-in rustc target; 4, Add new supported device nodes to dts; 5, Some bug fixes and other small changes; 6, Update the default config file. -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmZKCycWHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImeoXWD/9pFhbbJj49T1xiwc2j/XgQL8HI s88/h4z5AXEbHFO8XIG1Cpw/Z3a1DsCiWBsOkCogagILzYuN0r7UqcrI02ZoeY6N fbuDatB3i+hJWCBzcl1HPkFy/9av4j4EktZs0+X/wVgKkd0aIh78qs8+1RwKhshf FoOv+cMu7zFS8Jrt+w16diNCY1JsDv7TCkCVhvJxAodrtGg4oo2NPfrGOrKAP8Dq LClvFEqDcXq1kKcipw3Q7BwDlBpJEvLZ0iAl19BnLAmBzI3Wfze9ouoYv8WiUyaY br0GPShGf16I3DKtTdHsHH/zmayQ7JSmFzZ9JEHzcBrE4AprfWLuwsUjd2WXDD6U wK+p4tWd0AUFf+/h4u1yQB9/rlt+JZ2ny/A2u4YR/BPtthiYqp8SDSH62vpCSFOE dByDeTbfjTdJsWr+bsI2gOO0sVwDYpph9SJfAyBn4miKw7v8w+2rI1oqo/ZQkP59 0SczM9C9jzpgXSGDc4yQbnqoA4KA9U6zljd12mYL5HV/AjhD19va3FmENgByZUuE Z7A0RZsiU5T401xEZiOUhwzy9m/USc1O2ivCmeowx9kP/gWic0KeAsmlMiro0jeR y9jthci8iOgjjLmCEVC06GWGUojP2roXI/38We6enVevy2GXbEEDRa1QGbQ5ndoJ MEPm4NvW1wsBgWIYmg== =WR15 -----END PGP SIGNATURE----- Merge tag 'loongarch-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Select some options in Kconfig - Give a chance to build with !CONFIG_SMP - Switch to use built-in rustc target - Add new supported device nodes to dts - Some bug fixes and other small changes - Update the default config file * tag 'loongarch-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: Update Loongson-3 default config file LoongArch: dts: Add new supported device nodes to Loongson-2K2000 LoongArch: dts: Add new supported device nodes to Loongson-2K0500 LoongArch: dts: Remove "disabled" state of clock controller node LoongArch: rust: Switch to use built-in rustc target LoongArch: Fix callchain parse error with kernel tracepoint events again LoongArch: Give a chance to build with !CONFIG_SMP LoongArch: Select THP_SWAP if HAVE_ARCH_TRANSPARENT_HUGEPAGE LoongArch: Select ARCH_WANT_DEFAULT_BPF_JIT LoongArch: Select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 LoongArch: Select ARCH_HAS_FAST_MULTIPLIER |
||
![]() |
0cc6f45cec |
IOMMU Updates for Linux v6.10
Including: - Core: - IOMMU memory usage observability - This will make the memory used for IO page tables explicitly visible. - Simplify arch_setup_dma_ops() - Intel VT-d: - Consolidate domain cache invalidation - Remove private data from page fault message - Allocate DMAR fault interrupts locally - Cleanup and refactoring - ARM-SMMUv2: - Support for fault debugging hardware on Qualcomm implementations - Re-land support for the ->domain_alloc_paging() callback - ARM-SMMUv3: - Improve handling of MSI allocation failure - Drop support for the "disable_bypass" cmdline option - Major rework of the CD creation code, following on directly from the STE rework merged last time around. - Add unit tests for the new STE/CD manipulation logic - AMD-Vi: - Final part of SVA changes with generic IO page fault handling - Renesas IPMMU: - Add support for R8A779H0 hardware - A couple smaller fixes and updates across the sub-tree -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmZHJMkACgkQK/BELZcB GuND1Q/+M4RN5jM66XCfhqoP8QaI8I7zDlPDd14ismx0bjtOZhoiXpptKkAA8guo 7mS57MLqBw/hKYucm1mw+F1qi1HnRWSstKXiCPmzDm3UXYgZJlKkrOw6vydFeHJH zx2ei7TmBrc0SrsybWK3NWRfVBBkO8enGZTmti0DfHL/rOFcUM0LHegY51GcDaaH SlDr+LLDMeGynSQWhRlVNJVmEI5gpVPitY/mDUpVPoELiW9C0WGk8kPlR11z2pCR eUNiqGJUcGasOhmfiYnpJR462eg7J41glquu+YHj8ivPbbu3C4wxgruY/tR4dmJG 8s6AMAWR53JzG2SrCCwtzyRPSXmKfvixF+VKmlB2Ksc7VAn1xA0DYnY5Tx99EtXu qcEaR4SICMti0urmBGo/cGFdXi2TB1ccXqwoRtp1N3KiYnnOaQdLNO9qZdl9uUTI uleXACzkCVSssSpBfGjFcPyHU4r3WjMfX0f5ZJPpFMoQmvwV1yeMX7xTEZz4Sxew cHfBt9FAW9+4mBMTQfokBt0hZ6jwKcYl/z3Xi2oD+Ik/Qrzx5kcLA8LZLEVRXIBa SZh2ASazq/dr8YoZ744VRmlmi+nISAIHbbQMeqQEQgYQh0HpwS9g5HtpsBzNP6aB 91RHqZSccb/zNdi8e+RH79Y7pX/G5QcuVKcW6KQUBcAAb6hAgOg= =JUzp -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: "Core: - IOMMU memory usage observability - This will make the memory used for IO page tables explicitly visible. - Simplify arch_setup_dma_ops() Intel VT-d: - Consolidate domain cache invalidation - Remove private data from page fault message - Allocate DMAR fault interrupts locally - Cleanup and refactoring ARM-SMMUv2: - Support for fault debugging hardware on Qualcomm implementations - Re-land support for the ->domain_alloc_paging() callback ARM-SMMUv3: - Improve handling of MSI allocation failure - Drop support for the "disable_bypass" cmdline option - Major rework of the CD creation code, following on directly from the STE rework merged last time around. - Add unit tests for the new STE/CD manipulation logic AMD-Vi: - Final part of SVA changes with generic IO page fault handling Renesas IPMMU: - Add support for R8A779H0 hardware ... and a couple smaller fixes and updates across the sub-tree" * tag 'iommu-updates-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (80 commits) iommu/arm-smmu-v3: Make the kunit into a module arm64: Properly clean up iommu-dma remnants iommu/amd: Enable Guest Translation after reading IOMMU feature register iommu/vt-d: Decouple igfx_off from graphic identity mapping iommu/amd: Fix compilation error iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd() iommu/arm-smmu-v3: Move the CD generation for SVA into a function iommu/arm-smmu-v3: Allocate the CD table entry in advance iommu/arm-smmu-v3: Make arm_smmu_alloc_cd_ptr() iommu/arm-smmu-v3: Consolidate clearing a CD table entry iommu/arm-smmu-v3: Move the CD generation for S1 domains into a function iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry() iommu/arm-smmu-v3: Add an ops indirection to the STE code iommu/arm-smmu-qcom: Don't build debug features as a kernel module iommu/amd: Add SVA domain support iommu: Add ops->domain_alloc_sva() iommu/amd: Initial SVA support for AMD IOMMU iommu/amd: Add support for enable/disable IOPF iommu/amd: Add IO page fault notifier handler ... |
||
![]() |
70a663205d |
Probes updates for v6.10:
- tracing/probes: Adding new pseudo-types %pd and %pD support for dumping dentry name from 'struct dentry *' and file name from 'struct file *'. - uprobes: Some performance optimizations have been done. . Speed up the BPF uprobe event by delaying the fetching of the uprobe event arguments that are not used in BPF. . Avoid locking by speculatively checking whether uprobe event is valid. . Reduce lock contention by using read/write_lock instead of spinlock for uprobe list operation. This improved BPF uprobe benchmark result 43% on average. - rethook: Removes non-fatal warning messages when tracing stack from BPF and skip rcu_is_watching() validation in rethook if possible. - objpool: Optimizing objpool (which is used by kretprobes and fprobe as rethook backend storage) by inlining functions and avoid caching nr_cpu_ids because it is a const value. - fprobe: Add entry/exit callbacks types (code cleanup) - kprobes: Check ftrace was killed in kprobes if it uses ftrace. -----BEGIN PGP SIGNATURE----- iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmZFUxsbHG1hc2FtaS5o aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8b+fIH/A96/SeC5WRLhXmHfTCM IvKUea2n0b0oV/2pVfHqfkCBTICuUZ97Opd9VH9jLtjBOTh0fUOGZ2DNVGdSYfWm IIkS5dhuZxHXrSHEVYykwLHI3AOL7Q6Ny9EmOg1CNMidUkPMNtBvppsBYPlFU/B/ qQJAvOdkVOnNITCaas0+MNgepoVVKdJzdNQ1I4WrGyG8isCZBaCYKo2QcGyheCNN y8NXvnVHgmgHQ8nTaeE5AawclFzFnhwHfPQPe1kiyGrx15b8K+VYmaZxPKv33A1a KT3TKJ1Ep7s7iWFh2iPVJzIwOXCmSnvNTKfNx/MDuKtO7UVfFwytoMEaekbmv3bG VqM= =n/mW -----END PGP SIGNATURE----- Merge tag 'probes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes updates from Masami Hiramatsu: - tracing/probes: Add new pseudo-types %pd and %pD support for dumping dentry name from 'struct dentry *' and file name from 'struct file *' - uprobes performance optimizations: - Speed up the BPF uprobe event by delaying the fetching of the uprobe event arguments that are not used in BPF - Avoid locking by speculatively checking whether uprobe event is valid - Reduce lock contention by using read/write_lock instead of spinlock for uprobe list operation. This improved BPF uprobe benchmark result 43% on average - rethook: Remove non-fatal warning messages when tracing stack from BPF and skip rcu_is_watching() validation in rethook if possible - objpool: Optimize objpool (which is used by kretprobes and fprobe as rethook backend storage) by inlining functions and avoid caching nr_cpu_ids because it is a const value - fprobe: Add entry/exit callbacks types (code cleanup) - kprobes: Check ftrace was killed in kprobes if it uses ftrace * tag 'probes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: kprobe/ftrace: bail out if ftrace was killed selftests/ftrace: Fix required features for VFS type test case objpool: cache nr_possible_cpus() and avoid caching nr_cpu_ids objpool: enable inlining objpool_push() and objpool_pop() operations rethook: honor CONFIG_FTRACE_VALIDATE_RCU_IS_WATCHING in rethook_try_get() ftrace: make extra rcu_is_watching() validation check optional uprobes: reduce contention on uprobes_tree access rethook: Remove warning messages printed for finding return address of a frame. fprobe: Add entry/exit callbacks types selftests/ftrace: add fprobe test cases for VFS type "%pd" and "%pD" selftests/ftrace: add kprobe test cases for VFS type "%pd" and "%pD" Documentation: tracing: add new type '%pd' and '%pD' for kprobe tracing/probes: support '%pD' type for print struct file's name tracing/probes: support '%pd' type for print struct dentry's name uprobes: add speculative lockless system-wide uprobe filter check uprobes: prepare uprobe args buffer lazily uprobes: encapsulate preparation of uprobe args buffer |
||
![]() |
1a7d0890dd |
kprobe/ftrace: bail out if ftrace was killed
If an error happens in ftrace, ftrace_kill() will prevent disarming kprobes. Eventually, the ftrace_ops associated with the kprobes will be freed, yet the kprobes will still be active, and when triggered, they will use the freed memory, likely resulting in a page fault and panic. This behavior can be reproduced quite easily, by creating a kprobe and then triggering a ftrace_kill(). For simplicity, we can simulate an ftrace error with a kernel module like [1]: [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer sudo perf probe --add commit_creds sudo perf trace -e probe:commit_creds # In another terminal make sudo insmod ftrace_killer.ko # calls ftrace_kill(), simulating bug # Back to perf terminal # ctrl-c sudo perf probe --del commit_creds After a short period, a page fault and panic would occur as the kprobe continues to execute and uses the freed ftrace_ops. While ftrace_kill() is supposed to be used only in extreme circumstances, it is invoked in FTRACE_WARN_ON() and so there are many places where an unexpected bug could be triggered, yet the system may continue operating, possibly without the administrator noticing. If ftrace_kill() does not panic the system, then we should do everything we can to continue operating, rather than leave a ticking time bomb. Link: https://lore.kernel.org/all/20240501162956.229427-1-stephen.s.brennan@oracle.com/ Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Guo Ren <guoren@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> |
||
![]() |
f4b0c4b508 |
ARM:
* Move a lot of state that was previously stored on a per vcpu basis into a per-CPU area, because it is only pertinent to the host while the vcpu is loaded. This results in better state tracking, and a smaller vcpu structure. * Add full handling of the ERET/ERETAA/ERETAB instructions in nested virtualisation. The last two instructions also require emulating part of the pointer authentication extension. As a result, the trap handling of pointer authentication has been greatly simplified. * Turn the global (and not very scalable) LPI translation cache into a per-ITS, scalable cache, making non directly injected LPIs much cheaper to make visible to the vcpu. * A batch of pKVM patches, mostly fixes and cleanups, as the upstreaming process seems to be resuming. Fingers crossed! * Allocate PPIs and SGIs outside of the vcpu structure, allowing for smaller EL2 mapping and some flexibility in implementing more or less than 32 private IRQs. * Purge stale mpidr_data if a vcpu is created after the MPIDR map has been created. * Preserve vcpu-specific ID registers across a vcpu reset. * Various minor cleanups and improvements. LoongArch: * Add ParaVirt IPI support. * Add software breakpoint support. * Add mmio trace events support. RISC-V: * Support guest breakpoints using ebreak * Introduce per-VCPU mp_state_lock and reset_cntx_lock * Virtualize SBI PMU snapshot and counter overflow interrupts * New selftests for SBI PMU and Guest ebreak * Some preparatory work for both TDX and SNP page fault handling. This also cleans up the page fault path, so that the priorities of various kinds of fauls (private page, no memory, write to read-only slot, etc.) are easier to follow. x86: * Minimize amount of time that shadow PTEs remain in the special REMOVED_SPTE state. This is a state where the mmu_lock is held for reading but concurrent accesses to the PTE have to spin; shortening its use allows other vCPUs to repopulate the zapped region while the zapper finishes tearing down the old, defunct page tables. * Advertise the max mappable GPA in the "guest MAXPHYADDR" CPUID field, which is defined by hardware but left for software use. This lets KVM communicate its inability to map GPAs that set bits 51:48 on hosts without 5-level nested page tables. Guest firmware is expected to use the information when mapping BARs; this avoids that they end up at a legal, but unmappable, GPA. * Fixed a bug where KVM would not reject accesses to MSR that aren't supposed to exist given the vCPU model and/or KVM configuration. * As usual, a bunch of code cleanups. x86 (AMD): * Implement a new and improved API to initialize SEV and SEV-ES VMs, which will also be extendable to SEV-SNP. The new API specifies the desired encryption in KVM_CREATE_VM and then separately initializes the VM. The new API also allows customizing the desired set of VMSA features; the features affect the measurement of the VM's initial state, and therefore enabling them cannot be done tout court by the hypervisor. While at it, the new API includes two bugfixes that couldn't be applied to the old one without a flag day in userspace or without affecting the initial measurement. When a SEV-ES VM is created with the new VM type, KVM_GET_REGS/KVM_SET_REGS and friends are rejected once the VMSA has been encrypted. Also, the FPU and AVX state will be synchronized and encrypted too. * Support for GHCB version 2 as applicable to SEV-ES guests. This, once more, is only accessible when using the new KVM_SEV_INIT2 flow for initialization of SEV-ES VMs. x86 (Intel): * An initial bunch of prerequisite patches for Intel TDX were merged. They generally don't do anything interesting. The only somewhat user visible change is a new debugging mode that checks that KVM's MMU never triggers a #VE virtualization exception in the guest. * Clear vmcs.EXIT_QUALIFICATION when synthesizing an EPT Misconfig VM-Exit to L1, as per the SDM. Generic: * Use vfree() instead of kvfree() for allocations that always use vcalloc() or __vcalloc(). * Remove .change_pte() MMU notifier - the changes to non-KVM code are small and Andrew Morton asked that I also take those through the KVM tree. The callback was only ever implemented by KVM (which was also the original user of MMU notifiers) but it had been nonfunctional ever since calls to set_pte_at_notify were wrapped with invalidate_range_start and invalidate_range_end... in 2012. Selftests: * Enhance the demand paging test to allow for better reporting and stressing of UFFD performance. * Convert the steal time test to generate TAP-friendly output. * Fix a flaky false positive in the xen_shinfo_test due to comparing elapsed time across two different clock domains. * Skip the MONITOR/MWAIT test if the host doesn't actually support MWAIT. * Avoid unnecessary use of "sudo" in the NX hugepage test wrapper shell script, to play nice with running in a minimal userspace environment. * Allow skipping the RSEQ test's sanity check that the vCPU was able to complete a reasonable number of KVM_RUNs, as the assert can fail on a completely valid setup. If the test is run on a large-ish system that is otherwise idle, and the test isn't affined to a low-ish number of CPUs, the vCPU task can be repeatedly migrated to CPUs that are in deep sleep states, which results in the vCPU having very little net runtime before the next migration due to high wakeup latencies. * Define _GNU_SOURCE for all selftests to fix a warning that was introduced by a change to kselftest_harness.h late in the 6.9 cycle, and because forcing every test to #define _GNU_SOURCE is painful. * Provide a global pseudo-RNG instance for all tests, so that library code can generate random, but determinstic numbers. * Use the global pRNG to randomly force emulation of select writes from guest code on x86, e.g. to help validate KVM's emulation of locked accesses. * Allocate and initialize x86's GDT, IDT, TSS, segments, and default exception handlers at VM creation, instead of forcing tests to manually trigger the related setup. Documentation: * Fix a goof in the KVM_CREATE_GUEST_MEMFD documentation. -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZE878UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroOukQf+LcvZsWtrC7Wd5K9SQbYXaS4Rk6P6 JHoQW2d0hUN893J2WibEw+l1J/0vn5JumqHXyZgJ7CbaMtXkWWQTwDSDLuURUKpv XNB3Sb17G87NH+s1tOh0tA9h5upbtlHVHvrtIwdbb9+XHgQ6HTL4uk+HdfO/p9fW cWBEZAKoWcCIa99Numv3pmq5vdrvBlNggwBugBS8TH69EKMw+V1Vu1SFkIdNDTQk NJJ28cohoP3wnwlIHaXSmU4RujipPH3Lm/xupyA5MwmzO713eq2yUqV49jzhD5/I MA4Ruvgrdm4wpp89N9lQMyci91u6q7R9iZfMu0tSg2qYI3UPKIdstd8sOA== =2lED -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "ARM: - Move a lot of state that was previously stored on a per vcpu basis into a per-CPU area, because it is only pertinent to the host while the vcpu is loaded. This results in better state tracking, and a smaller vcpu structure. - Add full handling of the ERET/ERETAA/ERETAB instructions in nested virtualisation. The last two instructions also require emulating part of the pointer authentication extension. As a result, the trap handling of pointer authentication has been greatly simplified. - Turn the global (and not very scalable) LPI translation cache into a per-ITS, scalable cache, making non directly injected LPIs much cheaper to make visible to the vcpu. - A batch of pKVM patches, mostly fixes and cleanups, as the upstreaming process seems to be resuming. Fingers crossed! - Allocate PPIs and SGIs outside of the vcpu structure, allowing for smaller EL2 mapping and some flexibility in implementing more or less than 32 private IRQs. - Purge stale mpidr_data if a vcpu is created after the MPIDR map has been created. - Preserve vcpu-specific ID registers across a vcpu reset. - Various minor cleanups and improvements. LoongArch: - Add ParaVirt IPI support - Add software breakpoint support - Add mmio trace events support RISC-V: - Support guest breakpoints using ebreak - Introduce per-VCPU mp_state_lock and reset_cntx_lock - Virtualize SBI PMU snapshot and counter overflow interrupts - New selftests for SBI PMU and Guest ebreak - Some preparatory work for both TDX and SNP page fault handling. This also cleans up the page fault path, so that the priorities of various kinds of fauls (private page, no memory, write to read-only slot, etc.) are easier to follow. x86: - Minimize amount of time that shadow PTEs remain in the special REMOVED_SPTE state. This is a state where the mmu_lock is held for reading but concurrent accesses to the PTE have to spin; shortening its use allows other vCPUs to repopulate the zapped region while the zapper finishes tearing down the old, defunct page tables. - Advertise the max mappable GPA in the "guest MAXPHYADDR" CPUID field, which is defined by hardware but left for software use. This lets KVM communicate its inability to map GPAs that set bits 51:48 on hosts without 5-level nested page tables. Guest firmware is expected to use the information when mapping BARs; this avoids that they end up at a legal, but unmappable, GPA. - Fixed a bug where KVM would not reject accesses to MSR that aren't supposed to exist given the vCPU model and/or KVM configuration. - As usual, a bunch of code cleanups. x86 (AMD): - Implement a new and improved API to initialize SEV and SEV-ES VMs, which will also be extendable to SEV-SNP. The new API specifies the desired encryption in KVM_CREATE_VM and then separately initializes the VM. The new API also allows customizing the desired set of VMSA features; the features affect the measurement of the VM's initial state, and therefore enabling them cannot be done tout court by the hypervisor. While at it, the new API includes two bugfixes that couldn't be applied to the old one without a flag day in userspace or without affecting the initial measurement. When a SEV-ES VM is created with the new VM type, KVM_GET_REGS/KVM_SET_REGS and friends are rejected once the VMSA has been encrypted. Also, the FPU and AVX state will be synchronized and encrypted too. - Support for GHCB version 2 as applicable to SEV-ES guests. This, once more, is only accessible when using the new KVM_SEV_INIT2 flow for initialization of SEV-ES VMs. x86 (Intel): - An initial bunch of prerequisite patches for Intel TDX were merged. They generally don't do anything interesting. The only somewhat user visible change is a new debugging mode that checks that KVM's MMU never triggers a #VE virtualization exception in the guest. - Clear vmcs.EXIT_QUALIFICATION when synthesizing an EPT Misconfig VM-Exit to L1, as per the SDM. Generic: - Use vfree() instead of kvfree() for allocations that always use vcalloc() or __vcalloc(). - Remove .change_pte() MMU notifier - the changes to non-KVM code are small and Andrew Morton asked that I also take those through the KVM tree. The callback was only ever implemented by KVM (which was also the original user of MMU notifiers) but it had been nonfunctional ever since calls to set_pte_at_notify were wrapped with invalidate_range_start and invalidate_range_end... in 2012. Selftests: - Enhance the demand paging test to allow for better reporting and stressing of UFFD performance. - Convert the steal time test to generate TAP-friendly output. - Fix a flaky false positive in the xen_shinfo_test due to comparing elapsed time across two different clock domains. - Skip the MONITOR/MWAIT test if the host doesn't actually support MWAIT. - Avoid unnecessary use of "sudo" in the NX hugepage test wrapper shell script, to play nice with running in a minimal userspace environment. - Allow skipping the RSEQ test's sanity check that the vCPU was able to complete a reasonable number of KVM_RUNs, as the assert can fail on a completely valid setup. If the test is run on a large-ish system that is otherwise idle, and the test isn't affined to a low-ish number of CPUs, the vCPU task can be repeatedly migrated to CPUs that are in deep sleep states, which results in the vCPU having very little net runtime before the next migration due to high wakeup latencies. - Define _GNU_SOURCE for all selftests to fix a warning that was introduced by a change to kselftest_harness.h late in the 6.9 cycle, and because forcing every test to #define _GNU_SOURCE is painful. - Provide a global pseudo-RNG instance for all tests, so that library code can generate random, but determinstic numbers. - Use the global pRNG to randomly force emulation of select writes from guest code on x86, e.g. to help validate KVM's emulation of locked accesses. - Allocate and initialize x86's GDT, IDT, TSS, segments, and default exception handlers at VM creation, instead of forcing tests to manually trigger the related setup. Documentation: - Fix a goof in the KVM_CREATE_GUEST_MEMFD documentation" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (225 commits) selftests/kvm: remove dead file KVM: selftests: arm64: Test vCPU-scoped feature ID registers KVM: selftests: arm64: Test that feature ID regs survive a reset KVM: selftests: arm64: Store expected register value in set_id_regs KVM: selftests: arm64: Rename helper in set_id_regs to imply VM scope KVM: arm64: Only reset vCPU-scoped feature ID regs once KVM: arm64: Reset VM feature ID regs from kvm_reset_sys_regs() KVM: arm64: Rename is_id_reg() to imply VM scope KVM: arm64: Destroy mpidr_data for 'late' vCPU creation KVM: arm64: Use hVHE in pKVM by default on CPUs with VHE support KVM: arm64: Fix hvhe/nvhe early alias parsing KVM: SEV: Allow per-guest configuration of GHCB protocol version KVM: SEV: Add GHCB handling for termination requests KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests KVM: SEV: Add support to handle AP reset MSR protocol KVM: x86: Explicitly zero kvm_caps during vendor module load KVM: x86: Fully re-initialize supported_mce_cap on vendor module load KVM: x86: Fully re-initialize supported_vm_types on vendor module load KVM: x86/mmu: Sanity check that __kvm_faultin_pfn() doesn't create noslot pfns KVM: x86/mmu: Initialize kvm_page_fault's pfn and hva to error values ... |
||
![]() |
0cc2dc4902 |
arch: make execmem setup available regardless of CONFIG_MODULES
execmem does not depend on modules, on the contrary modules use execmem. To make execmem available when CONFIG_MODULES=n, for instance for kprobes, split execmem_params initialization out from arch/*/kernel/module.c and compile it when CONFIG_EXECMEM=y Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> |
||
![]() |
f6bec26c0a |
mm/execmem, arch: convert simple overrides of module_alloc to execmem
Several architectures override module_alloc() only to define address range for code allocations different than VMALLOC address space. Provide a generic implementation in execmem that uses the parameters for address space ranges, required alignment and page protections provided by architectures. The architectures must fill execmem_info structure and implement execmem_arch_setup() that returns a pointer to that structure. This way the execmem initialization won't be called from every architecture, but rather from a central place, namely a core_initcall() in execmem. The execmem provides execmem_alloc() API that wraps __vmalloc_node_range() with the parameters defined by the architectures. If an architecture does not implement execmem_arch_setup(), execmem_alloc() will fall back to module_alloc(). Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Song Liu <song@kernel.org> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> |
||
![]() |
5685d7fcb5 |
LoongArch: Give a chance to build with !CONFIG_SMP
In the current code, SMP is selected in Kconfig for LoongArch, the users can not unset it, this is reasonable for a multi-processor machine. But as the help info of config SMP said, if you have a system with only one CPU, say N. On a uni-processor machine, the kernel will run faster if you say N here. Loongson-2K0500 is a single-core CPU for applications like industrial control, printing terminals, and BMC (Baseboard Management Controller), there are many development boards, products and solutions on the market, so it is better and necessary to give a chance to build with !CONFIG_SMP for a uni-processor machine. First of all, do not select SMP for config LOONGARCH in Kconfig to make it possible to unset CONFIG_SMP. Then, do some changes to fix warnings and errors if CONFIG_SMP is not set. (1) Define get_ipi_irq() only if CONFIG_SMP is set to fix the warning: arch/loongarch/kernel/irq.c:90:19: warning: 'get_ipi_irq' defined but not used [-Wunused-function] (2) Add "#ifdef CONFIG_SMP" in asm/smp.h to fix the warning: ./arch/loongarch/include/asm/smp.h:49:9: warning: "raw_smp_processor_id" redefined 49 | #define raw_smp_processor_id raw_smp_processor_id | ^~~~~~~~~~~~~~~~~~~~ ./include/linux/smp.h:198:9: note: this is the location of the previous definition 198 | #define raw_smp_processor_id() 0 (3) Define machine_shutdown() as empty under !CONFIG_SMP to fix the error: arch/loongarch/kernel/machine_kexec.c: In function 'machine_shutdown': arch/loongarch/kernel/machine_kexec.c:233:25: error: implicit declaration of function 'cpu_device_up'; did you mean 'put_device'? [-Wimplicit-function-declaration] (4) Make config SCHED_SMT depends on SMP to fix many errors such as: kernel/sched/core.c: In function 'sched_core_find': kernel/sched/core.c:310:43: error: 'struct rq' has no member named 'cpu' (5) Define cpu_logical_map(cpu) as 0 under !CONFIG_SMP in asm/smp.h, then include asm/smp.h in asm/acpi.h (because acpi.h is included in linux/irq.h indirectly) to fix many build errors under drivers/irqchip such as: drivers/irqchip/irq-loongson-eiointc.c: In function 'cpu_to_eio_node': drivers/irqchip/irq-loongson-eiointc.c:59:16: error: implicit declaration of function 'cpu_logical_map' [-Wimplicit-function-declaration] (6) Do not write per_cpu_offset(0) to PERCPU_BASE_KS when resume because the per_cpu_offset(x) macro is defined as (__per_cpu_offset[x]) only under CONFIG_SMP in include/asm-generic/percpu.h. Just save the value of PERCPU_BASE_KS when suspend and restore it when resume to fix the error: arch/loongarch/power/suspend.c: In function 'loongarch_common_resume': arch/loongarch/power/suspend.c:47:21: error: implicit declaration of function 'per_cpu_offset' [-Wimplicit-function-declaration] (7) Fix huge page handling under !CONFIG_SMP in tlbex.S. When running the UnixBench tests with "-c 1" single-streamed pass, the improvement of performance is about 9 percent with this patch. By the way, it is helpful to debug and analysis the kernel issues of multi-processor system under !CONFIG_SMP. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
2bd5059c6c | Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'core' and 'x86/vt-d' into next | ||
![]() |
74c16b2e2b |
LoongArch: KVM: Add PV IPI support on guest side
PARAVIRT config option and PV IPI is added for the guest side, function pv_ipi_init() is used to add IPI sending and IPI receiving hooks. This function firstly checks whether system runs in VM mode, and if kernel runs in VM mode, it will call function kvm_para_available() to detect the current hypervirsor type (now only KVM type detection is supported). The paravirt functions can work only if current hypervisor type is KVM, since there is only KVM supported on LoongArch now. PV IPI uses virtual IPI sender and virtual IPI receiver functions. With virtual IPI sender, IPI message is stored in memory rather than emulated HW. IPI multicast is also supported, and 128 vcpus can received IPIs at the same time like X86 KVM method. Hypercall method is used for IPI sending. With virtual IPI receiver, HW SWI0 is used rather than real IPI HW. Since VCPU has separate HW SWI0 like HW timer, there is no trap in IPI interrupt acknowledge. Since IPI message is stored in memory, there is no trap in getting IPI message. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
316863cb62 |
LoongArch/smp: Refine some ipi functions on LoongArch platform
Refine the ipi handling on LoongArch platform, there are three modifications: 1. Add generic function get_percpu_irq(), replacing some percpu irq functions such as get_ipi_irq()/get_pmc_irq()/get_timer_irq() with get_percpu_irq(). 2. Change definition about parameter action called by function loongson_send_ipi_single() and loongson_send_ipi_mask(), and it is defined as decimal encoding format at ipi sender side. Normal decimal encoding is used rather than binary bitmap encoding for ipi action, ipi hw sender uses decimal encoding code, and ipi receiver will get binary bitmap encoding, the ipi hw will convert it into bitmap in ipi message buffer. 3. Add a structure smp_ops on LoongArch platform so that pv ipi can be used later. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
fece6530bf |
dma-mapping: Add helpers for dma_range_map bounds
Several places want to compute the lower and/or upper bounds of a dma_range_map, so let's factor that out into reusable helpers. Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> # For arm64 Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/45ec52f033ec4dfb364e23f48abaf787f612fa53.1713523152.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de> |
||
![]() |
f3334ebb8a |
LoongArch: Lately init pmu after smp is online
There is an smp function call named reset_counters() to init PMU registers of every CPU in PMU initialization state. It requires that all CPUs are online. However there is an early_initcall() wrapper for the PMU init funciton init_hw_perf_events(), so that pmu init funciton is called in do_pre_smp_initcalls() which before function smp_init(). Function reset_counters() cannot work on other CPUs since they haven't boot up still. Here replace the wrapper early_initcall() with pure_initcall(), so that the PMU init function is called after every cpu is online. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
1e3cd03c54 |
LoongArch changes for v6.9
1, Add objtool support for LoongArch; 2, Add ORC stack unwinder support for LoongArch; 3, Add kernel livepatching support for LoongArch; 4, Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig; 5, Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig; 6, Some bug fixes and other small changes. -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmX69KsWHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImerjFD/9rAzm1+G4VxFvFzlOiJXEqquNJ +Vz2fAZLU3lJhBlx0uUKXFVijvLe8s/DnoLrM9e/M6gk4ivT9eszy3DnqT3NjGDX njYFkPUWZhZGACmbkoVk9St80R8sPIdZrwXtW3q7g3T0bC7LXUXrJw52Sh4gmbYx RqLsE6GoEWGY0zhhWqeeAM9LkKDuLxxyjH4fYE4g623EhQt7A7hP5okyaC+xHzp+ qp/4dPFLu61LeqIfeBUKK7nQ6uzno3EWLiME2eHEHiuelYfzmh+BtNMcX9Ugb/En j0vLGNsoDGmEYw7xGa6OSRaCR/nCwVJz4SvuH33wbbbHhVAiUKUBVNFR3gmAtLlc BSa2dDZbKhHkiWSUCM9K2ihr7WiQNuraTK1kKHwBgfa+RbEVOTu1q8yokAB9XCaT T7lijJ8MKQmzHpMvgev7nN41baDB6V5bPIni0Ueh+NhQJKZ2/IxtYA3XzV5D0UgL TBovVgYB/VNThS9gzOrlenKuDX9hT+kCQgyudErXaoIo645P6dsPFowOZRQxCEIv WnLskZatLTCA8xWl1XyC1bqtGxhp34Gbhg0ZcvUqlNE20luaK/qi8wtW9Mv1Utp+ aXFO3i7d93z99oAcUT0oc1N83T0x0M/p69Z+rL/2+L0sYQgBf1cwUEiDNRW4OCdI h15289rRTxjeL7NZPw== =lSkY -----END PGP SIGNATURE----- Merge tag 'loongarch-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Add objtool support for LoongArch - Add ORC stack unwinder support for LoongArch - Add kernel livepatching support for LoongArch - Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig - Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig - Some bug fixes and other small changes * tag 'loongarch-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch/crypto: Clean up useless assignment operations LoongArch: Define the __io_aw() hook as mmiowb() LoongArch: Remove superfluous flush_dcache_page() definition LoongArch: Move {dmw,tlb}_virt_to_page() definition to page.h LoongArch: Change __my_cpu_offset definition to avoid mis-optimization LoongArch: Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig LoongArch: Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig LoongArch: Add kernel livepatching support LoongArch: Add ORC stack unwinder support objtool: Check local label in read_unwind_hints() objtool: Check local label in add_dead_ends() objtool/LoongArch: Enable orc to be built objtool/x86: Separate arch-specific and generic parts objtool/LoongArch: Implement instruction decoder objtool/LoongArch: Enable objtool to be built |
||
![]() |
902861e34c |
- Sumanth Korikkar has taught s390 to allocate hotplug-time page frames
from hotplugged memory rather than only from main memory. Series "implement "memmap on memory" feature on s390". - More folio conversions from Matthew Wilcox in the series "Convert memcontrol charge moving to use folios" "mm: convert mm counter to take a folio" - Chengming Zhou has optimized zswap's rbtree locking, providing significant reductions in system time and modest but measurable reductions in overall runtimes. The series is "mm/zswap: optimize the scalability of zswap rb-tree". - Chengming Zhou has also provided the series "mm/zswap: optimize zswap lru list" which provides measurable runtime benefits in some swap-intensive situations. - And Chengming Zhou further optimizes zswap in the series "mm/zswap: optimize for dynamic zswap_pools". Measured improvements are modest. - zswap cleanups and simplifications from Yosry Ahmed in the series "mm: zswap: simplify zswap_swapoff()". - In the series "Add DAX ABI for memmap_on_memory", Vishal Verma has contributed several DAX cleanups as well as adding a sysfs tunable to control the memmap_on_memory setting when the dax device is hotplugged as system memory. - Johannes Weiner has added the large series "mm: zswap: cleanups", which does that. - More DAMON work from SeongJae Park in the series "mm/damon: make DAMON debugfs interface deprecation unignorable" "selftests/damon: add more tests for core functionalities and corner cases" "Docs/mm/damon: misc readability improvements" "mm/damon: let DAMOS feeds and tame/auto-tune itself" - In the series "mm/mempolicy: weighted interleave mempolicy and sysfs extension" Rakie Kim has developed a new mempolicy interleaving policy wherein we allocate memory across nodes in a weighted fashion rather than uniformly. This is beneficial in heterogeneous memory environments appearing with CXL. - Christophe Leroy has contributed some cleanup and consolidation work against the ARM pagetable dumping code in the series "mm: ptdump: Refactor CONFIG_DEBUG_WX and check_wx_pages debugfs attribute". - Luis Chamberlain has added some additional xarray selftesting in the series "test_xarray: advanced API multi-index tests". - Muhammad Usama Anjum has reworked the selftest code to make its human-readable output conform to the TAP ("Test Anything Protocol") format. Amongst other things, this opens up the use of third-party tools to parse and process out selftesting results. - Ryan Roberts has added fork()-time PTE batching of THP ptes in the series "mm/memory: optimize fork() with PTE-mapped THP". Mainly targeted at arm64, this significantly speeds up fork() when the process has a large number of pte-mapped folios. - David Hildenbrand also gets in on the THP pte batching game in his series "mm/memory: optimize unmap/zap with PTE-mapped THP". It implements batching during munmap() and other pte teardown situations. The microbenchmark improvements are nice. - And in the series "Transparent Contiguous PTEs for User Mappings" Ryan Roberts further utilizes arm's pte's contiguous bit ("contpte mappings"). Kernel build times on arm64 improved nicely. Ryan's series "Address some contpte nits" provides some followup work. - In the series "mm/hugetlb: Restore the reservation" Breno Leitao has fixed an obscure hugetlb race which was causing unnecessary page faults. He has also added a reproducer under the selftest code. - In the series "selftests/mm: Output cleanups for the compaction test", Mark Brown did what the title claims. - Kinsey Ho has added the series "mm/mglru: code cleanup and refactoring". - Even more zswap material from Nhat Pham. The series "fix and extend zswap kselftests" does as claimed. - In the series "Introduce cpu_dcache_is_aliasing() to fix DAX regression" Mathieu Desnoyers has cleaned up and fixed rather a mess in our handling of DAX on archiecctures which have virtually aliasing data caches. The arm architecture is the main beneficiary. - Lokesh Gidra's series "per-vma locks in userfaultfd" provides dramatic improvements in worst-case mmap_lock hold times during certain userfaultfd operations. - Some page_owner enhancements and maintenance work from Oscar Salvador in his series "page_owner: print stacks and their outstanding allocations" "page_owner: Fixup and cleanup" - Uladzislau Rezki has contributed some vmalloc scalability improvements in his series "Mitigate a vmap lock contention". It realizes a 12x improvement for a certain microbenchmark. - Some kexec/crash cleanup work from Baoquan He in the series "Split crash out from kexec and clean up related config items". - Some zsmalloc maintenance work from Chengming Zhou in the series "mm/zsmalloc: fix and optimize objects/page migration" "mm/zsmalloc: some cleanup for get/set_zspage_mapping()" - Zi Yan has taught the MM to perform compaction on folios larger than order=0. This a step along the path to implementaton of the merging of large anonymous folios. The series is named "Enable >0 order folio memory compaction". - Christoph Hellwig has done quite a lot of cleanup work in the pagecache writeback code in his series "convert write_cache_pages() to an iterator". - Some modest hugetlb cleanups and speedups in Vishal Moola's series "Handle hugetlb faults under the VMA lock". - Zi Yan has changed the page splitting code so we can split huge pages into sizes other than order-0 to better utilize large folios. The series is named "Split a folio to any lower order folios". - David Hildenbrand has contributed the series "mm: remove total_mapcount()", a cleanup. - Matthew Wilcox has sought to improve the performance of bulk memory freeing in his series "Rearrange batched folio freeing". - Gang Li's series "hugetlb: parallelize hugetlb page init on boot" provides large improvements in bootup times on large machines which are configured to use large numbers of hugetlb pages. - Matthew Wilcox's series "PageFlags cleanups" does that. - Qi Zheng's series "minor fixes and supplement for ptdesc" does that also. S390 is affected. - Cleanups to our pagemap utility functions from Peter Xu in his series "mm/treewide: Replace pXd_large() with pXd_leaf()". - Nico Pache has fixed a few things with our hugepage selftests in his series "selftests/mm: Improve Hugepage Test Handling in MM Selftests". - Also, of course, many singleton patches to many things. Please see the individual changelogs for details. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZfJpPQAKCRDdBJ7gKXxA joxeAP9TrcMEuHnLmBlhIXkWbIR4+ki+pA3v+gNTlJiBhnfVSgD9G55t1aBaRplx TMNhHfyiHYDTx/GAV9NXW84tasJSDgA= =TG55 -----END PGP SIGNATURE----- Merge tag 'mm-stable-2024-03-13-20-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames from hotplugged memory rather than only from main memory. Series "implement "memmap on memory" feature on s390". - More folio conversions from Matthew Wilcox in the series "Convert memcontrol charge moving to use folios" "mm: convert mm counter to take a folio" - Chengming Zhou has optimized zswap's rbtree locking, providing significant reductions in system time and modest but measurable reductions in overall runtimes. The series is "mm/zswap: optimize the scalability of zswap rb-tree". - Chengming Zhou has also provided the series "mm/zswap: optimize zswap lru list" which provides measurable runtime benefits in some swap-intensive situations. - And Chengming Zhou further optimizes zswap in the series "mm/zswap: optimize for dynamic zswap_pools". Measured improvements are modest. - zswap cleanups and simplifications from Yosry Ahmed in the series "mm: zswap: simplify zswap_swapoff()". - In the series "Add DAX ABI for memmap_on_memory", Vishal Verma has contributed several DAX cleanups as well as adding a sysfs tunable to control the memmap_on_memory setting when the dax device is hotplugged as system memory. - Johannes Weiner has added the large series "mm: zswap: cleanups", which does that. - More DAMON work from SeongJae Park in the series "mm/damon: make DAMON debugfs interface deprecation unignorable" "selftests/damon: add more tests for core functionalities and corner cases" "Docs/mm/damon: misc readability improvements" "mm/damon: let DAMOS feeds and tame/auto-tune itself" - In the series "mm/mempolicy: weighted interleave mempolicy and sysfs extension" Rakie Kim has developed a new mempolicy interleaving policy wherein we allocate memory across nodes in a weighted fashion rather than uniformly. This is beneficial in heterogeneous memory environments appearing with CXL. - Christophe Leroy has contributed some cleanup and consolidation work against the ARM pagetable dumping code in the series "mm: ptdump: Refactor CONFIG_DEBUG_WX and check_wx_pages debugfs attribute". - Luis Chamberlain has added some additional xarray selftesting in the series "test_xarray: advanced API multi-index tests". - Muhammad Usama Anjum has reworked the selftest code to make its human-readable output conform to the TAP ("Test Anything Protocol") format. Amongst other things, this opens up the use of third-party tools to parse and process out selftesting results. - Ryan Roberts has added fork()-time PTE batching of THP ptes in the series "mm/memory: optimize fork() with PTE-mapped THP". Mainly targeted at arm64, this significantly speeds up fork() when the process has a large number of pte-mapped folios. - David Hildenbrand also gets in on the THP pte batching game in his series "mm/memory: optimize unmap/zap with PTE-mapped THP". It implements batching during munmap() and other pte teardown situations. The microbenchmark improvements are nice. - And in the series "Transparent Contiguous PTEs for User Mappings" Ryan Roberts further utilizes arm's pte's contiguous bit ("contpte mappings"). Kernel build times on arm64 improved nicely. Ryan's series "Address some contpte nits" provides some followup work. - In the series "mm/hugetlb: Restore the reservation" Breno Leitao has fixed an obscure hugetlb race which was causing unnecessary page faults. He has also added a reproducer under the selftest code. - In the series "selftests/mm: Output cleanups for the compaction test", Mark Brown did what the title claims. - Kinsey Ho has added the series "mm/mglru: code cleanup and refactoring". - Even more zswap material from Nhat Pham. The series "fix and extend zswap kselftests" does as claimed. - In the series "Introduce cpu_dcache_is_aliasing() to fix DAX regression" Mathieu Desnoyers has cleaned up and fixed rather a mess in our handling of DAX on archiecctures which have virtually aliasing data caches. The arm architecture is the main beneficiary. - Lokesh Gidra's series "per-vma locks in userfaultfd" provides dramatic improvements in worst-case mmap_lock hold times during certain userfaultfd operations. - Some page_owner enhancements and maintenance work from Oscar Salvador in his series "page_owner: print stacks and their outstanding allocations" "page_owner: Fixup and cleanup" - Uladzislau Rezki has contributed some vmalloc scalability improvements in his series "Mitigate a vmap lock contention". It realizes a 12x improvement for a certain microbenchmark. - Some kexec/crash cleanup work from Baoquan He in the series "Split crash out from kexec and clean up related config items". - Some zsmalloc maintenance work from Chengming Zhou in the series "mm/zsmalloc: fix and optimize objects/page migration" "mm/zsmalloc: some cleanup for get/set_zspage_mapping()" - Zi Yan has taught the MM to perform compaction on folios larger than order=0. This a step along the path to implementaton of the merging of large anonymous folios. The series is named "Enable >0 order folio memory compaction". - Christoph Hellwig has done quite a lot of cleanup work in the pagecache writeback code in his series "convert write_cache_pages() to an iterator". - Some modest hugetlb cleanups and speedups in Vishal Moola's series "Handle hugetlb faults under the VMA lock". - Zi Yan has changed the page splitting code so we can split huge pages into sizes other than order-0 to better utilize large folios. The series is named "Split a folio to any lower order folios". - David Hildenbrand has contributed the series "mm: remove total_mapcount()", a cleanup. - Matthew Wilcox has sought to improve the performance of bulk memory freeing in his series "Rearrange batched folio freeing". - Gang Li's series "hugetlb: parallelize hugetlb page init on boot" provides large improvements in bootup times on large machines which are configured to use large numbers of hugetlb pages. - Matthew Wilcox's series "PageFlags cleanups" does that. - Qi Zheng's series "minor fixes and supplement for ptdesc" does that also. S390 is affected. - Cleanups to our pagemap utility functions from Peter Xu in his series "mm/treewide: Replace pXd_large() with pXd_leaf()". - Nico Pache has fixed a few things with our hugepage selftests in his series "selftests/mm: Improve Hugepage Test Handling in MM Selftests". - Also, of course, many singleton patches to many things. Please see the individual changelogs for details. * tag 'mm-stable-2024-03-13-20-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (435 commits) mm/zswap: remove the memcpy if acomp is not sleepable crypto: introduce: acomp_is_async to expose if comp drivers might sleep memtest: use {READ,WRITE}_ONCE in memory scanning mm: prohibit the last subpage from reusing the entire large folio mm: recover pud_leaf() definitions in nopmd case selftests/mm: skip the hugetlb-madvise tests on unmet hugepage requirements selftests/mm: skip uffd hugetlb tests with insufficient hugepages selftests/mm: dont fail testsuite due to a lack of hugepages mm/huge_memory: skip invalid debugfs new_order input for folio split mm/huge_memory: check new folio order when split a folio mm, vmscan: retry kswapd's priority loop with cache_trim_mode off on failure mm: add an explicit smp_wmb() to UFFDIO_CONTINUE mm: fix list corruption in put_pages_list mm: remove folio from deferred split list before uncharging it filemap: avoid unnecessary major faults in filemap_fault() mm,page_owner: drop unnecessary check mm,page_owner: check for null stack_record before bumping its refcount mm: swap: fix race between free_swap_and_cache() and swapoff() mm/treewide: align up pXd_leaf() retval across archs mm/treewide: drop pXd_large() ... |
||
![]() |
9187210eee |
Networking changes for 6.9.
Core & protocols ---------------- - Large effort by Eric to lower rtnl_lock pressure and remove locks: - Make commonly used parts of rtnetlink (address, route dumps etc.) lockless, protected by RCU instead of rtnl_lock. - Add a netns exit callback which already holds rtnl_lock, allowing netns exit to take rtnl_lock once in the core instead of once for each driver / callback. - Remove locks / serialization in the socket diag interface. - Remove 6 calls to synchronize_rcu() while holding rtnl_lock. - Remove the dev_base_lock, depend on RCU where necessary. - Support busy polling on a per-epoll context basis. Poll length and budget parameters can be set independently of system defaults. - Introduce struct net_hotdata, to make sure read-mostly global config variables fit in as few cache lines as possible. - Add optional per-nexthop statistics to ease monitoring / debug of ECMP imbalance problems. - Support TCP_NOTSENT_LOWAT in MPTCP. - Ensure that IPv6 temporary addresses' preferred lifetimes are long enough, compared to other configured lifetimes, and at least 2 sec. - Support forwarding of ICMP Error messages in IPSec, per RFC 4301. - Add support for the independent control state machine for bonding per IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled control state machine. - Add "network ID" to MCTP socket APIs to support hosts with multiple disjoint MCTP networks. - Re-use the mono_delivery_time skbuff bit for packets which user space wants to be sent at a specified time. Maintain the timing information while traversing veth links, bridge etc. - Take advantage of MSG_SPLICE_PAGES for RxRPC DATA and ACK packets. - Simplify many places iterating over netdevs by using an xarray instead of a hash table walk (hash table remains in place, for use on fastpaths). - Speed up scanning for expired routes by keeping a dedicated list. - Speed up "generic" XDP by trying harder to avoid large allocations. - Support attaching arbitrary metadata to netconsole messages. Things we sprinkled into general kernel code -------------------------------------------- - Enforce VM_IOREMAP flag and range in ioremap_page_range and introduce VM_SPARSE kind and vm_area_[un]map_pages (used by bpf_arena). - Rework selftest harness to enable the use of the full range of ksft exit code (pass, fail, skip, xfail, xpass). Netfilter --------- - Allow userspace to define a table that is exclusively owned by a daemon (via netlink socket aliveness) without auto-removing this table when the userspace program exits. Such table gets marked as orphaned and a restarting management daemon can re-attach/regain ownership. - Speed up element insertions to nftables' concatenated-ranges set type. Compact a few related data structures. BPF --- - Add BPF token support for delegating a subset of BPF subsystem functionality from privileged system-wide daemons such as systemd through special mount options for userns-bound BPF fs to a trusted & unprivileged application. - Introduce bpf_arena which is sparse shared memory region between BPF program and user space where structures inside the arena can have pointers to other areas of the arena, and pointers work seamlessly for both user-space programs and BPF programs. - Introduce may_goto instruction that is a contract between the verifier and the program. The verifier allows the program to loop assuming it's behaving well, but reserves the right to terminate it. - Extend the BPF verifier to enable static subprog calls in spin lock critical sections. - Support registration of struct_ops types from modules which helps projects like fuse-bpf that seeks to implement a new struct_ops type. - Add support for retrieval of cookies for perf/kprobe multi links. - Support arbitrary TCP SYN cookie generation / validation in the TC layer with BPF to allow creating SYN flood handling in BPF firewalls. - Add code generation to inline the bpf_kptr_xchg() helper which improves performance when stashing/popping the allocated BPF objects. Wireless -------- - Add SPP (signaling and payload protected) AMSDU support. - Support wider bandwidth OFDMA, as required for EHT operation. Driver API ---------- - Major overhaul of the Energy Efficient Ethernet internals to support new link modes (2.5GE, 5GE), share more code between drivers (especially those using phylib), and encourage more uniform behavior. Convert and clean up drivers. - Define an API for querying per netdev queue statistics from drivers. - IPSec: account in global stats for fully offloaded sessions. - Create a concept of Ethernet PHY Packages at the Device Tree level, to allow parameterizing the existing PHY package code. - Enable Rx hashing (RSS) on GTP protocol fields. Misc ---- - Improvements and refactoring all over networking selftests. - Create uniform module aliases for TC classifiers, actions, and packet schedulers to simplify creating modprobe policies. - Address all missing MODULE_DESCRIPTION() warnings in networking. - Extend the Netlink descriptions in YAML to cover message encapsulation or "Netlink polymorphism", where interpretation of nested attributes depends on link type, classifier type or some other "class type". Drivers ------- - Ethernet high-speed NICs: - Add a new driver for Marvell's Octeon PCI Endpoint NIC VF. - Intel (100G, ice, idpf): - support E825-C devices - nVidia/Mellanox: - support devices with one port and multiple PCIe links - Broadcom (bnxt): - support n-tuple filters - support configuring the RSS key - Wangxun (ngbe/txgbe): - implement irq_domain for TXGBE's sub-interrupts - Pensando/AMD: - support XDP - optimize queue submission and wakeup handling (+17% bps) - optimize struct layout, saving 28% of memory on queues - Ethernet NICs embedded and virtual: - Google cloud vNIC: - refactor driver to perform memory allocations for new queue config before stopping and freeing the old queue memory - Synopsys (stmmac): - obey queueMaxSDU and implement counters required by 802.1Qbv - Renesas (ravb): - support packet checksum offload - suspend to RAM and runtime PM support - Ethernet switches: - nVidia/Mellanox: - support for nexthop group statistics - Microchip: - ksz8: implement PHY loopback - add support for KSZ8567, a 7-port 10/100Mbps switch - PTP: - New driver for RENESAS FemtoClock3 Wireless clock generator. - Support OCP PTP cards designed and built by Adva. - CAN: - Support recvmsg() flags for own, local and remote traffic on CAN BCM sockets. - Support for esd GmbH PCIe/402 CAN device family. - m_can: - Rx/Tx submission coalescing - wake on frame Rx - WiFi: - Intel (iwlwifi): - enable signaling and payload protected A-MSDUs - support wider-bandwidth OFDMA - support for new devices - bump FW API to 89 for AX devices; 90 for BZ/SC devices - MediaTek (mt76): - mt7915: newer ADIE version support - mt7925: radio temperature sensor support - Qualcomm (ath11k): - support 6 GHz station power modes: Low Power Indoor (LPI), Standard Power) SP and Very Low Power (VLP) - QCA6390 & WCN6855: support 2 concurrent station interfaces - QCA2066 support - Qualcomm (ath12k): - refactoring in preparation for Multi-Link Operation (MLO) support - 1024 Block Ack window size support - firmware-2.bin support - support having multiple identical PCI devices (firmware needs to have ATH12K_FW_FEATURE_MULTI_QRTR_ID) - QCN9274: support split-PHY devices - WCN7850: enable Power Save Mode in station mode - WCN7850: P2P support - RealTek: - rtw88: support for more rtw8811cu and rtw8821cu devices - rtw89: support SCAN_RANDOM_SN and SET_SCAN_DWELL - rtlwifi: speed up USB firmware initialization - rtwl8xxxu: - RTL8188F: concurrent interface support - Channel Switch Announcement (CSA) support in AP mode - Broadcom (brcmfmac): - per-vendor feature support - per-vendor SAE password setup - DMI nvram filename quirk for ACEPC W5 Pro Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmXv0mgACgkQMUZtbf5S IrtgMxAAuRd+WJW++SENr4KxIWhYO1q6Xcxnai43wrNkan9swD24icG8TYALt4f3 yoT6idQvWReAb5JNlh9rUQz8R7E0nJXlvEFn5MtJwcthx2C6wFo/XkJlddlRrT+j c2xGILwLjRhW65LaC0MZ2ECbEERkFz8xcGfK2SWzUgh6KYvPjcRfKFxugpM7xOQK P/Wnqhs4fVRS/Mj/bCcXcO+yhwC121Q3qVeQVjGS0AzEC65hAW87a/kc2BfgcegD EyI9R7mf6criQwX+0awubjfoIdr4oW/8oDVNvUDczkJkbaEVaLMQk9P5x/0XnnVS UHUchWXyI80Q8Rj12uN1/I0h3WtwNQnCRBuLSmtm6GLfCAwbLvp2nGWDnaXiqryW DVKUIHGvqPKjkOOMOVfSvfB3LvkS3xsFVVYiQBQCn0YSs/gtu4CoF2Nty9CiLPbK tTuxUnLdPDZDxU//l0VArZmP8p2JM7XQGJ+JH8GFH4SBTyBR23e0iyPSoyaxjnYn RReDnHMVsrS1i7GPhbqDJWn+uqMSs7N149i0XmmyeqwQHUVSJN3J2BApP2nCaDfy H2lTuYly5FfEezt61NvCE4qr/VsWeEjm1fYlFQ9dFn4pGn+HghyCpw+xD1ZN56DN lujemau5B3kk1UTtAT4ypPqvuqjkRFqpNV2LzsJSk/Js+hApw8Y= =oY52 -----END PGP SIGNATURE----- Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Large effort by Eric to lower rtnl_lock pressure and remove locks: - Make commonly used parts of rtnetlink (address, route dumps etc) lockless, protected by RCU instead of rtnl_lock. - Add a netns exit callback which already holds rtnl_lock, allowing netns exit to take rtnl_lock once in the core instead of once for each driver / callback. - Remove locks / serialization in the socket diag interface. - Remove 6 calls to synchronize_rcu() while holding rtnl_lock. - Remove the dev_base_lock, depend on RCU where necessary. - Support busy polling on a per-epoll context basis. Poll length and budget parameters can be set independently of system defaults. - Introduce struct net_hotdata, to make sure read-mostly global config variables fit in as few cache lines as possible. - Add optional per-nexthop statistics to ease monitoring / debug of ECMP imbalance problems. - Support TCP_NOTSENT_LOWAT in MPTCP. - Ensure that IPv6 temporary addresses' preferred lifetimes are long enough, compared to other configured lifetimes, and at least 2 sec. - Support forwarding of ICMP Error messages in IPSec, per RFC 4301. - Add support for the independent control state machine for bonding per IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled control state machine. - Add "network ID" to MCTP socket APIs to support hosts with multiple disjoint MCTP networks. - Re-use the mono_delivery_time skbuff bit for packets which user space wants to be sent at a specified time. Maintain the timing information while traversing veth links, bridge etc. - Take advantage of MSG_SPLICE_PAGES for RxRPC DATA and ACK packets. - Simplify many places iterating over netdevs by using an xarray instead of a hash table walk (hash table remains in place, for use on fastpaths). - Speed up scanning for expired routes by keeping a dedicated list. - Speed up "generic" XDP by trying harder to avoid large allocations. - Support attaching arbitrary metadata to netconsole messages. Things we sprinkled into general kernel code: - Enforce VM_IOREMAP flag and range in ioremap_page_range and introduce VM_SPARSE kind and vm_area_[un]map_pages (used by bpf_arena). - Rework selftest harness to enable the use of the full range of ksft exit code (pass, fail, skip, xfail, xpass). Netfilter: - Allow userspace to define a table that is exclusively owned by a daemon (via netlink socket aliveness) without auto-removing this table when the userspace program exits. Such table gets marked as orphaned and a restarting management daemon can re-attach/regain ownership. - Speed up element insertions to nftables' concatenated-ranges set type. Compact a few related data structures. BPF: - Add BPF token support for delegating a subset of BPF subsystem functionality from privileged system-wide daemons such as systemd through special mount options for userns-bound BPF fs to a trusted & unprivileged application. - Introduce bpf_arena which is sparse shared memory region between BPF program and user space where structures inside the arena can have pointers to other areas of the arena, and pointers work seamlessly for both user-space programs and BPF programs. - Introduce may_goto instruction that is a contract between the verifier and the program. The verifier allows the program to loop assuming it's behaving well, but reserves the right to terminate it. - Extend the BPF verifier to enable static subprog calls in spin lock critical sections. - Support registration of struct_ops types from modules which helps projects like fuse-bpf that seeks to implement a new struct_ops type. - Add support for retrieval of cookies for perf/kprobe multi links. - Support arbitrary TCP SYN cookie generation / validation in the TC layer with BPF to allow creating SYN flood handling in BPF firewalls. - Add code generation to inline the bpf_kptr_xchg() helper which improves performance when stashing/popping the allocated BPF objects. Wireless: - Add SPP (signaling and payload protected) AMSDU support. - Support wider bandwidth OFDMA, as required for EHT operation. Driver API: - Major overhaul of the Energy Efficient Ethernet internals to support new link modes (2.5GE, 5GE), share more code between drivers (especially those using phylib), and encourage more uniform behavior. Convert and clean up drivers. - Define an API for querying per netdev queue statistics from drivers. - IPSec: account in global stats for fully offloaded sessions. - Create a concept of Ethernet PHY Packages at the Device Tree level, to allow parameterizing the existing PHY package code. - Enable Rx hashing (RSS) on GTP protocol fields. Misc: - Improvements and refactoring all over networking selftests. - Create uniform module aliases for TC classifiers, actions, and packet schedulers to simplify creating modprobe policies. - Address all missing MODULE_DESCRIPTION() warnings in networking. - Extend the Netlink descriptions in YAML to cover message encapsulation or "Netlink polymorphism", where interpretation of nested attributes depends on link type, classifier type or some other "class type". Drivers: - Ethernet high-speed NICs: - Add a new driver for Marvell's Octeon PCI Endpoint NIC VF. - Intel (100G, ice, idpf): - support E825-C devices - nVidia/Mellanox: - support devices with one port and multiple PCIe links - Broadcom (bnxt): - support n-tuple filters - support configuring the RSS key - Wangxun (ngbe/txgbe): - implement irq_domain for TXGBE's sub-interrupts - Pensando/AMD: - support XDP - optimize queue submission and wakeup handling (+17% bps) - optimize struct layout, saving 28% of memory on queues - Ethernet NICs embedded and virtual: - Google cloud vNIC: - refactor driver to perform memory allocations for new queue config before stopping and freeing the old queue memory - Synopsys (stmmac): - obey queueMaxSDU and implement counters required by 802.1Qbv - Renesas (ravb): - support packet checksum offload - suspend to RAM and runtime PM support - Ethernet switches: - nVidia/Mellanox: - support for nexthop group statistics - Microchip: - ksz8: implement PHY loopback - add support for KSZ8567, a 7-port 10/100Mbps switch - PTP: - New driver for RENESAS FemtoClock3 Wireless clock generator. - Support OCP PTP cards designed and built by Adva. - CAN: - Support recvmsg() flags for own, local and remote traffic on CAN BCM sockets. - Support for esd GmbH PCIe/402 CAN device family. - m_can: - Rx/Tx submission coalescing - wake on frame Rx - WiFi: - Intel (iwlwifi): - enable signaling and payload protected A-MSDUs - support wider-bandwidth OFDMA - support for new devices - bump FW API to 89 for AX devices; 90 for BZ/SC devices - MediaTek (mt76): - mt7915: newer ADIE version support - mt7925: radio temperature sensor support - Qualcomm (ath11k): - support 6 GHz station power modes: Low Power Indoor (LPI), Standard Power) SP and Very Low Power (VLP) - QCA6390 & WCN6855: support 2 concurrent station interfaces - QCA2066 support - Qualcomm (ath12k): - refactoring in preparation for Multi-Link Operation (MLO) support - 1024 Block Ack window size support - firmware-2.bin support - support having multiple identical PCI devices (firmware needs to have ATH12K_FW_FEATURE_MULTI_QRTR_ID) - QCN9274: support split-PHY devices - WCN7850: enable Power Save Mode in station mode - WCN7850: P2P support - RealTek: - rtw88: support for more rtw8811cu and rtw8821cu devices - rtw89: support SCAN_RANDOM_SN and SET_SCAN_DWELL - rtlwifi: speed up USB firmware initialization - rtwl8xxxu: - RTL8188F: concurrent interface support - Channel Switch Announcement (CSA) support in AP mode - Broadcom (brcmfmac): - per-vendor feature support - per-vendor SAE password setup - DMI nvram filename quirk for ACEPC W5 Pro" * tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2255 commits) nexthop: Fix splat with CONFIG_DEBUG_PREEMPT=y nexthop: Fix out-of-bounds access during attribute validation nexthop: Only parse NHA_OP_FLAGS for dump messages that require it nexthop: Only parse NHA_OP_FLAGS for get messages that require it bpf: move sleepable flag from bpf_prog_aux to bpf_prog bpf: hardcode BPF_PROG_PACK_SIZE to 2MB * num_possible_nodes() selftests/bpf: Add kprobe multi triggering benchmarks ptp: Move from simple ida to xarray vxlan: Remove generic .ndo_get_stats64 vxlan: Do not alloc tstats manually devlink: Add comments to use netlink gen tool nfp: flower: handle acti_netdevs allocation failure net/packet: Add getsockopt support for PACKET_COPY_THRESH net/netlink: Add getsockopt support for NETLINK_LISTEN_ALL_NSID selftests/bpf: Add bpf_arena_htab test. selftests/bpf: Add bpf_arena_list test. selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages bpf: Add helper macro bpf_addr_space_cast() libbpf: Recognize __arena global variables. bpftool: Recognize arena map type ... |
||
![]() |
d08c407f71 |
A large set of updates and features for timers and timekeeping:
- The hierarchical timer pull model When timer wheel timers are armed they are placed into the timer wheel of a CPU which is likely to be busy at the time of expiry. This is done to avoid wakeups on potentially idle CPUs. This is wrong in several aspects: 1) The heuristics to select the target CPU are wrong by definition as the chance to get the prediction right is close to zero. 2) Due to #1 it is possible that timers are accumulated on a single target CPU 3) The required computation in the enqueue path is just overhead for dubious value especially under the consideration that the vast majority of timer wheel timers are either canceled or rearmed before they expire. The timer pull model avoids the above by removing the target computation on enqueue and queueing timers always on the CPU on which they get armed. This is achieved by having separate wheels for CPU pinned timers and global timers which do not care about where they expire. As long as a CPU is busy it handles both the pinned and the global timers which are queued on the CPU local timer wheels. When a CPU goes idle it evaluates its own timer wheels: - If the first expiring timer is a pinned timer, then the global timers can be ignored as the CPU will wake up before they expire. - If the first expiring timer is a global timer, then the expiry time is propagated into the timer pull hierarchy and the CPU makes sure to wake up for the first pinned timer. The timer pull hierarchy organizes CPUs in groups of eight at the lowest level and at the next levels groups of eight groups up to the point where no further aggregation of groups is required, i.e. the number of levels is log8(NR_CPUS). The magic number of eight has been established by experimention, but can be adjusted if needed. In each group one busy CPU acts as the migrator. It's only one CPU to avoid lock contention on remote timer wheels. The migrator CPU checks in its own timer wheel handling whether there are other CPUs in the group which have gone idle and have global timers to expire. If there are global timers to expire, the migrator locks the remote CPU timer wheel and handles the expiry. Depending on the group level in the hierarchy this handling can require to walk the hierarchy downwards to the CPU level. Special care is taken when the last CPU goes idle. At this point the CPU is the systemwide migrator at the top of the hierarchy and it therefore cannot delegate to the hierarchy. It needs to arm its own timer device to expire either at the first expiring timer in the hierarchy or at the first CPU local timer, which ever expires first. This completely removes the overhead from the enqueue path, which is e.g. for networking a true hotpath and trades it for a slightly more complex idle path. This has been in development for a couple of years and the final series has been extensively tested by various teams from silicon vendors and ran through extensive CI. There have been slight performance improvements observed on network centric workloads and an Intel team confirmed that this allows them to power down a die completely on a mult-die socket for the first time in a mostly idle scenario. There is only one outstanding ~1.5% regression on a specific overloaded netperf test which is currently investigated, but the rest is either positive or neutral performance wise and positive on the power management side. - Fixes for the timekeeping interpolation code for cross-timestamps: cross-timestamps are used for PTP to get snapshots from hardware timers and interpolated them back to clock MONOTONIC. The changes address a few corner cases in the interpolation code which got the math and logic wrong. - Simplifcation of the clocksource watchdog retry logic to automatically adjust to handle larger systems correctly instead of having more incomprehensible command line parameters. - Treewide consolidation of the VDSO data structures. - The usual small improvements and cleanups all over the place. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmXuAN0THHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoVKXEADIR45rjR1Xtz32js7B53Y65O4WNoOQ 6/ycWcswuGzg/h4QUpPSJ6gOGVmKSWwZi4n0P/VadCiXGSPPm0aUKsoRUt9DZsPY mtj2wjCSXKXiyhTl9OtrZME86ZAIGO1dQXa/sOHsiP5PCjgQkD0b5CYi1+B6eHDt 1/Uo2Tb9g8VAPppq20V5Uo93GrPf642oyi3FCFrR1M112Uuak5DmqHJYiDpreNcG D5SgI+ykSiaUaVyHifvqijoJk0rYXkqEC6evl02477lJ/X0vVo2/M8XPS95BxHST s5Iruo4rP+qeAy8QvhZpoPX59fO0m/AgA7cf77XXAtOpVdLH+bs4ILsEbouAIOtv lsmRkcYt+TpvrZFHPAxks+6g3afuROiDtxD5sXXpVWxvofi8FwWqubdlqdsbw9MP ZCTNyzNyKL47QeDwBfSynYUL1RSyqsphtIwk4oeQklH9rwMAnW21hi30z15hQ0pQ FOVkmcwi79JNvl/G+jRkDzw7r8/zcHshWdSjyUM04CDjjnCDjQOFWSIjEPwbQjjz S4HXpJKJW963dBgs9Z84/Ctw1GwoBk1qedDWDJE1257Qvmo/Wpe/7GddWcazOGnN RRFMzGPbOqBDbjtErOKGU+iCisgNEvz2XK+TI16uRjWde7DxZpiTVYgNDrZ+/Pyh rQ23UBms6ZRR+A== =iQlu -----END PGP SIGNATURE----- Merge tag 'timers-core-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "A large set of updates and features for timers and timekeeping: - The hierarchical timer pull model When timer wheel timers are armed they are placed into the timer wheel of a CPU which is likely to be busy at the time of expiry. This is done to avoid wakeups on potentially idle CPUs. This is wrong in several aspects: 1) The heuristics to select the target CPU are wrong by definition as the chance to get the prediction right is close to zero. 2) Due to #1 it is possible that timers are accumulated on a single target CPU 3) The required computation in the enqueue path is just overhead for dubious value especially under the consideration that the vast majority of timer wheel timers are either canceled or rearmed before they expire. The timer pull model avoids the above by removing the target computation on enqueue and queueing timers always on the CPU on which they get armed. This is achieved by having separate wheels for CPU pinned timers and global timers which do not care about where they expire. As long as a CPU is busy it handles both the pinned and the global timers which are queued on the CPU local timer wheels. When a CPU goes idle it evaluates its own timer wheels: - If the first expiring timer is a pinned timer, then the global timers can be ignored as the CPU will wake up before they expire. - If the first expiring timer is a global timer, then the expiry time is propagated into the timer pull hierarchy and the CPU makes sure to wake up for the first pinned timer. The timer pull hierarchy organizes CPUs in groups of eight at the lowest level and at the next levels groups of eight groups up to the point where no further aggregation of groups is required, i.e. the number of levels is log8(NR_CPUS). The magic number of eight has been established by experimention, but can be adjusted if needed. In each group one busy CPU acts as the migrator. It's only one CPU to avoid lock contention on remote timer wheels. The migrator CPU checks in its own timer wheel handling whether there are other CPUs in the group which have gone idle and have global timers to expire. If there are global timers to expire, the migrator locks the remote CPU timer wheel and handles the expiry. Depending on the group level in the hierarchy this handling can require to walk the hierarchy downwards to the CPU level. Special care is taken when the last CPU goes idle. At this point the CPU is the systemwide migrator at the top of the hierarchy and it therefore cannot delegate to the hierarchy. It needs to arm its own timer device to expire either at the first expiring timer in the hierarchy or at the first CPU local timer, which ever expires first. This completely removes the overhead from the enqueue path, which is e.g. for networking a true hotpath and trades it for a slightly more complex idle path. This has been in development for a couple of years and the final series has been extensively tested by various teams from silicon vendors and ran through extensive CI. There have been slight performance improvements observed on network centric workloads and an Intel team confirmed that this allows them to power down a die completely on a mult-die socket for the first time in a mostly idle scenario. There is only one outstanding ~1.5% regression on a specific overloaded netperf test which is currently investigated, but the rest is either positive or neutral performance wise and positive on the power management side. - Fixes for the timekeeping interpolation code for cross-timestamps: cross-timestamps are used for PTP to get snapshots from hardware timers and interpolated them back to clock MONOTONIC. The changes address a few corner cases in the interpolation code which got the math and logic wrong. - Simplifcation of the clocksource watchdog retry logic to automatically adjust to handle larger systems correctly instead of having more incomprehensible command line parameters. - Treewide consolidation of the VDSO data structures. - The usual small improvements and cleanups all over the place" * tag 'timers-core-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (62 commits) timer/migration: Fix quick check reporting late expiry tick/sched: Fix build failure for CONFIG_NO_HZ_COMMON=n vdso/datapage: Quick fix - use asm/page-def.h for ARM64 timers: Assert no next dyntick timer look-up while CPU is offline tick: Assume timekeeping is correctly handed over upon last offline idle call tick: Shut down low-res tick from dying CPU tick: Split nohz and highres features from nohz_mode tick: Move individual bit features to debuggable mask accesses tick: Move got_idle_tick away from common flags tick: Assume the tick can't be stopped in NOHZ_MODE_INACTIVE mode tick: Move broadcast cancellation up to CPUHP_AP_TICK_DYING tick: Move tick cancellation up to CPUHP_AP_TICK_DYING tick: Start centralizing tick related CPU hotplug operations tick/sched: Don't clear ts::next_tick again in can_stop_idle_tick() tick/sched: Rename tick_nohz_stop_sched_tick() to tick_nohz_full_stop_tick() tick: Use IS_ENABLED() whenever possible tick/sched: Remove useless oneshot ifdeffery tick/nohz: Remove duplicate between lowres and highres handlers tick/nohz: Remove duplicate between tick_nohz_switch_to_nohz() and tick_setup_sched_timer() hrtimer: Select housekeeping CPU during migration ... |
||
![]() |
d7bca9199a |
mm: Introduce vmap_page_range() to map pages in PCI address space
ioremap_page_range() should be used for ranges within vmalloc range only.
The vmalloc ranges are allocated by get_vm_area(). PCI has "resource"
allocator that manages PCI_IOBASE, IO_SPACE_LIMIT address range, hence
introduce vmap_page_range() to be used exclusively to map pages
in PCI address space.
Fixes:
|
||
![]() |
199cc14cb4 |
LoongArch: Add kernel livepatching support
The arch-specified function ftrace_regs_set_instruction_pointer() has been implemented in arch/loongarch/include/asm/ftrace.h, so here only implement arch_stack_walk_reliable() function. Here are the test logs: [root@linux fedora]# cat /proc/cmdline BOOT_IMAGE=/vmlinuz-6.8.0-rc2 root=/dev/sda3 [root@linux fedora]# modprobe livepatch-sample [root@linux fedora]# cat /proc/cmdline this has been live patched [root@linux fedora]# echo 0 > /sys/kernel/livepatch/livepatch_sample/enabled [root@linux fedora]# rmmod livepatch_sample [root@linux fedora]# cat /proc/cmdline BOOT_IMAGE=/vmlinuz-6.8.0-rc2 root=/dev/sda3 [root@linux fedora]# dmesg -t | tail -5 livepatch: enabling patch 'livepatch_sample' livepatch: 'livepatch_sample': starting patching transition livepatch: 'livepatch_sample': patching complete livepatch: 'livepatch_sample': starting unpatching transition livepatch: 'livepatch_sample': unpatching complete Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
cb8a2ef084 |
LoongArch: Add ORC stack unwinder support
The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is similar in concept to a DWARF unwinder. The difference is that the format of the ORC data is much simpler than DWARF, which in turn allows the ORC unwinder to be much simpler and faster. The ORC data consists of unwind tables which are generated by objtool. After analyzing all the code paths of a .o file, it determines information about the stack state at each instruction address in the file and outputs that information to the .orc_unwind and .orc_unwind_ip sections. The per-object ORC sections are combined at link time and are sorted and post-processed at boot time. The unwinder uses the resulting data to correlate instruction addresses with their stack states at run time. Most of the logic are similar with x86, in order to get ra info before ra is saved into stack, add ra_reg and ra_offset into orc_entry. At the same time, modify some arch-specific code to silence the objtool warnings. Co-developed-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Co-developed-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
ea034d0b07 |
loongarch, crash: wrap crash dumping code into crash related ifdefs
Now crash codes under kernel/ folder has been split out from kexec code, crash dumping can be separated from kexec reboot in config items on loongarch with some adjustments. Here use IS_ENABLED(CONFIG_CRASH_RESERVE) check to decide if compiling in the crashkernel reservation code. Link: https://lkml.kernel.org/r/20240124051254.67105-15-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Pingfan Liu <piliu@redhat.com> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Michael Kelley <mhklinux@outlook.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
![]() |
9fa304b9f8 |
LoongArch: Call early_init_fdt_scan_reserved_mem() earlier
The unflatten_and_copy_device_tree() function contains a call to
memblock_alloc(). This means that memblock is allocating memory before
any of the reserved memory regions are set aside in the arch_mem_init()
function which calls early_init_fdt_scan_reserved_mem(). Therefore,
there is a possibility for memblock to allocate from any of the
reserved memory regions.
Hence, move the call to early_init_fdt_scan_reserved_mem() to be earlier
in the init sequence, so that the reserved memory regions are set aside
before any allocations are done using memblock.
Cc: stable@vger.kernel.org
Fixes:
|
||
![]() |
752cd08da3 |
LoongArch: Update cpu_sibling_map when disabling nonboot CPUs
Update cpu_sibling_map when disabling nonboot CPUs by defining & calling clear_cpu_sibling_map(), otherwise we get such errors on SMT systems: jump label: negative count! WARNING: CPU: 6 PID: 45 at kernel/jump_label.c:263 __static_key_slow_dec_cpuslocked+0xec/0x100 CPU: 6 PID: 45 Comm: cpuhp/6 Not tainted 6.8.0-rc5+ #1340 pc 90000000004c302c ra 90000000004c302c tp 90000001005bc000 sp 90000001005bfd20 a0 000000000000001b a1 900000000224c278 a2 90000001005bfb58 a3 900000000224c280 a4 900000000224c278 a5 90000001005bfb50 a6 0000000000000001 a7 0000000000000001 t0 ce87a4763eb5234a t1 ce87a4763eb5234a t2 0000000000000000 t3 0000000000000000 t4 0000000000000006 t5 0000000000000000 t6 0000000000000064 t7 0000000000001964 t8 000000000009ebf6 u0 9000000001f2a068 s9 0000000000000000 s0 900000000246a2d8 s1 ffffffffffffffff s2 ffffffffffffffff s3 90000000021518c0 s4 0000000000000040 s5 9000000002151058 s6 9000000009828e40 s7 00000000000000b4 s8 0000000000000006 ra: 90000000004c302c __static_key_slow_dec_cpuslocked+0xec/0x100 ERA: 90000000004c302c __static_key_slow_dec_cpuslocked+0xec/0x100 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 00000004 (PPLV0 +PIE -PWE) EUEN: 00000000 (-FPE -SXE -ASXE -BTE) ECFG: 00071c1c (LIE=2-4,10-12 VS=7) ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) CPU: 6 PID: 45 Comm: cpuhp/6 Not tainted 6.8.0-rc5+ #1340 Stack : 0000000000000000 900000000203f258 900000000179afc8 90000001005bc000 90000001005bf980 0000000000000000 90000001005bf988 9000000001fe0be0 900000000224c280 900000000224c278 90000001005bf8c0 0000000000000001 0000000000000001 ce87a4763eb5234a 0000000007f38000 90000001003f8cc0 0000000000000000 0000000000000006 0000000000000000 4c206e6f73676e6f 6f4c203a656d616e 000000000009ec99 0000000007f38000 0000000000000000 900000000214b000 9000000001fe0be0 0000000000000004 0000000000000000 0000000000000107 0000000000000009 ffffffffffafdabe 00000000000000b4 0000000000000006 90000000004c302c 9000000000224528 00005555939a0c7c 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1c ... Call Trace: [<9000000000224528>] show_stack+0x48/0x1a0 [<900000000179afc8>] dump_stack_lvl+0x78/0xa0 [<9000000000263ed0>] __warn+0x90/0x1a0 [<90000000017419b8>] report_bug+0x1b8/0x280 [<900000000179c564>] do_bp+0x264/0x420 [<90000000004c302c>] __static_key_slow_dec_cpuslocked+0xec/0x100 [<90000000002b4d7c>] sched_cpu_deactivate+0x2fc/0x300 [<9000000000266498>] cpuhp_invoke_callback+0x178/0x8a0 [<9000000000267f70>] cpuhp_thread_fun+0xf0/0x240 [<90000000002a117c>] smpboot_thread_fn+0x1dc/0x2e0 [<900000000029a720>] kthread+0x140/0x160 [<9000000000222288>] ret_from_kernel_thread+0xc/0xa4 Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
1001db6c42 |
LoongArch: Disable IRQ before init_fn() for nonboot CPUs
Disable IRQ before init_fn() for nonboot CPUs when hotplug, in order to silence such warnings (and also avoid potential errors due to unexpected interrupts): WARNING: CPU: 1 PID: 0 at kernel/rcu/tree.c:4503 rcu_cpu_starting+0x214/0x280 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.6.17+ #1198 pc 90000000048e3334 ra 90000000047bd56c tp 900000010039c000 sp 900000010039fdd0 a0 0000000000000001 a1 0000000000000006 a2 900000000802c040 a3 0000000000000000 a4 0000000000000001 a5 0000000000000004 a6 0000000000000000 a7 90000000048e3f4c t0 0000000000000001 t1 9000000005c70968 t2 0000000004000000 t3 000000000005e56e t4 00000000000002e4 t5 0000000000001000 t6 ffffffff80000000 t7 0000000000040000 t8 9000000007931638 u0 0000000000000006 s9 0000000000000004 s0 0000000000000001 s1 9000000006356ac0 s2 9000000007244000 s3 0000000000000001 s4 0000000000000001 s5 900000000636f000 s6 7fffffffffffffff s7 9000000002123940 s8 9000000001ca55f8 ra: 90000000047bd56c tlb_init+0x24c/0x528 ERA: 90000000048e3334 rcu_cpu_starting+0x214/0x280 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 00000000 (PPLV0 -PIE -PWE) EUEN: 00000000 (-FPE -SXE -ASXE -BTE) ECFG: 00071000 (LIE=12 VS=7) ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.6.17+ #1198 Stack : 0000000000000000 9000000006375000 9000000005b61878 900000010039c000 900000010039fa30 0000000000000000 900000010039fa38 900000000619a140 9000000006456888 9000000006456880 900000010039f950 0000000000000001 0000000000000001 cb0cb028ec7e52e1 0000000002b90000 9000000100348700 0000000000000000 0000000000000001 ffffffff916d12f1 0000000000000003 0000000000040000 9000000007930370 0000000002b90000 0000000000000004 9000000006366000 900000000619a140 0000000000000000 0000000000000004 0000000000000000 0000000000000009 ffffffffffc681f2 9000000002123940 9000000001ca55f8 9000000006366000 90000000047a4828 00007ffff057ded8 00000000000000b0 0000000000000000 0000000000000000 0000000000071000 ... Call Trace: [<90000000047a4828>] show_stack+0x48/0x1a0 [<9000000005b61874>] dump_stack_lvl+0x84/0xcc [<90000000047f60ac>] __warn+0x8c/0x1e0 [<9000000005b0ab34>] report_bug+0x1b4/0x280 [<9000000005b63110>] do_bp+0x2d0/0x480 [<90000000047a2e20>] handle_bp+0x120/0x1c0 [<90000000048e3334>] rcu_cpu_starting+0x214/0x280 [<90000000047bd568>] tlb_init+0x248/0x528 [<90000000047a4c44>] per_cpu_trap_init+0x124/0x160 [<90000000047a19f4>] cpu_probe+0x494/0xa00 [<90000000047b551c>] start_secondary+0x3c/0xc0 [<9000000005b66134>] smpboot_entry+0x50/0x58 Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
8d87d2cd1d |
LoongArch: vdso: Use generic union vdso_data_store
There is already a generic union definition for vdso_data_store in vdso datapage header. Use this definition to prevent code duplication. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20240219153939.75719-9-anna-maria@linutronix.de |
||
![]() |
4551b30525 |
LoongArch: Change acpi_core_pic[NR_CPUS] to acpi_core_pic[MAX_CORE_PIC]
With default config, the value of NR_CPUS is 64. When HW platform has more then 64 cpus, system will crash on these platforms. MAX_CORE_PIC is the maximum cpu number in MADT table (max physical number) which can exceed the supported maximum cpu number (NR_CPUS, max logical number), but kernel should not crash. Kernel should boot cpus with NR_CPUS, let the remainder cpus stay in BIOS. The potential crash reason is that the array acpi_core_pic[NR_CPUS] can be overflowed when parsing MADT table, and it is obvious that CORE_PIC should be corresponding to physical core rather than logical core, so it is better to define the array as acpi_core_pic[MAX_CORE_PIC]. With the patch, system can boot up 64 vcpus with qemu parameter -smp 128, otherwise system will crash with the following message. [ 0.000000] CPU 0 Unable to handle kernel paging request at virtual address 0000420000004259, era == 90000000037a5f0c, ra == 90000000037a46ec [ 0.000000] Oops[#1]: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-rc2+ #192 [ 0.000000] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 [ 0.000000] pc 90000000037a5f0c ra 90000000037a46ec tp 9000000003c90000 sp 9000000003c93d60 [ 0.000000] a0 0000000000000019 a1 9000000003d93bc0 a2 0000000000000000 a3 9000000003c93bd8 [ 0.000000] a4 9000000003c93a74 a5 9000000083c93a67 a6 9000000003c938f0 a7 0000000000000005 [ 0.000000] t0 0000420000004201 t1 0000000000000000 t2 0000000000000001 t3 0000000000000001 [ 0.000000] t4 0000000000000003 t5 0000000000000000 t6 0000000000000030 t7 0000000000000063 [ 0.000000] t8 0000000000000014 u0 ffffffffffffffff s9 0000000000000000 s0 9000000003caee98 [ 0.000000] s1 90000000041b0480 s2 9000000003c93da0 s3 9000000003c93d98 s4 9000000003c93d90 [ 0.000000] s5 9000000003caa000 s6 000000000a7fd000 s7 000000000f556b60 s8 000000000e0a4330 [ 0.000000] ra: 90000000037a46ec platform_init+0x214/0x250 [ 0.000000] ERA: 90000000037a5f0c efi_runtime_init+0x30/0x94 [ 0.000000] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 0.000000] PRMD: 00000000 (PPLV0 -PIE -PWE) [ 0.000000] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 0.000000] ECFG: 00070800 (LIE=11 VS=7) [ 0.000000] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) [ 0.000000] BADV: 0000420000004259 [ 0.000000] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) [ 0.000000] Modules linked in: [ 0.000000] Process swapper (pid: 0, threadinfo=(____ptrval____), task=(____ptrval____)) [ 0.000000] Stack : 9000000003c93a14 9000000003800898 90000000041844f8 90000000037a46ec [ 0.000000] 000000000a7fd000 0000000008290000 0000000000000000 0000000000000000 [ 0.000000] 0000000000000000 0000000000000000 00000000019d8000 000000000f556b60 [ 0.000000] 000000000a7fd000 000000000f556b08 9000000003ca7700 9000000003800000 [ 0.000000] 9000000003c93e50 9000000003800898 9000000003800108 90000000037a484c [ 0.000000] 000000000e0a4330 000000000f556b60 000000000a7fd000 000000000f556b08 [ 0.000000] 9000000003ca7700 9000000004184000 0000000000200000 000000000e02b018 [ 0.000000] 000000000a7fd000 90000000037a0790 9000000003800108 0000000000000000 [ 0.000000] 0000000000000000 000000000e0a4330 000000000f556b60 000000000a7fd000 [ 0.000000] 000000000f556b08 000000000eaae298 000000000eaa5040 0000000000200000 [ 0.000000] ... [ 0.000000] Call Trace: [ 0.000000] [<90000000037a5f0c>] efi_runtime_init+0x30/0x94 [ 0.000000] [<90000000037a46ec>] platform_init+0x214/0x250 [ 0.000000] [<90000000037a484c>] setup_arch+0x124/0x45c [ 0.000000] [<90000000037a0790>] start_kernel+0x90/0x670 [ 0.000000] [<900000000378b0d8>] kernel_entry+0xd8/0xdc Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
5056c596c3 |
LoongArch/smp: Call rcutree_report_cpu_starting() at tlb_init()
Machines which have more than 8 nodes fail to boot SMP after commit |
||
![]() |
24fdd51899 |
LoongArch changes for v6.8
1, Raise minimum clang version to 18.0.0; 2, Enable initial Rust support for LoongArch; 3, Add built-in dtb support for LoongArch; 4, Use generic interface to support crashkernel=X,[high,low]; 5, Some bug fixes and other small changes; 6, Update the default config file. -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmWnW9cWHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImel3CD/0Wnd2VOhoPubJkCXd+v7SdPDFB +BlkevAdmKQXkxNVXHRwfirsEBnUdQTfSN/5hMd69ZWUTayYq3WFxOcaPs27AAyn cXmGAzxfCjanSj+zxK8Gcmef5kppx3PRSbFdnWgc42Povu0xTOH3M31HXx5WXGtv hZK439DspNGHlF1Bsbs3J8xbS76jc/HDZAqnIjLuefQUaWM8nhsYxJIwVeGKUX1T IyEgBwhHhsY9ho/86yk8VXgordAN4dnMVmAHbR63HqjLo/8sck4IiPNxWKFCHex8 vgxp0zGxfBBts284EfSofDQHrSrrWl4+e2fW2QJ81BBDSS0wPCs4TAnzH+x9X7Wb MJuh8WIJqhfXdPFxs5fdnUeykEm1V/oWFfkWORk4jbQkpY9aZbk/iv6uxsmRhmhv 2WPWvjF+7B2zSXtMcjgm71ymb/nU95W2FZO02GlwTnbGJRKA2xLkjn9rCXoHWjd3 IlxgIgZJ1vkPvFPS/sbekaTUEG+6/qTPGGa2Ol3Q5ZTTLk9serfDa8ay1xCZeOny +fRBgLsuQAOGO2pvxfXjs+uvboZNUHeKrAi7XeR61GcbNpQDkjuwNJXQMiMQ+f66 jWM6H+hV+6sQ/W43KVrGCyBqTX4J9PSN/gX/Cq0PL74Yheop6neYXZTl5uDNYDe9 WYxiS9j/FoYgj8lxYQ== =GzFR -----END PGP SIGNATURE----- Merge tag 'loongarch-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Raise minimum clang version to 18.0.0 - Enable initial Rust support for LoongArch - Add built-in dtb support for LoongArch - Use generic interface to support crashkernel=X,[high,low] - Some bug fixes and other small changes - Update the default config file. * tag 'loongarch-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (22 commits) MAINTAINERS: Add BPF JIT for LOONGARCH entry LoongArch: Update Loongson-3 default config file LoongArch: BPF: Prevent out-of-bounds memory access LoongArch: BPF: Support 64-bit pointers to kfuncs LoongArch: Fix definition of ftrace_regs_set_instruction_pointer() LoongArch: Use generic interface to support crashkernel=X,[high,low] LoongArch: Fix and simplify fcsr initialization on execve() LoongArch: Let cores_io_master cover the largest NR_CPUS LoongArch: Change SHMLBA from SZ_64K to PAGE_SIZE LoongArch: Add a missing call to efi_esrt_init() LoongArch: Parsing CPU-related information from DTS LoongArch: dts: DeviceTree for Loongson-2K2000 LoongArch: dts: DeviceTree for Loongson-2K1000 LoongArch: dts: DeviceTree for Loongson-2K0500 LoongArch: Allow device trees be built into the kernel dt-bindings: interrupt-controller: loongson,liointc: Fix dtbs_check warning for interrupt-names dt-bindings: interrupt-controller: loongson,liointc: Fix dtbs_check warning for reg-names dt-bindings: loongarch: Add Loongson SoC boards compatibles dt-bindings: loongarch: Add CPU bindings for LoongArch LoongArch: Enable initial Rust support ... |
||
![]() |
80955ae955 |
Driver core changes for 6.8-rc1
Here are the set of driver core and kernfs changes for 6.8-rc1. Nothing major in here this release cycle, just lots of small cleanups and some tweaks on kernfs that in the very end, got reverted and will come back in a safer way next release cycle. Included in here are: - more driver core 'const' cleanups and fixes - fw_devlink=rpm is now the default behavior - kernfs tiny changes to remove some string functions - cpu handling in the driver core is updated to work better on many systems that add topologies and cpus after booting - other minor changes and cleanups All of the cpu handling patches have been acked by the respective maintainers and are coming in here in one series. Everything has been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZaeOrg8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ymtcwCffzvKKkSY9qAp6+0v2WQNkZm1JWoAoJCPYUwF If6wEoPLWvRfKx4gIoq9 =D96r -----END PGP SIGNATURE----- Merge tag 'driver-core-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here are the set of driver core and kernfs changes for 6.8-rc1. Nothing major in here this release cycle, just lots of small cleanups and some tweaks on kernfs that in the very end, got reverted and will come back in a safer way next release cycle. Included in here are: - more driver core 'const' cleanups and fixes - fw_devlink=rpm is now the default behavior - kernfs tiny changes to remove some string functions - cpu handling in the driver core is updated to work better on many systems that add topologies and cpus after booting - other minor changes and cleanups All of the cpu handling patches have been acked by the respective maintainers and are coming in here in one series. Everything has been in linux-next for a while with no reported issues" * tag 'driver-core-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (51 commits) Revert "kernfs: convert kernfs_idr_lock to an irq safe raw spinlock" kernfs: convert kernfs_idr_lock to an irq safe raw spinlock class: fix use-after-free in class_register() PM: clk: make pm_clk_add_notifier() take a const pointer EDAC: constantify the struct bus_type usage kernfs: fix reference to renamed function driver core: device.h: fix Excess kernel-doc description warning driver core: class: fix Excess kernel-doc description warning driver core: mark remaining local bus_type variables as const driver core: container: make container_subsys const driver core: bus: constantify subsys_register() calls driver core: bus: make bus_sort_breadthfirst() take a const pointer kernfs: d_obtain_alias(NULL) will do the right thing... driver core: Better advertise dev_err_probe() kernfs: Convert kernfs_path_from_node_locked() from strlcpy() to strscpy() kernfs: Convert kernfs_name_locked() from strlcpy() to strscpy() kernfs: Convert kernfs_walk_ns() from strlcpy() to strscpy() initramfs: Expose retained initrd as sysfs file fs/kernfs/dir: obey S_ISGID kernel/cgroup: use kernfs_create_dir_ns() ... |
||
![]() |
09d1c6a80f |
Generic:
- Use memdup_array_user() to harden against overflow. - Unconditionally advertise KVM_CAP_DEVICE_CTRL for all architectures. - Clean up Kconfigs that all KVM architectures were selecting - New functionality around "guest_memfd", a new userspace API that creates an anonymous file and returns a file descriptor that refers to it. guest_memfd files are bound to their owning virtual machine, cannot be mapped, read, or written by userspace, and cannot be resized. guest_memfd files do however support PUNCH_HOLE, which can be used to switch a memory area between guest_memfd and regular anonymous memory. - New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify per-page attributes for a given page of guest memory; right now the only attribute is whether the guest expects to access memory via guest_memfd or not, which in Confidential SVMs backed by SEV-SNP, TDX or ARM64 pKVM is checked by firmware or hypervisor that guarantees confidentiality (AMD PSP, Intel TDX module, or EL2 in the case of pKVM). x86: - Support for "software-protected VMs" that can use the new guest_memfd and page attributes infrastructure. This is mostly useful for testing, since there is no pKVM-like infrastructure to provide a meaningfully reduced TCB. - Fix a relatively benign off-by-one error when splitting huge pages during CLEAR_DIRTY_LOG. - Fix a bug where KVM could incorrectly test-and-clear dirty bits in non-leaf TDP MMU SPTEs if a racing thread replaces a huge SPTE with a non-huge SPTE. - Use more generic lockdep assertions in paths that don't actually care about whether the caller is a reader or a writer. - let Xen guests opt out of having PV clock reported as "based on a stable TSC", because some of them don't expect the "TSC stable" bit (added to the pvclock ABI by KVM, but never set by Xen) to be set. - Revert a bogus, made-up nested SVM consistency check for TLB_CONTROL. - Advertise flush-by-ASID support for nSVM unconditionally, as KVM always flushes on nested transitions, i.e. always satisfies flush requests. This allows running bleeding edge versions of VMware Workstation on top of KVM. - Sanity check that the CPU supports flush-by-ASID when enabling SEV support. - On AMD machines with vNMI, always rely on hardware instead of intercepting IRET in some cases to detect unmasking of NMIs - Support for virtualizing Linear Address Masking (LAM) - Fix a variety of vPMU bugs where KVM fail to stop/reset counters and other state prior to refreshing the vPMU model. - Fix a double-overflow PMU bug by tracking emulated counter events using a dedicated field instead of snapshotting the "previous" counter. If the hardware PMC count triggers overflow that is recognized in the same VM-Exit that KVM manually bumps an event count, KVM would pend PMIs for both the hardware-triggered overflow and for KVM-triggered overflow. - Turn off KVM_WERROR by default for all configs so that it's not inadvertantly enabled by non-KVM developers, which can be problematic for subsystems that require no regressions for W=1 builds. - Advertise all of the host-supported CPUID bits that enumerate IA32_SPEC_CTRL "features". - Don't force a masterclock update when a vCPU synchronizes to the current TSC generation, as updating the masterclock can cause kvmclock's time to "jump" unexpectedly, e.g. when userspace hotplugs a pre-created vCPU. - Use RIP-relative address to read kvm_rebooting in the VM-Enter fault paths, partly as a super minor optimization, but mostly to make KVM play nice with position independent executable builds. - Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on CONFIG_HYPERV as a minor optimization, and to self-document the code. - Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV "emulation" at build time. ARM64: - LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB base granule sizes. Branch shared with the arm64 tree. - Large Fine-Grained Trap rework, bringing some sanity to the feature, although there is more to come. This comes with a prefix branch shared with the arm64 tree. - Some additional Nested Virtualization groundwork, mostly introducing the NV2 VNCR support and retargetting the NV support to that version of the architecture. - A small set of vgic fixes and associated cleanups. Loongarch: - Optimization for memslot hugepage checking - Cleanup and fix some HW/SW timer issues - Add LSX/LASX (128bit/256bit SIMD) support RISC-V: - KVM_GET_REG_LIST improvement for vector registers - Generate ISA extension reg_list using macros in get-reg-list selftest - Support for reporting steal time along with selftest s390: - Bugfixes Selftests: - Fix an annoying goof where the NX hugepage test prints out garbage instead of the magic token needed to run the test. - Fix build errors when a header is delete/moved due to a missing flag in the Makefile. - Detect if KVM bugged/killed a selftest's VM and print out a helpful message instead of complaining that a random ioctl() failed. - Annotate the guest printf/assert helpers with __printf(), and fix the various bugs that were lurking due to lack of said annotation. There are two non-KVM patches buried in the middle of guest_memfd support: fs: Rename anon_inode_getfile_secure() and anon_inode_getfd_secure() mm: Add AS_UNMOVABLE to mark mapping as completely unmovable The first is small and mostly suggested-by Christian Brauner; the second a bit less so but it was written by an mm person (Vlastimil Babka). -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmWcMWkUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroO15gf/WLmmg3SET6Uzw9iEq2xo28831ZA+ 6kpILfIDGKozV5safDmMvcInlc/PTnqOFrsKyyN4kDZ+rIJiafJdg/loE0kPXBML wdR+2ix5kYI1FucCDaGTahskBDz8Lb/xTpwGg9BFLYFNmuUeHc74o6GoNvr1uliE 4kLZL2K6w0cSMPybUD+HqGaET80ZqPwecv+s1JL+Ia0kYZJONJifoHnvOUJ7DpEi rgudVdgzt3EPjG0y1z6MjvDBXTCOLDjXajErlYuZD3Ej8N8s59Dh2TxOiDNTLdP4 a4zjRvDmgyr6H6sz+upvwc7f4M4p+DBvf+TkWF54mbeObHUYliStqURIoA== =66Ws -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "Generic: - Use memdup_array_user() to harden against overflow. - Unconditionally advertise KVM_CAP_DEVICE_CTRL for all architectures. - Clean up Kconfigs that all KVM architectures were selecting - New functionality around "guest_memfd", a new userspace API that creates an anonymous file and returns a file descriptor that refers to it. guest_memfd files are bound to their owning virtual machine, cannot be mapped, read, or written by userspace, and cannot be resized. guest_memfd files do however support PUNCH_HOLE, which can be used to switch a memory area between guest_memfd and regular anonymous memory. - New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify per-page attributes for a given page of guest memory; right now the only attribute is whether the guest expects to access memory via guest_memfd or not, which in Confidential SVMs backed by SEV-SNP, TDX or ARM64 pKVM is checked by firmware or hypervisor that guarantees confidentiality (AMD PSP, Intel TDX module, or EL2 in the case of pKVM). x86: - Support for "software-protected VMs" that can use the new guest_memfd and page attributes infrastructure. This is mostly useful for testing, since there is no pKVM-like infrastructure to provide a meaningfully reduced TCB. - Fix a relatively benign off-by-one error when splitting huge pages during CLEAR_DIRTY_LOG. - Fix a bug where KVM could incorrectly test-and-clear dirty bits in non-leaf TDP MMU SPTEs if a racing thread replaces a huge SPTE with a non-huge SPTE. - Use more generic lockdep assertions in paths that don't actually care about whether the caller is a reader or a writer. - let Xen guests opt out of having PV clock reported as "based on a stable TSC", because some of them don't expect the "TSC stable" bit (added to the pvclock ABI by KVM, but never set by Xen) to be set. - Revert a bogus, made-up nested SVM consistency check for TLB_CONTROL. - Advertise flush-by-ASID support for nSVM unconditionally, as KVM always flushes on nested transitions, i.e. always satisfies flush requests. This allows running bleeding edge versions of VMware Workstation on top of KVM. - Sanity check that the CPU supports flush-by-ASID when enabling SEV support. - On AMD machines with vNMI, always rely on hardware instead of intercepting IRET in some cases to detect unmasking of NMIs - Support for virtualizing Linear Address Masking (LAM) - Fix a variety of vPMU bugs where KVM fail to stop/reset counters and other state prior to refreshing the vPMU model. - Fix a double-overflow PMU bug by tracking emulated counter events using a dedicated field instead of snapshotting the "previous" counter. If the hardware PMC count triggers overflow that is recognized in the same VM-Exit that KVM manually bumps an event count, KVM would pend PMIs for both the hardware-triggered overflow and for KVM-triggered overflow. - Turn off KVM_WERROR by default for all configs so that it's not inadvertantly enabled by non-KVM developers, which can be problematic for subsystems that require no regressions for W=1 builds. - Advertise all of the host-supported CPUID bits that enumerate IA32_SPEC_CTRL "features". - Don't force a masterclock update when a vCPU synchronizes to the current TSC generation, as updating the masterclock can cause kvmclock's time to "jump" unexpectedly, e.g. when userspace hotplugs a pre-created vCPU. - Use RIP-relative address to read kvm_rebooting in the VM-Enter fault paths, partly as a super minor optimization, but mostly to make KVM play nice with position independent executable builds. - Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on CONFIG_HYPERV as a minor optimization, and to self-document the code. - Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV "emulation" at build time. ARM64: - LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB base granule sizes. Branch shared with the arm64 tree. - Large Fine-Grained Trap rework, bringing some sanity to the feature, although there is more to come. This comes with a prefix branch shared with the arm64 tree. - Some additional Nested Virtualization groundwork, mostly introducing the NV2 VNCR support and retargetting the NV support to that version of the architecture. - A small set of vgic fixes and associated cleanups. Loongarch: - Optimization for memslot hugepage checking - Cleanup and fix some HW/SW timer issues - Add LSX/LASX (128bit/256bit SIMD) support RISC-V: - KVM_GET_REG_LIST improvement for vector registers - Generate ISA extension reg_list using macros in get-reg-list selftest - Support for reporting steal time along with selftest s390: - Bugfixes Selftests: - Fix an annoying goof where the NX hugepage test prints out garbage instead of the magic token needed to run the test. - Fix build errors when a header is delete/moved due to a missing flag in the Makefile. - Detect if KVM bugged/killed a selftest's VM and print out a helpful message instead of complaining that a random ioctl() failed. - Annotate the guest printf/assert helpers with __printf(), and fix the various bugs that were lurking due to lack of said annotation" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (185 commits) x86/kvm: Do not try to disable kvmclock if it was not enabled KVM: x86: add missing "depends on KVM" KVM: fix direction of dependency on MMU notifiers KVM: introduce CONFIG_KVM_COMMON KVM: arm64: Add missing memory barriers when switching to pKVM's hyp pgd KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache RISC-V: KVM: selftests: Add get-reg-list test for STA registers RISC-V: KVM: selftests: Add steal_time test support RISC-V: KVM: selftests: Add guest_sbi_probe_extension RISC-V: KVM: selftests: Move sbi_ecall to processor.c RISC-V: KVM: Implement SBI STA extension RISC-V: KVM: Add support for SBI STA registers RISC-V: KVM: Add support for SBI extension registers RISC-V: KVM: Add SBI STA info to vcpu_arch RISC-V: KVM: Add steal-update vcpu request RISC-V: KVM: Add SBI STA extension skeleton RISC-V: paravirt: Implement steal-time support RISC-V: Add SBI STA extension definitions RISC-V: paravirt: Add skeleton for pv-time support RISC-V: KVM: Fix indentation in kvm_riscv_vcpu_set_reg_csr() ... |
||
![]() |
78de91b458 |
LoongArch: Use generic interface to support crashkernel=X,[high,low]
LoongArch already supports two crashkernel regions in kexec-tools, so we
can directly use the common interface to support crashkernel=X,[high,low]
after commit
|
||
![]() |
c239665130 |
LoongArch: Fix and simplify fcsr initialization on execve()
There has been a lingering bug in LoongArch Linux systems causing some
GCC tests to intermittently fail (see Closes link). I've made a minimal
reproducer:
zsh% cat measure.s
.align 4
.globl _start
_start:
movfcsr2gr $a0, $fcsr0
bstrpick.w $a0, $a0, 16, 16
beqz $a0, .ok
break 0
.ok:
li.w $a7, 93
syscall 0
zsh% cc mesaure.s -o measure -nostdlib
zsh% echo $((1.0/3))
0.33333333333333331
zsh% while ./measure; do ; done
This while loop should not stop as POSIX is clear that execve must set
fenv to the default, where FCSR should be zero. But in fact it will
just stop after running for a while (normally less than 30 seconds).
Note that "$((1.0/3))" is needed to reproduce this issue because it
raises FE_INVALID and makes fcsr0 non-zero.
The problem is we are currently relying on SET_PERSONALITY2() to reset
current->thread.fpu.fcsr. But SET_PERSONALITY2() is executed before
start_thread which calls lose_fpu(0). We can see if kernel preempt is
enabled, we may switch to another thread after SET_PERSONALITY2() but
before lose_fpu(0). Then bad thing happens: during the thread switch
the value of the fcsr0 register is stored into current->thread.fpu.fcsr,
making it dirty again.
The issue can be fixed by setting current->thread.fpu.fcsr after
lose_fpu(0) because lose_fpu() clears TIF_USEDFPU, then the thread
switch won't touch current->thread.fpu.fcsr.
The only other architecture setting FCSR in SET_PERSONALITY2() is MIPS.
I've ran a similar test on MIPS with mainline kernel and it turns out
MIPS is buggy, too. Anyway MIPS do this for supporting different FP
flavors (NaN encodings, etc.) which do not exist on LoongArch. So for
LoongArch, we can simply remove the current->thread.fpu.fcsr setting
from SET_PERSONALITY2() and do it in start_thread(), after lose_fpu(0).
The while loop failing with the mainline kernel has survived one hour
after this change on LoongArch.
Fixes:
|
||
![]() |
ce68ff3528 |
LoongArch: Let cores_io_master cover the largest NR_CPUS
Now loongson_system_configuration::cores_io_master only covers 64 cpus, if NR_CPUS > 64 there will be memory corruption. So let cores_io_master cover the largest NR_CPUS (256). Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
d23b77953f |
LoongArch: Change SHMLBA from SZ_64K to PAGE_SIZE
LoongArch has hardware page coloring for L1 Cache, so we don't have cache aliases. But SFB (Store Fill Buffer) still has aliases. So we define SHMLBA to SZ_64K previously. But there are losts of applications use PAGE_SIZE rather than SHMLBA to mmap() file pages and shared pages. Of course we can fix them one by one, but not easy. On the other hand, we can simply disable SFB for 4KB page size to fix cache alias (there will be performance decrease, but acceptable), and in future we will fix SFB in hardware. So we can safely define SHMLBA to PAGE_SIZE (use the generic shmparam.h) to make life easier. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
9499daeade |
LoongArch: Add a missing call to efi_esrt_init()
ESRT (EFI System Resource Table) is needed for UEFI's "Capsule Update" feature. But ESRT initialization is missing on LoongArch now, so add a call to efi_esrt_init() at the end of efi_init(). Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
44a01f1f72 |
LoongArch: Parsing CPU-related information from DTS
Generally, we can get cpu-related information, such as model name, from /proc/cpuinfo. For FDT-based systems, we need to parse the relevant information from DTS. BTW, set loongson_sysconf.cores_per_package to num_processors if SMBIOS doesn't provide a valid number (usually FDT-based systems). Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Hongliang Wang <wanghongliang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
5f346a6e59 |
LoongArch: Allow device trees be built into the kernel
During the upstream progress of those DT-based drivers, DT properties are changed a lot so very different from those in existing bootloaders. It is inevitably that some existing systems do not provide a standard, canonical device tree to the kernel at boot time. So let's provide a device tree table in the kernel, keyed by the dts filename, containing the relevant DTBs. We can use the built-in dts files as references. Each SoC has only one built-in dts file which describes all possible device information of that SoC, so the dts files are good examples during development. And as a reference, our built-in dts file only enables the most basic bootable combinations (so it is generic enough), acts as an alternative in case the dts in the bootloader is unexpected. Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
c299010061 |
asm-generic cleanups for 6.8
A series from Baoquan He cleans up the asm-generic/io.h to remove the ioremap_uc() definition from everything except x86, which still needs it for pre-PAT systems. This series notably contains a patch from Jiaxun Yang that converts MIPS to use asm-generic/io.h like every other architecture does, enabling future cleanups. Some of my own patches fix -Wmissing-prototype warnings in architecture specific code across several architectures. This is now needed as the warning is enabled by default. There are still some remaining warnings in minor platforms, but the series should catch most of the widely used ones make them more consistent with one another. David McKay fixes a bug in __generic_cmpxchg_local() when this is used on 64-bit architectures. This could currently only affect parisc64 and sparc64. Additional cleanups address from Linus Walleij, Uwe Kleine-König, Thomas Huth, and Kefeng Wang help reduce unnecessary inconsistencies between architectures. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmWeak8ACgkQYKtH/8kJ UidSiQ/+LL1WTO9d3Zx5HI0GGGjaIYpYs6jUNSf9Y5GPQiOrvjfEWj7CU11/4vxl GlQRpRyncYm8Eiz0Qu+aNxZFiiMah8Uful75yfbX8P1L4EPTbAYNDjkyNJrTjIAK jPK4sl8awIrapOeFUz++PsEj22R/4Is4f0mo+CqoCkL5RKlHe5oFdXzcwjmds4yK CvU6Ldn+M7FZ3EItMdjXaB3D3HS9uictFiO5JByZY8p+IcqgNRI/iHNnZIMsltJ+ XjDi0DG+x4jCj6teElSchw7AofE4OcNSP3xbR1PLKv6+xBLGYaAGZhNuPTz88eV/ Gj0loDQrrR5McGUfDBRHK9zN2Jd0O/FKnfh9kLOt1FLFyGPvC78Q/2HkpVCjbBr2 Pr1aqhLDHA+tGNSsThsV8RUa8/tiEnxAki43tfBFS3SEKhtQsTm2g1z4miwbE3p0 BJIrSgTqrP/SBq7a9z/thPrkzdZcNuA9FUETTbaMeUlJS51n1V9E5A1t7sOG7jaI vV/gbuR6FjvD49mTyQiOSCt3V4ygRqgN1Q+C4QM8WLqq2keUq0AhGodquv8F78in J3x2j2r27lHY7jKf8B0dua/JXAsF20u8qD6yDQ9ymkjt/MWhGXBgK0jpT7RTIuMS e2jmTywUVD4UohAcx3inkOojUhIJ5KDB0I4Pzv4zWcHNbyFNKcY= =4VQl -----END PGP SIGNATURE----- Merge tag 'asm-generic-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic cleanups from Arnd Bergmann: "A series from Baoquan He cleans up the asm-generic/io.h to remove the ioremap_uc() definition from everything except x86, which still needs it for pre-PAT systems. This series notably contains a patch from Jiaxun Yang that converts MIPS to use asm-generic/io.h like every other architecture does, enabling future cleanups. Some of my own patches fix -Wmissing-prototype warnings in architecture specific code across several architectures. This is now needed as the warning is enabled by default. There are still some remaining warnings in minor platforms, but the series should catch most of the widely used ones make them more consistent with one another. David McKay fixes a bug in __generic_cmpxchg_local() when this is used on 64-bit architectures. This could currently only affect parisc64 and sparc64. Additional cleanups address from Linus Walleij, Uwe Kleine-König, Thomas Huth, and Kefeng Wang help reduce unnecessary inconsistencies between architectures" * tag 'asm-generic-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: Fix 32 bit __generic_cmpxchg_local Hexagon: Make pfn accessors statics inlines ARC: mm: Make virt_to_pfn() a static inline mips: remove extraneous asm-generic/iomap.h include sparc: Use $(kecho) to announce kernel images being ready arm64: vdso32: Define BUILD_VDSO32_64 to correct prototypes csky: fix arch_jump_label_transform_static override arch: add do_page_fault prototypes arch: add missing prepare_ftrace_return() prototypes arch: vdso: consolidate gettime prototypes arch: include linux/cpu.h for trap_init() prototype arch: fix asm-offsets.c building with -Wmissing-prototypes arch: consolidate arch_irq_work_raise prototypes hexagon: Remove CONFIG_HEXAGON_ARCH_VERSION from uapi header asm/io: remove unnecessary xlate_dev_mem_ptr() and unxlate_dev_mem_ptr() mips: io: remove duplicated codes arch/*/io.h: remove ioremap_uc in some architectures mips: add <asm-generic/io.h> including |
||
![]() |
78273df7f6 |
header cleanups for 6.8
The goal is to get sched.h down to a type only header, so the main thing happening in this patchset is splitting out various _types.h headers and dependency fixups, as well as moving some things out of sched.h to better locations. This is prep work for the memory allocation profiling patchset which adds new sched.h interdepencencies. Testing - it's been in -next, and fixes from pretty much all architectures have percolated in - nothing major. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmWfBwwACgkQE6szbY3K bnZPwBAAmuRojXaeWxi01IPIOehSGDe68vw44PR9glEMZvxdnZuPOdvE4/+245/L bRKU2WBCjBUokUbV9msIShwRkFTZAmEMPNfPAAsFMA+VXeDYHKB+ZRdwTggNAQ+I SG6fZgh5m0HsewCDxU8oqVHkjVq4fXn0cy+aL6xLEd9gu67GoBzX2pDieS2Kvy6j jnyoKTxFwb+LTQgph0P4EIpq5I2umAsdLwdSR8EJ+8e9NiNvMo1pI00Lx/ntAnFZ JftWUJcMy3TQ5u1GkyfQN9y/yThX1bZK5GvmHS9SJ2Dkacaus5d+xaKCHtRuFS1I 7C6b8PsNgRczUMumBXus44HdlNfNs1yU3lvVxFvBIPE1qC9pYRHrkWIXXIocXLLC oxTEJ6B2G3BQZVQgLIA4fOaxMVhmvKffi/aEZLi9vN9VVosd1a6XNKI6KbyRnXFp GSs9qDqszhn5I3GYNlDNQTc/8UsRlhPFgS6nS0By6QnvxtGi9QkU2tBRBsXvqwCy cLoCYIhc2tvugHvld70dz26umiJ4rnmxGlobStNoigDvIKAIUt1UmIdr1so8P8eH xehnL9ZcOX6xnANDL0AqMFFHV6I58CJynhFdUoXfVQf/DWLGX48mpi9LVNsYBzsI CAwVOAQ0UjGrpdWmJ9ueY/ABYqg9vRjzaDEXQ+MhAYO55CLaVsg= =3tyT -----END PGP SIGNATURE----- Merge tag 'header_cleanup-2024-01-10' of https://evilpiepirate.org/git/bcachefs Pull header cleanups from Kent Overstreet: "The goal is to get sched.h down to a type only header, so the main thing happening in this patchset is splitting out various _types.h headers and dependency fixups, as well as moving some things out of sched.h to better locations. This is prep work for the memory allocation profiling patchset which adds new sched.h interdepencencies" * tag 'header_cleanup-2024-01-10' of https://evilpiepirate.org/git/bcachefs: (51 commits) Kill sched.h dependency on rcupdate.h kill unnecessary thread_info.h include Kill unnecessary kernel.h include preempt.h: Kill dependency on list.h rseq: Split out rseq.h from sched.h LoongArch: signal.c: add header file to fix build error restart_block: Trim includes lockdep: move held_lock to lockdep_types.h sem: Split out sem_types.h uidgid: Split out uidgid_types.h seccomp: Split out seccomp_types.h refcount: Split out refcount_types.h uapi/linux/resource.h: fix include x86/signal: kill dependency on time.h syscall_user_dispatch.h: split out *_types.h mm_types_task.h: Trim dependencies Split out irqflags_types.h ipc: Kill bogus dependency on spinlock.h shm: Slim down dependencies workqueue: Split out workqueue_types.h ... |
||
![]() |
a7e4c6cf5b |
EFI updates for v6.8
- Fix a syzbot reported issue in efivarfs where concurrent accesses to the file system resulted in list corruption - Add support for accessing EFI variables via the TEE subsystem (and a trusted application in the secure world) instead of via EFI runtime firmware running in the OS's execution context - Avoid linker tricks to discover the image base on LoongArch -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCZYVaHQAKCRAwbglWLn0t XPm/AQDzX9A6TND00eOLYYWw91kybHnzrVd8GRKOv2EIxGz33AEAgW6nXIJlBRax MBq6S/sXdyknuCC3sO7H9FexdD4BzQM= =MZUx -----END PGP SIGNATURE----- Merge tag 'efi-next-for-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Fix a syzbot reported issue in efivarfs where concurrent accesses to the file system resulted in list corruption - Add support for accessing EFI variables via the TEE subsystem (and a trusted application in the secure world) instead of via EFI runtime firmware running in the OS's execution context - Avoid linker tricks to discover the image base on LoongArch * tag 'efi-next-for-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: memmap: fix kernel-doc warnings efi/loongarch: Directly position the loaded image file efivarfs: automatically update super block flag efi: Add tee-based EFI variable driver efi: Add EFI_ACCESS_DENIED status code efi: expose efivar generic ops register function efivarfs: Move efivarfs list into superblock s_fs_info efivarfs: Free s_fs_info on unmount efivarfs: Move efivar availability check into FS context init efivarfs: force RO when remounting if SetVariable is not supported |
||
![]() |
fb46e22a9e |
Many singleton patches against the MM code. The patch series which
are included in this merge do the following: - Peng Zhang has done some mapletree maintainance work in the series "maple_tree: add mt_free_one() and mt_attr() helpers" "Some cleanups of maple tree" - In the series "mm: use memmap_on_memory semantics for dax/kmem" Vishal Verma has altered the interworking between memory-hotplug and dax/kmem so that newly added 'device memory' can more easily have its memmap placed within that newly added memory. - Matthew Wilcox continues folio-related work (including a few fixes) in the patch series "Add folio_zero_tail() and folio_fill_tail()" "Make folio_start_writeback return void" "Fix fault handler's handling of poisoned tail pages" "Convert aops->error_remove_page to ->error_remove_folio" "Finish two folio conversions" "More swap folio conversions" - Kefeng Wang has also contributed folio-related work in the series "mm: cleanup and use more folio in page fault" - Jim Cromie has improved the kmemleak reporting output in the series "tweak kmemleak report format". - In the series "stackdepot: allow evicting stack traces" Andrey Konovalov to permits clients (in this case KASAN) to cause eviction of no longer needed stack traces. - Charan Teja Kalla has fixed some accounting issues in the page allocator's atomic reserve calculations in the series "mm: page_alloc: fixes for high atomic reserve caluculations". - Dmitry Rokosov has added to the samples/ dorectory some sample code for a userspace memcg event listener application. See the series "samples: introduce cgroup events listeners". - Some mapletree maintanance work from Liam Howlett in the series "maple_tree: iterator state changes". - Nhat Pham has improved zswap's approach to writeback in the series "workload-specific and memory pressure-driven zswap writeback". - DAMON/DAMOS feature and maintenance work from SeongJae Park in the series "mm/damon: let users feed and tame/auto-tune DAMOS" "selftests/damon: add Python-written DAMON functionality tests" "mm/damon: misc updates for 6.8" - Yosry Ahmed has improved memcg's stats flushing in the series "mm: memcg: subtree stats flushing and thresholds". - In the series "Multi-size THP for anonymous memory" Ryan Roberts has added a runtime opt-in feature to transparent hugepages which improves performance by allocating larger chunks of memory during anonymous page faults. - Matthew Wilcox has also contributed some cleanup and maintenance work against eh buffer_head code int he series "More buffer_head cleanups". - Suren Baghdasaryan has done work on Andrea Arcangeli's series "userfaultfd move option". UFFDIO_MOVE permits userspace heap compaction algorithms to move userspace's pages around rather than UFFDIO_COPY'a alloc/copy/free. - Stefan Roesch has developed a "KSM Advisor", in the series "mm/ksm: Add ksm advisor". This is a governor which tunes KSM's scanning aggressiveness in response to userspace's current needs. - Chengming Zhou has optimized zswap's temporary working memory use in the series "mm/zswap: dstmem reuse optimizations and cleanups". - Matthew Wilcox has performed some maintenance work on the writeback code, both code and within filesystems. The series is "Clean up the writeback paths". - Andrey Konovalov has optimized KASAN's handling of alloc and free stack traces for secondary-level allocators, in the series "kasan: save mempool stack traces". - Andrey also performed some KASAN maintenance work in the series "kasan: assorted clean-ups". - David Hildenbrand has gone to town on the rmap code. Cleanups, more pte batching, folio conversions and more. See the series "mm/rmap: interface overhaul". - Kinsey Ho has contributed some maintenance work on the MGLRU code in the series "mm/mglru: Kconfig cleanup". - Matthew Wilcox has contributed lruvec page accounting code cleanups in the series "Remove some lruvec page accounting functions". -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZZyF2wAKCRDdBJ7gKXxA jjWjAP42LHvGSjp5M+Rs2rKFL0daBQsrlvy6/jCHUequSdWjSgEAmOx7bc5fbF27 Oa8+DxGM9C+fwqZ/7YxU2w/WuUmLPgU= =0NHs -----END PGP SIGNATURE----- Merge tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Peng Zhang has done some mapletree maintainance work in the series 'maple_tree: add mt_free_one() and mt_attr() helpers' 'Some cleanups of maple tree' - In the series 'mm: use memmap_on_memory semantics for dax/kmem' Vishal Verma has altered the interworking between memory-hotplug and dax/kmem so that newly added 'device memory' can more easily have its memmap placed within that newly added memory. - Matthew Wilcox continues folio-related work (including a few fixes) in the patch series 'Add folio_zero_tail() and folio_fill_tail()' 'Make folio_start_writeback return void' 'Fix fault handler's handling of poisoned tail pages' 'Convert aops->error_remove_page to ->error_remove_folio' 'Finish two folio conversions' 'More swap folio conversions' - Kefeng Wang has also contributed folio-related work in the series 'mm: cleanup and use more folio in page fault' - Jim Cromie has improved the kmemleak reporting output in the series 'tweak kmemleak report format'. - In the series 'stackdepot: allow evicting stack traces' Andrey Konovalov to permits clients (in this case KASAN) to cause eviction of no longer needed stack traces. - Charan Teja Kalla has fixed some accounting issues in the page allocator's atomic reserve calculations in the series 'mm: page_alloc: fixes for high atomic reserve caluculations'. - Dmitry Rokosov has added to the samples/ dorectory some sample code for a userspace memcg event listener application. See the series 'samples: introduce cgroup events listeners'. - Some mapletree maintanance work from Liam Howlett in the series 'maple_tree: iterator state changes'. - Nhat Pham has improved zswap's approach to writeback in the series 'workload-specific and memory pressure-driven zswap writeback'. - DAMON/DAMOS feature and maintenance work from SeongJae Park in the series 'mm/damon: let users feed and tame/auto-tune DAMOS' 'selftests/damon: add Python-written DAMON functionality tests' 'mm/damon: misc updates for 6.8' - Yosry Ahmed has improved memcg's stats flushing in the series 'mm: memcg: subtree stats flushing and thresholds'. - In the series 'Multi-size THP for anonymous memory' Ryan Roberts has added a runtime opt-in feature to transparent hugepages which improves performance by allocating larger chunks of memory during anonymous page faults. - Matthew Wilcox has also contributed some cleanup and maintenance work against eh buffer_head code int he series 'More buffer_head cleanups'. - Suren Baghdasaryan has done work on Andrea Arcangeli's series 'userfaultfd move option'. UFFDIO_MOVE permits userspace heap compaction algorithms to move userspace's pages around rather than UFFDIO_COPY'a alloc/copy/free. - Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm: Add ksm advisor'. This is a governor which tunes KSM's scanning aggressiveness in response to userspace's current needs. - Chengming Zhou has optimized zswap's temporary working memory use in the series 'mm/zswap: dstmem reuse optimizations and cleanups'. - Matthew Wilcox has performed some maintenance work on the writeback code, both code and within filesystems. The series is 'Clean up the writeback paths'. - Andrey Konovalov has optimized KASAN's handling of alloc and free stack traces for secondary-level allocators, in the series 'kasan: save mempool stack traces'. - Andrey also performed some KASAN maintenance work in the series 'kasan: assorted clean-ups'. - David Hildenbrand has gone to town on the rmap code. Cleanups, more pte batching, folio conversions and more. See the series 'mm/rmap: interface overhaul'. - Kinsey Ho has contributed some maintenance work on the MGLRU code in the series 'mm/mglru: Kconfig cleanup'. - Matthew Wilcox has contributed lruvec page accounting code cleanups in the series 'Remove some lruvec page accounting functions'" * tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits) mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER mm, treewide: introduce NR_PAGE_ORDERS selftests/mm: add separate UFFDIO_MOVE test for PMD splitting selftests/mm: skip test if application doesn't has root privileges selftests/mm: conform test to TAP format output selftests: mm: hugepage-mmap: conform to TAP format output selftests/mm: gup_test: conform test to TAP format output mm/selftests: hugepage-mremap: conform test to TAP format output mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large mm/memcontrol: remove __mod_lruvec_page_state() mm/khugepaged: use a folio more in collapse_file() slub: use a folio in __kmalloc_large_node slub: use folio APIs in free_large_kmalloc() slub: use alloc_pages_node() in alloc_slab_page() mm: remove inc/dec lruvec page state functions mm: ratelimit stat flush from workingset shrinker kasan: stop leaking stack trace handles mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE mm/mglru: add dummy pmd_dirty() ... |
||
![]() |
c968b99f86 |
LoongArch: signal.c: add header file to fix build error
loongarch's signal.c uses rseq_signal_deliver() so it should
pull in the appropriate header to prevent a build error:
../arch/loongarch/kernel/signal.c: In function 'handle_signal':
../arch/loongarch/kernel/signal.c:1034:9: error: implicit declaration of function 'rseq_signal_deliver' [-Werror=implicit-function-declaration]
1034 | rseq_signal_deliver(ksig, regs);
| ^~~~~~~~~~~~~~~~~~~
Fixes:
|
||
![]() |
a721aeac8b | sync mm-stable with mm-hotfixes-stable to pick up depended-upon changes | ||
![]() |
174a0c565c |
efi/loongarch: Directly position the loaded image file
The use of the 'kernel_offset' variable to position the image file that has been loaded by UEFI or GRUB is unnecessary, because we can directly position the loaded image file through using the image_base field of the efi_loaded_image struct provided by UEFI. Replace kernel_offset with image_base to position the image file that has been loaded by UEFI or GRUB. Signed-off-by: Wang Yao <wangyao@lemote.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> |
||
![]() |
118e10cd89 |
LoongArch: KVM: Add LASX (256bit SIMD) support
This patch adds LASX (256bit SIMD) support for LoongArch KVM. There will be LASX exception in KVM when guest use the LASX instructions. KVM will enable LASX and restore the vector registers for guest and then return to guest to continue running. Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
db1ecca22e |
LoongArch: KVM: Add LSX (128bit SIMD) support
This patch adds LSX (128bit SIMD) support for LoongArch KVM. There will be LSX exception in KVM when guest use the LSX instructions. KVM will enable LSX and restore the vector registers for guest and then return to guest to continue running. Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
a62aa88ba1 |
17 hotfixes. 8 are cc:stable and the other 9 pertain to post-6.6 issues.
-----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZXxs8wAKCRDdBJ7gKXxA junbAQCdItfHHinkWziciOrb0387wW+5WZ1ohqRFW8pGYLuasQEArpKmw13bvX7z e+ec9K1Ek9MlIsO2RwORR4KHH4MAbwA= =YpZh -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2023-12-15-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "17 hotfixes. 8 are cc:stable and the other 9 pertain to post-6.6 issues" * tag 'mm-hotfixes-stable-2023-12-15-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/mglru: reclaim offlined memcgs harder mm/mglru: respect min_ttl_ms with memcgs mm/mglru: try to stop at high watermarks mm/mglru: fix underprotected page cache mm/shmem: fix race in shmem_undo_range w/THP Revert "selftests: error out if kernel header files are not yet built" crash_core: fix the check for whether crashkernel is from high memory x86, kexec: fix the wrong ifdeffery CONFIG_KEXEC sh, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC mips, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC m68k, kexec: fix the incorrect ifdeffery and build dependency of CONFIG_KEXEC loongarch, kexec: change dependency of object files mm/damon/core: make damon_start() waits until kdamond_fn() starts selftests/mm: cow: print ksft header before printing anything else mm: fix VMA heap bounds checking riscv: fix VMALLOC_START definition kexec: drop dependency on ARCH_SUPPORTS_KEXEC from CRASH_DUMP |
||
![]() |
655fc6cd45 |
loongarch, kexec: change dependency of object files
Patch series "kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC". The select of KEXEC for CRASH_DUMP in kernel/Kconfig.kexec will be dropped, then compiling errors will be triggered if below config items are set: === CONFIG_CRASH_CORE=y CONFIG_KEXEC_CORE=y CONFIG_CRASH_DUMP=y === E.g on mips, below link error are seen: -------------------------------------------------------------------- mipsel-linux-ld: kernel/kexec_core.o: in function `kimage_free': kernel/kexec_core.c:(.text+0x2200): undefined reference to `machine_kexec_cleanup' mipsel-linux-ld: kernel/kexec_core.o: in function `__crash_kexec': kernel/kexec_core.c:(.text+0x2480): undefined reference to `machine_crash_shutdown' mipsel-linux-ld: kernel/kexec_core.c:(.text+0x2488): undefined reference to `machine_kexec' mipsel-linux-ld: kernel/kexec_core.o: in function `kernel_kexec': kernel/kexec_core.c:(.text+0x29b8): undefined reference to `machine_shutdown' mipsel-linux-ld: kernel/kexec_core.c:(.text+0x29c0): undefined reference to `machine_kexec' -------------------------------------------------------------------- Here, change the incorrect dependency of building kexec_core related object files, and the ifdeffery on architectures from CONFIG_KEXEC to CONFIG_KEXEC_CORE. Testing: ======== Passed on mips and loognarch with the LKP reproducer. This patch (of 5): Currently, in arch/loongarch/kernel/Makefile, building machine_kexec.o relocate_kernel.o depends on CONFIG_KEXEC. Whereas, since we will drop the select of KEXEC for CRASH_DUMP in kernel/Kconfig.kexec, compiling error will be triggered if below config items are set: === CONFIG_CRASH_CORE=y CONFIG_KEXEC_CORE=y CONFIG_CRASH_DUMP=y === --------------------------------------------------------------- loongarch64-linux-ld: kernel/kexec_core.o: in function `.L209': >> kexec_core.c:(.text+0x1660): undefined reference to `machine_kexec_cleanup' loongarch64-linux-ld: kernel/kexec_core.o: in function `.L287': >> kexec_core.c:(.text+0x1c5c): undefined reference to `machine_crash_shutdown' >> loongarch64-linux-ld: kexec_core.c:(.text+0x1c64): undefined reference to `machine_kexec' loongarch64-linux-ld: kernel/kexec_core.o: in function `.L2^B5': >> kexec_core.c:(.text+0x2090): undefined reference to `machine_shutdown' loongarch64-linux-ld: kexec_core.c:(.text+0x20a0): undefined reference to `machine_kexec' --------------------------------------------------------------- Here, change the dependency of machine_kexec.o relocate_kernel.o to CONFIG_KEXEC_CORE can fix above building error. Link: https://lkml.kernel.org/r/20231208073036.7884-1-bhe@redhat.com Link: https://lkml.kernel.org/r/20231208073036.7884-2-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311300946.kHE9Iu71-lkp@intel.com/ Cc: Eric DeVolder <eric_devolder@yahoo.com> Cc: Ignat Korchagin <ignat@cloudflare.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
![]() |
ff6c3d81f2 |
NUMA: optimize detection of memory with no node id assigned by firmware
Sanity check that makes sure the nodes cover all memory loops over numa_meminfo to count the pages that have node id assigned by the firmware, then loops again over memblock.memory to find the total amount of memory and in the end checks that the difference between the total memory and memory that covered by nodes is less than some threshold. Worse, the loop over numa_meminfo calls __absent_pages_in_range() that also partially traverses memblock.memory. It's much simpler and more efficient to have a single traversal of memblock.memory that verifies that amount of memory not covered by nodes is less than a threshold. Introduce memblock_validate_numa_coverage() that does exactly that and use it instead of numa_meminfo_cover_memory(). Link: https://lkml.kernel.org/r/20231026020329.327329-1-zhiguangni01@gmail.com Signed-off-by: Liam Ni <zhiguangni01@gmail.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Binbin Zhou <zhoubinbin@loongson.cn> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Feiyang Chen <chenfeiyang@loongson.cn> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: WANG Xuerui <kernel@xen0n.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
![]() |
97ceddbc94 |
LoongArch: Set unwind stack type to unknown rather than set error flag
During unwinding, unwind_done() is used as an end condition. Normally it unwind to the user stack and then set the stack type to unknown, which is a normal exit. When something unexpected happens in unwind process and we cannot unwind anymore, we should set the error flag, and also set the stack type to unknown to indicate that the unwind process can not continue. The error flag emphasizes that the unwind process produce an unexpected error. There is no unexpected things when we unwind the PT_REGS in the top of IRQ stack and find out that is an user mode PT_REGS. Thus, we should not set error flag and just set stack type to unknown. Reported-by: Hengqi Chen <hengqi.chen@gmail.com> Acked-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
13f9f0361c |
LoongArch: convert to use arch_cpu_is_hotpluggable()
Convert loongarch to use the arch_cpu_is_hotpluggable() helper rather than arch_register_cpu(). Also remove the export as nothing should be using arch_register_cpu() outside of the core kernel/acpi code. Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/E1r5R4B-00Ct0G-Kk@rmk-PC.armlinux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
![]() |
0d122fb600 |
LoongArch: Use the __weak version of arch_unregister_cpu()
LoongArch provides its own arch_unregister_cpu(). This clears the hotpluggable flag, then unregisters the CPU. It isn't necessary to clear the hotpluggable flag when unregistering a cpu. unregister_cpu() writes NULL to the percpu cpu_sys_devices pointer, meaning cpu_is_hotpluggable() will return false, as get_cpu_device() has returned NULL. Remove arch_unregister_cpu() and use the __weak version. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/E1r5R46-00Ct0A-GJ@rmk-PC.armlinux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
![]() |
db3ba29a83 |
LoongArch: Switch over to GENERIC_CPU_DEVICES
Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be overridden by the arch code, switch over to this to allow common code to choose when the register_cpu() call is made. This allows topology_init() to be removed. This is an intermediate step to the logic being moved to drivers/acpi, where GENERIC_CPU_DEVICES will do the work when booting with acpi=off. This is a subtle change. Originally: - on boot, topology_init() would have marked present CPUs that io_master() is true for as hotplug-incapable. - if a CPU is hotplugged that is an io_master(), it can later be hot-unplugged. The new behaviour is that any CPU that io_master() is true for will now always be marked as hotplug-incapable, thus even if it was hotplugged, it can no longer be hot-unplugged. This patch also has the effect of moving the registration of CPUs from subsys to driver core initialisation, prior to any initcalls running. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/E1r5R41-00Ct04-Bg@rmk-PC.armlinux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
![]() |
29d93102fd |
Loongarch: remove arch_*register_cpu() exports
arch_register_cpu() and arch_unregister_cpu() are not used by anything that can be a module - they are used by drivers/base/cpu.c and drivers/acpi/acpi_processor.c, neither of which can be a module. Remove the exports. Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/E1r5R2w-00Csyn-E2@rmk-PC.armlinux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
![]() |
4d86896793 |
arch: fix asm-offsets.c building with -Wmissing-prototypes
When -Wmissing-prototypes is enabled, the some asm-offsets.c files fail to build, even when this warning is disabled in the Makefile for normal files: arch/sparc/kernel/asm-offsets.c:22:5: error: no previous prototype for 'sparc32_foo' [-Werror=missing-prototypes] arch/sparc/kernel/asm-offsets.c:48:5: error: no previous prototype for 'foo' [-Werror=missing-prototypes] Address this by making use of the same trick as x86, marking these functions as 'static __used' to avoid the need for a prototype by not drop them in dead-code elimination. Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://lore.kernel.org/lkml/CAK7LNARfEmFk0Du4Hed19eX_G6tUC5wG0zP+L1AyvdpOF4ybXQ@mail.gmail.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> |
||
![]() |
d43f37b734 |
LoongArch: Implement constant timer shutdown interface
When a cpu is hot-unplugged, it is put in idle state and the function arch_cpu_idle_dead() is called. The timer interrupt for this processor should be disabled, otherwise there will be pending timer interrupt for the unplugged cpu, so that vcpu is prevented from giving up scheduling when system is running in vm mode. This patch implements the timer shutdown interface so that the constant timer will be properly disabled when a CPU is hot-unplugged. Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
902d75cdf0 |
LoongArch: Silence the boot warning about 'nokaslr'
The kernel parameter 'nokaslr' is handled before start_kernel(), so we don't need early_param() to mark it technically. But it can cause a boot warning as follows: Unknown kernel command line parameters "nokaslr", will be passed to user space. When we use 'init=/bin/bash', 'nokaslr' which passed to user space will even cause a kernel panic. So we use early_param() to mark 'nokaslr', simply print a notice and silence the boot warning (also fix a potential panic). This logic is similar to RISC-V. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
aa0cbc1b50 |
LoongArch: Record pc instead of offset in la_abs relocation
To clarify, the previous version functioned flawlessly. However, it's worth noting that the LLVM's LoongArch backend currently lacks support for cross-section label calculations. With this patch, we enable the use of clang to compile relocatable kernels. Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
4eeee6636a |
LoongArch changes for v6.7
1, Support PREEMPT_DYNAMIC with static keys; 2, Relax memory ordering for atomic operations; 3, Support BPF CPU v4 instructions for LoongArch; 4, Some build and runtime warning fixes. -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmVQWXgWHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImepDTEACS808EsgSNIM1+JwldhdqKOErt XDWlLuIddVpenInx8F+9GnZJzKBU+wl+Ow5ejcVarjcecIJDv5UhoVrbhpeOHkfv RszRXQR4p/ZNSFvdraYDjjJ9UX6bp5rq7vMUC2d9bLazMauAfwf7T/HJ5qj9OYZi RLlcwaKo2UQHYsT7nJicjh0qpH1YpZQBYTaUUCwzilzB6vAIOTf6X12vFmhtM/i+ 5RIPnesMA1IQSm2ywUODpDHCs7Pirvy8aJvx0CsYdi3xl1yg3pUS6u69Ms61uWlw 29yYhNbWmVnDikTVLTNISDb/jwto5SAVB2KQKBhF1trF4ZBNE6r7sP4m2tfllYo9 KXK9tm0U8McS5o46Qd5er6eEnxL7mEeAsc12tNKUYOMe3SIkmHJmj/rZQOtpsiBg zqQsYkGUfO2VAwMWiGke8dxPZElOYwZ3UCOpbEpXEXy3NW71VJTIuQFGmsYKJhdy 3xaAtQxdffE5yUTt2j3Y8Mex2b2oSUBSF263imsZjzWOOxd480iaoejtamf1V779 bElevzZjMDmbiQ7kiVSf96TWc7iYcSv33jhP4DorKIqnPseYPfrXEeD1xY7JV+IU kkvSlO0hAJzVMmQgu5n0PPT1wrVpuvwtbsfcRobIkr1vktZyLaKHRq7rh4R5HTRL ZUUm6c0kUDywGT+J4A== =bmFe -----END PGP SIGNATURE----- Merge tag 'loongarch-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - support PREEMPT_DYNAMIC with static keys - relax memory ordering for atomic operations - support BPF CPU v4 instructions for LoongArch - some build and runtime warning fixes * tag 'loongarch-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: selftests/bpf: Enable cpu v4 tests for LoongArch LoongArch: BPF: Support signed mod instructions LoongArch: BPF: Support signed div instructions LoongArch: BPF: Support 32-bit offset jmp instructions LoongArch: BPF: Support unconditional bswap instructions LoongArch: BPF: Support sign-extension mov instructions LoongArch: BPF: Support sign-extension load instructions LoongArch: Add more instruction opcodes and emit_* helpers LoongArch/smp: Call rcutree_report_cpu_starting() earlier LoongArch: Relax memory ordering for atomic operations LoongArch: Mark __percpu functions as always inline LoongArch: Disable module from accessing external data directly LoongArch: Support PREEMPT_DYNAMIC with static keys |
||
![]() |
a2ccf46333 |
LoongArch/smp: Call rcutree_report_cpu_starting() earlier
rcutree_report_cpu_starting() must be called before cpu_probe() to avoid the following lockdep splat that triggered by calling __alloc_pages() when CONFIG_PROVE_RCU_LIST=y: ============================= WARNING: suspicious RCU usage 6.6.0+ #980 Not tainted ----------------------------- kernel/locking/lockdep.c:3761 RCU-list traversed in non-reader section!! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 1 lock held by swapper/1/0: #0: 900000000c82ef98 (&pcp->lock){+.+.}-{2:2}, at: get_page_from_freelist+0x894/0x1790 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.6.0+ #980 Stack : 0000000000000001 9000000004f79508 9000000004893670 9000000100310000 90000001003137d0 0000000000000000 90000001003137d8 9000000004f79508 0000000000000000 0000000000000001 0000000000000000 90000000048a3384 203a656d616e2065 ca43677b3687e616 90000001002c3480 0000000000000008 000000000000009d 0000000000000000 0000000000000001 80000000ffffe0b8 000000000000000d 0000000000000033 0000000007ec0000 13bbf50562dad831 9000000005140748 0000000000000000 9000000004f79508 0000000000000004 0000000000000000 9000000005140748 90000001002bad40 0000000000000000 90000001002ba400 0000000000000000 9000000003573ec8 0000000000000000 00000000000000b0 0000000000000004 0000000000000000 0000000000070000 ... Call Trace: [<9000000003573ec8>] show_stack+0x38/0x150 [<9000000004893670>] dump_stack_lvl+0x74/0xa8 [<900000000360d2bc>] lockdep_rcu_suspicious+0x14c/0x190 [<900000000361235c>] __lock_acquire+0xd0c/0x2740 [<90000000036146f4>] lock_acquire+0x104/0x2c0 [<90000000048a955c>] _raw_spin_lock_irqsave+0x5c/0x90 [<900000000381cd5c>] rmqueue_bulk+0x6c/0x950 [<900000000381fc0c>] get_page_from_freelist+0xd4c/0x1790 [<9000000003821c6c>] __alloc_pages+0x1bc/0x3e0 [<9000000003583b40>] tlb_init+0x150/0x2a0 [<90000000035742a0>] per_cpu_trap_init+0xf0/0x110 [<90000000035712fc>] cpu_probe+0x3dc/0x7a0 [<900000000357ed20>] start_secondary+0x40/0xb0 [<9000000004897138>] smpboot_entry+0x54/0x58 raw_smp_processor_id() is required in order to avoid calling into lockdep before RCU has declared the CPU to be watched for readers. See also commit |
||
![]() |
1f24458a10 |
TTY/Serial changes for 6.7-rc1
Here is the big set of tty/serial driver changes for 6.7-rc1. Included in here are: - console/vgacon cleanups and removals from Arnd - tty core and n_tty cleanups from Jiri - lots of 8250 driver updates and cleanups - sc16is7xx serial driver updates - dt binding updates - first set of port lock wrapers from Thomas for the printk fixes coming in future releases - other small serial and tty core cleanups and updates All of these have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZUTbaw8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+yk9+gCeKdoRb8FDwGCO/GaoHwR4EzwQXhQAoKXZRmN5 LTtw9sbfGIiBdOTtgLPb =6PJr -----END PGP SIGNATURE----- Merge tag 'tty-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty and serial updates from Greg KH: "Here is the big set of tty/serial driver changes for 6.7-rc1. Included in here are: - console/vgacon cleanups and removals from Arnd - tty core and n_tty cleanups from Jiri - lots of 8250 driver updates and cleanups - sc16is7xx serial driver updates - dt binding updates - first set of port lock wrapers from Thomas for the printk fixes coming in future releases - other small serial and tty core cleanups and updates All of these have been in linux-next for a while with no reported issues" * tag 'tty-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (193 commits) serdev: Replace custom code with device_match_acpi_handle() serdev: Simplify devm_serdev_device_open() function serdev: Make use of device_set_node() tty: n_gsm: add copyright Siemens Mobility GmbH tty: n_gsm: fix race condition in status line change on dead connections serial: core: Fix runtime PM handling for pending tx vgacon: fix mips/sibyte build regression dt-bindings: serial: drop unsupported samsung bindings tty: serial: samsung: drop earlycon support for unsupported platforms tty: 8250: Add note for PX-835 tty: 8250: Fix IS-200 PCI ID comment tty: 8250: Add Brainboxes Oxford Semiconductor-based quirks tty: 8250: Add support for Intashield IX cards tty: 8250: Add support for additional Brainboxes PX cards tty: 8250: Fix up PX-803/PX-857 tty: 8250: Fix port count of PX-257 tty: 8250: Add support for Intashield IS-100 tty: 8250: Add support for Brainboxes UP cards tty: 8250: Add support for additional Brainboxes UC cards tty: 8250: Remove UC-257 and UC-431 ... |
||
![]() |
8f6f76a6a2 |
As usual, lots of singleton and doubleton patches all over the tree and
there's little I can say which isn't in the individual changelogs. The lengthier patch series are - "kdump: use generic functions to simplify crashkernel reservation in arch", from Baoquan He. This is mainly cleanups and consolidation of the "crashkernel=" kernel parameter handling. - After much discussion, David Laight's "minmax: Relax type checks in min() and max()" is here. Hopefully reduces some typecasting and the use of min_t() and max_t(). - A group of patches from Oleg Nesterov which clean up and slightly fix our handling of reads from /proc/PID/task/... and which remove task_struct.therad_group. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZUQP9wAKCRDdBJ7gKXxA jmOAAQDh8sxagQYocoVsSm28ICqXFeaY9Co1jzBIDdNesAvYVwD/c2DHRqJHEiS4 63BNcG3+hM9nwGJHb5lyh5m79nBMRg0= =On4u -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "As usual, lots of singleton and doubleton patches all over the tree and there's little I can say which isn't in the individual changelogs. The lengthier patch series are - 'kdump: use generic functions to simplify crashkernel reservation in arch', from Baoquan He. This is mainly cleanups and consolidation of the 'crashkernel=' kernel parameter handling - After much discussion, David Laight's 'minmax: Relax type checks in min() and max()' is here. Hopefully reduces some typecasting and the use of min_t() and max_t() - A group of patches from Oleg Nesterov which clean up and slightly fix our handling of reads from /proc/PID/task/... and which remove task_struct.thread_group" * tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (64 commits) scripts/gdb/vmalloc: disable on no-MMU scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n .mailmap: add address mapping for Tomeu Vizoso mailmap: update email address for Claudiu Beznea tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions .mailmap: map Benjamin Poirier's address scripts/gdb: add lx_current support for riscv ocfs2: fix a spelling typo in comment proc: test ProtectionKey in proc-empty-vm test proc: fix proc-empty-vm test with vsyscall fs/proc/base.c: remove unneeded semicolon do_io_accounting: use sig->stats_lock do_io_accounting: use __for_each_thread() ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error() ocfs2: fix a typo in a comment scripts/show_delta: add __main__ judgement before main code treewide: mark stuff as __ro_after_init fs: ocfs2: check status values proc: test /proc/${pid}/statm compiler.h: move __is_constexpr() to compiler.h ... |
||
![]() |
ef12ea629e |
LoongArch KVM changes for v6.7
Add LoongArch's KVM support. Loongson 3A5000/3A6000 supports hardware assisted virtualization. With cpu virtualization, there are separate hw-supported user mode and kernel mode in guest mode. With memory virtualization, there are two-level hw mmu table for guest mode and host mode. Also there is separate hw cpu timer with consant frequency in guest mode, so that vm can migrate between hosts with different freq. Currently, we are able to boot LoongArch Linux Guests. Few key aspects of KVM LoongArch added by this series are: 1. Enable kvm hardware function when kvm module is loaded. 2. Implement VM and vcpu related ioctl interface such as vcpu create, vcpu run etc. GET_ONE_REG/SET_ONE_REG ioctl commands are use to get general registers one by one. 3. Hardware access about MMU, timer and csr are emulated in kernel. 4. Hardwares such as mmio and iocsr device are emulated in user space such as IPI, irqchips, pci devices etc. -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmUaKb4WHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImeuj+EACFIsWmHb8I3q4feviWBIdRcve7 xzpseO/r2Xfx5IdU6/iAKIW1YNVpHU3c8+nYS20K73uGJnMJnAZ/hmPFf+0pJ4AB qZL4aBClRynH4YsdGUeYwBfU7VMKg66ijiaZkqFEa7lSePA81gwYY2MYao58J7Gw qMe9IPnxcU/a+UqxTrFs2/G4aSb4SR0cp69GFSIGXZ7uKBc4LCMZ0ujzo/NY/V4G NkoJi8I85frF1OQbJaeJ/lGkC1Dx3Qbv7AbXYObA8K/ka1c5lr9RKraha2BOFPif mHYDa7lE6qj6e5LX9QVcpkuDnxs2yaA0Zny/p2I2o7AS0xUU/eFGGdnu1bkMqrsw 42ZuKL5Dtp/Wl5W2EEIkCVmZar+3UOeiWwGXPTk4A4fQHEGayXzJvrdUa/7f+O6Y JWYKSVjo6ZHg0ArKFtA13N54KximsrOKTpaB4fuUIP5SXMB/+JVn77/93Oyq6XSY 9+7yUTzjeMqz+4o8DLPhBIyJPvZ6cxAfx2i4htHiQg7e9ir0rZE3jsOtvJ2pgo3K 1QtGSJWH9dQMeOFnBfpDIvx3aDBcXPayXHNdaL3t/gCwJCpRuH1g6HeDTYXCJsQc qu6FDoeQn5+5IeEzoZawapQIJ2y7iXTk5SoysybKcKmnp8yYKB6olCqW8cxbWeMs HSqTUusPMsnJGlJMTw== =jaPm -----END PGP SIGNATURE----- Merge tag 'loongarch-kvm-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.7 Add LoongArch's KVM support. Loongson 3A5000/3A6000 supports hardware assisted virtualization. With cpu virtualization, there are separate hw-supported user mode and kernel mode in guest mode. With memory virtualization, there are two-level hw mmu table for guest mode and host mode. Also there is separate hw cpu timer with consant frequency in guest mode, so that vm can migrate between hosts with different freq. Currently, we are able to boot LoongArch Linux Guests. Few key aspects of KVM LoongArch added by this series are: 1. Enable kvm hardware function when kvm module is loaded. 2. Implement VM and vcpu related ioctl interface such as vcpu create, vcpu run etc. GET_ONE_REG/SET_ONE_REG ioctl commands are use to get general registers one by one. 3. Hardware access about MMU, timer and csr are emulated in kernel. 4. Hardwares such as mmio and iocsr device are emulated in user space such as IPI, irqchips, pci devices etc. |
||
![]() |
278be83601 |
LoongArch: Disable WUC for pgprot_writecombine() like ioremap_wc()
Currently the code disables WUC only disables it for ioremap_wc(), which is only used when mapping writecombine pages like ioremap() (mapped to the kernel space). But for VRAM mapped in TTM/GEM, it is mapped with a crafted pgprot by the pgprot_writecombine() function, in which case WUC isn't disabled now. Disable WUC for pgprot_writecombine() (fallback to SUC) if needed, like ioremap_wc(). This improves the AMDGPU driver's stability (solves some misrendering) on Loongson-3A5000/3A6000 machines. Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
00c2ca84c6 |
LoongArch: Use SYM_CODE_* to annotate exception handlers
As described in include/linux/linkage.h, FUNC -- C-like functions (proper stack frame etc.) CODE -- non-C code (e.g. irq handlers with different, special stack etc.) SYM_FUNC_{START, END} -- use for global functions SYM_CODE_{START, END} -- use for non-C (special) functions So use SYM_CODE_* to annotate exception handlers. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
b8466fe82b |
efi: move screen_info into efi init code
After the vga console no longer relies on global screen_info, there are only two remaining use cases: - on the x86 architecture, it is used for multiple boot methods (bzImage, EFI, Xen, kexec) to commucate the initial VGA or framebuffer settings to a number of device drivers. - on other architectures, it is only used as part of the EFI stub, and only for the three sysfb framebuffers (simpledrm, simplefb, efifb). Remove the duplicate data structure definitions by moving it into the efi-init.c file that sets it up initially for the EFI case, leaving x86 as an exception that retains its own definition for non-EFI boots. The added #ifdefs here are optional, I added them to further limit the reach of screen_info to configurations that have at least one of the users enabled. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20231017093947.3627976-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
![]() |
8a736ddfc8 |
vgacon: rework screen_info #ifdef checks
On non-x86 architectures, the screen_info variable is generally only used for the VGA console where supported, and in some cases the EFI framebuffer or vga16fb. Now that we have a definite list of which architectures actually use it for what, use consistent #ifdef checks so the global variable is only defined when it is actually used on those architectures. Loongarch and riscv have no support for vgacon or vga16fb, but they support EFI firmware, so only that needs to be checked, and the initialization can be removed because that is handled by EFI. IA64 has both vgacon and EFI, though EFI apparently never uses a framebuffer here. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Khalid Aziz <khalid@gonehiking.org> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20231009211845.3136536-3-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
![]() |
a9e1a3d84e |
crash_core: change the prototype of function parse_crashkernel()
Add two parameters 'low_size' and 'high' to function parse_crashkernel(), later crashkernel=,high|low parsing will be added. Make adjustments in all call sites of parse_crashkernel() in arch. Link: https://lkml.kernel.org/r/20230914033142.676708-3-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Jiahao <chenjiahao16@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
![]() |
39fdf4be72 |
LoongArch: KVM: Implement vcpu world switch
Implement LoongArch vcpu world switch, including vcpu enter guest and vcpu exit from guest, both operations need to save or restore the host and guest registers. Reviewed-by: Bibo Mao <maobibo@loongson.cn> Tested-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
b1dc55a3d6 |
LoongArch: Add support for 64_PCREL relocation type
When build and update kernel with the latest upstream binutils and loongson3_defconfig, module loader fails with: kmod: zsmalloc: Unknown relocation type 109 kmod: fuse: Unknown relocation type 109 kmod: fuse: Unknown relocation type 109 kmod: radeon: Unknown relocation type 109 kmod: nf_tables: Unknown relocation type 109 kmod: nf_tables: Unknown relocation type 109 This is because the latest upstream binutils replaces a pair of ADD64 and SUB64 with 64_PCREL, so add support for 64_PCREL relocation type. Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb Cc: <stable@vger.kernel.org> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
c1c2ce2d3b |
LoongArch: Add support for 32_PCREL relocation type
When build and update kernel with the latest upstream binutils and loongson3_defconfig, module loader fails with: kmod: zsmalloc: Unsupport relocation type 99, please add its support. kmod: fuse: Unsupport relocation type 99, please add its support. kmod: ipmi_msghandler: Unsupport relocation type 99, please add its support. kmod: ipmi_msghandler: Unsupport relocation type 99, please add its support. kmod: pstore: Unsupport relocation type 99, please add its support. kmod: drm_display_helper: Unsupport relocation type 99, please add its support. kmod: drm_display_helper: Unsupport relocation type 99, please add its support. kmod: drm_display_helper: Unsupport relocation type 99, please add its support. kmod: fuse: Unsupport relocation type 99, please add its support. kmod: fat: Unsupport relocation type 99, please add its support. This is because the latest upstream binutils replaces a pair of ADD32 and SUB32 with 32_PCREL, so add support for 32_PCREL relocation type. Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb Cc: <stable@vger.kernel.org> Co-developed-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
2761498876 |
LoongArch: Define relocation types for ABI v2.10
The relocation types from 101 to 109 are used by GNU binutils >= 2.41, add their definitions to use them in later patches. Link: https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=include/elf/loongarch.h#l230 Cc: <stable@vger.kernel.org> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
1943feecf8 |
LoongArch: numa: Fix high_memory calculation
For 64bit kernel without HIGHMEM, high_memory is the virtual address of
the highest physical address in the system. But __va(get_num_physpages()
<< PAGE_SHIFT) is not what we want for high_memory because there may be
holes in the physical address space. On the other hand, max_low_pfn is
calculated from memblock_end_of_DRAM(), which is exactly corresponding
to the highest physical address, so use it for high_memory calculation.
Cc: <stable@vger.kernel.org>
Fixes:
|
||
![]() |
b795fb9f58 |
LoongArch: Set all reserved memblocks on Node#0 at initialization
After commit
|
||
![]() |
d0b933ae7a |
LoongArch: Remove dead code in relocate_new_kernel
The initial aim is to silence the following objtool warning: arch/loongarch/kernel/relocate_kernel.o: warning: objtool: relocate_new_kernel+0x74: unreachable instruction There are two adjacent "b" instructions, the second one is unreachable, it is dead code, just remove it. Co-developed-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Co-developed-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
c718a0bad7 |
LoongArch: Fix some build warnings with W=1
There are some building warnings when building LoongArch kernel with W=1 as following, this patch fixes them. arch/loongarch/kernel/acpi.c:284:13: warning: no previous prototype for ‘acpi_numa_arch_fixup’ [-Wmissing-prototypes] 284 | void __init acpi_numa_arch_fixup(void) {} | ^~~~~~~~~~~~~~~~~~~~ arch/loongarch/kernel/time.c:32:13: warning: no previous prototype for ‘constant_timer_interrupt’ [-Wmissing-prototypes] 32 | irqreturn_t constant_timer_interrupt(int irq, void *data) | ^~~~~~~~~~~~~~~~~~~~~~~~ arch/loongarch/kernel/traps.c:496:25: warning: no previous prototype for 'do_fpe' [-Wmissing-prototypes] 496 | asmlinkage void noinstr do_fpe(struct pt_regs *regs | ^~~~~~ arch/loongarch/kernel/traps.c:813:22: warning: variable ‘opcode’ set but not used [-Wunused-but-set-variable] 813 | unsigned int opcode; | ^~~~~~ arch/loongarch/kernel/signal.c:895:14: warning: no previous prototype for ‘get_sigframe’ [-Wmissing-prototypes] 895 | void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs, | ^~~~~~~~~~~~ arch/loongarch/kernel/syscall.c:21:40: warning: initialized field overwritten [-Woverride-init] 21 | #define __SYSCALL(nr, call) [nr] = (call), | ^ arch/loongarch/kernel/syscall.c:40:14: warning: no previous prototype for ‘do_syscall’ [-Wmissing-prototypes] 40 | void noinstr do_syscall(struct pt_regs *regs) | ^~~~~~~~~~ arch/loongarch/kernel/smp.c:502:17: warning: no previous prototype for ‘start_secondary’ [-Wmissing-prototypes] 502 | asmlinkage void start_secondary(void) | ^~~~~~~~~~~~~~~ arch/loongarch/kernel/process.c:309:15: warning: no previous prototype for ‘arch_align_stack’ [-Wmissing-prototypes] 309 | unsigned long arch_align_stack(unsigned long sp) | ^~~~~~~~~~~~~~~~ arch/loongarch/kernel/topology.c:13:5: warning: no previous prototype for ‘arch_register_cpu’ [-Wmissing-prototypes] 13 | int arch_register_cpu(int cpu) | ^~~~~~~~~~~~~~~~~ arch/loongarch/kernel/topology.c:27:6: warning: no previous prototype for ‘arch_unregister_cpu’ [-Wmissing-prototypes] 27 | void arch_unregister_cpu(int cpu) | ^~~~~~~~~~~~~~~~~~~ arch/loongarch/kernel/module-sections.c:103:5: warning: no previous prototype for ‘module_frob_arch_sections’ [-Wmissing-prototypes] 103 | int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, | ^~~~~~~~~~~~~~~~~~~~~~~~~ arch/loongarch/mm/hugetlbpage.c:56:5: warning: no previous prototype for ‘is_aligned_hugepage_range’ [-Wmissing-prototypes] 56 | int is_aligned_hugepage_range(unsigned long addr, unsigned long len) | ^~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
68ffa230da |
LoongArch: Fix lockdep static memory detection
Since commit |
||
![]() |
12952b6bbd |
LoongArch changes for v6.6
1, Allow usage of LSX/LASX in the kernel; 2, Add SIMD-optimized RAID5/RAID6 routines; 3, Add Loongson Binary Translation (LBT) extension support; 4, Add basic KGDB & KDB support; 5, Add building with kcov coverage; 6, Add KFENCE (Kernel Electric-Fence) support; 7, Add KASAN (Kernel Address Sanitizer) support; 8, Some bug fixes and other small changes; 9, Update the default config file. -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmT5TfMWHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImeqd3EACjqCaHNlp33kwufSPpGuQw9a8I F7JW1KzBOoWELch5nFRjfQClROBWRmM4jN5YnxENBQ5K2F1K6gfxdkfjew+KV2mn ki9ByamCfFVJDZXo9wavUD2LBrVakEFmLT+SyXBxdWwJ3fDivHjF6A0qs9ltp7dq Bttq4bkw1mZsU6MnViRwPKVROtNUVrd9mwYSTq0iXviVEbWhPHQQTxRizNra9Z6X 7XWxO0ODHl0WVvdOJU+F16mBRS3Bs1g/HHAIDc41yrYEHFFOeFCEUAQSF/4Nj5wj BAfAB8WOa9+vPH8fTnrpCt2RtGJmkz71TM49DdXB7jpGaWIyc4WDi9MXeeBiJ0wE vQg8IECc9POC1sH4/6BMwq2qkrWRj2PYFYof0fP66iWNjmodtNUf7GOVHy8MTQan xHWizJFAdY/u/bwbF9tRQ+EVeot/844CkjtZxkgTfV8shN6kCMEVAamwBItZ7TXN g/oc1ORM6nsKHBDQF3r2LSY0Gbf3OSfMJVL8SLEQ9hAhgGhotmJ36B4bdvyO7T0Q gNn//U+p4IIMFRKRxreEz9P0KjTOJrHAAxNzu1oZebhGZd5WI+i0PHYkkBDKZTXc 7qaEdM2cX8Wd0ePIXOHQnSItwYO7ilrviHyeCM8wd/g2/W/00jvnpF3J+2rk7eJO rcfAr8+V5ylYBQzp6Q== =NXy2 -----END PGP SIGNATURE----- Merge tag 'loongarch-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Allow usage of LSX/LASX in the kernel, and use them for SIMD-optimized RAID5/RAID6 routines - Add Loongson Binary Translation (LBT) extension support - Add basic KGDB & KDB support - Add building with kcov coverage - Add KFENCE (Kernel Electric-Fence) support - Add KASAN (Kernel Address Sanitizer) support - Some bug fixes and other small changes - Update the default config file * tag 'loongarch-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (25 commits) LoongArch: Update Loongson-3 default config file LoongArch: Add KASAN (Kernel Address Sanitizer) support LoongArch: Simplify the processing of jumping new kernel for KASLR kasan: Add (pmd|pud)_init for LoongArch zero_(pud|p4d)_populate process kasan: Add __HAVE_ARCH_SHADOW_MAP to support arch specific mapping LoongArch: Add KFENCE (Kernel Electric-Fence) support LoongArch: Get partial stack information when providing regs parameter LoongArch: mm: Add page table mapped mode support for virt_to_page() kfence: Defer the assignment of the local variable addr LoongArch: Allow building with kcov coverage LoongArch: Provide kaslr_offset() to get kernel offset LoongArch: Add basic KGDB & KDB support LoongArch: Add Loongson Binary Translation (LBT) extension support raid6: Add LoongArch SIMD recovery implementation raid6: Add LoongArch SIMD syndrome calculation LoongArch: Add SIMD-optimized XOR routines LoongArch: Allow usage of LSX/LASX in the kernel LoongArch: Define symbol 'fault' as a local label in fpu.S LoongArch: Adjust {copy, clear}_user exception handler behavior LoongArch: Use static defined zero page rather than allocated ... |
||
![]() |
5aa4ac64e6 |
LoongArch: Add KASAN (Kernel Address Sanitizer) support
1/8 of kernel addresses reserved for shadow memory. But for LoongArch, There are a lot of holes between different segments and valid address space (256T available) is insufficient to map all these segments to kasan shadow memory with the common formula provided by kasan core, saying (addr >> KASAN_SHADOW_SCALE_SHIFT) + KASAN_SHADOW_OFFSET So LoongArch has a arch-specific mapping formula, different segments are mapped individually, and only limited space lengths of these specific segments are mapped to shadow. At early boot stage the whole shadow region populated with just one physical page (kasan_early_shadow_page). Later, this page is reused as readonly zero shadow for some memory that kasan currently don't track. After mapping the physical memory, pages for shadow memory are allocated and mapped. Functions like memset()/memcpy()/memmove() do a lot of memory accesses. If bad pointer passed to one of these function it is important to be caught. Compiler's instrumentation cannot do this since these functions are written in assembly. KASan replaces memory functions with manually instrumented variants. Original functions declared as weak symbols so strong definitions in mm/kasan/kasan.c could replace them. Original functions have aliases with '__' prefix in names, so we could call non-instrumented variant if needed. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
9fbcc07679 |
LoongArch: Simplify the processing of jumping new kernel for KASLR
Modified relocate_kernel() doesn't return new kernel's entry point but the random_offset. In this way we share the start_kernel() processing with the normal kernel, which avoids calling 'jr a0' directly and allows some other operations (e.g, kasan_early_init) before start_kernel() when KASLR (CONFIG_RANDOMIZE_BASE) is turned on. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
95bb5b617b |
LoongArch: Get partial stack information when providing regs parameter
Currently, arch_stack_walk() can only get the full stack information including NMI. This is because the implementation of arch_stack_walk() is forced to ignore the information passed by the regs parameter and use the current stack information instead. For some detection systems like KFENCE, only partial stack information is needed. In particular, the stack frame where the interrupt occurred. To support KFENCE, this patch modifies the implementation of the arch_stack_walk() function so that if this function is called with the regs argument passed, it retains all the stack information in regs and uses it to provide accurate information. Before this patch: [ 1.531195 ] ================================================================== [ 1.531442 ] BUG: KFENCE: out-of-bounds read in stack_trace_save_regs+0x48/0x6c [ 1.531442 ] [ 1.531900 ] Out-of-bounds read at 0xffff800012267fff (1B left of kfence-#12): [ 1.532046 ] stack_trace_save_regs+0x48/0x6c [ 1.532169 ] kfence_report_error+0xa4/0x528 [ 1.532276 ] kfence_handle_page_fault+0x124/0x270 [ 1.532388 ] no_context+0x50/0x94 [ 1.532453 ] do_page_fault+0x1a8/0x36c [ 1.532524 ] tlb_do_page_fault_0+0x118/0x1b4 [ 1.532623 ] test_out_of_bounds_read+0xa0/0x1d8 [ 1.532745 ] kunit_generic_run_threadfn_adapter+0x1c/0x28 [ 1.532854 ] kthread+0x124/0x130 [ 1.532922 ] ret_from_kernel_thread+0xc/0xa4 <snip> After this patch: [ 1.320220 ] ================================================================== [ 1.320401 ] BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0xa8/0x1d8 [ 1.320401 ] [ 1.320898 ] Out-of-bounds read at 0xffff800012257fff (1B left of kfence-#10): [ 1.321134 ] test_out_of_bounds_read+0xa8/0x1d8 [ 1.321264 ] kunit_generic_run_threadfn_adapter+0x1c/0x28 [ 1.321392 ] kthread+0x124/0x130 [ 1.321459 ] ret_from_kernel_thread+0xc/0xa4 <snip> Suggested-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Enze Li <lienze@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
e14dd07696 |
LoongArch: Add basic KGDB & KDB support
KGDB is intended to be used as a source level debugger for the Linux kernel. It is used along with gdb to debug a Linux kernel. GDB can be used to "break in" to the kernel to inspect memory, variables and regs similar to the way an application developer would use GDB to debug an application. KDB is a frontend of KGDB which is similar to GDB. By now, in addition to the generic KGDB features, the LoongArch KGDB implements the following features: - Hardware breakpoints/watchpoints; - Software single-step support for KDB. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> # Framework & CoreFeature Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> # BreakPoint & SingleStep Signed-off-by: Hui Li <lihui@loongson.cn> # Some Minor Improvements Signed-off-by: Randy Dunlap <rdunlap@infradead.org> # Some Build Error Fixes Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
bd3c579848 |
LoongArch: Add Loongson Binary Translation (LBT) extension support
Loongson Binary Translation (LBT) is used to accelerate binary translation, which contains 4 scratch registers (scr0 to scr3), x86/ARM eflags (eflags) and x87 fpu stack pointer (ftop). This patch support kernel to save/restore these registers, handle the LBT exception and maintain sigcontext. Signed-off-by: Qi Hu <huqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
2478e4b759 |
LoongArch: Allow usage of LSX/LASX in the kernel
Allow usage of LSX/LASX in the kernel by extending kernel_fpu_begin() and kernel_fpu_end(). Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
8f58c571bf |
LoongArch: Define symbol 'fault' as a local label in fpu.S
The initial aim is to silence the following objtool warnings: arch/loongarch/kernel/fpu.o: warning: objtool: _save_fp_context() falls through to next function fault() arch/loongarch/kernel/fpu.o: warning: objtool: _restore_fp_context() falls through to next function fault() arch/loongarch/kernel/fpu.o: warning: objtool: _save_lsx_context() falls through to next function fault() arch/loongarch/kernel/fpu.o: warning: objtool: _restore_lsx_context() falls through to next function fault() arch/loongarch/kernel/fpu.o: warning: objtool: _save_lasx_context() falls through to next function fault() arch/loongarch/kernel/fpu.o: warning: objtool: _restore_lasx_context() falls through to next function fault() Currently, SYM_FUNC_START()/SYM_FUNC_END() defines the symbol 'fault' as SYM_T_FUNC which is STT_FUNC, the objtool warnings are generated through the following code: tools/objtool/include/objtool/check.h: static inline struct symbol *insn_func(struct instruction *insn) { struct symbol *sym = insn->sym; if (sym && sym->type != STT_FUNC) sym = NULL; return sym; } tools/objtool/check.c: static int validate_branch(struct objtool_file *file, struct symbol *func, struct instruction *insn, struct insn_state state) { ... if (func && insn_func(insn) && func != insn_func(insn)->pfunc) { ... WARN("%s() falls through to next function %s()", func->name, insn_func(insn)->name); return 1; } ... } We can see that the fixup can be a local label in the following code: arch/loongarch/include/asm/asm-extable.h: .pushsection __ex_table, "a"; \ .balign 4; \ .long ((insn) - .); \ .long ((fixup) - .); \ .short (type); \ .short (data); \ .popsection; .macro _asm_extable, insn, fixup __ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_FIXUP, 0) .endm Like arch/loongarch/lib/*.S, just define the symbol 'fault' as a local label in fpu.S. Before: $ readelf -s arch/loongarch/kernel/fpu.o | awk -F: /fault/'{print $2}' 000000000000053c 8 FUNC GLOBAL DEFAULT 1 fault After: $ readelf -s arch/loongarch/kernel/fpu.o | awk -F: /fault/'{print $2}' 000000000000053c 0 NOTYPE LOCAL DEFAULT 1 .L_fpu_fault Co-developed-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
0921af6ccf |
LoongArch: Use static defined zero page rather than allocated
On LoongArch system, there is only one page needed for zero page (no cache synonyms), and there is no COLOR_ZERO_PAGE, so zero_page_mask is useless and the macro __HAVE_COLOR_ZERO_PAGE is not necessary. Like other popular architectures, It is simpler to define the zero page in kernel BSS code segment rather than dynamically allocate. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
2bb20d2926 |
LoongArch: mm: Introduce unified function populate_kernel_pte()
Function pcpu_populate_pte() and fixmap_pte() are similar, they populate one page from kernel address space. And there is confusion between pgd and p4d in the function fixmap_pte(), such as pgd_none() always returns zero. This patch introduces a unified function populate_kernel_pte() and then replaces pcpu_populate_pte() and fixmap_pte(). Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
f33efa905c |
LoongArch: Code improvements in function pcpu_populate_pte()
Do some code improvements in function pcpu_populate_pte(): 1. Add memory allocation failure handling; 2. Replace pgd_populate() with p4d_populate(), it will be useful if there are four-level page tables. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
8e1e49550d |
TTY/Serial driver changes for 6.6-rc1
Here is the big set of tty and serial driver changes for 6.6-rc1. Lots of cleanups in here this cycle, and some driver updates. Short summary is: - Jiri's continued work to make the tty code and apis be a bit more sane with regards to modern kernel coding style and types - cpm_uart driver updates - n_gsm updates and fixes - meson driver updates - sc16is7xx driver updates - 8250 driver updates for different hardware types - qcom-geni driver fixes - tegra serial driver change - stm32 driver updates - synclink_gt driver cleanups - tty structure size reduction All of these have been in linux-next this week with no reported issues. The last bit of cleanups from Jiri and the tty structure size reduction came in last week, a bit late but as they were just style changes and size reductions, I figured they should get into this merge cycle so that others can work on top of them with no merge conflicts. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZPH+jA8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ykKyACgldt6QeenTN+6dXIHS/eQHtTKZwMAn3arSeXI QrUUnLFjOWyoX87tbMBQ =LVw0 -----END PGP SIGNATURE----- Merge tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver updates from Greg KH: "Here is the big set of tty and serial driver changes for 6.6-rc1. Lots of cleanups in here this cycle, and some driver updates. Short summary is: - Jiri's continued work to make the tty code and apis be a bit more sane with regards to modern kernel coding style and types - cpm_uart driver updates - n_gsm updates and fixes - meson driver updates - sc16is7xx driver updates - 8250 driver updates for different hardware types - qcom-geni driver fixes - tegra serial driver change - stm32 driver updates - synclink_gt driver cleanups - tty structure size reduction All of these have been in linux-next this week with no reported issues. The last bit of cleanups from Jiri and the tty structure size reduction came in last week, a bit late but as they were just style changes and size reductions, I figured they should get into this merge cycle so that others can work on top of them with no merge conflicts" * tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (199 commits) tty: shrink the size of struct tty_struct by 40 bytes tty: n_tty: deduplicate copy code in n_tty_receive_buf_real_raw() tty: n_tty: extract ECHO_OP processing to a separate function tty: n_tty: unify counts to size_t tty: n_tty: use u8 for chars and flags tty: n_tty: simplify chars_in_buffer() tty: n_tty: remove unsigned char casts from character constants tty: n_tty: move newline handling to a separate function tty: n_tty: move canon handling to a separate function tty: n_tty: use MASK() for masking out size bits tty: n_tty: make n_tty_data::num_overrun unsigned tty: n_tty: use time_is_before_jiffies() in n_tty_receive_overrun() tty: n_tty: use 'num' for writes' counts tty: n_tty: use output character directly tty: n_tty: make flow of n_tty_receive_buf_common() a bool Revert "tty: serial: meson: Add a earlycon for the T7 SoC" Documentation: devices.txt: Fix minors for ttyCPM* Documentation: devices.txt: Remove ttySIOC* Documentation: devices.txt: Remove ttyIOC* serial: 8250_bcm7271: improve bcm7271 8250 port ... |
||
![]() |
461f35f014 |
drm for 6.6-rc1
core: - fix gfp flags in drmm_kmalloc gpuva: - add new generic GPU VA manager (for nouveau initially) syncobj: - add new DRM_IOCTL_SYNCOBJ_EVENTFD ioctl dma-buf: - acquire resv lock for mmap() in exporters - support dma-buf self import automatically - docs fixes backlight: - fix fbdev interactions atomic: - improve logging prime: - remove struct gem_prim_mmap plus driver updates gem: - drm_exec: add locking over multiple GEM objects - fix lockdep checking fbdev: - make fbdev userspace interfaces optional - use linux device instead of fbdev device - use deferred i/o helper macros in various drivers - Make FB core selectable without drivers - Remove obsolete flags FBINFO_DEFAULT and FBINFO_FLAG_DEFAULT - Add helper macros and Kconfig tokens for DMA-allocated framebuffer ttm: - support init_on_free - swapout fixes panel: - panel-edp: Support AUO B116XAB01.4 - Support Visionox R66451 plus DT bindings - ld9040: Backlight support, magic improved, Kconfig fix - Convert to of_device_get_match_data() - Fix Kconfig dependencies - simple: Set bpc value to fix warning; Set connector type for AUO T215HVN01; Support Innolux G156HCE-L01 plus DT bindings - ili9881: Support TDO TL050HDV35 LCD panel plus DT bindings - startek: Support KD070FHFID015 MIPI-DSI panel plus DT bindings - sitronix-st7789v: Support Inanbo T28CP45TN89 plus DT bindings; Support EDT ET028013DMA plus DT bindings; Various cleanups - edp: Add timings for N140HCA-EAC - Allow panels and touchscreens to power sequence together - Fix Innolux G156HCE-L01 LVDS clock bridge: - debugfs for chains support - dw-hdmi: Improve support for YUV420 bus format CEC suspend/resume, update EDID on HDMI detect - dw-mipi-dsi: Fix enable/disable of DSI controller - lt9611uxc: Use MODULE_FIRMWARE() - ps8640: Remove broken EDID code - samsung-dsim: Fix command transfer - tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups - adv7511: Fix low refresh rate - anx7625: Switch to macros instead of hardcoded values locking fixes - tc358767: fix hardware delays - sitronix-st7789v: Support panel orientation; Support rotation property; Add support for Jasonic JT240MHQS-HWT-EK-E3 plus DT bindings amdgpu: - SDMA 6.1.0 support - HDP 6.1 support - SMUIO 14.0 support - PSP 14.0 support - IH 6.1 support - Lots of checkpatch cleanups - GFX 9.4.3 updates - Add USB PD and IFWI flashing documentation - GPUVM updates - RAS fixes - DRR fixes - FAMS fixes - Virtual display fixes - Soft IH fixes - SMU13 fixes - Rework PSP firmware loading for other IPs - Kernel doc fixes - DCN 3.0.1 fixes - LTTPR fixes - DP MST fixes - DCN 3.1.6 fixes - SMU 13.x fixes - PSP 13.x fixes - SubVP fixes - GC 9.4.3 fixes - Display bandwidth calculation fixes - VCN4 secure submission fixes - Allow building DC on RISC-V - Add visible FB info to bo_print_info - HBR3 fixes - GFX9 MCBP fix - GMC10 vmhub index fix - GMC11 vmhub index fix - Create a new doorbell manager - SR-IOV fixes - initial freesync panel replay support - revert zpos properly until igt regression is fixeed - use TTM to manage doorbell BAR - Expose both current and average power via hwmon if supported amdkfd: - Cleanup CRIU dma-buf handling - Use KIQ to unmap HIQ - GFX 9.4.3 debugger updates - GFX 9.4.2 debugger fixes - Enable cooperative groups fof gfx11 - SVM fixes - Convert older APUs to use dGPU path like newer APUs - Drop IOMMUv2 path as it is no longer used - TBA fix for aldebaran i915: - ICL+ DSI modeset sequence - HDCP improvements - MTL display fixes and cleanups - HSW/BDW PSR1 restored - Init DDI ports in VBT order - General display refactors - Start using plane scale factor for relative data rate - Use shmem for dpt objects - Expose RPS thresholds in sysfs - Apply GuC SLPC min frequency softlimit correctly - Extend Wa_14015795083 to TGL, RKL, DG1 and ADL - Fix a VMA UAF for multi-gt platform - Do not use stolen on MTL due to HW bug - Check HuC and GuC version compatibility on MTL - avoid infinite GPU waits due to premature release of request memory - Fixes and updates for GSC memory allocation - Display SDVO fixes - Take stolen handling out of FBC code - Make i915_coherent_map_type GT-centric - Simplify shmem_create_from_object map_type msm: - SM6125 MDSS support - DPU: SM6125 DPU support - DSI: runtime PM support, burst mode support - DSI PHY: SM6125 support in 14nm DSI PHY driver - GPU: prepare for a7xx - fix a690 firmware - disable relocs on a6xx and newer radeon: - Lots of checkpatch cleanups ast: - improve device-model detection - Represent BMV as virtual connector - Report DP connection status nouveau: - add new exec/bind interface to support Vulkan - document some getparam ioctls - improve VRAM detection - various fixes/cleanups - workraound DPCD issues ivpu: - MMU updates - debugfs support - Support vpu4 virtio: - add sync object support atmel-hlcdc: - Support inverted pixclock polarity etnaviv: - runtime PM cleanups - hang handling fixes exynos: - use fbdev DMA helpers - fix possible NULL ptr dereference komeda: - always attach encoder omapdrm: - use fbdev DMA helpers ingenic: - kconfig regmap fixes loongson: - support display controller mediatek: - Small mtk-dpi cleanups - DisplayPort: support eDP and aux-bus - Fix coverity issues - Fix potential memory leak if vmap() fail mgag200: - minor fixes mxsfb: - support disabling overlay planes panfrost: - fix sync in IRQ handling ssd130x: - Support per-controller default resolution plus DT bindings - Reduce memory-allocation overhead - Improve intermediate buffer size computation - Fix allocation of temporary buffers - Fix pitch computation - Fix shadow plane allocation tegra: - use fbdev DMA helpers - Convert to devm_platform_ioremap_resource() - support bridge/connector - enable PM tidss: - Support TI AM625 plus DT bindings - Implement new connector model plus driver updates vkms: - improve write back support - docs fixes - support gamma LUT zynqmp-dpsub: - misc fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmTukSYACgkQDHTzWXnE hr6vnQ/+J7vBVkBr8JsaEV/twcZwzbNdpivsIagd8U83GQB50nDReVXbNx+Wo0/C WiGlrC6Sw3NVOGbkigd5IQ7fb5C/7RnBmzMi/iS7Qnk2uEqLqgV00VxfGwdm6wgr 0gNB8zuu2xYphHz2K8LzwnmeQRdN+YUQpUa2wNzLO88IEkTvq5vx2rJEn5p9/3hp OxbbPBzpDRRPlkNFfVQCN8todbKdsPc4am81Eqgv7BJf21RFgQodPGW5koCYuv0w 3m+PJh1KkfYAL974EsLr/pkY7yhhiZ6SlFLX8ssg4FyZl/Vthmc9bl14jRq/pqt4 GBp8yrPq1XjrwXR8wv3MiwNEdANQ+KD9IoGlzLxqVgmEFRE+g4VzZZXeC3AIrTVP FPg4iLUrDrmj9RpJmbVqhq9X2jZs+EtRAFkJPrPbq2fItAD2a2dW4X3ISSnnTqDI 6O2dVwuLCU6OfWnvN4bPW9p8CqRgR8Itqv1SI8qXooDy307YZu1eTUf5JAVwG/SW xbDEFVFlMPyFLm+KN5dv1csJKK21vWi9gLg8phK8mTWYWnqMEtJqbxbRzmdBEFmE pXKVu01P6ZqgBbaETpCljlOaEDdJnvO4W+o70MgBtpR2IWFMbMNO+iS0EmLZ6Vgj 9zYZctpL+dMuHV0Of1GMkHFRHTMYEzW4tuctLIQfG13y4WzyczY= =CwV9 -----END PGP SIGNATURE----- Merge tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "The drm core grew a new generic gpu virtual address manager, and new execution locking helpers. These are used by nouveau now to provide uAPI support for the userspace Vulkan driver. AMD had a bunch of new IP core support, loads of refactoring around fbdev, but mostly just the usual amount of stuff across the board. core: - fix gfp flags in drmm_kmalloc gpuva: - add new generic GPU VA manager (for nouveau initially) syncobj: - add new DRM_IOCTL_SYNCOBJ_EVENTFD ioctl dma-buf: - acquire resv lock for mmap() in exporters - support dma-buf self import automatically - docs fixes backlight: - fix fbdev interactions atomic: - improve logging prime: - remove struct gem_prim_mmap plus driver updates gem: - drm_exec: add locking over multiple GEM objects - fix lockdep checking fbdev: - make fbdev userspace interfaces optional - use linux device instead of fbdev device - use deferred i/o helper macros in various drivers - Make FB core selectable without drivers - Remove obsolete flags FBINFO_DEFAULT and FBINFO_FLAG_DEFAULT - Add helper macros and Kconfig tokens for DMA-allocated framebuffer ttm: - support init_on_free - swapout fixes panel: - panel-edp: Support AUO B116XAB01.4 - Support Visionox R66451 plus DT bindings - ld9040: - Backlight support - magic improved - Kconfig fix - Convert to of_device_get_match_data() - Fix Kconfig dependencies - simple: - Set bpc value to fix warning - Set connector type for AUO T215HVN01 - Support Innolux G156HCE-L01 plus DT bindings - ili9881: Support TDO TL050HDV35 LCD panel plus DT bindings - startek: Support KD070FHFID015 MIPI-DSI panel plus DT bindings - sitronix-st7789v: - Support Inanbo T28CP45TN89 plus DT bindings - Support EDT ET028013DMA plus DT bindings - Various cleanups - edp: Add timings for N140HCA-EAC - Allow panels and touchscreens to power sequence together - Fix Innolux G156HCE-L01 LVDS clock bridge: - debugfs for chains support - dw-hdmi: - Improve support for YUV420 bus format - CEC suspend/resume - update EDID on HDMI detect - dw-mipi-dsi: Fix enable/disable of DSI controller - lt9611uxc: Use MODULE_FIRMWARE() - ps8640: Remove broken EDID code - samsung-dsim: Fix command transfer - tc358764: - Handle HS/VS polarity - Use BIT() macro - Various cleanups - adv7511: Fix low refresh rate - anx7625: - Switch to macros instead of hardcoded values - locking fixes - tc358767: fix hardware delays - sitronix-st7789v: - Support panel orientation - Support rotation property - Add support for Jasonic JT240MHQS-HWT-EK-E3 plus DT bindings amdgpu: - SDMA 6.1.0 support - HDP 6.1 support - SMUIO 14.0 support - PSP 14.0 support - IH 6.1 support - Lots of checkpatch cleanups - GFX 9.4.3 updates - Add USB PD and IFWI flashing documentation - GPUVM updates - RAS fixes - DRR fixes - FAMS fixes - Virtual display fixes - Soft IH fixes - SMU13 fixes - Rework PSP firmware loading for other IPs - Kernel doc fixes - DCN 3.0.1 fixes - LTTPR fixes - DP MST fixes - DCN 3.1.6 fixes - SMU 13.x fixes - PSP 13.x fixes - SubVP fixes - GC 9.4.3 fixes - Display bandwidth calculation fixes - VCN4 secure submission fixes - Allow building DC on RISC-V - Add visible FB info to bo_print_info - HBR3 fixes - GFX9 MCBP fix - GMC10 vmhub index fix - GMC11 vmhub index fix - Create a new doorbell manager - SR-IOV fixes - initial freesync panel replay support - revert zpos properly until igt regression is fixeed - use TTM to manage doorbell BAR - Expose both current and average power via hwmon if supported amdkfd: - Cleanup CRIU dma-buf handling - Use KIQ to unmap HIQ - GFX 9.4.3 debugger updates - GFX 9.4.2 debugger fixes - Enable cooperative groups fof gfx11 - SVM fixes - Convert older APUs to use dGPU path like newer APUs - Drop IOMMUv2 path as it is no longer used - TBA fix for aldebaran i915: - ICL+ DSI modeset sequence - HDCP improvements - MTL display fixes and cleanups - HSW/BDW PSR1 restored - Init DDI ports in VBT order - General display refactors - Start using plane scale factor for relative data rate - Use shmem for dpt objects - Expose RPS thresholds in sysfs - Apply GuC SLPC min frequency softlimit correctly - Extend Wa_14015795083 to TGL, RKL, DG1 and ADL - Fix a VMA UAF for multi-gt platform - Do not use stolen on MTL due to HW bug - Check HuC and GuC version compatibility on MTL - avoid infinite GPU waits due to premature release of request memory - Fixes and updates for GSC memory allocation - Display SDVO fixes - Take stolen handling out of FBC code - Make i915_coherent_map_type GT-centric - Simplify shmem_create_from_object map_type msm: - SM6125 MDSS support - DPU: SM6125 DPU support - DSI: runtime PM support, burst mode support - DSI PHY: SM6125 support in 14nm DSI PHY driver - GPU: prepare for a7xx - fix a690 firmware - disable relocs on a6xx and newer radeon: - Lots of checkpatch cleanups ast: - improve device-model detection - Represent BMV as virtual connector - Report DP connection status nouveau: - add new exec/bind interface to support Vulkan - document some getparam ioctls - improve VRAM detection - various fixes/cleanups - workraound DPCD issues ivpu: - MMU updates - debugfs support - Support vpu4 virtio: - add sync object support atmel-hlcdc: - Support inverted pixclock polarity etnaviv: - runtime PM cleanups - hang handling fixes exynos: - use fbdev DMA helpers - fix possible NULL ptr dereference komeda: - always attach encoder omapdrm: - use fbdev DMA helpers ingenic: - kconfig regmap fixes loongson: - support display controller mediatek: - Small mtk-dpi cleanups - DisplayPort: support eDP and aux-bus - Fix coverity issues - Fix potential memory leak if vmap() fail mgag200: - minor fixes mxsfb: - support disabling overlay planes panfrost: - fix sync in IRQ handling ssd130x: - Support per-controller default resolution plus DT bindings - Reduce memory-allocation overhead - Improve intermediate buffer size computation - Fix allocation of temporary buffers - Fix pitch computation - Fix shadow plane allocation tegra: - use fbdev DMA helpers - Convert to devm_platform_ioremap_resource() - support bridge/connector - enable PM tidss: - Support TI AM625 plus DT bindings - Implement new connector model plus driver updates vkms: - improve write back support - docs fixes - support gamma LUT zynqmp-dpsub: - misc fixes" * tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drm: (1327 commits) drm/gpuva_mgr: remove unused prev pointer in __drm_gpuva_sm_map() drm/tests/drm_kunit_helpers: Place correct function name in the comment header drm/nouveau: uapi: don't pass NO_PREFETCH flag implicitly drm/nouveau: uvmm: fix unset region pointer on remap drm/nouveau: sched: avoid job races between entities drm/i915: Fix HPD polling, reenabling the output poll work as needed drm: Add an HPD poll helper to reschedule the poll work drm/i915: Fix TLB-Invalidation seqno store drm/ttm/tests: Fix type conversion in ttm_pool_test drm/msm/a6xx: Bail out early if setting GPU OOB fails drm/msm/a6xx: Move LLC accessors to the common header drm/msm/a6xx: Introduce a6xx_llc_read drm/ttm/tests: Require MMU when testing drm/panel: simple: Fix Innolux G156HCE-L01 LVDS clock Revert "Revert "drm/amdgpu/display: change pipe policy for DCN 2.0"" drm/amdgpu: Add memory vendor information drm/amd: flush any delayed gfxoff on suspend entry drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix drm/amdgpu: Remove gfxoff check in GFX v9.4.3 drm/amd/pm: Update pci link speed for smu v13.0.6 ... |
||
![]() |
d68b4b6f30 |
- An extensive rework of kexec and crash Kconfig from Eric DeVolder
("refactor Kconfig to consolidate KEXEC and CRASH options"). - kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a couple of macros to args.h"). - gdb feature work from Kuan-Ying Lee ("Add GDB memory helper commands"). - vsprintf inclusion rationalization from Andy Shevchenko ("lib/vsprintf: Rework header inclusions"). - Switch the handling of kdump from a udev scheme to in-kernel handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory hot un/plug"). - Many singleton patches to various parts of the tree -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO2GpAAKCRDdBJ7gKXxA juW3AQD1moHzlSN6x9I3tjm5TWWNYFoFL8af7wXDJspp/DWH/AD/TO0XlWWhhbYy QHy7lL0Syha38kKLMXTM+bN6YQHi9AU= =WJQa -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - An extensive rework of kexec and crash Kconfig from Eric DeVolder ("refactor Kconfig to consolidate KEXEC and CRASH options") - kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a couple of macros to args.h") - gdb feature work from Kuan-Ying Lee ("Add GDB memory helper commands") - vsprintf inclusion rationalization from Andy Shevchenko ("lib/vsprintf: Rework header inclusions") - Switch the handling of kdump from a udev scheme to in-kernel handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory hot un/plug") - Many singleton patches to various parts of the tree * tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits) document while_each_thread(), change first_tid() to use for_each_thread() drivers/char/mem.c: shrink character device's devlist[] array x86/crash: optimize CPU changes crash: change crash_prepare_elf64_headers() to for_each_possible_cpu() crash: hotplug support for kexec_load() x86/crash: add x86 crash hotplug support crash: memory and CPU hotplug sysfs attributes kexec: exclude elfcorehdr from the segment digest crash: add generic infrastructure for crash hotplug support crash: move a few code bits to setup support of crash hotplug kstrtox: consistently use _tolower() kill do_each_thread() nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse scripts/bloat-o-meter: count weak symbol sizes treewide: drop CONFIG_EMBEDDED lockdep: fix static memory detection even more lib/vsprintf: declare no_hash_pointers in sprintf.h lib/vsprintf: split out sprintf() and friends kernel/fork: stop playing lockless games for exe_file replacement adfs: delete unused "union adfs_dirtail" definition ... |
||
![]() |
9730870b48 |
LoongArch: Fix hw_breakpoint_control() for watchpoints
In hw_breakpoint_control(), encode_ctrl_reg() has already encoded the MWPnCFG3_LoadEn/MWPnCFG3_StoreEn bits in info->ctrl. We don't need to add (1 << MWPnCFG3_LoadEn | 1 << MWPnCFG3_StoreEn) unconditionally. Otherwise we can't set read watchpoint and write watchpoint separately. Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
656f9aec07 |
LoongArch: Ensure FP/SIMD registers in the core dump file is up to date
This is a port of commit
|
||
![]() |
c337c849ab |
LoongArch: Put the body of play_dead() into arch_cpu_idle_dead()
The initial aim is to silence the following objtool warning: arch/loongarch/kernel/process.o: warning: objtool: arch_cpu_idle_dead() falls through to next function start_thread() According to tools/objtool/Documentation/objtool.txt, this is because the last instruction of arch_cpu_idle_dead() is a call to a noreturn function play_dead(). In order to silence the warning, one simple way is to add the noreturn function play_dead() to objtool's hard-coded global_noreturns array, that is to say, just put "NORETURN(play_dead)" into tools/objtool/noreturns.h, it works well. But I noticed that play_dead() is only defined once and only called by arch_cpu_idle_dead(), so put the body of play_dead() into the caller arch_cpu_idle_dead(), then remove the noreturn function play_dead() is an alternative way which can reduce the overhead of the function call at the same time. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
a038ae7148 |
LoongArch: Return earlier in die() if notify_die() returns NOTIFY_STOP
After the call to oops_exit(), it should not panic or execute the crash kernel if the oops is to be suppressed. Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
6933c11fb5 |
LoongArch: Do not kill the task in die() if notify_die() returns NOTIFY_STOP
If notify_die() returns NOTIFY_STOP, honor the return value from the
handler chain invocation in die() and return without killing the task
as, through a debugger, the fault may have been fixed. It makes sense
even if ignoring the event will make the system unstable: by allowing
access through a debugger it has been compromised already anyway. It
makes our port consistent with x86, arm64, riscv and csky.
Commit
|
||
![]() |
55b46ff939 |
LoongArch: Replace #include <asm/export.h> with #include <linux/export.h>
Commit
|
||
![]() |
347aa8dec2 |
LoongArch: Remove unneeded #include <asm/export.h>
There is no EXPORT_SYMBOL() line there, hence #include <asm/export.h> is unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
fdebffeba8 |
Linux 6.5-rc7
-----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmTiDvweHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG7doH/2Poj73npPKPVfT3 RF8AsgvAj2pLby67rdvwTdFX+exS63SsuwtGGRfAHfGabiwmNN+oT2dLb0aY15bp nskHnFpcqQ/pfZ2i2rUCenzBX8S9QULvPidLKRaf1FcSdOzqd97Bw5oDPMtzqy/R Pm+Dzs//7fYvtm69nt6hKW4d6wXxNcg7Fk/QgoJ5Ax9vGvDuZmWXH0ZgBf/5kH04 TTPQNtVX57lf+FHugkhFEn4JbYXvN168b+LuX2PHwOeG/8AIS69Hc0vgvhHNAycT mmpUI1gWA2jfrJ2RCyyezF/6wy9Ocsp+CbPjfwjuRUxOk0XIm1+cp9Mlz/cRbMsZ f0tOTpk= =xrJp -----END PGP SIGNATURE----- BackMerge tag 'v6.5-rc7' into drm-next Linux 6.5-rc7 This is needed for the CI stuff and the msm pull has fixes in it. Signed-off-by: Dave Airlie <airlied@redhat.com> |
||
![]() |
8d539b84f1 |
nmi_backtrace: allow excluding an arbitrary CPU
The APIs that allow backtracing across CPUs have always had a way to exclude the current CPU. This convenience means callers didn't need to find a place to allocate a CPU mask just to handle the common case. Let's extend the API to take a CPU ID to exclude instead of just a boolean. This isn't any more complex for the API to handle and allows the hardlockup detector to exclude a different CPU (the one it already did a trace for) without needing to find space for a CPU mask. Arguably, this new API also encourages safer behavior. Specifically if the caller wants to avoid tracing the current CPU (maybe because they already traced the current CPU) this makes it more obvious to the caller that they need to make sure that the current CPU ID can't change. [akpm@linux-foundation.org: fix trigger_allbutcpu_cpu_backtrace() stub] Link: https://lkml.kernel.org/r/20230804065935.v4.1.Ia35521b91fc781368945161d7b28538f9996c182@changeid Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: kernel test robot <lkp@intel.com> Cc: Lecopzer Chen <lecopzer.chen@mediatek.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Pingfan Liu <kernelfans@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
![]() |
fe3015748a |
Merge 6.5-rc4 into tty-next
We need the serial/tty fixes in here as well for testing and future development. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
![]() |
83da30d73b |
LoongArch: Fix CMDLINE_EXTEND and CMDLINE_BOOTLOADER handling
On FDT systems these command line processing are already taken care of by early_init_dt_scan_chosen(). Add similar handling to the ACPI (non- FDT) code path to allow these config options to work for ACPI (non-FDT) systems too. Signed-off-by: Zhihong Dong <donmor3000@hotmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |
||
![]() |
bcb48185ed |
tty: sysrq: switch sysrq handlers from int to u8
The passed parameter to sysrq handlers is a key (a character). So change the type from 'int' to 'u8'. Let it specifically be 'u8' for two reasons: * unsigned: unsigned values come from the upper layers (devices) and the tty layer assumes unsigned on most places, and * 8-bit: as that what's supposed to be one day in all the layers built on the top of tty. (Currently, we use mostly 'unsigned char' and somewhere still only 'char'. (But that also translates to the former thanks to -funsigned-char.)) Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "David S. Miller" <davem@davemloft.net> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Zqiang <qiang.zhang1211@gmail.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> # DRM Acked-by: WANG Xuerui <git@xen0n.name> # loongarch Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Link: https://lore.kernel.org/r/20230712081811.29004-3-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
![]() |
6c7f27441d |
drm-misc-next for v6.6:
UAPI Changes: * fbdev: * Make fbdev userspace interfaces optional; only leaves the framebuffer console active * prime: * Support dma-buf self-import for all drivers automatically: improves support for many userspace compositors Cross-subsystem Changes: * backlight: * Fix interaction with fbdev in several drivers * base: Convert struct platform.remove to return void; part of a larger, tree-wide effort * dma-buf: Acquire reservation lock for mmap() in exporters; part of an on-going effort to simplify locking around dma-bufs * fbdev: * Use Linux device instead of fbdev device in many places * Use deferred-I/O helper macros in various drivers * i2c: Convert struct i2c from .probe_new to .probe; part of a larger, tree-wide effort * video: * Avoid including <linux/screen_info.h> Core Changes: * atomic: * Improve logging * prime: * Remove struct drm_driver.gem_prime_mmap plus driver updates: all drivers now implement this callback with drm_gem_prime_mmap() * gem: * Support execution contexts: provides locking over multiple GEM objects * ttm: * Support init_on_free * Swapout fixes Driver Changes: * accel: * ivpu: MMU updates; Support debugfs * ast: * Improve device-model detection * Cleanups * bridge: * dw-hdmi: Improve support for YUV420 bus format * dw-mipi-dsi: Fix enable/disable of DSI controller * lt9611uxc: Use MODULE_FIRMWARE() * ps8640: Remove broken EDID code * samsung-dsim: Fix command transfer * tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups * Cleanups * ingenic: * Kconfig REGMAP fixes * loongson: * Support display controller * mgag200: * Minor fixes * mxsfb: * Support disabling overlay planes * nouveau: * Improve VRAM detection * Various fixes and cleanups * panel: * panel-edp: Support AUO B116XAB01.4 * Support Visionox R66451 plus DT bindings * Cleanups * ssd130x: * Support per-controller default resolution plus DT bindings * Reduce memory-allocation overhead * Cleanups * tidss: * Support TI AM625 plus DT bindings * Implement new connector model plus driver updates * vkms * Improve write-back support * Documentation fixes -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmSvvRAACgkQaA3BHVML eiNpGQgAs8jq1XjN9t8jZsdgXnoCbkZyVUI2NO0HwoVwpRCLgbXp5AX5qq2oRciE TBhe4Fceh/ZsYqHTZQahnguxgRKM5JgXwbI4Z0iiOVcqasNbycaKAqipxJJ7kdo1 qPhGCbgQFVX7oIq2xjfXehh6O0SYX+R9r88X8dMJxMYv/pcLwOHG74kS040WOcQq uATgcnobOf/D8ZmlqvfKGAeTUoFo/RSR2Uhlauka58qgeUbicrTELZT2barY9d+k as6U5vv4wx2zMklTkjrlkMpAT1ZpbB9d3jGHwL27VEnjlfd3wV2bdH7Dzn9qZRf/ gn0ALg/b3u5yBWk/k7YBvijXyNcH6Q== =bBuG -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2023-07-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.6: UAPI Changes: * fbdev: * Make fbdev userspace interfaces optional; only leaves the framebuffer console active * prime: * Support dma-buf self-import for all drivers automatically: improves support for many userspace compositors Cross-subsystem Changes: * backlight: * Fix interaction with fbdev in several drivers * base: Convert struct platform.remove to return void; part of a larger, tree-wide effort * dma-buf: Acquire reservation lock for mmap() in exporters; part of an on-going effort to simplify locking around dma-bufs * fbdev: * Use Linux device instead of fbdev device in many places * Use deferred-I/O helper macros in various drivers * i2c: Convert struct i2c from .probe_new to .probe; part of a larger, tree-wide effort * video: * Avoid including <linux/screen_info.h> Core Changes: * atomic: * Improve logging * prime: * Remove struct drm_driver.gem_prime_mmap plus driver updates: all drivers now implement this callback with drm_gem_prime_mmap() * gem: * Support execution contexts: provides locking over multiple GEM objects * ttm: * Support init_on_free * Swapout fixes Driver Changes: * accel: * ivpu: MMU updates; Support debugfs * ast: * Improve device-model detection * Cleanups * bridge: * dw-hdmi: Improve support for YUV420 bus format * dw-mipi-dsi: Fix enable/disable of DSI controller * lt9611uxc: Use MODULE_FIRMWARE() * ps8640: Remove broken EDID code * samsung-dsim: Fix command transfer * tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups * Cleanups * ingenic: * Kconfig REGMAP fixes * loongson: * Support display controller * mgag200: * Minor fixes * mxsfb: * Support disabling overlay planes * nouveau: * Improve VRAM detection * Various fixes and cleanups * panel: * panel-edp: Support AUO B116XAB01.4 * Support Visionox R66451 plus DT bindings * Cleanups * ssd130x: * Support per-controller default resolution plus DT bindings * Reduce memory-allocation overhead * Cleanups * tidss: * Support TI AM625 plus DT bindings * Implement new connector model plus driver updates * vkms * Improve write-back support * Documentation fixes Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230713090830.GA23281@linux-uq9g |
||
![]() |
8b0d13545b |
efi: Do not include <linux/screen_info.h> from EFI header
The header file <linux/efi.h> does not need anything from <linux/screen_info.h>. Declare struct screen_info and remove the include statements. Update a number of source files that require struct screen_info's definition. v2: * update loongarch (Jingfeng) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Sui Jingfeng <suijingfeng@loongson.cn> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230706104852.27451-2-tzimmermann@suse.de |
||
![]() |
cccf0c2ee5 |
Tracing updates for 6.5:
- Add new feature to have function graph tracer record the return value. Adds a new option: funcgraph-retval ; when set, will show the return value of a function in the function graph tracer. - Also add the option: funcgraph-retval-hex where if it is not set, and the return value is an error code, then it will return the decimal of the error code, otherwise it still reports the hex value. - Add the file /sys/kernel/tracing/osnoise/per_cpu/cpu<cpu>/timerlat_fd That when a application opens it, it becomes the task that the timer lat tracer traces. The application can also read this file to find out how it's being interrupted. - Add the file /sys/kernel/tracing/available_filter_functions_addrs that works just the same as available_filter_functions but also shows the addresses of the functions like kallsyms, except that it gives the address of where the fentry/mcount jump/nop is. This is used by BPF to make it easier to attach BPF programs to ftrace hooks. - Replace strlcpy with strscpy in the tracing boot code. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZJy6ixQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qnzRAPsEI2YgjaJSHnuPoGRHbrNil6pq66wY LYaLizGI4Jv9BwEAqdSdcYcMiWo1SFBAO8QxEDM++BX3zrRyVgW8ahaTNgs= =TF0C -----END PGP SIGNATURE----- Merge tag 'trace-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Add new feature to have function graph tracer record the return value. Adds a new option: funcgraph-retval ; when set, will show the return value of a function in the function graph tracer. - Also add the option: funcgraph-retval-hex where if it is not set, and the return value is an error code, then it will return the decimal of the error code, otherwise it still reports the hex value. - Add the file /sys/kernel/tracing/osnoise/per_cpu/cpu<cpu>/timerlat_fd That when a application opens it, it becomes the task that the timer lat tracer traces. The application can also read this file to find out how it's being interrupted. - Add the file /sys/kernel/tracing/available_filter_functions_addrs that works just the same as available_filter_functions but also shows the addresses of the functions like kallsyms, except that it gives the address of where the fentry/mcount jump/nop is. This is used by BPF to make it easier to attach BPF programs to ftrace hooks. - Replace strlcpy with strscpy in the tracing boot code. * tag 'trace-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix warnings when building htmldocs for function graph retval riscv: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL tracing/boot: Replace strlcpy with strscpy tracing/timerlat: Add user-space interface tracing/osnoise: Skip running osnoise if all instances are off tracing/osnoise: Switch from PF_NO_SETAFFINITY to migrate_disable ftrace: Show all functions with addresses in available_filter_functions_addrs selftests/ftrace: Add funcgraph-retval test case LoongArch: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL x86/ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL arm64: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL tracing: Add documentation for funcgraph-retval and funcgraph-retval-hex function_graph: Support recording and printing the return value of function fgraph: Add declaration of "struct fgraph_ret_regs" |
||
![]() |
112e7e2151 |
LoongArch changes for v6.5
1, Preliminary ClangBuiltLinux enablement; 2, Add support to clone a time namespace; 3, Add vector extensions support; 4, Add SMT (Simultaneous Multi-Threading) support; 5, Support dbar with different hints; 6, Introduce hardware page table walker; 7, Add jump-label implementation; 8, Add rethook and uprobes support; 9, Some bug fixes and other small changes. -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmSdmI4WHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImeoeDEACd+KFZnQrX6fwpohuxWgQ46YSk bmKRnVCVg6jg/yL99WTHloaubMbyncgNL7YNvCRmuXcTQdjFP1zFb3q3ZqxTkaOT Kg9EO4R5H8U2Wolz3uqcvTBbPxv6bwDor1gBWzVo8RTO4S7gYt5tLS7pvLiYPWzp Jhgko2AHE/Y02Qqg00ARLIzDDLMm9vR5Gdmpj2jhl8wMHMNaqMW5E0r7XaiFoav0 G1PjIAk+0LIj9QHYUm5e0kcXHh0KzgUMG0LUDawbAanZn2r1KhAk0ouwXX/eu9J7 NQQRDi1Z02pTI5X3V1VAf+0O7RGnGGnWb/r2K76nr7lZNp88RJfbLtgBM01pzw2U NTnIAP7cAomNDaBglzAuS17yeFTS953nxaQlR8/t3UefP+fHmiIJyOOlxrXFMwVM jOfW4JAIkcl5DD/8l9lU1t91+zyKMrjsv8IrlGPW1sLsr/UOQjBajdHwnH8veC41 mL+xjiMb51g33JDRDAl6mloxXms9LvcNzbnKSwqVO1i23GkN1iHJmbq66Teut07C 7AH2qM3tSAuiXmNF1ntvodK1ukLS8moOiP8ZuuHfS13rr7gxPyRAvYdBiDaNEhF2 gCYYpIcaly+5rSf6wvDXgl/s3tS4o07AzDfpttyH5jYnY80nVj7CgS8vUU/mg1lW QpapBnBHU8wrz+eBqQ== =3gt3 -----END PGP SIGNATURE----- Merge tag 'loongarch-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - preliminary ClangBuiltLinux enablement - add support to clone a time namespace - add vector extensions support - add SMT (Simultaneous Multi-Threading) support - support dbar with different hints - introduce hardware page table walker - add jump-label implementation - add rethook and uprobes support - some bug fixes and other small changes * tag 'loongarch-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (28 commits) LoongArch: Remove five DIE_* definitions in kdebug.h LoongArch: Add uprobes support LoongArch: Use larch_insn_gen_break() for kprobes LoongArch: Add larch_insn_gen_break() to generate break insns LoongArch: Check for AMO instructions in insns_not_supported() LoongArch: Move three functions from kprobes.c to inst.c LoongArch: Replace kretprobe with rethook LoongArch: Add jump-label implementation LoongArch: Select HAVE_DEBUG_KMEMLEAK to support kmemleak LoongArch: Export some arch-specific pm interfaces LoongArch: Introduce hardware page table walker LoongArch: Support dbar with different hints LoongArch: Add SMT (Simultaneous Multi-Threading) support LoongArch: Add vector extensions support LoongArch: Add support to clone a time namespace Makefile: Add loongarch target flag for Clang compilation LoongArch: Mark Clang LTO as working LoongArch: Include KBUILD_CPPFLAGS in CHECKFLAGS invocation LoongArch: vDSO: Use CLANG_FLAGS instead of filtering out '--target=' LoongArch: Tweak CFLAGS for Clang compatibility ... |
||
![]() |
19bc6cb640 |
LoongArch: Add uprobes support
Uprobes is the user-space counterpart to kprobes, this patch adds uprobes support for LoongArch. Here is a simple example with CONFIG_UPROBE_EVENTS=y: # cat test.c #include <stdio.h> int add(int a, int b) { return a + b; } int main() { return add(2, 7); } # gcc test.c -o /tmp/test # nm /tmp/test | grep add 0000000120004194 T add # cd /sys/kernel/debug/tracing # echo > uprobe_events # echo "p:myuprobe /tmp/test:0x4194 %r4 %r5" > uprobe_events # echo "r:myuretprobe /tmp/test:0x4194 %r4" >> uprobe_events # echo 1 > events/uprobes/enable # echo 1 > tracing_on # /tmp/test # cat trace ... # TASK-PID CPU# ||||| TIMESTAMP FUNCTION # | | | ||||| | | test-1060 [001] DNZff 1015.770620: myuprobe: (0x120004194) arg1=0x2 arg2=0x7 test-1060 [001] DNZff 1015.770930: myuretprobe: (0x1200041f0 <- 0x120004194) arg1=0x9 Tested-by: Jeff Xie <xiehuan09@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> |