mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
synced 2025-08-26 21:52:20 +00:00
loongarch-next
48992 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
![]() |
0a9ee9ce49 |
- Make sure sanity checks down in the mutex lock path happen on the correct
type of task so that they don't trigger falsely - Use the write unsafe user access pairs when writing a futex value to prevent an error on PowerPC which does user read and write accesses differently -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmihlaAACgkQEsHwGGHe VUrV7g/+Kx4n1TQuGnk4kd5h5q0uls8mgeFddYjv6BgVxcaWq7Tzv7XMJ5hvWEqp P/+Zt43Sv9sd7i+PhoFD2Lr+EDYx8c0Lp08/LH0zsgKIA2Ai8ntJHJcb3se3Kxr5 yV23d0tkrCijcB58OL1xncm96Lp3XoXyTz8b0ahKDNG9mS8F/9XK9GgmG9OqKXDg 7T8Vx5NKt0YrvAWwvsQQlTUTcQ4a4O/UMwJmgEbvqHn0WwQISRxx/TE6wYuIwWAj pbrN5kzDsZ6tA07h48NWnkFOEeqsQgbbKDkWvYYRYBrVzEATQBpBWfSQ0HsaqPmc 1Mk5Zs+J5UFhHx7Yw348JVqw5Fl4VDT4Oi4AoIzjBym3c73nrNfZzESRsf4dES5Q DBsgTb0tjEZcR7MrWWErYu1LXw1qP5Ib39qLDVIvQQ4HomctSUuXVTIRL9qJvaCK aCPt2Ivkhj3wItZSeTfzLTXbWE9lP4AhuBpJ4ALHRbOaCRLNfK9ZzLfOUyKePUvx s3j7iPubfS5/lw192z5weLzEE4e8E7wIxSkIQNKLQFI/kr5YwKfwEO5Zm+UMfH5j m+Hl7YKS0nT2IlFbRel2cSkw4MDaEJjgahMzbp+D0p+xV2H4KjY4nLoarwsuoP8D GxLAOmRW1nzqj3QHIWsBF9iBxkO89lshWOgGxhUbhtywNqxSV6M= =q1tX -----END PGP SIGNATURE----- Merge tag 'locking_urgent_for_v6.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Borislav Petkov: - Make sure sanity checks down in the mutex lock path happen on the correct type of task so that they don't trigger falsely - Use the write unsafe user access pairs when writing a futex value to prevent an error on PowerPC which does user read and write accesses differently * tag 'locking_urgent_for_v6.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking: Fix __clear_task_blocked_on() warning from __ww_mutex_wound() path futex: Use user_write_access_begin/_end() in futex_put_value() |
||
![]() |
63467137ec |
Including fixes from Netfilter and IPsec.
Current release - regressions: - netfilter: nft_set_pipapo: - don't return bogus extension pointer - fix null deref for empty set Current release - new code bugs: - core: prevent deadlocks when enabling NAPIs with mixed kthread config - eth: netdevsim: Fix wild pointer access in nsim_queue_free(). Previous releases - regressions: - page_pool: allow enabling recycling late, fix false positive warning - sched: ets: use old 'nbands' while purging unused classes - xfrm: - restore GSO for SW crypto - bring back device check in validate_xmit_xfrm - tls: handle data disappearing from under the TLS ULP - ptp: prevent possible ABBA deadlock in ptp_clock_freerun() - eth: bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE - eth: hv_netvsc: fix panic during namespace deletion with VF Previous releases - always broken: - netfilter: fix refcount leak on table dump - vsock: do not allow binding to VMADDR_PORT_ANY - sctp: linearize cloned gso packets in sctp_rcv - eth: hibmcge: fix the division by zero issue - eth: microchip: fix KSZ8863 reset problem Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmidxhgSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkPckP/AhJ0kARPgo72OhElbW6KvkHhXVNUTJb 6m15j9Z8ybQPBupSorxUEBx7pxczv/GBloN1xoJqUY/p7B6kLO2HDOpaReNFjZvF J9hPl6ZF6CWzwgfcUfwI3UB1zhALKyHDClclfd8FBoKjAYXCrZXuPv/AV4oqYsA1 y7g24zpI76Cu+M+Nf5YhrlIVSmQ1/DXX8gifdcHFYnAKmCn7KxNv2lwvm2/yE2lL 9/Xl/D1cG/BiAaCUUXR4BP8RU5gdW72+lM3qmC/QFO/7jcBaoE89Y9Anona8p0PQ dqDerHd0GDUH9QA6bht3asCS+IW+Zo2gf25o53OzlYIMAxDmEZLUBCwetJhvNJBq DUQ6agzfNRxsCnlOc4JhMOqNr7rdU7d+9c9KuZWA/m8KRWdlvTYGJd/qzSlTWOhq s9228dl+4oTb9Mnq8Bqafi42+TImeOyFRW9ZgF8ptjlF0l/lyv6moIrRCmVXppRZ awABNDdG+i004XmAOAeOhjbUT7clLkLr+KEnsfH16qCa2o3dM6rlhvWYp2sucVJf SyRvMdz5VqMLgruefpQS/DuK52UklpRawgvgngzU6UDYQUaxQKToeusMjRU7xUnW hVI1y7/oNH6+r7Zr/iLTLKRR007B+RVC7VSbeMpxmAW+n6puMb+z7RnrJlnFapGM qqqtk2/jItuK =Ydk1 -----END PGP SIGNATURE----- Merge tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from Netfilter and IPsec. Current release - regressions: - netfilter: nft_set_pipapo: - don't return bogus extension pointer - fix null deref for empty set Current release - new code bugs: - core: prevent deadlocks when enabling NAPIs with mixed kthread config - eth: netdevsim: Fix wild pointer access in nsim_queue_free(). Previous releases - regressions: - page_pool: allow enabling recycling late, fix false positive warning - sched: ets: use old 'nbands' while purging unused classes - xfrm: - restore GSO for SW crypto - bring back device check in validate_xmit_xfrm - tls: handle data disappearing from under the TLS ULP - ptp: prevent possible ABBA deadlock in ptp_clock_freerun() - eth: - bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE - hv_netvsc: fix panic during namespace deletion with VF Previous releases - always broken: - netfilter: fix refcount leak on table dump - vsock: do not allow binding to VMADDR_PORT_ANY - sctp: linearize cloned gso packets in sctp_rcv - eth: - hibmcge: fix the division by zero issue - microchip: fix KSZ8863 reset problem" * tag 'net-6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits) net: usb: asix_devices: add phy_mask for ax88772 mdio bus net: kcm: Fix race condition in kcm_unattach() selftests: net/forwarding: test purge of active DWRR classes net/sched: ets: use old 'nbands' while purging unused classes bnxt: fill data page pool with frags if PAGE_SIZE > BNXT_RX_PAGE_SIZE netdevsim: Fix wild pointer access in nsim_queue_free(). net: mctp: Fix bad kfree_skb in bind lookup test netfilter: nf_tables: reject duplicate device on updates ipvs: Fix estimator kthreads preferred affinity netfilter: nft_set_pipapo: fix null deref for empty set selftests: tls: test TCP stealing data from under the TLS socket tls: handle data disappearing from under the TLS ULP ptp: prevent possible ABBA deadlock in ptp_clock_freerun() ixgbe: prevent from unwanted interface name changes devlink: let driver opt out of automatic phys_port_name generation net: prevent deadlocks when enabling NAPIs with mixed kthread config net: update NAPI threaded config even for disabled NAPIs selftests: drv-net: don't assume device has only 2 queues docs: Fix name for net.ipv4.udp_child_hash_entries riscv: dts: thead: Add APB clocks for TH1520 GMACs ... |
||
![]() |
21924af67d |
locking: Fix __clear_task_blocked_on() warning from __ww_mutex_wound() path
The __clear_task_blocked_on() helper added a number of sanity
checks ensuring we hold the mutex wait lock and that the task
we are clearing blocked_on pointer (if set) matches the mutex.
However, there is an edge case in the _ww_mutex_wound() logic
where we need to clear the blocked_on pointer for the task that
owns the mutex, not the task that is waiting on the mutex.
For this case the sanity checks aren't valid, so handle this
by allowing a NULL lock to skip the additional checks.
K Prateek Nayak and Maarten Lankhorst also pointed out that in
this case where we don't hold the owner's mutex wait_lock, we
need to be a bit more careful using READ_ONCE/WRITE_ONCE in both
the __clear_task_blocked_on() and __set_task_blocked_on()
implementations to avoid accidentally tripping WARN_ONs if two
instances race. So do that here as well.
This issue was easier to miss, I realized, as the test-ww_mutex
driver only exercises the wait-die class of ww_mutexes. I've
sent a patch[1] to address this so the logic will be easier to
test.
[1]: https://lore.kernel.org/lkml/20250801023358.562525-2-jstultz@google.com/
Fixes:
|
||
![]() |
c0a23bbc98 |
ipvs: Fix estimator kthreads preferred affinity
The estimator kthreads' affinity are defined by sysctl overwritten
preferences and applied through a plain call to the scheduler's affinity
API.
However since the introduction of managed kthreads preferred affinity,
such a practice shortcuts the kthreads core code which eventually
overwrites the target to the default unbound affinity.
Fix this with using the appropriate kthread's API.
Fixes:
|
||
![]() |
dfb36e4a8d |
futex: Use user_write_access_begin/_end() in futex_put_value()
Commit |
||
![]() |
61399e0c54 |
rcu: Fix racy re-initialization of irq_work causing hangs
RCU re-initializes the deferred QS irq work everytime before attempting
to queue it. However there are situations where the irq work is
attempted to be queued even though it is already queued. In that case
re-initializing messes-up with the irq work queue that is about to be
handled.
The chances for that to happen are higher when the architecture doesn't
support self-IPIs and irq work are then all lazy, such as with the
following sequence:
1) rcu_read_unlock() is called when IRQs are disabled and there is a
grace period involving blocked tasks on the node. The irq work
is then initialized and queued.
2) The related tasks are unblocked and the CPU quiescent state
is reported. rdp->defer_qs_iw_pending is reset to DEFER_QS_IDLE,
allowing the irq work to be requeued in the future (note the previous
one hasn't fired yet).
3) A new grace period starts and the node has blocked tasks.
4) rcu_read_unlock() is called when IRQs are disabled again. The irq work
is re-initialized (but it's queued! and its node is cleared) and
requeued. Which means it's requeued to itself.
5) The irq work finally fires with the tick. But since it was requeued
to itself, it loops and hangs.
Fix this with initializing the irq work only once before the CPU boots.
Fixes:
|
||
![]() |
b96ddbc5c8 |
- Remove an obsolete comment and fix spelling
-----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmiXmjUACgkQEsHwGGHe VUohIg/9FA8vG1+KBjC6njb8M4tmKiIpS2/OLfdurdp0MFCnxgsAifHbB4Lfh0Aq RQO/d3kQ7MrdWS8vuR7AHBTq6ceGuKwIpcikJOvloohlTcHslxpxXDedKdTeSx4L CgV88uZbiYppZwFPafC/KOEkMKb+gd/WhnzHQmhIzMk/Cqtv55qaATTdk4Qmai02 rPcLoS1Zhcs+t/4z9OleELZE9dcNTK2ZpOhHe+ygWZjWQizXAA8hJCyr8zc8UPhu +JLj5y2QofFa9XqADk54HzGidR/KKM9or8dsg2X1i4vO8UL26cKHry4NOJOUCblV AJptZzFZ3BU0kOyr0RVgT/dCSupl/ujGyiB9B3IuvxYqPAAt11KTbRrY0SoDd2w1 7XGkm4PQiUikA0VFBKjsoeaJg+portp/TJ4O9r7etRgy+qgguLzZZzaZw1zlGEon juPlcTpdEwZND7XtnjtG9hIdJlih8NuJwKWSlmRuoWtzyCR5eGEnxOaD8pgTflr8 MR5mQlkOgoXQjzjn6D+bmlB4qdst8aP3A8OY3uuj52xfGoaNp7FsP4RL//nGpdqg p/sdAOhGGyhs6c3/XrhRT/8YXog6cEJXelXXL1jzIjuiQX3yfR1//nWl7+ftKDvB EOR30na1iZ6MDGK7xJiS5B2wyc1my7AQj/mi/+5KopP8APMmqhg= =lrRr -----END PGP SIGNATURE----- Merge tag 'smp_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp fixes from Borislav Petkov: - Remove an obsolete comment and fix spelling * tag 'smp_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu: Remove obsolete comment from takedown_cpu() smp: Fix spelling in on_each_cpu_cond_mask()'s doc-comment |
||
![]() |
7d2fed1f3c |
- Fix a wrong ioremap size in mvebu-gicp
- Remove yet another compile-test case for a driver which needs an additional dependency - Fix a lock inversion scenario in the IRQ unit test suite - Remove an impossible flag situation in gic-v5 - Do not iounmap resources in gic-v5 which are managed by devm - Make sure stale, left-over interrupts in mvebu-gicp are cleared on driver init - Fix a reference counting mishap in msi-lib - Fix a dereference-before-null-ptr-check case in the riscv-imsic irqchip driver -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmiXkZYACgkQEsHwGGHe VUoiaw/7B/0wjf6CZdrzFW+gSBgXMA4VghVcEUySQTeE66SL5dhOM5Pw2w8z5ow5 tn2/cMZg7aAyQUt9XlPzWhHwU5Pu06Jtd32LtDX0sqf54byYybQHtoUBnNcPY7FR Anf7jr0WnXNHe+qQcccfVDtEYzF9R/HIkmp63f348wP3aS5RTbFPWk2cOPdAXOAY TeWordoDXjtzix+Ro8zk2WaD0h9oDLdHgg4eww3FUVNBvKiXIhkV7bb70t85f+gA f0eF6LJ0318e6UHwVhU+OgYzdD9uXMNTrZDKeZq/xYYIytVlCfwvKDMWFZC11ltC 89BMyghLSRkI/oV/2/gXGICRmGp7OEn6HzD/vpPv30Zfeyj0e8O/rat2uZCifrbL 9RJ4sXMJCOJUoHD3t/e7i1TDqsmVF9CdTbgwqQt6ANtypJrkVkIBqO4QvcNu8qQ5 c6lt5Y7ob+owpIhUoBmxCUaZz19wZAwRcOIkAZwoWXTrvfYjD28AveQOqHpOBvvQ WQY3pvGkvgY9vmWbIeshWhZzb+kX5Wn5WI4C7Ul5cng2WUfo1pkI6U8u9dmv0D7y LBVjnj/rXTWR0G9OyI3R9WqrGmrCnKOPMpv98lsETctBZxTrSStbOWe4dOBTe1Zh Jq1KRWZ4UE5SauOr8R/59y21E4HulVVH2WK7TUhM8paPkeYbw3o= =VUGL -----END PGP SIGNATURE----- Merge tag 'irq_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Fix a wrong ioremap size in mvebu-gicp - Remove yet another compile-test case for a driver which needs an additional dependency - Fix a lock inversion scenario in the IRQ unit test suite - Remove an impossible flag situation in gic-v5 - Do not iounmap resources in gic-v5 which are managed by devm - Make sure stale, left-over interrupts in mvebu-gicp are cleared on driver init - Fix a reference counting mishap in msi-lib - Fix a dereference-before-null-ptr-check case in the riscv-imsic irqchip driver * tag 'irq_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/mvebu-gicp: Use resource_size() for ioremap() irqchip: Build IMX_MU_MSI only on ARM genirq/test: Resolve irq lock inversion warnings irqchip/gic-v5: Remove IRQD_RESEND_WHEN_IN_PROGRESS for ITS IRQs irqchip/gic-v5: iwb: Fix iounmap probe failure path irqchip/mvebu-gicp: Clear pending interrupts on init irqchip/msi-lib: Fix fwnode refcount in msi_lib_irq_domain_select() irqchip/riscv-imsic: Don't dereference before NULL pointer check |
||
![]() |
8e8f6b635f |
- Prevent a futex hash leak due to different mm lifetimes
-----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmiXjOUACgkQEsHwGGHe VUpEog//eQ60cSh6pzFJ6yypmPmp1/Tk7XHQH9s9V4JzrsbFoTCwqm2h3NE27Pfu INNfdiZ76Paf5fRkl/pITGwW11Svn0w42xWwM8BDeZMv/yAq8dXa/QKaABVJa+Hd PujrWUno3H/qck0O50Fq9Y6nbE0lHBczHxKsaGdrARxra91JpAezsgwkN7jVO9Kk 6/2Gb9Hk2buWvG+eLmM4JwNIvaxgbMttw93tfA7DthPyQCI0dCPINmJ22fXhVZLI tVmkid9MqGOjz4789z0AN+pF+VfEcejGSy29zzCk5NrrgfgK0QSoZ0JpvUP5vtUh Opoez017+3sKe3REk+0j+PGdttmE48Zhl7WDgkAIZqOOEwWiVBoqbk9gCIGiJKan x9BRjcP3p1TH1RsS6OsHA+tbf+ZlGhOKQNRNeWmisteiOcDiuRYY8NE7F5Q3/mBQ N0KnlzAo2m2uTwJ4r5uvEOAIcCvB+EtNn2SYBkCxMpTCRzT65/WEQjgqmLDHR6cP LSFOfo91E210TwU/ZospjXxT3NhntoWRQVbvbbO5QS4gr3Sq6MCIofmIrjfJNqq6 AoVnrM+8QAOp+pOaoPwSIcwp68uhI4L6SXAZtP0+xScwv6UUUy1KUv9TMMNbZ4/4 lh9JYYIdfh3rtOlZmdK4+KoGBQ19YZ/qc9tXB8/oqadrQWFBOic= =Xpyt -----END PGP SIGNATURE----- Merge tag 'locking_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Borislav Petkov: - Prevent a futex hash leak due to different mm lifetimes * tag 'locking_urgent_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Move futex cleanup to __mmdrop() |
||
![]() |
c30a13538d |
bpf-fixes
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmiWKasACgkQ6rmadz2v bToTMhAAmAvsOqfaoYR0LfmuwPZj9F0bqg9voRRgHNnf3j1Iz0L2po65pdbJGQPs mf+KivbeWsPic8AD0mnbgPLgPXNYsunxsQ+k5LfShbtECZ+0QqUmT27EAIhsHWTr ghWmMZ6E3kq7bfsTaNAtJNjiovHCr+swqpZoJpAlwz1CxNhESDIjdDzGtarPNxHw s3ioI9gvPuH4x5vq7SblyfgeYJWNk0ZELISmdNEud1nBrdR91Ji2Iqxwkky0qhKI 4kQzN9MmjsIl4dIa1DMlQsMpKkC/sHtoFW2VEUkuHUwP+wa/YIn7iAvD7xc2k3nS 2fLzlpfZnSIdu+2piu9A7RoU4VI/XyC+tud6WmDYGN/xKgA0N7BmoBQjhgptj+uH BitpTGmFPoDuRiQDKHiGPTP6Wc4djt0Ipp6hFr89Q2ywCUCrylRafuJDpOn6xwzG VZH6yK1cMk2kIa8jArGjMtvcWiMbaMn6GwxjQaPC+Syhy6dmAWVmcyEKYXki8GAc N+gl0bBGHEGhcCCuyzxFAOLKx81CDp5C+gfoCzt3lDAXztrd1PeaZV+n9c6jtgVT VPO9TahYqnsfuLPBVDTCGSvVsTP2Fh2fIcdsEJvcF1EISBiwYBw6FGpDi4WlbOfm pisRchdT0YV6vg0V+f2U6mL5UbZr7v5w5PAfh7zXCqvfppoPD8U= =Qt6J -----END PGP SIGNATURE----- Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Fix memory leak of bpf_scc_info objects (Eduard Zingerman) - Fix a regression in the 'perf' tool caused by moving UID filtering to BPF (Ilya Leoshkevich) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: perf bpf-filter: Enable events manually libbpf: Add the ability to suppress perf event enablement bpf: Fix memory leak of bpf_scc_info objects |
||
![]() |
da274853fe |
cpu: Remove obsolete comment from takedown_cpu()
takedown_cpu() has a comment about "all preempt/rcu users must observe !cpu_active()" which is kind of meaningless in this function. This comment was originally introduced by commit |
||
![]() |
5b65258229 |
genirq/test: Resolve irq lock inversion warnings
irq_shutdown_and_deactivate() is normally called with the descriptor lock
held, and interrupts disabled. Nested a few levels down, it grabs the
global irq_resend_lock. Lockdep rightfully complains when interrupts are
not disabled:
CPU0 CPU1
---- ----
lock(irq_resend_lock);
local_irq_disable();
lock(&irq_desc_lock_class);
lock(irq_resend_lock);
<Interrupt>
lock(&irq_desc_lock_class);
...
_raw_spin_lock+0x2b/0x40
clear_irq_resend+0x14/0x70
irq_shutdown_and_deactivate+0x29/0x80
irq_shutdown_depth_test+0x1ce/0x600
kunit_try_run_case+0x90/0x120
Grab the descriptor lock and disable interrupts, to resolve the
problem.
Fixes:
|
||
![]() |
a530a36bb5 |
Kbuild updates for v6.17
- Fix a shortcut key issue in menuconfig - Fix missing rebuild of kheaders - Sort the symbol dump generated by gendwarfsyms - Support zboot extraction in scripts/extract-vmlinux - Migrate gconfig to GTK 3 - Add TAR variable to allow overriding the default tar command - Hand over Kbuild maintainership -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmiSr38VHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsGcTIP/RCVr/OEJgVXg8dOBNhNuhGoidnM 2uqRcaza68tOSegpFGcfd9bhO1TCR/O5SL117TS8UEx4f9ge7gk/+XVZC8i1268m 9+V6eJd8QI34nB/EezMnrhvFmn2kC0UMuSldZYQJ2cReLjtapBP2xBtWnxi+Zyyw +FjdHwQln7E8UaB/gMqh9KVVOytX4NIUUZEA/78nd4eJaJbLxJ/5ztAxGLB//bXI Rr6bjAeOmIfRWS9QWnGzNzHmzp4SSmU+/gdLXyaWlmoVjeut8O+BJXvQRNfswk8K JXmk6uZUx6CNheCca2RaM0i6XAArkpOQc/7v7Ul/rSriTxdxAVUghjk0fNrXJGvQ kBjewOTUXg8f4xhuPAL3nkWmCh0s0tV3Q0EneAaJuUck5yRkW0bxxKa74h6ji2q8 8RsNS5Mq0yYiR1gmCqhEmTN6/wDamSpGkHT0k6ukipixjCr5DP2QFP/xT3d7qSwc W6msliXgBmY/ZrGoJXy4zjmu5vxKfAes+7JLBDxUKjdItC818qwSPf+nbvVIdJqb K4/hcNDuYPKd/8N8YapPjbGfTBk9Xqp74ez4xg5XNIBPS+cE5k/mePletbCzkzC5 vzjoNgVUmpPGPdDaBk+S4jSEzWUi575aQx590OfdoXsBt4CQVHHk4PEM1Qh5cWnZ m6tx1oqfruovU6Gx =MtyJ -----END PGP SIGNATURE----- Merge tag 'kbuild-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: "This is the last pull request from me. I'm grateful to have been able to continue as a maintainer for eight years. From the next cycle, Nathan and Nicolas will maintain Kbuild. - Fix a shortcut key issue in menuconfig - Fix missing rebuild of kheaders - Sort the symbol dump generated by gendwarfsyms - Support zboot extraction in scripts/extract-vmlinux - Migrate gconfig to GTK 3 - Add TAR variable to allow overriding the default tar command - Hand over Kbuild maintainership" * tag 'kbuild-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (92 commits) MAINTAINERS: hand over Kbuild maintenance kheaders: make it possible to override TAR kbuild: userprogs: use correct linker when mixing clang and GNU ld kconfig: lxdialog: replace strcpy() with strncpy() in inputbox.c kconfig: lxdialog: replace strcpy with snprintf in print_autowrap kconfig: gconf: refactor text_insert_help() kconfig: gconf: remove unneeded variable in text_insert_msg kconfig: gconf: use hyphens in signals kconfig: gconf: replace GtkImageMenuItem with GtkMenuItem kconfig: gconf: Fix Back button behavior kconfig: gconf: fix single view to display dependent symbols correctly scripts: add zboot support to extract-vmlinux gendwarfksyms: order -T symtypes output by name gendwarfksyms: use preferred form of sizeof for allocation kconfig: qconf: confine {begin,end}Group to constructor and destructor kconfig: qconf: fix ConfigList::updateListAllforAll() kconfig: add a function to dump all menu entries in a tree-like format kconfig: gconf: show GTK version in About dialog kconfig: gconf: replace GtkHPaned and GtkVPaned with GtkPaned kconfig: gconf: replace GdkColor with GdkRGBA ... |
||
![]() |
adf12a394c |
Perf fixes for perf_mmap() reference counting to prevent potential
reference count leaks which are caused by: - VMA splits, which change the offset or size of a mapping, which causes perf_mmap_close() to ignore the unmap or unmap the wrong buffer. - Several internal issues of perf_mmap(), which can cause reference count leaks in the perf mmap, corrupt accounting or cause leaks in perf drivers. The main fix is to prevent VMA splits by implementing the [may_]split() callback for vm operations. The other issues are addressed by rearranging code, early returns on failure and invocation of cleanups. Also provide a selftest to validate the fixes. The reference counting should be converted to refcount_t, but that requires larger refactoring of the code and will be done once these fixes are upstream. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmiSd0gTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoWCbD/9niCHArXIWfdIp1K2ZwM2+tsCU7Ntl XnkfnaPoCVpQQXcAN11WIEK6DlwaiY2sfOco1cpEqr5px0Hv5qzM+OGm0r3KQBpr 9d1Ox8+tqUOIxw5zsKFXpw6/WX4zzdxGHjzU/T0H+fPP3jB+hj/Q3hOk4u10+f3v 7f3Q4sOfkmOauQez2HEaDwUip6lLZFEaf8IK0tYEkOJxcStwsC2TnLvmlEmOA0Yx PnAXOicrpbe9d8KNq6VxU0OtV6XAT+YJtf9T5cTNR1NhIkqyaMwbdzkuh9RZgxAE oRblaAHubAUMmv2DgYOTUGoYivsXY13XOtjfXdLmxt19HmkSOyaCFO8nJgjAPOL7 gxGXS7zKxhNac7bfVgBANPUHOOWtV30H5CqYOzxaPlQs8gzOsl8l+NDZuwVlP4P6 CMdN3rz3eMpnMpuzy0mmUJhowytKDA8N81yamCP5L9hWWZVfp4boZfIXMMLtJdQa nv/T2HxLL8HweFrI6Wd7YDhXMKhsNDAqJvtSv0z+5U+PWWd9rcOFsgS9sUHIiJuB pLvNwLxPntzF6qw4qIp1W1AHfLz2VF/tR8WyINpEZe4oafP1TccI+aLQdIJ/vVqp gQ0bCTiZb16IGsHruu4L9C0fe40TdSuiwEK5X9Opk4aP11oagsqQ+GxzssvQZnZc Jx2XqouabWBBvQ== =B9L/ -----END PGP SIGNATURE----- Merge tag 'perf-fixes-27504' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git Pull perf fixes from Thomas Gleixner: "Perf fixes for perf_mmap() reference counting to prevent potential reference count leaks which are caused by: - VMA splits, which change the offset or size of a mapping, which causes perf_mmap_close() to ignore the unmap or unmap the wrong buffer. - Several internal issues of perf_mmap(), which can cause reference count leaks in the perf mmap, corrupt accounting or cause leaks in perf drivers. The main fix is to prevent VMA splits by implementing the [may_]split() callback for vm operations. The other issues are addressed by rearranging code, early returns on failure and invocation of cleanups. Also provide a selftest to validate the fixes. The reference counting should be converted to refcount_t, but that requires larger refactoring of the code and will be done once these fixes are upstream" * tag 'perf-fixes-27504' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git: selftests/perf_events: Add a mmap() correctness test perf/core: Prevent VMA split of buffer mappings perf/core: Handle buffer mapping fail correctly in perf_mmap() perf/core: Exit early on perf_mmap() fail perf/core: Don't leak AUX buffer refcount on allocation failure perf/core: Preserve AUX buffer allocation failure result |
||
![]() |
73d210e9fa |
kheaders: make it possible to override TAR
Commit
|
||
![]() |
b024d7b56c |
perf/core: Prevent VMA split of buffer mappings
The perf mmap code is careful about mmap()'ing the user page with the
ringbuffer and additionally the auxiliary buffer, when the event supports
it. Once the first mapping is established, subsequent mapping have to use
the same offset and the same size in both cases. The reference counting for
the ringbuffer and the auxiliary buffer depends on this being correct.
Though perf does not prevent that a related mapping is split via mmap(2),
munmap(2) or mremap(2). A split of a VMA results in perf_mmap_open() calls,
which take reference counts, but then the subsequent perf_mmap_close()
calls are not longer fulfilling the offset and size checks. This leads to
reference count leaks.
As perf already has the requirement for subsequent mappings to match the
initial mapping, the obvious consequence is that VMA splits, caused by
resizing of a mapping or partial unmapping, have to be prevented.
Implement the vm_operations_struct::may_split() callback and return
unconditionally -EINVAL.
That ensures that the mapping offsets and sizes cannot be changed after the
fact. Remapping to a different fixed address with the same size is still
possible as it takes the references for the new mapping and drops those of
the old mapping.
Fixes:
|
||
![]() |
f74b9f4ba6 |
perf/core: Handle buffer mapping fail correctly in perf_mmap()
After successful allocation of a buffer or a successful attachment to an
existing buffer perf_mmap() tries to map the buffer read only into the page
table. If that fails, the already set up page table entries are zapped, but
the other perf specific side effects of that failure are not handled. The
calling code just cleans up the VMA and does not invoke perf_mmap_close().
This leaks reference counts, corrupts user->vm accounting and also results
in an unbalanced invocation of event::event_mapped().
Cure this by moving the event::event_mapped() invocation before the
map_range() call so that on map_range() failure perf_mmap_close() can be
invoked without causing an unbalanced event::event_unmapped() call.
perf_mmap_close() undoes the reference counts and eventually frees buffers.
Fixes:
|
||
![]() |
07091aade3 |
perf/core: Exit early on perf_mmap() fail
When perf_mmap() fails to allocate a buffer, it still invokes the
event_mapped() callback of the related event. On X86 this might increase
the perf_rdpmc_allowed reference counter. But nothing undoes this as
perf_mmap_close() is never called in this case, which causes another
reference count leak.
Return early on failure to prevent that.
Fixes:
|
||
![]() |
5468c0fbcc |
perf/core: Don't leak AUX buffer refcount on allocation failure
Failure of the AUX buffer allocation leaks the reference count.
Set the reference count to 1 only when the allocation succeeds.
Fixes:
|
||
![]() |
54473e0ef8 |
perf/core: Preserve AUX buffer allocation failure result
A recent overhaul sets the return value to 0 unconditionally after the
allocations, which causes reference count leaks and corrupts the user->vm
accounting.
Preserve the AUX buffer allocation failure return value, so that the
subsequent code works correctly.
Fixes:
|
||
![]() |
da23ea194d |
Significant patch series in this pull request:
- The 4 patch series "mseal cleanups" from Lorenzo Stoakes erforms some mseal cleaning with no intended functional change. - The 3 patch series "Optimizations for khugepaged" from David Hildenbrand improves khugepaged throughput by batching PTE operations for large folios. This gain is mainly for arm64. - The 8 patch series "x86: enable EXECMEM_ROX_CACHE for ftrace and kprobes" from Mike Rapoport provides a bugfix, additional debug code and cleanups to the execmem code. - The 7 patch series "mm/shmem, swap: bugfix and improvement of mTHP swap in" from Kairui Song provides bugfixes, cleanups and performance improvememnts to the mTHP swapin code. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaI+6HQAKCRDdBJ7gKXxA jv7lAQCAKE5dUhdZ0pOYbhBKTlDapQh2KqHrlV3QFcxXgknEoQD/c3gG01rY3fLh Cnf5l9+cdyfKxFniO48sUPx6IpriRg8= =HT5/ -----END PGP SIGNATURE----- Merge tag 'mm-stable-2025-08-03-12-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: "Significant patch series in this pull request: - "mseal cleanups" (Lorenzo Stoakes) Some mseal cleaning with no intended functional change. - "Optimizations for khugepaged" (David Hildenbrand) Improve khugepaged throughput by batching PTE operations for large folios. This gain is mainly for arm64. - "x86: enable EXECMEM_ROX_CACHE for ftrace and kprobes" (Mike Rapoport) A bugfix, additional debug code and cleanups to the execmem code. - "mm/shmem, swap: bugfix and improvement of mTHP swap in" (Kairui Song) Bugfixes, cleanups and performance improvememnts to the mTHP swapin code" * tag 'mm-stable-2025-08-03-12-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (38 commits) mm: mempool: fix crash in mempool_free() for zero-minimum pools mm: correct type for vmalloc vm_flags fields mm/shmem, swap: fix major fault counting mm/shmem, swap: rework swap entry and index calculation for large swapin mm/shmem, swap: simplify swapin path and result handling mm/shmem, swap: never use swap cache and readahead for SWP_SYNCHRONOUS_IO mm/shmem, swap: tidy up swap entry splitting mm/shmem, swap: tidy up THP swapin checks mm/shmem, swap: avoid redundant Xarray lookup during swapin x86/ftrace: enable EXECMEM_ROX_CACHE for ftrace allocations x86/kprobes: enable EXECMEM_ROX_CACHE for kprobes allocations execmem: drop writable parameter from execmem_fill_trapping_insns() execmem: add fallback for failures in vmalloc(VM_ALLOW_HUGE_VMAP) execmem: move execmem_force_rw() and execmem_restore_rox() before use execmem: rework execmem_cache_free() execmem: introduce execmem_alloc_rw() execmem: drop unused execmem_update_copy() mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped mm/rmap: add anon_vma lifetime debug check mm: remove mm/io-mapping.c ... |
||
![]() |
35a813e010 |
printk changes for 6.17
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmiQpykACgkQUqAMR0iA lPJcrg/9Hez6+zO7LECCn5VkuK5oJWR5CyCfwx14ki8UF38djQGU2frckI5837rE MnVoEBexZunK5SXy4MAy7bTCitzR+lMqNtP5uq9J2ovlSPtNlfuJRDr7uGQLDtSS M5KZ1qsZnhgwLYeNhfVVToHgp+OwIQb2GcgYmYc8k03fUI1NQpdxIM46DzoTj+06 x6qgrNsmmJbm8E73VWBByJAEFoq9ugjny8Rt+tYMi/CmhgZpp0ZyF1r5dYfYX/KS VS8UQY//aZOFhNsQUAXwP7Ym00CYRgTg7Na+MHivYLXmYGH2gF6tWQhX/eEgHKcJ RTmUbLFx70fdBbjJMxv2k8vyMk2sy6sTfJHPqM/NS/Fb0tSPBXQJG/EexzfoqiBc wcjgOPkeALIosVdFdTqXxjoIGOP8rqsU4t6Y6WFjJlWK04SBVjxBUofytRdQSxkG 5Sb0rFVGKrKIkXaVkt4byPa1/BDpfNhfKMYPtQ56pv2VNUgzfye4prUAZHE5pLnK 8nixeeMtKDFFCBpn6rG5wZW7k2mK5FrWGZUfdfxdK1gWQ1y0kqGy5wa3lNZLcxlH l3AtOYoDeWM2DjDVO6WCj8ambEWkbjbGg7tC9TI3F0NvRJSYytTb6npMqb3Gwhcb U4NgT+Ho0GJ/5BLUye8HMfhvrGoCfRCeptHtEFXAK7pzKyjc0+c= =Mocd -----END PGP SIGNATURE----- Merge tag 'printk-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Add new "hash_pointers=[auto|always|never]" boot parameter to force the hashing even with "slab_debug" enabled - Allow to stop CPU, after losing nbcon console ownership during panic(), even without proper NMI - Allow to use the printk kthread immediately even for the 1st registered nbcon - Compiler warning removal * tag 'printk-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: nbcon: Allow reacquire during panic printk: Allow to use the printk kthread immediately even for 1st nbcon slab: Decouple slab_debug and no_hash_pointers vsprintf: Use __diag macros to disable '-Wsuggest-attribute=format' compiler-gcc.h: Introduce __diag_GCC_all |
||
![]() |
99b773d720 |
sched/psi: Fix psi_seq initialization
With the seqcount moved out of the group into a global psi_seq,
re-initializing the seqcount on group creation is causing seqcount
corruption.
Fixes:
|
||
![]() |
3db488c8ed | Merge branch 'rework/fixes' into for-linus | ||
![]() |
e991acf1bc |
Significant patch series in this pull request:
- The 2 patch series "squashfs: Remove page->mapping references" from Matthew Wilcox gets us closer to being able to remove page->mapping. - The 5 patch series "relayfs: misc changes" from Jason Xing does some maintenance and minor feature addition work in relayfs. - The 5 patch series "kdump: crashkernel reservation from CMA" from Jiri Bohac switches us from static preallocation of the kdump crashkernel's working memory over to dynamic allocation. So the difficulty of a-priori estimation of the second kernel's needs is removed and the first kernel obtains extra memory. - The 5 patch series "generalize panic_print's dump function to be used by other kernel parts" from Feng Tang implements some consolidation and rationalizatio of the various ways in which a faiing kernel splats information at the operator. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaI+82gAKCRDdBJ7gKXxA jj4JAP9xb+w9DrBY6sa+7KTPIb+aTqQ7Zw3o9O2m+riKQJv6jAEA6aEwRnDA0451 fDT5IqVlCWGvnVikdZHSnvhdD7TGsQ0= =rT71 -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Significant patch series in this pull request: - "squashfs: Remove page->mapping references" (Matthew Wilcox) gets us closer to being able to remove page->mapping - "relayfs: misc changes" (Jason Xing) does some maintenance and minor feature addition work in relayfs - "kdump: crashkernel reservation from CMA" (Jiri Bohac) switches us from static preallocation of the kdump crashkernel's working memory over to dynamic allocation. So the difficulty of a-priori estimation of the second kernel's needs is removed and the first kernel obtains extra memory - "generalize panic_print's dump function to be used by other kernel parts" (Feng Tang) implements some consolidation and rationalization of the various ways in which a failing kernel splats information at the operator * tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (80 commits) tools/getdelays: add backward compatibility for taskstats version kho: add test for kexec handover delaytop: enhance error logging and add PSI feature description samples: Kconfig: fix spelling mistake "instancess" -> "instances" fat: fix too many log in fat_chain_add() scripts/spelling.txt: add notifer||notifier to spelling.txt xen/xenbus: fix typo "notifer" net: mvneta: fix typo "notifer" drm/xe: fix typo "notifer" cxl: mce: fix typo "notifer" KVM: x86: fix typo "notifer" MAINTAINERS: add maintainers for delaytop ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below() ucount: fix atomic_long_inc_below() argument type kexec: enable CMA based contiguous allocation stackdepot: make max number of pools boot-time configurable lib/xxhash: remove unused functions init/Kconfig: restore CONFIG_BROKEN help text lib/raid6: update recov_rvv.c zero page usage docs: update docs after introducing delaytop ... |
||
![]() |
3c4a063b1f |
tracing cleanups for v6.17:
- Remove unneeded goto out statements Over time, the logic was restructured but left a "goto out" where the out label simply did a "return ret;". Instead of jumping to this out label, simply return immediately and remove the out label. - Add guard(ring_buffer_nest) Some calls to the tracing ring buffer can happen when the ring buffer is already being written to at the same context (for example, a trace_printk() in between a ring_buffer_lock_reserve() and a ring_buffer_unlock_commit()). In order to not trigger the recursion detection, these functions use ring_buffer_nest_start() and ring_buffer_nest_end(). Create a guard() for these functions so that their use cases can be simplified and not need to use goto for the release. - Clean up the tracing code with guard() and __free() logic There were several locations that were prime candidates for using guard() and __free() helpers. Switch them over to use them. - Fix output of function argument traces for unsigned int values The function tracer with "func-args" option set will record up to 6 argument registers and then use BTF to format them for human consumption when the trace file is read. There's several arguments that are "unsigned long" and even "unsigned int" that are either and address or a mask. It is easier to understand if they were printed using hexadecimal instead of decimal. The old method just printed all non-pointer values as signed integers, which made it even worse for unsigned integers. For instance, instead of: __local_bh_disable_ip(ip=-2127311112, cnt=256) <-handle_softirqs Show: __local_bh_disable_ip(ip=0xffffffff8133cef8, cnt=0x100) <-handle_softirqs -----BEGIN PGP SIGNATURE----- iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaI9pOBQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qkhoAQD+moa8M+WWUS9T9utwREytolfyNKEO dW0dPVzquX3L6gEAnc7zNla4QZJsdU1bHyhpDTn/Zhu11aMrzoxcBcdrSwI= =x79z -----END PGP SIGNATURE----- Merge tag 'trace-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull more tracing updates from Steven Rostedt: - Remove unneeded goto out statements Over time, the logic was restructured but left a "goto out" where the out label simply did a "return ret;". Instead of jumping to this out label, simply return immediately and remove the out label. - Add guard(ring_buffer_nest) Some calls to the tracing ring buffer can happen when the ring buffer is already being written to at the same context (for example, a trace_printk() in between a ring_buffer_lock_reserve() and a ring_buffer_unlock_commit()). In order to not trigger the recursion detection, these functions use ring_buffer_nest_start() and ring_buffer_nest_end(). Create a guard() for these functions so that their use cases can be simplified and not need to use goto for the release. - Clean up the tracing code with guard() and __free() logic There were several locations that were prime candidates for using guard() and __free() helpers. Switch them over to use them. - Fix output of function argument traces for unsigned int values The function tracer with "func-args" option set will record up to 6 argument registers and then use BTF to format them for human consumption when the trace file is read. There are several arguments that are "unsigned long" and even "unsigned int" that are either and address or a mask. It is easier to understand if they were printed using hexadecimal instead of decimal. The old method just printed all non-pointer values as signed integers, which made it even worse for unsigned integers. For instance, instead of: __local_bh_disable_ip(ip=-2127311112, cnt=256) <-handle_softirqs show: __local_bh_disable_ip(ip=0xffffffff8133cef8, cnt=0x100) <-handle_softirqs" * tag 'trace-v6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Have unsigned int function args displayed as hexadecimal ring-buffer: Convert ring_buffer_write() to use guard(preempt_notrace) tracing: Use __free(kfree) in trace.c to remove gotos tracing: Add guard() around locks and mutexes in trace.c tracing: Add guard(ring_buffer_nest) tracing: Remove unneeded goto out logic |
||
![]() |
8877fcb70f |
This is a small set of changes for modules, primarily to extend module users
to use the module data structures in combination with the already no-op stub module functions, even when support for modules is disabled in the kernel configuration. This change follows the kernel's coding style for conditional compilation and allows kunit code to drop all CONFIG_MODULES ifdefs, which is also part of the changes. This should allow others part of the kernel to do the same cleanup. Note that this had a conflict with sysctl changes [1] but should be fixed now as I rebased on top. The remaining changes include a fix for module name length handling which could potentially lead to the removal of an incorrect module, and various cleanups. The module name fix and related cleanup has been in linux-next since Thursday (July 31) while the rest of the changes for a bit more than 3 weeks. Note that this currently has conflicts in next with kbuild's tree [2]. Link: https://lore.kernel.org/all/20250714175916.774e6d79@canb.auug.org.au/ [1] Link: https://lore.kernel.org/all/20250801132941.6815d93d@canb.auug.org.au/ [2] -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE73Ua4R8Pc+G5xjxTQJ6jxB8ZUfsFAmiPQgkACgkQQJ6jxB8Z UfuPTA//XrRguJFBhQh6cUWqVleTNQJuhjiPsOSO5S52aVET4wsrnRNeM2eM5oqw 0+6ELvhIJINQ1LjpOP8D67d8P5Ds1/qM1pbQIkQsoKiEj6E7Q4dXH5N0uyf/BzO3 HaosLG9cpqcomlSorYEiYoPjqy9EChQzsi+YAYWAB+fW6bvU/AdUHTRH88m3ppBJ Y22BTTPOKKyj5/QgfY+kwH8TTnrzCzY8aoOqW7uimLI5h4c9dFQ2PigRJnoMfDG1 11w5VshOTzZJvNFrUk5GVSirwlxdJDbW6dKfG0DD5+eNWK5dfIEc+/EcuhaGoPvO Euwv8VQubdxHTAG6kzHI0MtxAVQUM1gyz8zHiu18eW++GTtnTFs6m8E6H9AC176G nDkUh3qSxJN2HHgxtS9VUExEEZpYqtWeB9Zts8K3oSWvTaQenHWpVHPADkxzS4JU Jvkjq8SiKo+RqHxaOKfyf1RfOtYe5tjMCLrP7zX39d1+cwGxuc6mip/omY9HFDgn op132fYdt24JSHoioJDzRz9mTfvj3nICEmgX4D4WDQx5lP27CUcLugPnBNHPp0fu 5hL+ajy8M8nq4zm/42Y+F7VS74TIA6mSnJKs9dMCknUWueD6HrDEU9xHi1YMpUMZ cBUSpU+P94dCIScwEzkp926vDnHyxCHLbpF1Jsq5qNNdj7AelHk= =4bGB -----END PGP SIGNATURE----- Merge tag 'modules-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux Pull module updates from Daniel Gomez: "This is a small set of changes for modules, primarily to extend module users to use the module data structures in combination with the already no-op stub module functions, even when support for modules is disabled in the kernel configuration. This change follows the kernel's coding style for conditional compilation and allows kunit code to drop all CONFIG_MODULES ifdefs, which is also part of the changes. This should allow others part of the kernel to do the same cleanup. The remaining changes include a fix for module name length handling which could potentially lead to the removal of an incorrect module, and various cleanups" * tag 'modules-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux: module: Rename MAX_PARAM_PREFIX_LEN to __MODULE_NAME_LEN tracing: Replace MAX_PARAM_PREFIX_LEN with MODULE_NAME_LEN module: Restore the moduleparam prefix length check module: Remove unnecessary +1 from last_unloaded_module::name size module: Prevent silent truncation of module name in delete_module(2) kunit: test: Drop CONFIG_MODULE ifdeffery module: make structure definitions always visible module: move 'struct module_use' to internal.h |
||
![]() |
838955f64a |
execmem: introduce execmem_alloc_rw()
Some callers of execmem_alloc() require the memory to be temporarily writable even when it is allocated from ROX cache. These callers use execemem_make_temp_rw() right after the call to execmem_alloc(). Wrap this sequence in execmem_alloc_rw() API. Link: https://lkml.kernel.org/r/20250713071730.4117334-3-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
![]() |
881388f343 |
mm: add process info to bad rss-counter warning
Enhance the debugging information in check_mm() by including the process name and PID when reporting bad rss-counter states. This helps identify which process is associated with the memory accounting issue. Link: https://lkml.kernel.org/r/20250723100901.1909683-1-liuqiye2025@163.com Signed-off-by: Xuanye Liu <liuqiye2025@163.com> Acked-by: SeongJae Park <sj@kernel.org> Cc: Ben Segall <bsegall@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
![]() |
58b4fba81a |
ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below()
Use atomic_long_try_cmpxchg() instead of atomic_long_cmpxchg (*ptr, old, new) == old in atomic_long_inc_below(). x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, atomic_long_try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg fails, enabling further code simplifications. No functional change intended. Link: https://lkml.kernel.org/r/20250721174610.28361-2-ubizjak@gmail.com Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Reviewed-by: Alexey Gladkov <legion@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Alexey Gladkov <legion@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: MengEn Sun <mengensun@tencent.com> Cc: "Thomas Weißschuh" <linux@weissschuh.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
![]() |
f8cd9193b6 |
ucount: fix atomic_long_inc_below() argument type
The type of u argument of atomic_long_inc_below() should be long to avoid
unwanted truncation to int.
The patch fixes the wrong argument type of an internal function to
prevent unwanted argument truncation. It fixes an internal locking
primitive; it should not have any direct effect on userspace.
Mark said
: AFAICT there's no problem in practice because atomic_long_inc_below()
: is only used by inc_ucount(), and it looks like the value is
: constrained between 0 and INT_MAX.
:
: In inc_ucount() the limit value is taken from
: user_namespace::ucount_max[], and AFAICT that's only written by
: sysctls, to the table setup by setup_userns_sysctls(), where
: UCOUNT_ENTRY() limits the value between 0 and INT_MAX.
:
: This is certainly a cleanup, but there might be no functional issue in
: practice as above.
Link: https://lkml.kernel.org/r/20250721174610.28361-1-ubizjak@gmail.com
Fixes:
|
||
![]() |
07d2490297 |
kexec: enable CMA based contiguous allocation
When booting a new kernel with kexec_file, the kernel picks a target location that the kernel should live at, then allocates random pages, checks whether any of those patches magically happens to coincide with a target address range and if so, uses them for that range. For every page allocated this way, it then creates a page list that the relocation code - code that executes while all CPUs are off and we are just about to jump into the new kernel - copies to their final memory location. We can not put them there before, because chances are pretty good that at least some page in the target range is already in use by the currently running Linux environment. Copying is happening from a single CPU at RAM rate, which takes around 4-50 ms per 100 MiB. All of this is inefficient and error prone. To successfully kexec, we need to quiesce all devices of the outgoing kernel so they don't scribble over the new kernel's memory. We have seen cases where that does not happen properly (*cough* GIC *cough*) and hence the new kernel was corrupted. This started a month long journey to root cause failing kexecs to eventually see memory corruption, because the new kernel was corrupted severely enough that it could not emit output to tell us about the fact that it was corrupted. By allocating memory for the next kernel from a memory range that is guaranteed scribbling free, we can boot the next kernel up to a point where it is at least able to detect corruption and maybe even stop it before it becomes severe. This increases the chance for successful kexecs. Since kexec got introduced, Linux has gained the CMA framework which can perform physically contiguous memory mappings, while keeping that memory available for movable memory when it is not needed for contiguous allocations. The default CMA allocator is for DMA allocations. This patch adds logic to the kexec file loader to attempt to place the target payload at a location allocated from CMA. If successful, it uses that memory range directly instead of creating copy instructions during the hot phase. To ensure that there is a safety net in case anything goes wrong with the CMA allocation, it also adds a flag for user space to force disable CMA allocations. Using CMA allocations has two advantages: 1) Faster by 4-50 ms per 100 MiB. There is no more need to copy in the hot phase. 2) More robust. Even if by accident some page is still in use for DMA, the new kernel image will be safe from that access because it resides in a memory region that is considered allocated in the old kernel and has a chance to reinitialize that component. Link: https://lkml.kernel.org/r/20250610085327.51817-1-graf@amazon.com Signed-off-by: Alexander Graf <graf@amazon.com> Acked-by: Baoquan He <bhe@redhat.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Zhongkun He <hezhongkun.hzk@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
![]() |
1b30d44417 |
bpf: Fix memory leak of bpf_scc_info objects
env->scc_info array contains references to bpf_scc_info objects
allocated lazily in verifier.c:scc_visit_alloc().
env->scc_cnt was supposed to track env->scc_info array size
in order to free referenced objects in verifier.c:free_states().
Fix initialization of env->scc_cnt that was omitted in
verifier.c:compute_scc().
To reproduce the bug:
- build with CONFIG_DEBUG_KMEMLEAK
- boot and load bpf program with loops, e.g.:
./veristat -q pyperf180.bpf.o
- initiate memleak scan and check results:
echo scan > /sys/kernel/debug/kmemleak
cat /sys/kernel/debug/kmemleak
Fixes:
|
||
![]() |
e703b7e247 |
futex: Move futex cleanup to __mmdrop()
Futex hash allocations are done in mm_init() and the cleanup happens in
__mmput(). That works most of the time, but there are mm instances which
are instantiated via mm_alloc() and freed via mmdrop(), which causes the
futex hash to be leaked.
Move the cleanup to __mmdrop().
Fixes:
|
||
![]() |
83e6384374 |
smp: Fix spelling in on_each_cpu_cond_mask()'s doc-comment
"boolean" is spelt as "blooean". Fix that. Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250722161818.6139-1-romank@linux.microsoft.com |
||
![]() |
a6923c06a3 |
bpf-fixes
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmiNNksACgkQ6rmadz2v bTrKRhAAnju4bbFRHU88Y68p6Meq/jxgjxHZAkTqZA0Nvbu2cItPRL7XHAAhTWE7 OBEIm3UKCH4gs4fY8rDHiIgnnaQavXUmvXZblOIOjxnqRKJpU3px+wwJvGFq5Enq WP6UZV8tj+O2tNfNNYS+mgQvvIpUISHGpKimvx7ede3e1U3cJBkppbT3gooMHYuc 5s1QtYHWaPY/1DpkHgqJ2UPGcbT9/HSPGMHRNaHKjQTcNcLcrj7RRjchgXqcc7Vs hVijvVrLiuK0MyU42ritmaqvjjgD6hKPZguRQe2/hAtrOo0Alf+4mXkMgam7simN iHfGc7nhw1xAFTPj4WXahja89G00FdDN5NR37Rgurm/i2fY7BuXAkMjiMiwGB3C3 jk2wG3RSifYeC2rxhkYJdqcx8Cz6m+pjgyJ2o9Jy5dn426VXg/kzkUXpl6u5jaPZ SmKoo9Xu1r7xqTaUc9kk8pJI5Xt9vD5oQjF2KQuPZXxNidiwW6k2OGbW+wF26nEi Q6pfDu3pvHAd/UE6cD5yFe97o3Cc2XfGwI/Sv2k99UVPvNcvfAvVo9fsItHBhCPn zHkihW2S0zmbBlhcrB+PrLclNgLleP9JukFN+5scc0a9lbQxIm6v2TNKGlBfDQtO I+Kn266oqT4BEgnQGlCQquINnQAdmS8VMnnunGOu6+rwPUtkI7E= =XLHS -----END PGP SIGNATURE----- Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Fix kCFI failures in JITed BPF code on arm64 (Sami Tolvanen, Puranjay Mohan, Mark Rutland, Maxwell Bland) - Disallow tail calls between BPF programs that use different cgroup local storage maps to prevent out-of-bounds access (Daniel Borkmann) - Fix unaligned access in flow_dissector and netfilter BPF programs (Paul Chaignon) - Avoid possible use of uninitialized mod_len in libbpf (Achill Gilgenast) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Test for unaligned flow_dissector ctx access bpf: Improve ctx access verifier error message bpf: Check netfilter ctx accesses are aligned bpf: Check flow_dissector ctx accesses are aligned arm64/cfi,bpf: Support kCFI + BPF on arm64 cfi: Move BPF CFI types and helpers to generic code cfi: add C CFI type macro libbpf: Avoid possible use of uninitialized mod_len bpf: Fix oob access in cgroup local storage bpf: Move cgroup iterator helpers to bpf.h bpf: Move bpf map owner out of common struct bpf: Add cookie object to bpf maps |
||
![]() |
3ca824369b |
tracing: Have unsigned int function args displayed as hexadecimal
Most function arguments that are passed in as unsigned int or unsigned long are better displayed as hexadecimal than normal integer. For example, the functions: static void __create_object(unsigned long ptr, size_t size, int min_count, gfp_t gfp, unsigned int objflags); static bool stack_access_ok(struct unwind_state *state, unsigned long _addr, size_t len); void __local_bh_disable_ip(unsigned long ip, unsigned int cnt); Show up in the trace as: __create_object(ptr=-131387050520576, size=4096, min_count=1, gfp=3264, objflags=0) <-kmem_cache_alloc_noprof stack_access_ok(state=0xffffc9000233fc98, _addr=-60473102566256, len=8) <-unwind_next_frame __local_bh_disable_ip(ip=-2127311112, cnt=256) <-handle_softirqs Instead, by displaying unsigned as hexadecimal, they look more like this: __create_object(ptr=0xffff8881028d2080, size=0x280, min_count=1, gfp=0x82820, objflags=0x0) <-kmem_cache_alloc_node_noprof stack_access_ok(state=0xffffc90000003938, _addr=0xffffc90000003930, len=0x8) <-unwind_next_frame __local_bh_disable_ip(ip=0xffffffff8133cef8, cnt=0x100) <-handle_softirqs Which is much easier to understand as most unsigned longs are usually just pointers. Even the "unsigned int cnt" in __local_bh_disable_ip() looks better as hexadecimal as a lot of flags are passed as unsigned. Changes since v2: https://lore.kernel.org/20250801111453.01502861@gandalf.local.home - Use btf_int_encoding() instead of open coding it (Martin KaFai Lau) Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Douglas Raillard <douglas.raillard@arm.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Link: https://lore.kernel.org/20250801165601.7770d65c@gandalf.local.home Acked-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
![]() |
821c9e515d |
virtio, vhost: features, fixes
vhost can now support legacy threading if enabled in Kconfig vsock memory allocation strategies for large buffers have been improved, reducing pressure on kmalloc vhost now supports the in-order feature guest bits missed the merge window fixes, cleanups all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmiMvQEPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpgr8IAKUrIjqqTYXLkbCWn6tK8T+LxZ6LkMkyHA1v AJ+y5fKDeLsT5QpusD1XRjXJVqXBwQEsTN0pNVuhWHlcCpUeOFEHuJaf/QMncbc3 deFlUfMa3ihniUxBuyhojlWURsf94uTC906lCFXlIsfSKH2CW6/SjKvqR0SH5PhN 5WaqRYiSFFwDlyG2Ul4e5temP/er2KuZfYyvcYCU8VdSEp6bjvqCHd9ztFIVuByp fFWsrHce6IqR8ixOOzavEjzfd8WAN3LGzXntj5KEaX3fZ6HxCZCMv+rNVqvJmLps cSrTgIUo60nCiZb8klUCS1YTEEvmdmJg3UmmddIpIhcsCYJSbOU= =2dxm -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: - vhost can now support legacy threading if enabled in Kconfig - vsock memory allocation strategies for large buffers have been improved, reducing pressure on kmalloc - vhost now supports the in-order feature. guest bits missed the merge window. - fixes, cleanups all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (30 commits) vsock/virtio: Allocate nonlinear SKBs for handling large transmit buffers vsock/virtio: Rename virtio_vsock_skb_rx_put() vhost/vsock: Allocate nonlinear SKBs for handling large receive buffers vsock/virtio: Move SKB allocation lower-bound check to callers vsock/virtio: Rename virtio_vsock_alloc_skb() vsock/virtio: Resize receive buffers so that each SKB fits in a 4K page vsock/virtio: Move length check to callers of virtio_vsock_skb_rx_put() vsock/virtio: Validate length in packet header before skb_put() vhost/vsock: Avoid allocating arbitrarily-sized SKBs vhost_net: basic in_order support vhost: basic in order support vhost: fail early when __vhost_add_used() fails vhost: Reintroduce kthread API and add mode selection vdpa: Fix IDR memory leak in VDUSE module exit vdpa/mlx5: Fix release of uninitialized resources on error path vhost-scsi: Fix check for inline_sg_cnt exceeding preallocated limit virtio: virtio_dma_buf: fix missing parameter documentation vhost: Fix typos vhost: vringh: Remove unused functions vhost: vringh: Remove unused iotlb functions ... |
||
![]() |
0bd0a41a51 |
pci-v6.17-changes
-----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmiL3OkUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vz9bhAAqiD9REYlNUgGX/bEBgCVPFdtjjTz FpSLzG23vWd2J0FEy04qtQWH9j71IXnM+yMybzsMe9SsPt2HhczzSCIMpPj0FZNN ccOf3gA/KqPux7FORrS3mpM8OO4ICt3XZhCji3nNg5iW5XlH+NrQKPVxRlvBB0rP +7RxSjDClUdZ97QSSmp1uZ7Qh1qyV0Ht0qjPMwecrnB2kApt4ZaMphAaKPEjX/4f RgZPFqbIpRWt9e87Z8ADr5c2jokZAzIV0zauQ2fhbjBkTcXIXL3yOzUbR+ngBWDD oq21rXJBUCQheA7J6j2SKabgF9AZaI5NI9ERld5vJ1inXSZCyuyKopN1AzuKZquG N+jyYJqZC99ePvMLbTWs/spU58J03A6TOwaJNE3ISRgbnxFkhvLl7h68XuTDonZm hYGloXXUj+i+rh7/eJIDDWa9MTpEvl2p1zc6EDIZ/umlnHwg9rGlGQVARMCs6Ist EiJQEtjMMlXiBJMkFhpxesOdyonGkxAL9WtT6MoEOFF7dqgsTqSKiDUPa+6MHV+I tsTB630J3ROsWGfQD1uJI2BrCm+op4j6faamH6UMqCrUU0TUZMHiRR3qVWbM6qgU /WL1gZ96uy5I7UoE0+gH+wMhMClO2BnsxffocToDE5wOYpGDd5BwPEoY8ej8U2lu CBMCkMor1jDtS8Y= =ipv3 -----END PGP SIGNATURE----- Merge tag 'pci-v6.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Allow built-in drivers, not just modular drivers, to use async initial probing (Lukas Wunner) - Support Immediate Readiness even on devices with no PM Capability (Sean Christopherson) - Consolidate definition of PCIE_RESET_CONFIG_WAIT_MS (100ms), the required delay between a reset and sending config requests to a device (Niklas Cassel) - Add pci_is_display() to check for "Display" base class and use it in ALSA hda, vfio, vga_switcheroo, vt-d (Mario Limonciello) - Allow 'isolated PCI functions' (multi-function devices without a function 0) for LoongArch, similar to s390 and jailhouse (Huacai Chen) Power control: - Add ability to enable optional slot clock for cases where the PCIe host controller and the slot are supplied by different clocks (Marek Vasut) PCIe native device hotplug: - Fix runtime PM ref imbalance on Hot-Plug Capable ports caused by misinterpreting a config read failure after a device has been removed (Lukas Wunner) - Avoid creating a useless PCIe port service device for pciehp if the slot is handled by the ACPI hotplug driver (Lukas Wunner) - Ignore ACPI hotplug slots when calculating depth of pciehp hotplug ports (Lukas Wunner) Virtualization: - Save VF resizable BAR state and restore it after reset (Michał Winiarski) - Allow IOV resources (VF BARs) to be resized (Michał Winiarski) - Add pci_iov_vf_bar_set_size() so drivers can control VF BAR size (Michał Winiarski) Endpoint framework: - Add RC-to-EP doorbell support using platform MSI controller, including a test case (Frank Li) - Allow BAR assignment via configfs so platforms have flexibility in determining BAR usage (Jerome Brunet) Native PCIe controller drivers: - Convert amazon,al-alpine-v[23]-pcie, apm,xgene-pcie, axis,artpec6-pcie, marvell,armada-3700-pcie, st,spear1340-pcie to DT schema format (Rob Herring) - Use dev_fwnode() instead of of_fwnode_handle() to remove OF dependency in altera (fixes an unused variable), designware-host, mediatek, mediatek-gen3, mobiveil, plda, xilinx, xilinx-dma, xilinx-nwl (Jiri Slaby, Arnd Bergmann) - Convert aardvark, altera, brcmstb, designware-host, iproc, mediatek, mediatek-gen3, mobiveil, plda, rcar-host, vmd, xilinx, xilinx-dma, xilinx-nwl from using pci_msi_create_irq_domain() to using msi_create_parent_irq_domain() instead; this makes the interrupt controller per-PCI device, allows dynamic allocation of vectors after initialization, and allows support of IMS (Nam Cao) APM X-Gene PCIe controller driver: - Rewrite MSI handling to MSI CPU affinity, drop useless CPU hotplug bits, use device-managed memory allocations, and clean things up (Marc Zyngier) - Probe xgene-msi as a standard platform driver rather than a subsys_initcall (Marc Zyngier) Broadcom STB PCIe controller driver: - Add optional DT 'num-lanes' property and if present, use it to override the Maximum Link Width advertised in Link Capabilities (Jim Quinlan) Cadence PCIe controller driver: - Use PCIe Message routing types from the PCI core rather than defining private ones (Hans Zhang) Freescale i.MX6 PCIe controller driver: - Add IMX8MQ_EP third 64-bit BAR in epc_features (Richard Zhu) - Add IMX8MM_EP and IMX8MP_EP fixed 256-byte BAR 4 in epc_features (Richard Zhu) - Configure LUT for MSI/IOMMU in Endpoint mode so Root Complex can trigger doorbel on Endpoint (Frank Li) - Remove apps_reset (LTSSM_EN) from imx_pcie_{assert,deassert}_core_reset(), which fixes a hotplug regression on i.MX8MM (Richard Zhu) - Delay Endpoint link start until configfs 'start' written (Richard Zhu) Intel VMD host bridge driver: - Add Intel Panther Lake (PTL)-H/P/U Vendor ID (George D Sworo) Qualcomm PCIe controller driver: - Add DT binding and driver support for SA8255p, which supports ECAM for Configuration Space access (Mayank Rana) - Update DT binding and driver to describe PHYs and per-Root Port resets in a Root Port stanza and deprecate describing them in the host bridge; this makes it possible to support multiple Root Ports in the future (Krishna Chaitanya Chundru) - Add Qualcomm QCS615 to SM8150 DT binding (Ziyue Zhang) - Add Qualcomm QCS8300 to SA8775p DT binding (Ziyue Zhang) - Drop TBU and ref clocks from Qualcomm SM8150 and SC8180x DT bindings (Konrad Dybcio) - Document 'link_down' reset in Qualcomm SA8775P DT binding (Ziyue Zhang) - Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ (Niklas Cassel) Rockchip PCIe controller driver: - Drop unused PCIe Message routing and code definitions (Hans Zhang) - Remove several unused header includes (Hans Zhang) - Use standard PCIe config register definitions instead of rockchip-specific redefinitions (Geraldo Nascimento) - Set Target Link Speed to 5.0 GT/s before retraining so we have a chance to train at a higher speed (Geraldo Nascimento) Rockchip DesignWare PCIe controller driver: - Prevent race between link training and register update via DBI by inhibiting link training after hot reset and link down (Wilfred Mallawa) - Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ (Niklas Cassel) Sophgo PCIe controller driver: - Add DT binding and driver for Sophgo SG2044 PCIe controller driver in Root Complex mode (Inochi Amaoto) Synopsys DesignWare PCIe controller driver: - Add required PCIE_RESET_CONFIG_WAIT_MS after waiting for Link up on Ports that support > 5.0 GT/s. Slower Ports still rely on the not-quite-correct PCIE_LINK_WAIT_SLEEP_MS 90ms default delay while waiting for the Link (Niklas Cassel)" * tag 'pci-v6.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (116 commits) dt-bindings: PCI: qcom,pcie-sa8775p: Document 'link_down' reset dt-bindings: PCI: Remove 83xx-512x-pci.txt dt-bindings: PCI: Convert amazon,al-alpine-v[23]-pcie to DT schema dt-bindings: PCI: Convert marvell,armada-3700-pcie to DT schema dt-bindings: PCI: Convert apm,xgene-pcie to DT schema dt-bindings: PCI: Convert axis,artpec6-pcie to DT schema dt-bindings: PCI: Convert st,spear1340-pcie to DT schema PCI: Move is_pciehp check out of pciehp_is_native() PCI: pciehp: Use is_pciehp instead of is_hotplug_bridge PCI/portdrv: Use is_pciehp instead of is_hotplug_bridge PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports selftests: pci_endpoint: Add doorbell test case misc: pci_endpoint_test: Add doorbell test case PCI: endpoint: pci-epf-test: Add doorbell test support PCI: endpoint: Add pci_epf_align_inbound_addr() helper for inbound address alignment PCI: endpoint: pci-ep-msi: Add checks for MSI parent and mutability PCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller PCI: dwc: Add Sophgo SG2044 PCIe controller driver in Root Complex mode PCI: vmd: Switch to msi_create_parent_irq_domain() PCI: vmd: Convert to lock guards ... |
||
![]() |
db5f0c3e3e |
ring-buffer: Convert ring_buffer_write() to use guard(preempt_notrace)
The function ring_buffer_write() has a goto out to only do a preempt_enable_notrace(). This can be replaced by a guard. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250801203858.205479143@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
![]() |
12d5189615 |
tracing: Use __free(kfree) in trace.c to remove gotos
There's a couple of locations that have goto out in trace.c for the only purpose of freeing a variable that was allocated. These can be replaced with __free(kfree). Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250801203858.040892777@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
![]() |
debe57fbe1 |
tracing: Add guard() around locks and mutexes in trace.c
There's several locations in trace.c that can be simplified by using guards around raw_spin_lock_irqsave, mutexes and preempt disabling. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250801203857.879085376@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
![]() |
788fa4b47c |
tracing: Add guard(ring_buffer_nest)
Some calls to the tracing ring buffer can happen when the ring buffer is already being written to by the same context (for example, a trace_printk() in between a ring_buffer_lock_reserve() and a ring_buffer_unlock_commit()). In order to not trigger the recursion detection, these functions use ring_buffer_nest_start() and ring_buffer_nest_end(). Create a guard() for these functions so that their use cases can be simplified and not need to use goto for the release. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250801203857.710501021@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
![]() |
c89504a703 |
tracing: Remove unneeded goto out logic
Several places in the trace.c file there's a goto out where the out is simply a return. There's no reason to jump to the out label if it's not doing any more logic but simply returning from the function. Replace the goto outs with a return and remove the out labels. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20250801203857.538726745@kernel.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
![]() |
d6f38c1239 |
tracing changes for 6.17
- Deprecate auto-mounting tracefs to /sys/kernel/debug/tracing When tracefs was first introduced back in 2014, the directory /sys/kernel/tracing was added and is the designated location to mount tracefs. To keep backward compatibility, tracefs was auto-mounted in /sys/kernel/debug/tracing as well. All distros now mount tracefs on /sys/kernel/tracing. Having it seen in two different locations has lead to various issues and inconsistencies. The VFS folks have to also maintain debugfs_create_automount() for this single user. It's been over 10 years. Tooling and scripts should start replacing the debugfs location with the tracefs one. The reason tracefs was created in the first place was to allow access to the tracing facilities without the need to configure debugfs into the kernel. Using tracefs should now be more robust. A new config is created: CONFIG_TRACEFS_AUTOMOUNT_DEPRECATED which is default y, so that the kernel is still built with the automount. This config allows those that want to remove the automount from debugfs to do so. When tracefs is accessed from /sys/kernel/debug/tracing, the following printk is triggerd: pr_warn("NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030\n"); This gives users another 5 years to fix their scripts. - Use queue_rcu_work() instead of call_rcu() for freeing event filters The number of filters to be free can be many depending on the number of events within an event system. Freeing them from softirq context can potentially cause undesired latency. Use the RCU workqueue to free them instead. - Remove pointless memory barriers in latency code Memory barriers were added to some of the latency code a long time ago with the idea of "making them visible", but that's not what memory barriers are for. They are to synchronize access between different variables. There was no synchronization here making them pointless. - Remove "__attribute__()" from the type field of event format When LLVM is used to compile the kernel with CONFIG_DEBUG_INFO_BTF=y and PAHOLE_HAS_BTF_TAG=y, some of the format fields get expanded with the following: field:const char * filename; offset:24; size:8; signed:0; Turns into: field:const char __attribute__((btf_type_tag("user"))) * filename; offset:24; size:8; signed:0; This confuses parsers. Add code to strip these tags from the strings. - Add eprobe config option CONFIG_EPROBE_EVENTS Eprobes were added back in 5.15 but were only enabled when another probe was enabled (kprobe, fprobe, uprobe, etc). The eprobes had no config option of their own. Add one as they should be a separate entity. It's default y to keep with the old kernels but still has dependencies on TRACING and HAVE_REGS_AND_STACK_ACCESS_API. - Add eprobe documentation When eprobes were added back in 5.15 no documentation was added to describe them. This needs to be rectified. - Replace open coded cpumask_next_wrap() in move_to_next_cpu() - Have preemptirq_delay_run() use off-stack CPU mask - Remove obsolete comment about pelt_cfs event DECLARE_TRACE() appends "_tp" to trace events now, but the comment above pelt_cfs still mentioned appending it manually. - Remove EVENT_FILE_FL_SOFT_MODE flag The SOFT_MODE flag was required when the soft enabling and disabling of trace events was first introduced. But there was a bug with this approach as it only worked for a single instance. When multiple users required soft disabling and disabling the code was changed to have a ref count. The SOFT_MODE flag is now set iff the ref count is non zero. This is redundant and just reading the ref count is good enough. - Fix typo in comment -----BEGIN PGP SIGNATURE----- iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaIt5ZRQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qvriAPsEbOEgMrPF1Tdj1mHLVajYTxI8ft5J aX5bfM2cDDRVcgEA57JHOXp4d05dj555/hgAUuCWuFp/E0Anp45EnFTedgQ= =wKZW -----END PGP SIGNATURE----- Merge tag 'trace-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Deprecate auto-mounting tracefs to /sys/kernel/debug/tracing When tracefs was first introduced back in 2014, the directory /sys/kernel/tracing was added and is the designated location to mount tracefs. To keep backward compatibility, tracefs was auto-mounted in /sys/kernel/debug/tracing as well. All distros now mount tracefs on /sys/kernel/tracing. Having it seen in two different locations has lead to various issues and inconsistencies. The VFS folks have to also maintain debugfs_create_automount() for this single user. It's been over 10 years. Tooling and scripts should start replacing the debugfs location with the tracefs one. The reason tracefs was created in the first place was to allow access to the tracing facilities without the need to configure debugfs into the kernel. Using tracefs should now be more robust. A new config is created: CONFIG_TRACEFS_AUTOMOUNT_DEPRECATED which is default y, so that the kernel is still built with the automount. This config allows those that want to remove the automount from debugfs to do so. When tracefs is accessed from /sys/kernel/debug/tracing, the following printk is triggerd: pr_warn("NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030\n"); This gives users another 5 years to fix their scripts. - Use queue_rcu_work() instead of call_rcu() for freeing event filters The number of filters to be free can be many depending on the number of events within an event system. Freeing them from softirq context can potentially cause undesired latency. Use the RCU workqueue to free them instead. - Remove pointless memory barriers in latency code Memory barriers were added to some of the latency code a long time ago with the idea of "making them visible", but that's not what memory barriers are for. They are to synchronize access between different variables. There was no synchronization here making them pointless. - Remove "__attribute__()" from the type field of event format When LLVM is used to compile the kernel with CONFIG_DEBUG_INFO_BTF=y and PAHOLE_HAS_BTF_TAG=y, some of the format fields get expanded with the following: field:const char * filename; offset:24; size:8; signed:0; Turns into: field:const char __attribute__((btf_type_tag("user"))) * filename; offset:24; size:8; signed:0; This confuses parsers. Add code to strip these tags from the strings. - Add eprobe config option CONFIG_EPROBE_EVENTS Eprobes were added back in 5.15 but were only enabled when another probe was enabled (kprobe, fprobe, uprobe, etc). The eprobes had no config option of their own. Add one as they should be a separate entity. It's default y to keep with the old kernels but still has dependencies on TRACING and HAVE_REGS_AND_STACK_ACCESS_API. - Add eprobe documentation When eprobes were added back in 5.15 no documentation was added to describe them. This needs to be rectified. - Replace open coded cpumask_next_wrap() in move_to_next_cpu() - Have preemptirq_delay_run() use off-stack CPU mask - Remove obsolete comment about pelt_cfs event DECLARE_TRACE() appends "_tp" to trace events now, but the comment above pelt_cfs still mentioned appending it manually. - Remove EVENT_FILE_FL_SOFT_MODE flag The SOFT_MODE flag was required when the soft enabling and disabling of trace events was first introduced. But there was a bug with this approach as it only worked for a single instance. When multiple users required soft disabling and disabling the code was changed to have a ref count. The SOFT_MODE flag is now set iff the ref count is non zero. This is redundant and just reading the ref count is good enough. - Fix typo in comment * tag 'trace-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: Documentation: tracing: Add documentation about eprobes tracing: Have eprobes have their own config option tracing: Remove "__attribute__()" from the type field of event format tracing: Deprecate auto-mounting tracefs in debugfs tracing: Fix comment in trace_module_remove_events() tracing: Remove EVENT_FILE_FL_SOFT_MODE flag tracing: Remove pointless memory barriers tracing/sched: Remove obsolete comment on suffixes kernel: trace: preemptirq_delay_test: use offstack cpu mask tracing: Use queue_rcu_work() to free filters tracing: Replace opencoded cpumask_next_wrap() in move_to_next_cpu() |
||
![]() |
c6439bfaab |
Deferred unwind changes for 6.17
This is the core infrastructure for the deferred unwinder that is required for sframes[1]. Several other patch series is based on this work although those patch series are not dependent on each other. In order to simplify the development, having this core series upstream will allow the other series to be worked on in parallel. The other series are: - The two patches to implement x86: https://lore.kernel.org/linux-trace-kernel/20250717004958.260781923@kernel.org/ https://lore.kernel.org/linux-trace-kernel/20250717004958.432327787@kernel.org/ - The s390 work: https://lore.kernel.org/linux-trace-kernel/20250710163522.3195293-1-jremus@linux.ibm.com/ - The perf work: https://lore.kernel.org/linux-trace-kernel/20250718164119.089692174@kernel.org/ - The ftrace work: https://lore.kernel.org/linux-trace-kernel/20250424192612.505622711@goodmis.org/ - The sframe work: https://lore.kernel.org/linux-trace-kernel/20250717012848.927473176@kernel.org/ And more is on the way. The core infrastructure adds the following in kernel APIs: - int unwind_user_faultable(struct unwind_stacktrace *trace); Performs a user space stack trace that may fault user pages in. - int unwind_deferred_init(struct unwind_work *work, unwind_callback_t func); Allows a tracer to register with the unwind deferred infrastructure. - int unwind_deferred_request(struct unwind_work *work, u64 *cookie); Used when a tracer request a deferred trace. Can be called from interrupt or NMI context. - void unwind_deferred_cancel(struct unwind_work *work); Called by a tracer to unregister from the deferred unwind infrastructure. - void unwind_deferred_task_exit(struct task_struct *task); Called by task exit code to flush any pending unwind requests. - void unwind_task_init(struct task_struct *task); Called by do_fork() to initialize the task struct for the deferred unwinder. - void unwind_task_free(struct task_struct *task); Called by do_exit() to free up any resources used by the deferred unwinder. None of the above is actually compiled unless an architecture enables it, which none currently do. [1] https://sourceware.org/binutils/wiki/sframe -----BEGIN PGP SIGNATURE----- iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaIt9IhQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qqqzAQCMT/6qmSq7O746JF0MuGC6fTZnSbAc XGz4JigEqLTRewEA2kaJmD7PBsSRzFdiK2gvyKn95l+PZyWtE9MjTsqeSAc= =Lsbm -----END PGP SIGNATURE----- Merge tag 'trace-deferred-unwind-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull initial deferred unwind infrastructure from Steven Rostedt: "This is the core infrastructure for the deferred unwinder that is required for sframes[1]. Several other patch series are based on this work although those patch series are not dependent on each other. In order to simplify the development, having this core series upstream will allow the other series to be worked on in parallel. The other series are: - The two patches to implement x86 support [2] [3] - The s390 work [4] - The perf work [5] - The ftrace work [6] - The sframe work [7] And more is on the way. The core infrastructure adds the following in kernel APIs: - int unwind_user_faultable(struct unwind_stacktrace *trace); Performs a user space stack trace that may fault user pages in. - int unwind_deferred_init(struct unwind_work *work, unwind_callback_t func); Allows a tracer to register with the unwind deferred infrastructure. - int unwind_deferred_request(struct unwind_work *work, u64 *cookie); Used when a tracer request a deferred trace. Can be called from interrupt or NMI context. - void unwind_deferred_cancel(struct unwind_work *work); Called by a tracer to unregister from the deferred unwind infrastructure. - void unwind_deferred_task_exit(struct task_struct *task); Called by task exit code to flush any pending unwind requests. - void unwind_task_init(struct task_struct *task); Called by do_fork() to initialize the task struct for the deferred unwinder. - void unwind_task_free(struct task_struct *task); Called by do_exit() to free up any resources used by the deferred unwinder. None of the above is actually compiled unless an architecture enables it, which none currently do" Link: https://sourceware.org/binutils/wiki/sframe [1] Link: https://lore.kernel.org/linux-trace-kernel/20250717004958.260781923@kernel.org/ [2] Link: https://lore.kernel.org/linux-trace-kernel/20250717004958.432327787@kernel.org/ [3] Link: https://lore.kernel.org/linux-trace-kernel/20250710163522.3195293-1-jremus@linux.ibm.com/ [4] Link: https://lore.kernel.org/linux-trace-kernel/20250718164119.089692174@kernel.org/ [5] Link: https://lore.kernel.org/linux-trace-kernel/20250424192612.505622711@goodmis.org/ [6] Link: https://lore.kernel.org/linux-trace-kernel/20250717012848.927473176@kernel.org/ [7] * tag 'trace-deferred-unwind-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: unwind: Finish up unwind when a task exits unwind deferred: Use SRCU unwind_deferred_task_work() unwind: Add USED bit to only have one conditional on way back to user space unwind deferred: Add unwind_completed mask to stop spurious callbacks unwind deferred: Use bitmask to determine which callbacks to call unwind_user/deferred: Make unwind deferral requests NMI-safe unwind_user/deferred: Add deferred unwinding interface unwind_user/deferred: Add unwind cache unwind_user/deferred: Add unwind_user_faultable() unwind_user: Add user space unwinding API with frame pointer support |
||
![]() |
f914876eec |
bpf: Improve ctx access verifier error message
We've already had two "error during ctx access conversion" warnings triggered by syzkaller. Let's improve the error message by dumping the cnt variable so that we can more easily differentiate between the different error cases. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/cc94316c30dd76fae4a75a664b61a2dbfe68e205.1754039605.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
![]() |
32d89a405a |
vhost: Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...))
cocci warning: ./kernel/vhost_task.c:148:9-16: WARNING: ERR_CAST can be used with tsk Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)). Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn> Message-Id: <1a8499a5da53e4f72cf21aca044ae4b26db8b2ad.1749020055.git.xiaopei01@kylinos.cn> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> |
||
![]() |
f1befc82ad |
cfi: Move BPF CFI types and helpers to generic code
Instead of duplicating the same code for each architecture, move the CFI type hash variables for BPF function types and related helper functions to generic CFI code, and allow architectures to override the function definitions if needed. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20250801001004.1859976-7-samitolvanen@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
![]() |
f2d282e1df |
bitmap-for-6.17
Bits-related patched for 6.17: - find_random_bit() series (Yury); - GENMASK() consolidation (Vincent); - random cleanups (Shaopeng, Ben, Yury) -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmiLkz0ACgkQsUSA/Tof vsj7MgwAvRSyevYSm9cm1Y99098M/7gWeJUeLAIy0GJdFaBQIcMkXkRGXJ9A0ZHb RoCFG4eiukHIDHRzJjncXUNTk0zVCbEifUF43BdnJrhjTePlou5SNVh6xhJfQ1Ai ENB4Q+nAZyIm43cUnoDhR24ne3pgJcY+oe6e7sQTRFF/6+nB4RDHmjIAMVsYgH30 w8iPBxNXXULAZNDgOPA3J5bACEnZPfOAhtoiNBC9s4MsE4o+Q8E9FVhReI2tiIhk t98kVZu7TFyrGcCdLz8EgbcG4KPFBmwOwOv8S1Mzgy46MwS//dd7MZA7y3MqTvJ/ VEMoTMAK14/VrgDxu/vdBsUJt/T1wPc+ZbUt/rNb530oSDkvjIo+4ihg1nfswqhn u+fj65wAHRW7CSkgpHn3bM/wvxmtIaE6AoY6jWwyuZ1zGIEV+5iPBo56kkmpJlYj GlnbiTHkNR/jGa1GwB3PDG2kzoqXVLz6EeFdZncUX53MGa90g0+5/k0ld+oBJTDh 7QbkZlW1 =uj9U -----END PGP SIGNATURE----- Merge tag 'bitmap-for-6.17' of https://github.com/norov/linux Pull bitmap updates from Yury Norov: - find_random_bit() series (Yury) - GENMASK() consolidation (Vincent) - random cleanups (Shaopeng, Ben, Yury) * tag 'bitmap-for-6.17' of https://github.com/norov/linux: bitfield: Ensure the return values of helper functions are checked test_bits: add tests for __GENMASK() and __GENMASK_ULL() bits: unify the non-asm GENMASK*() bits: split the definition of the asm and non-asm GENMASK*() cpumask: Remove unnecessary cpumask_nth_andnot() watchdog: fix opencoded cpumask_next_wrap() in watchdog_next_cpu() clocksource: Improve randomness in clocksource_verify_choose_cpus() cpumask: introduce cpumask_random() bitmap: generalize node_random() |