mirror_ubuntu-kernels/include
Johannes Weiner fd919a85cd mm: page_isolation: prepare for hygienic freelists
Page isolation currently sets MIGRATE_ISOLATE on a block, then drops
zone->lock and scans the block for straddling buddies to split up. 
Because this happens non-atomically wrt the page allocator, it's possible
for allocations to get a buddy whose first block is a regular pcp
migratetype but whose tail is isolated.  This means that in certain cases
memory can still be allocated after isolation.  It will also trigger the
freelist type hygiene warnings in subsequent patches.

start_isolate_page_range()
  isolate_single_pageblock()
    set_migratetype_isolate(tail)
      lock zone->lock
      move_freepages_block(tail) // nop
      set_pageblock_migratetype(tail)
      unlock zone->lock
                                                     __rmqueue_smallest()
                                                       del_page_from_freelist(head)
                                                       expand(head, head_mt)
                                                         WARN(head_mt != tail_mt)
    start_pfn = ALIGN_DOWN(MAX_ORDER_NR_PAGES)
    for (pfn = start_pfn, pfn < end_pfn)
      if (PageBuddy())
        split_free_page(head)

Introduce a variant of move_freepages_block() provided by the allocator
specifically for page isolation; it moves free pages, converts the block,
and handles the splitting of straddling buddies while holding zone->lock.

The allocator knows that pageblocks and buddies are always naturally
aligned, which means that buddies can only straddle blocks if they're
actually >pageblock_order.  This means the search-and-split part can be
simplified compared to what page isolation used to do.

Also tighten up the page isolation code around the expectations of which
pages can be large, and how they are freed.

Based on extensive discussions with and invaluable input from Zi Yan.

[hannes@cmpxchg.org: work around older gcc warning]
  Link: https://lkml.kernel.org/r/20240321142426.GB777580@cmpxchg.org
Link: https://lkml.kernel.org/r/20240320180429.678181-10-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 20:56:04 -07:00
..
acpi mm: change inlined allocation helpers to account at the call site 2024-04-25 20:55:59 -07:00
asm-generic mm: change inlined allocation helpers to account at the call site 2024-04-25 20:55:59 -07:00
clocksource
crypto mm: change inlined allocation helpers to account at the call site 2024-04-25 20:55:59 -07:00
drm drm fixes for 6.9-rc1 2024-03-21 19:04:31 -07:00
dt-bindings Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
keys
kunit
kvm KVM: arm64: Fix host-programmed guest events in nVHE 2024-03-26 01:51:44 -07:00
linux mm: page_isolation: prepare for hygienic freelists 2024-04-25 20:56:04 -07:00
math-emu
media media updates for v6.9-rc1 2024-03-15 11:36:54 -07:00
memory
misc
net mm: change inlined allocation helpers to account at the call site 2024-04-25 20:55:59 -07:00
pcmcia
ras PCI/AER: Generalize TLP Header Log reading 2024-03-08 15:26:46 -06:00
rdma fix missing vmalloc.h includes 2024-04-25 20:55:49 -07:00
rv
scsi scsi: sd: Fix TCG OPAL unlock on system resume 2024-03-25 15:46:12 -04:00
soc Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
sound ASoC: Fixes for v6.9 2024-04-05 08:48:12 +02:00
target
trace mm: free up PG_slab 2024-04-25 20:56:00 -07:00
uapi vhost-vdpa: change ioctl # for VDPA_GET_VRING_SIZE 2024-04-08 04:11:04 -04:00
ufs scsi: ufs: core: Add config_scsi_dev vops comment 2024-03-10 18:10:24 -04:00
vdso vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h 2024-04-03 21:50:04 +02:00
video
xen