* [PATCH v9 0/3] kho: add support for deferred struct page init
@ 2026-04-23 12:25 Michal Clapinski
2026-04-23 12:25 ` [PATCH v9 1/3] kho: fix deferred initialization of scratch areas Michal Clapinski
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Michal Clapinski @ 2026-04-23 12:25 UTC (permalink / raw)
To: Evangelos Petrongonas, Pasha Tatashin, Mike Rapoport,
Pratyush Yadav, Alexander Graf, Samiullah Khawaja, kexec,
linux-mm
Cc: linux-kernel, Andrew Morton, Vlastimil Babka, Suren Baghdasaryan,
Michal Hocko, Brendan Jackman, Johannes Weiner, Zi Yan,
Michal Clapinski
When CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, struct page
initialization is deferred to parallel kthreads that run later in
the boot process.
Currently, KHO is incompatible with DEFERRED.
This series fixes that incompatibility.
---
v9:
- moved init_pageblock_migratetype from memmap_init_reserved_range
to __init_page_from_nid
- reduced number of ifdefs
- new commit to test this feature
v8:
- moved overriding the migratetype from init_pageblock_migratetype
to callsites
v7:
- reimplemented the initialization of kho scratch again
v6:
- reimplemented the initialization of kho scratch
v5:
- rebased
v4:
- added a new commit to fix deferred init of kho scratch
- switched to ulong when refering to pfn
v3:
- changed commit msg
- don't invoke early_pfn_to_nid if CONFIG_DEFERRED_STRUCT_PAGE_INIT=n
v2:
- updated a comment
I took Evangelos's test code:
https://git.infradead.org/?p=users/vpetrog/linux.git;a=shortlog;h=refs/heads/kho-deferred-struct-page-init
and then modified it to this monster test that does 2 allocations:
at core_initcall (early) and at module_init (late). Then kexec, then
2 more allocations at these points, then restore the original 2, then
kexec, then restore the other 2. Basically I test preservation of early
and late allocation both on cold and on warm boot.
Tested it both with and without DEFERRED.
https://github.com/mclapinski/linux/commits/deferred_test/
This patch probably doesn't apply onto anything currently.
It's based on mm-new with
"memblock: move reserve_bootmem_range() to memblock.c and make it static"
cherrypicked from rppt/memblock.
Evangelos Petrongonas (1):
kho: make preserved pages compatible with deferred struct page init
Michal Clapinski (2):
kho: fix deferred initialization of scratch areas
selftests: kho: test with deferred struct page init
include/linux/memblock.h | 21 +++++++++-
kernel/liveupdate/Kconfig | 2 -
kernel/liveupdate/kexec_handover.c | 52 ++++++++++++-------------
mm/memblock.c | 56 +++++++++++----------------
mm/mm_init.c | 30 +++++++-------
tools/testing/selftests/kho/vmtest.sh | 4 ++
6 files changed, 88 insertions(+), 77 deletions(-)
--
2.54.0.rc2.533.g4f5dca5207-goog
^ permalink raw reply [flat|nested] 10+ messages in thread* [PATCH v9 1/3] kho: fix deferred initialization of scratch areas 2026-04-23 12:25 [PATCH v9 0/3] kho: add support for deferred struct page init Michal Clapinski @ 2026-04-23 12:25 ` Michal Clapinski 2026-04-23 16:43 ` Pratyush Yadav ` (2 more replies) 2026-04-23 12:25 ` [PATCH v9 2/3] kho: make preserved pages compatible with deferred struct page init Michal Clapinski 2026-04-23 12:25 ` [PATCH v9 3/3] selftests: kho: test " Michal Clapinski 2 siblings, 3 replies; 10+ messages in thread From: Michal Clapinski @ 2026-04-23 12:25 UTC (permalink / raw) To: Evangelos Petrongonas, Pasha Tatashin, Mike Rapoport, Pratyush Yadav, Alexander Graf, Samiullah Khawaja, kexec, linux-mm Cc: linux-kernel, Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Johannes Weiner, Zi Yan, Michal Clapinski Currently, if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, kho_release_scratch() will initialize the struct pages and set migratetype of KHO scratch. Unless the whole scratch fits below first_deferred_pfn, some of that will be overwritten either by deferred_init_pages() or memmap_init_reserved_range(). To fix it, make memmap_init_range(), deferred_init_memmap_chunk() and __init_page_from_nid() recognize KHO scratch regions and set migratetype of pageblocks in those regions to MIGRATE_CMA. Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Michal Clapinski <mclapinski@google.com> --- include/linux/memblock.h | 21 +++++++++-- kernel/liveupdate/kexec_handover.c | 25 ------------- mm/memblock.c | 56 ++++++++++++------------------ mm/mm_init.c | 30 +++++++++------- 4 files changed, 58 insertions(+), 74 deletions(-) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index b0f750d22a7b..5afcd99aa8c1 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -613,11 +613,28 @@ static inline void memtest_report_meminfo(struct seq_file *m) { } #ifdef CONFIG_MEMBLOCK_KHO_SCRATCH void memblock_set_kho_scratch_only(void); void memblock_clear_kho_scratch_only(void); -void memmap_init_kho_scratch_pages(void); +bool memblock_is_kho_scratch_memory(phys_addr_t addr); + +static inline enum migratetype kho_scratch_migratetype(unsigned long pfn, + enum migratetype mt) +{ + if (memblock_is_kho_scratch_memory(PFN_PHYS(pfn))) + return MIGRATE_CMA; + return mt; +} #else static inline void memblock_set_kho_scratch_only(void) { } static inline void memblock_clear_kho_scratch_only(void) { } -static inline void memmap_init_kho_scratch_pages(void) {} +static inline bool memblock_is_kho_scratch_memory(phys_addr_t addr) +{ + return false; +} + +static inline enum migratetype kho_scratch_migratetype(unsigned long pfn, + enum migratetype mt) +{ + return mt; +} #endif #endif /* _LINUX_MEMBLOCK_H */ diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c index 18509d8082ea..a507366a2cf9 100644 --- a/kernel/liveupdate/kexec_handover.c +++ b/kernel/liveupdate/kexec_handover.c @@ -1576,35 +1576,10 @@ static __init int kho_init(void) } fs_initcall(kho_init); -static void __init kho_release_scratch(void) -{ - phys_addr_t start, end; - u64 i; - - memmap_init_kho_scratch_pages(); - - /* - * Mark scratch mem as CMA before we return it. That way we - * ensure that no kernel allocations happen on it. That means - * we can reuse it as scratch memory again later. - */ - __for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, - MEMBLOCK_KHO_SCRATCH, &start, &end, NULL) { - ulong start_pfn = pageblock_start_pfn(PFN_DOWN(start)); - ulong end_pfn = pageblock_align(PFN_UP(end)); - ulong pfn; - - for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) - init_pageblock_migratetype(pfn_to_page(pfn), - MIGRATE_CMA, false); - } -} - void __init kho_memory_init(void) { if (kho_in.scratch_phys) { kho_scratch = phys_to_virt(kho_in.scratch_phys); - kho_release_scratch(); if (kho_mem_retrieve(kho_get_fdt())) kho_in.fdt_phys = 0; diff --git a/mm/memblock.c b/mm/memblock.c index a6a1c91e276d..01a962681726 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1026,40 +1026,6 @@ int __init_memblock memblock_physmem_add(phys_addr_t base, phys_addr_t size) } #endif -#ifdef CONFIG_MEMBLOCK_KHO_SCRATCH -__init void memblock_set_kho_scratch_only(void) -{ - kho_scratch_only = true; -} - -__init void memblock_clear_kho_scratch_only(void) -{ - kho_scratch_only = false; -} - -__init void memmap_init_kho_scratch_pages(void) -{ - phys_addr_t start, end; - unsigned long pfn; - int nid; - u64 i; - - if (!IS_ENABLED(CONFIG_DEFERRED_STRUCT_PAGE_INIT)) - return; - - /* - * Initialize struct pages for free scratch memory. - * The struct pages for reserved scratch memory will be set up in - * memmap_init_reserved_pages() - */ - __for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, - MEMBLOCK_KHO_SCRATCH, &start, &end, &nid) { - for (pfn = PFN_UP(start); pfn < PFN_DOWN(end); pfn++) - init_deferred_page(pfn, nid); - } -} -#endif - /** * memblock_setclr_flag - set or clear flag for a memory region * @type: memblock type to set/clear flag for @@ -2533,6 +2499,28 @@ int reserve_mem_release_by_name(const char *name) return 1; } +#ifdef CONFIG_MEMBLOCK_KHO_SCRATCH +__init void memblock_set_kho_scratch_only(void) +{ + kho_scratch_only = true; +} + +__init void memblock_clear_kho_scratch_only(void) +{ + kho_scratch_only = false; +} + +bool __init_memblock memblock_is_kho_scratch_memory(phys_addr_t addr) +{ + int i = memblock_search(&memblock.memory, addr); + + if (i == -1) + return false; + + return memblock_is_kho_scratch(&memblock.memory.regions[i]); +} +#endif + #ifdef CONFIG_KEXEC_HANDOVER static int __init reserved_mem_preserve(void) diff --git a/mm/mm_init.c b/mm/mm_init.c index f9f8e1af921c..eddc0f03a779 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -692,9 +692,11 @@ void __meminit __init_page_from_nid(unsigned long pfn, int nid) } __init_single_page(pfn_to_page(pfn), pfn, zid, nid); - if (pageblock_aligned(pfn)) - init_pageblock_migratetype(pfn_to_page(pfn), MIGRATE_MOVABLE, - false); + if (pageblock_aligned(pfn)) { + enum migratetype mt = + kho_scratch_migratetype(pfn, MIGRATE_MOVABLE); + init_pageblock_migratetype(pfn_to_page(pfn), mt, false); + } } #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT @@ -927,7 +929,8 @@ void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone static void __init memmap_init_zone_range(struct zone *zone, unsigned long start_pfn, unsigned long end_pfn, - unsigned long *hole_pfn) + unsigned long *hole_pfn, + enum migratetype mt) { unsigned long zone_start_pfn = zone->zone_start_pfn; unsigned long zone_end_pfn = zone_start_pfn + zone->spanned_pages; @@ -940,8 +943,7 @@ static void __init memmap_init_zone_range(struct zone *zone, return; memmap_init_range(end_pfn - start_pfn, nid, zone_id, start_pfn, - zone_end_pfn, MEMINIT_EARLY, NULL, MIGRATE_MOVABLE, - false); + zone_end_pfn, MEMINIT_EARLY, NULL, mt, false); if (*hole_pfn < start_pfn) init_unavailable_range(*hole_pfn, start_pfn, zone_id, nid); @@ -957,6 +959,8 @@ static void __init memmap_init(void) for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { struct pglist_data *node = NODE_DATA(nid); + enum migratetype mt = + kho_scratch_migratetype(start_pfn, MIGRATE_MOVABLE); for (j = 0; j < MAX_NR_ZONES; j++) { struct zone *zone = node->node_zones + j; @@ -965,7 +969,7 @@ static void __init memmap_init(void) continue; memmap_init_zone_range(zone, start_pfn, end_pfn, - &hole_pfn); + &hole_pfn, mt); zone_id = j; } } @@ -1970,7 +1974,7 @@ unsigned long __init node_map_pfn_alignment(void) #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT static void __init deferred_free_pages(unsigned long pfn, - unsigned long nr_pages) + unsigned long nr_pages, enum migratetype mt) { struct page *page; unsigned long i; @@ -1983,8 +1987,7 @@ static void __init deferred_free_pages(unsigned long pfn, /* Free a large naturally-aligned chunk if possible */ if (nr_pages == MAX_ORDER_NR_PAGES && IS_MAX_ORDER_ALIGNED(pfn)) { for (i = 0; i < nr_pages; i += pageblock_nr_pages) - init_pageblock_migratetype(page + i, MIGRATE_MOVABLE, - false); + init_pageblock_migratetype(page + i, mt, false); __free_pages_core(page, MAX_PAGE_ORDER, MEMINIT_EARLY); return; } @@ -1994,8 +1997,7 @@ static void __init deferred_free_pages(unsigned long pfn, for (i = 0; i < nr_pages; i++, page++, pfn++) { if (pageblock_aligned(pfn)) - init_pageblock_migratetype(page, MIGRATE_MOVABLE, - false); + init_pageblock_migratetype(page, mt, false); __free_pages_core(page, 0, MEMINIT_EARLY); } } @@ -2053,6 +2055,8 @@ deferred_init_memmap_chunk(unsigned long start_pfn, unsigned long end_pfn, for_each_free_mem_range(i, nid, 0, &start, &end, NULL) { unsigned long spfn = PFN_UP(start); unsigned long epfn = PFN_DOWN(end); + enum migratetype mt = + kho_scratch_migratetype(spfn, MIGRATE_MOVABLE); if (spfn >= end_pfn) break; @@ -2065,7 +2069,7 @@ deferred_init_memmap_chunk(unsigned long start_pfn, unsigned long end_pfn, unsigned long chunk_end = min(mo_pfn, epfn); nr_pages += deferred_init_pages(zone, spfn, chunk_end); - deferred_free_pages(spfn, chunk_end - spfn); + deferred_free_pages(spfn, chunk_end - spfn, mt); spfn = chunk_end; -- 2.54.0.rc2.533.g4f5dca5207-goog ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v9 1/3] kho: fix deferred initialization of scratch areas 2026-04-23 12:25 ` [PATCH v9 1/3] kho: fix deferred initialization of scratch areas Michal Clapinski @ 2026-04-23 16:43 ` Pratyush Yadav 2026-04-23 17:42 ` Pasha Tatashin 2026-04-24 8:12 ` Mike Rapoport 2 siblings, 0 replies; 10+ messages in thread From: Pratyush Yadav @ 2026-04-23 16:43 UTC (permalink / raw) To: Michal Clapinski Cc: Evangelos Petrongonas, Pasha Tatashin, Mike Rapoport, Pratyush Yadav, Alexander Graf, Samiullah Khawaja, kexec, linux-mm, linux-kernel, Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Johannes Weiner, Zi Yan On Thu, Apr 23 2026, Michal Clapinski wrote: > Currently, if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, > kho_release_scratch() will initialize the struct pages and set migratetype > of KHO scratch. Unless the whole scratch fits below first_deferred_pfn, > some of that will be overwritten either by deferred_init_pages() or > memmap_init_reserved_range(). > > To fix it, make memmap_init_range(), deferred_init_memmap_chunk() and > __init_page_from_nid() recognize KHO scratch regions and set > migratetype of pageblocks in those regions to MIGRATE_CMA. > > Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> > Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> > Signed-off-by: Michal Clapinski <mclapinski@google.com> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> [...] -- Regards, Pratyush Yadav ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v9 1/3] kho: fix deferred initialization of scratch areas 2026-04-23 12:25 ` [PATCH v9 1/3] kho: fix deferred initialization of scratch areas Michal Clapinski 2026-04-23 16:43 ` Pratyush Yadav @ 2026-04-23 17:42 ` Pasha Tatashin 2026-04-24 8:12 ` Mike Rapoport 2 siblings, 0 replies; 10+ messages in thread From: Pasha Tatashin @ 2026-04-23 17:42 UTC (permalink / raw) To: Michal Clapinski Cc: Evangelos Petrongonas, Pasha Tatashin, Mike Rapoport, Pratyush Yadav, Alexander Graf, Samiullah Khawaja, kexec, linux-mm, linux-kernel, Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Johannes Weiner, Zi Yan On 04-23 14:25, Michal Clapinski wrote: > Currently, if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, > kho_release_scratch() will initialize the struct pages and set migratetype > of KHO scratch. Unless the whole scratch fits below first_deferred_pfn, > some of that will be overwritten either by deferred_init_pages() or > memmap_init_reserved_range(). > > To fix it, make memmap_init_range(), deferred_init_memmap_chunk() and > __init_page_from_nid() recognize KHO scratch regions and set > migratetype of pageblocks in those regions to MIGRATE_CMA. > > Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> > Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> > Signed-off-by: Michal Clapinski <mclapinski@google.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v9 1/3] kho: fix deferred initialization of scratch areas 2026-04-23 12:25 ` [PATCH v9 1/3] kho: fix deferred initialization of scratch areas Michal Clapinski 2026-04-23 16:43 ` Pratyush Yadav 2026-04-23 17:42 ` Pasha Tatashin @ 2026-04-24 8:12 ` Mike Rapoport 2 siblings, 0 replies; 10+ messages in thread From: Mike Rapoport @ 2026-04-24 8:12 UTC (permalink / raw) To: Michal Clapinski Cc: Evangelos Petrongonas, Pasha Tatashin, Pratyush Yadav, Alexander Graf, Samiullah Khawaja, kexec, linux-mm, linux-kernel, Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Johannes Weiner, Zi Yan On Thu, Apr 23, 2026 at 02:25:36PM +0200, Michal Clapinski wrote: > Currently, if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, > kho_release_scratch() will initialize the struct pages and set migratetype > of KHO scratch. Unless the whole scratch fits below first_deferred_pfn, > some of that will be overwritten either by deferred_init_pages() or > memmap_init_reserved_range(). > > To fix it, make memmap_init_range(), deferred_init_memmap_chunk() and > __init_page_from_nid() recognize KHO scratch regions and set > migratetype of pageblocks in those regions to MIGRATE_CMA. > > Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> > Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> > Signed-off-by: Michal Clapinski <mclapinski@google.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> > --- > include/linux/memblock.h | 21 +++++++++-- > kernel/liveupdate/kexec_handover.c | 25 ------------- > mm/memblock.c | 56 ++++++++++++------------------ > mm/mm_init.c | 30 +++++++++------- > 4 files changed, 58 insertions(+), 74 deletions(-) > > diff --git a/include/linux/memblock.h b/include/linux/memblock.h > index b0f750d22a7b..5afcd99aa8c1 100644 > --- a/include/linux/memblock.h > +++ b/include/linux/memblock.h > @@ -613,11 +613,28 @@ static inline void memtest_report_meminfo(struct seq_file *m) { } > #ifdef CONFIG_MEMBLOCK_KHO_SCRATCH > void memblock_set_kho_scratch_only(void); > void memblock_clear_kho_scratch_only(void); > -void memmap_init_kho_scratch_pages(void); > +bool memblock_is_kho_scratch_memory(phys_addr_t addr); > + > +static inline enum migratetype kho_scratch_migratetype(unsigned long pfn, > + enum migratetype mt) > +{ > + if (memblock_is_kho_scratch_memory(PFN_PHYS(pfn))) > + return MIGRATE_CMA; > + return mt; > +} > #else > static inline void memblock_set_kho_scratch_only(void) { } > static inline void memblock_clear_kho_scratch_only(void) { } > -static inline void memmap_init_kho_scratch_pages(void) {} > +static inline bool memblock_is_kho_scratch_memory(phys_addr_t addr) > +{ > + return false; > +} > + > +static inline enum migratetype kho_scratch_migratetype(unsigned long pfn, > + enum migratetype mt) > +{ > + return mt; > +} > #endif > > #endif /* _LINUX_MEMBLOCK_H */ > diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c > index 18509d8082ea..a507366a2cf9 100644 > --- a/kernel/liveupdate/kexec_handover.c > +++ b/kernel/liveupdate/kexec_handover.c > @@ -1576,35 +1576,10 @@ static __init int kho_init(void) > } > fs_initcall(kho_init); > > -static void __init kho_release_scratch(void) > -{ > - phys_addr_t start, end; > - u64 i; > - > - memmap_init_kho_scratch_pages(); > - > - /* > - * Mark scratch mem as CMA before we return it. That way we > - * ensure that no kernel allocations happen on it. That means > - * we can reuse it as scratch memory again later. > - */ > - __for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, > - MEMBLOCK_KHO_SCRATCH, &start, &end, NULL) { > - ulong start_pfn = pageblock_start_pfn(PFN_DOWN(start)); > - ulong end_pfn = pageblock_align(PFN_UP(end)); > - ulong pfn; > - > - for (pfn = start_pfn; pfn < end_pfn; pfn += pageblock_nr_pages) > - init_pageblock_migratetype(pfn_to_page(pfn), > - MIGRATE_CMA, false); > - } > -} > - > void __init kho_memory_init(void) > { > if (kho_in.scratch_phys) { > kho_scratch = phys_to_virt(kho_in.scratch_phys); > - kho_release_scratch(); > > if (kho_mem_retrieve(kho_get_fdt())) > kho_in.fdt_phys = 0; > diff --git a/mm/memblock.c b/mm/memblock.c > index a6a1c91e276d..01a962681726 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1026,40 +1026,6 @@ int __init_memblock memblock_physmem_add(phys_addr_t base, phys_addr_t size) > } > #endif > > -#ifdef CONFIG_MEMBLOCK_KHO_SCRATCH > -__init void memblock_set_kho_scratch_only(void) > -{ > - kho_scratch_only = true; > -} > - > -__init void memblock_clear_kho_scratch_only(void) > -{ > - kho_scratch_only = false; > -} > - > -__init void memmap_init_kho_scratch_pages(void) > -{ > - phys_addr_t start, end; > - unsigned long pfn; > - int nid; > - u64 i; > - > - if (!IS_ENABLED(CONFIG_DEFERRED_STRUCT_PAGE_INIT)) > - return; > - > - /* > - * Initialize struct pages for free scratch memory. > - * The struct pages for reserved scratch memory will be set up in > - * memmap_init_reserved_pages() > - */ > - __for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, > - MEMBLOCK_KHO_SCRATCH, &start, &end, &nid) { > - for (pfn = PFN_UP(start); pfn < PFN_DOWN(end); pfn++) > - init_deferred_page(pfn, nid); > - } > -} > -#endif > - > /** > * memblock_setclr_flag - set or clear flag for a memory region > * @type: memblock type to set/clear flag for > @@ -2533,6 +2499,28 @@ int reserve_mem_release_by_name(const char *name) > return 1; > } > > +#ifdef CONFIG_MEMBLOCK_KHO_SCRATCH > +__init void memblock_set_kho_scratch_only(void) > +{ > + kho_scratch_only = true; > +} > + > +__init void memblock_clear_kho_scratch_only(void) > +{ > + kho_scratch_only = false; > +} > + > +bool __init_memblock memblock_is_kho_scratch_memory(phys_addr_t addr) > +{ > + int i = memblock_search(&memblock.memory, addr); > + > + if (i == -1) > + return false; > + > + return memblock_is_kho_scratch(&memblock.memory.regions[i]); > +} > +#endif > + > #ifdef CONFIG_KEXEC_HANDOVER > > static int __init reserved_mem_preserve(void) > diff --git a/mm/mm_init.c b/mm/mm_init.c > index f9f8e1af921c..eddc0f03a779 100644 > --- a/mm/mm_init.c > +++ b/mm/mm_init.c > @@ -692,9 +692,11 @@ void __meminit __init_page_from_nid(unsigned long pfn, int nid) > } > __init_single_page(pfn_to_page(pfn), pfn, zid, nid); > > - if (pageblock_aligned(pfn)) > - init_pageblock_migratetype(pfn_to_page(pfn), MIGRATE_MOVABLE, > - false); > + if (pageblock_aligned(pfn)) { > + enum migratetype mt = > + kho_scratch_migratetype(pfn, MIGRATE_MOVABLE); > + init_pageblock_migratetype(pfn_to_page(pfn), mt, false); > + } > } > > #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT > @@ -927,7 +929,8 @@ void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone > static void __init memmap_init_zone_range(struct zone *zone, > unsigned long start_pfn, > unsigned long end_pfn, > - unsigned long *hole_pfn) > + unsigned long *hole_pfn, > + enum migratetype mt) > { > unsigned long zone_start_pfn = zone->zone_start_pfn; > unsigned long zone_end_pfn = zone_start_pfn + zone->spanned_pages; > @@ -940,8 +943,7 @@ static void __init memmap_init_zone_range(struct zone *zone, > return; > > memmap_init_range(end_pfn - start_pfn, nid, zone_id, start_pfn, > - zone_end_pfn, MEMINIT_EARLY, NULL, MIGRATE_MOVABLE, > - false); > + zone_end_pfn, MEMINIT_EARLY, NULL, mt, false); > > if (*hole_pfn < start_pfn) > init_unavailable_range(*hole_pfn, start_pfn, zone_id, nid); > @@ -957,6 +959,8 @@ static void __init memmap_init(void) > > for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { > struct pglist_data *node = NODE_DATA(nid); > + enum migratetype mt = > + kho_scratch_migratetype(start_pfn, MIGRATE_MOVABLE); > > for (j = 0; j < MAX_NR_ZONES; j++) { > struct zone *zone = node->node_zones + j; > @@ -965,7 +969,7 @@ static void __init memmap_init(void) > continue; > > memmap_init_zone_range(zone, start_pfn, end_pfn, > - &hole_pfn); > + &hole_pfn, mt); > zone_id = j; > } > } > @@ -1970,7 +1974,7 @@ unsigned long __init node_map_pfn_alignment(void) > > #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT > static void __init deferred_free_pages(unsigned long pfn, > - unsigned long nr_pages) > + unsigned long nr_pages, enum migratetype mt) > { > struct page *page; > unsigned long i; > @@ -1983,8 +1987,7 @@ static void __init deferred_free_pages(unsigned long pfn, > /* Free a large naturally-aligned chunk if possible */ > if (nr_pages == MAX_ORDER_NR_PAGES && IS_MAX_ORDER_ALIGNED(pfn)) { > for (i = 0; i < nr_pages; i += pageblock_nr_pages) > - init_pageblock_migratetype(page + i, MIGRATE_MOVABLE, > - false); > + init_pageblock_migratetype(page + i, mt, false); > __free_pages_core(page, MAX_PAGE_ORDER, MEMINIT_EARLY); > return; > } > @@ -1994,8 +1997,7 @@ static void __init deferred_free_pages(unsigned long pfn, > > for (i = 0; i < nr_pages; i++, page++, pfn++) { > if (pageblock_aligned(pfn)) > - init_pageblock_migratetype(page, MIGRATE_MOVABLE, > - false); > + init_pageblock_migratetype(page, mt, false); > __free_pages_core(page, 0, MEMINIT_EARLY); > } > } > @@ -2053,6 +2055,8 @@ deferred_init_memmap_chunk(unsigned long start_pfn, unsigned long end_pfn, > for_each_free_mem_range(i, nid, 0, &start, &end, NULL) { > unsigned long spfn = PFN_UP(start); > unsigned long epfn = PFN_DOWN(end); > + enum migratetype mt = > + kho_scratch_migratetype(spfn, MIGRATE_MOVABLE); > > if (spfn >= end_pfn) > break; > @@ -2065,7 +2069,7 @@ deferred_init_memmap_chunk(unsigned long start_pfn, unsigned long end_pfn, > unsigned long chunk_end = min(mo_pfn, epfn); > > nr_pages += deferred_init_pages(zone, spfn, chunk_end); > - deferred_free_pages(spfn, chunk_end - spfn); > + deferred_free_pages(spfn, chunk_end - spfn, mt); > > spfn = chunk_end; > > -- > 2.54.0.rc2.533.g4f5dca5207-goog > -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v9 2/3] kho: make preserved pages compatible with deferred struct page init 2026-04-23 12:25 [PATCH v9 0/3] kho: add support for deferred struct page init Michal Clapinski 2026-04-23 12:25 ` [PATCH v9 1/3] kho: fix deferred initialization of scratch areas Michal Clapinski @ 2026-04-23 12:25 ` Michal Clapinski 2026-04-23 12:25 ` [PATCH v9 3/3] selftests: kho: test " Michal Clapinski 2 siblings, 0 replies; 10+ messages in thread From: Michal Clapinski @ 2026-04-23 12:25 UTC (permalink / raw) To: Evangelos Petrongonas, Pasha Tatashin, Mike Rapoport, Pratyush Yadav, Alexander Graf, Samiullah Khawaja, kexec, linux-mm Cc: linux-kernel, Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Johannes Weiner, Zi Yan, Michal Clapinski From: Evangelos Petrongonas <epetron@amazon.de> When CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, struct page initialization is deferred to parallel kthreads that run later in the boot process. During KHO restoration, kho_preserved_memory_reserve() writes metadata for each preserved memory region. However, if the struct page has not been initialized, this write targets uninitialized memory, potentially leading to errors like: BUG: unable to handle page fault for address: ... Fix this by introducing kho_get_preserved_page(), which ensures all struct pages in a preserved region are initialized by calling init_deferred_page() which is a no-op when the struct page is already initialized. Signed-off-by: Evangelos Petrongonas <epetron@amazon.de> Co-developed-by: Michal Clapinski <mclapinski@google.com> Signed-off-by: Michal Clapinski <mclapinski@google.com> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> --- kernel/liveupdate/Kconfig | 2 -- kernel/liveupdate/kexec_handover.c | 27 ++++++++++++++++++++++++++- 2 files changed, 26 insertions(+), 3 deletions(-) diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig index 1a8513f16ef7..c13af38ba23a 100644 --- a/kernel/liveupdate/Kconfig +++ b/kernel/liveupdate/Kconfig @@ -1,12 +1,10 @@ # SPDX-License-Identifier: GPL-2.0-only menu "Live Update and Kexec HandOver" - depends on !DEFERRED_STRUCT_PAGE_INIT config KEXEC_HANDOVER bool "kexec handover" depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE - depends on !DEFERRED_STRUCT_PAGE_INIT select MEMBLOCK_KHO_SCRATCH select KEXEC_FILE select LIBFDT diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c index a507366a2cf9..d5718bef6d4d 100644 --- a/kernel/liveupdate/kexec_handover.c +++ b/kernel/liveupdate/kexec_handover.c @@ -473,6 +473,31 @@ struct page *kho_restore_pages(phys_addr_t phys, unsigned long nr_pages) } EXPORT_SYMBOL_GPL(kho_restore_pages); +/* + * With CONFIG_DEFERRED_STRUCT_PAGE_INIT, struct pages in higher memory regions + * may not be initialized yet at the time KHO deserializes preserved memory. + * KHO uses the struct page to store metadata and a later initialization would + * overwrite it. + * Ensure all the struct pages in the preservation are + * initialized. kho_preserved_memory_reserve() marks the reservation as noinit + * to make sure they don't get re-initialized later. + */ +static struct page *__init kho_get_preserved_page(phys_addr_t phys, + unsigned int order) +{ + unsigned long pfn = PHYS_PFN(phys); + int nid; + + if (!IS_ENABLED(CONFIG_DEFERRED_STRUCT_PAGE_INIT)) + return pfn_to_page(pfn); + + nid = early_pfn_to_nid(pfn); + for (unsigned long i = 0; i < (1UL << order); i++) + init_deferred_page(pfn + i, nid); + + return pfn_to_page(pfn); +} + static int __init kho_preserved_memory_reserve(phys_addr_t phys, unsigned int order) { @@ -481,7 +506,7 @@ static int __init kho_preserved_memory_reserve(phys_addr_t phys, u64 sz; sz = 1 << (order + PAGE_SHIFT); - page = phys_to_page(phys); + page = kho_get_preserved_page(phys, order); /* Reserve the memory preserved in KHO in memblock */ memblock_reserve(phys, sz); -- 2.54.0.rc2.533.g4f5dca5207-goog ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v9 3/3] selftests: kho: test with deferred struct page init 2026-04-23 12:25 [PATCH v9 0/3] kho: add support for deferred struct page init Michal Clapinski 2026-04-23 12:25 ` [PATCH v9 1/3] kho: fix deferred initialization of scratch areas Michal Clapinski 2026-04-23 12:25 ` [PATCH v9 2/3] kho: make preserved pages compatible with deferred struct page init Michal Clapinski @ 2026-04-23 12:25 ` Michal Clapinski 2026-04-23 16:43 ` Pratyush Yadav ` (2 more replies) 2 siblings, 3 replies; 10+ messages in thread From: Michal Clapinski @ 2026-04-23 12:25 UTC (permalink / raw) To: Evangelos Petrongonas, Pasha Tatashin, Mike Rapoport, Pratyush Yadav, Alexander Graf, Samiullah Khawaja, kexec, linux-mm Cc: linux-kernel, Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Johannes Weiner, Zi Yan, Michal Clapinski Enable DEFERRED_STRUCT_PAGE_INIT which depends on SMP. Also enable additional debugging options. Signed-off-by: Michal Clapinski <mclapinski@google.com> --- tools/testing/selftests/kho/vmtest.sh | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tools/testing/selftests/kho/vmtest.sh b/tools/testing/selftests/kho/vmtest.sh index 49fdac8e8b15..0014bd76e88d 100755 --- a/tools/testing/selftests/kho/vmtest.sh +++ b/tools/testing/selftests/kho/vmtest.sh @@ -59,10 +59,14 @@ function build_kernel() { tee "$kconfig" > "$kho_config" <<EOF CONFIG_BLK_DEV_INITRD=y CONFIG_KEXEC_HANDOVER=y +CONFIG_KEXEC_HANDOVER_DEBUG=y CONFIG_KEXEC_HANDOVER_DEBUGFS=y CONFIG_TEST_KEXEC_HANDOVER=y CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_VM=y +CONFIG_DEBUG_VM_PGFLAGS=y +CONFIG_SMP=y +CONFIG_DEFERRED_STRUCT_PAGE_INIT=y $arch_kconfig EOF -- 2.54.0.rc2.533.g4f5dca5207-goog ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v9 3/3] selftests: kho: test with deferred struct page init 2026-04-23 12:25 ` [PATCH v9 3/3] selftests: kho: test " Michal Clapinski @ 2026-04-23 16:43 ` Pratyush Yadav 2026-04-23 17:44 ` Pasha Tatashin 2026-04-24 8:13 ` Mike Rapoport 2 siblings, 0 replies; 10+ messages in thread From: Pratyush Yadav @ 2026-04-23 16:43 UTC (permalink / raw) To: Michal Clapinski Cc: Evangelos Petrongonas, Pasha Tatashin, Mike Rapoport, Pratyush Yadav, Alexander Graf, Samiullah Khawaja, kexec, linux-mm, linux-kernel, Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Johannes Weiner, Zi Yan On Thu, Apr 23 2026, Michal Clapinski wrote: > Enable DEFERRED_STRUCT_PAGE_INIT which depends on SMP. > Also enable additional debugging options. > > Signed-off-by: Michal Clapinski <mclapinski@google.com> Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org> [...] -- Regards, Pratyush Yadav ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v9 3/3] selftests: kho: test with deferred struct page init 2026-04-23 12:25 ` [PATCH v9 3/3] selftests: kho: test " Michal Clapinski 2026-04-23 16:43 ` Pratyush Yadav @ 2026-04-23 17:44 ` Pasha Tatashin 2026-04-24 8:13 ` Mike Rapoport 2 siblings, 0 replies; 10+ messages in thread From: Pasha Tatashin @ 2026-04-23 17:44 UTC (permalink / raw) To: Michal Clapinski Cc: Evangelos Petrongonas, Pasha Tatashin, Mike Rapoport, Pratyush Yadav, Alexander Graf, Samiullah Khawaja, kexec, linux-mm, linux-kernel, Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Johannes Weiner, Zi Yan On 04-23 14:25, Michal Clapinski wrote: > Enable DEFERRED_STRUCT_PAGE_INIT which depends on SMP. > Also enable additional debugging options. > > Signed-off-by: Michal Clapinski <mclapinski@google.com> Acked-by: Pasha Tatashin <pasha.tatashin@soleen.com> > --- > tools/testing/selftests/kho/vmtest.sh | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/tools/testing/selftests/kho/vmtest.sh b/tools/testing/selftests/kho/vmtest.sh > index 49fdac8e8b15..0014bd76e88d 100755 > --- a/tools/testing/selftests/kho/vmtest.sh > +++ b/tools/testing/selftests/kho/vmtest.sh > @@ -59,10 +59,14 @@ function build_kernel() { > tee "$kconfig" > "$kho_config" <<EOF > CONFIG_BLK_DEV_INITRD=y > CONFIG_KEXEC_HANDOVER=y > +CONFIG_KEXEC_HANDOVER_DEBUG=y > CONFIG_KEXEC_HANDOVER_DEBUGFS=y > CONFIG_TEST_KEXEC_HANDOVER=y > CONFIG_DEBUG_KERNEL=y > CONFIG_DEBUG_VM=y > +CONFIG_DEBUG_VM_PGFLAGS=y > +CONFIG_SMP=y > +CONFIG_DEFERRED_STRUCT_PAGE_INIT=y > $arch_kconfig > EOF > > -- > 2.54.0.rc2.533.g4f5dca5207-goog > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v9 3/3] selftests: kho: test with deferred struct page init 2026-04-23 12:25 ` [PATCH v9 3/3] selftests: kho: test " Michal Clapinski 2026-04-23 16:43 ` Pratyush Yadav 2026-04-23 17:44 ` Pasha Tatashin @ 2026-04-24 8:13 ` Mike Rapoport 2 siblings, 0 replies; 10+ messages in thread From: Mike Rapoport @ 2026-04-24 8:13 UTC (permalink / raw) To: Michal Clapinski Cc: Evangelos Petrongonas, Pasha Tatashin, Pratyush Yadav, Alexander Graf, Samiullah Khawaja, kexec, linux-mm, linux-kernel, Andrew Morton, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Johannes Weiner, Zi Yan On Thu, Apr 23, 2026 at 02:25:38PM +0200, Michal Clapinski wrote: > Enable DEFERRED_STRUCT_PAGE_INIT which depends on SMP. > Also enable additional debugging options. > > Signed-off-by: Michal Clapinski <mclapinski@google.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> > --- > tools/testing/selftests/kho/vmtest.sh | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/tools/testing/selftests/kho/vmtest.sh b/tools/testing/selftests/kho/vmtest.sh > index 49fdac8e8b15..0014bd76e88d 100755 > --- a/tools/testing/selftests/kho/vmtest.sh > +++ b/tools/testing/selftests/kho/vmtest.sh > @@ -59,10 +59,14 @@ function build_kernel() { > tee "$kconfig" > "$kho_config" <<EOF > CONFIG_BLK_DEV_INITRD=y > CONFIG_KEXEC_HANDOVER=y > +CONFIG_KEXEC_HANDOVER_DEBUG=y > CONFIG_KEXEC_HANDOVER_DEBUGFS=y > CONFIG_TEST_KEXEC_HANDOVER=y > CONFIG_DEBUG_KERNEL=y > CONFIG_DEBUG_VM=y > +CONFIG_DEBUG_VM_PGFLAGS=y > +CONFIG_SMP=y > +CONFIG_DEFERRED_STRUCT_PAGE_INIT=y > $arch_kconfig > EOF > > -- > 2.54.0.rc2.533.g4f5dca5207-goog > -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-04-24 8:13 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-23 12:25 [PATCH v9 0/3] kho: add support for deferred struct page init Michal Clapinski 2026-04-23 12:25 ` [PATCH v9 1/3] kho: fix deferred initialization of scratch areas Michal Clapinski 2026-04-23 16:43 ` Pratyush Yadav 2026-04-23 17:42 ` Pasha Tatashin 2026-04-24 8:12 ` Mike Rapoport 2026-04-23 12:25 ` [PATCH v9 2/3] kho: make preserved pages compatible with deferred struct page init Michal Clapinski 2026-04-23 12:25 ` [PATCH v9 3/3] selftests: kho: test " Michal Clapinski 2026-04-23 16:43 ` Pratyush Yadav 2026-04-23 17:44 ` Pasha Tatashin 2026-04-24 8:13 ` Mike Rapoport
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox