The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Muchun Song <muchun.song@linux.dev>
To: Oscar Salvador <osalvador@suse.de>
Cc: Muchun Song <songmuchun@bytedance.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Madhavan Srinivasan <maddy@linux.ibm.com>,
	Lorenzo Stoakes <ljs@kernel.org>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@kernel.org>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Christophe Leroy <chleroy@kernel.org>,
	Ackerley Tng <ackerleytng@google.com>,
	Frank van der Linden <fvdl@google.com>,
	aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com,
	linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 01/69] mm/hugetlb: Fix boot panic with CONFIG_DEBUG_VM and HVO bootmem pages
Date: Thu, 14 May 2026 16:13:15 +0800	[thread overview]
Message-ID: <D282FD33-5A85-40F0-9B4A-3B4414DE5486@linux.dev> (raw)
In-Reply-To: <agV-8Ng-E0tZ9G5k@localhost.localdomain>



> On May 14, 2026, at 15:51, Oscar Salvador <osalvador@suse.de> wrote:
> 
> On Wed, May 13, 2026 at 09:04:29PM +0800, Muchun Song wrote:
>> Commit 622026e87c40 ("mm/hugetlb: remove fake head pages") switched
>> HVO to reuse per-zone shared tail pages from zone->vmemmap_tails[].
>> 
>> Those shared tail pages were initialized in hugetlb_vmemmap_init(), but
>> bootmem HugeTLB folios are prepared earlier from gather_bootmem_prealloc().
>> With hugetlb_free_vmemmap=on, prep_and_add_bootmem_folios() can access
>> pageblock flags on bootmem HugeTLB pages whose mirrored tail struct pages
>> already point to the shared tail page. On CONFIG_DEBUG_VM kernels,
>> get_pfnblock_bitmap_bitidx() then dereferences the still-uninitialized
>> shared tail page and can panic during boot.
>> 
>> Initialize zone->vmemmap_tails[] from gather_bootmem_prealloc(), before
>> bootmem HugeTLB folios are processed, and drop the later initialization
>> from hugetlb_vmemmap_init().
>> 
>> This bug only affects CONFIG_DEBUG_VM kernels, where the relevant
>> assertion is evaluated.
>> 
>> Fixes: 622026e87c40 ("mm/hugetlb: remove fake head pages")
>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> 
> For the correctness of the change
> 
> Acked-by: Oscar Salvador <osalvador@suse.de>

Thanks.

> 
> but I have a couple of comments:
> 
>> ---
>> mm/hugetlb.c         | 19 +++++++++++++++++++
>> mm/hugetlb_vmemmap.c | 17 -----------------
>> 2 files changed, 19 insertions(+), 17 deletions(-)
>> 
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 31b34ca0f402..d22683ab30a1 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -3382,6 +3382,25 @@ static void __init gather_bootmem_prealloc(void)
>> .max_threads = num_node_state(N_MEMORY),
>> .numa_aware = true,
>> };
>> +#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
>> + 	struct zone *zone;
>> +
>> + 	for_each_zone(zone) {
>> + 		for (int i = 0; i < NR_VMEMMAP_TAILS; i++) {
>> + 			struct page *tail, *p;
>> + 			unsigned int order;
>> +
>> + 			tail = zone->vmemmap_tails[i];
>> + 			if (!tail)
>> + 				continue;
>> +
>> + 			order = i + VMEMMAP_TAIL_MIN_ORDER;
>> + 			p = page_to_virt(tail);
>> + 			for (int j = 0; j < PAGE_SIZE / sizeof(struct page); j++)
>> + 				init_compound_tail(p + j, NULL, order, zone);
>> + 		}
>> + 	}
> 
> This deserves a comment why do we need to initialize those pages here, no need for
> a fat one but a hint, because everybody else looking at this will wonder why do not
> have it in hugetlb_vmemmap_init(), as the name suggests.

Make sense.

> 
> 
>> +#endif
>> 
>> 	padata_do_multithreaded(&job);
>> }
>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>> index 4a077d231d3a..62e61af18c9a 100644
>> --- a/mm/hugetlb_vmemmap.c
>> +++ b/mm/hugetlb_vmemmap.c
>> @@ -870,27 +870,10 @@ static const struct ctl_table hugetlb_vmemmap_sysctls[] = {
>> static int __init hugetlb_vmemmap_init(void)
> 
> At this point, hugetlb_vmemmap_init only initialites the sysctls for
> each hstate, should the name reflect that better?

A good point. I can add a new cleanup patch.

> 
> Also, vmemmap_get_tail() still thinks that tail pages are initialized
> here from this comment:

Good catch. I'll fix it.

> 
> "/*
>   * Only allocate the page, but do not initialize it.
>   *
>   * Any initialization done here will be overwritten by memmap_init().
>   *
>   * hugetlb_vmemmap_init() will take care of initialization after
>   * memmap_init().
>   */"
> 
> so we might want to adjust that as well, and I am not sure if we have
> more comments in the tree referencing hugetlb_vmemmap_init() as the init
> point for those pages that need correction.

Only this one.

> 
> 
> Having said that, and this is just a question, we cannot really make
> gather_bootmem_prealloc() also do the initialization of the sysfs right?
> I see that hugetlb sysfs gets initialized from hugetlb_init() a few
> calls after, so.. meh.

hugetlb_vmemmap_init() is introduced by me. I want to hide some symbols
from mm/hugetlb_vmemmap.c at that time. So I used late_initcall() for that.

> 
> Anyway, maybe just convert hugetlb_vmemmap_init to hugetlb_vmemmap_sysfs_create
> (pick a better name), so it does not look like having two entry points
> for vmemmap init stuff.

This bug fix is a temporary change, the whole zone initialization will
removed later in patch 30. Then, there will be only one place for vmemmap init :)

I’m curious about your thoughts on the practice of minimizing external
symbols. Personally, I’d prefer to move these initializations into
their respective files:

    - hugetlb_sysfs_init();
    - hugetlb_cgroup_file_init();
    - hugetlb_sysctl_init();

Of course, that’s just my personal preference.

Muchun,
Thanks.


> 
> -- 
> Oscar Salvador
> SUSE Labs



  reply	other threads:[~2026-05-14  8:14 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13 13:04 [PATCH v2 00/69] mm: Generalize HVO for HugeTLB and device DAX Muchun Song
2026-05-13 13:04 ` [PATCH v2 01/69] mm/hugetlb: Fix boot panic with CONFIG_DEBUG_VM and HVO bootmem pages Muchun Song
2026-05-14  7:51   ` Oscar Salvador
2026-05-14  8:13     ` Muchun Song [this message]
2026-05-13 13:04 ` [PATCH v2 02/69] mm/hugetlb_vmemmap: Fix __hugetlb_vmemmap_optimize_folios() Muchun Song
2026-05-14  7:56   ` Oscar Salvador
2026-05-14  8:19     ` Muchun Song
2026-05-13 13:04 ` [PATCH v2 03/69] powerpc/mm: Fix wrong addr_pfn tracking in compound vmemmap population Muchun Song
2026-05-14  8:00   ` Oscar Salvador
2026-05-13 13:04 ` [PATCH v2 04/69] mm/hugetlb: Initialize gigantic bootmem hugepage struct pages earlier Muchun Song
2026-05-14  8:05   ` Oscar Salvador
2026-05-14  8:16     ` Muchun Song
2026-05-13 13:04 ` [PATCH v2 05/69] mm/mm_init: Simplify deferred_free_pages() migratetype init Muchun Song
2026-05-14  8:12   ` Oscar Salvador
2026-05-13 13:04 ` [PATCH v2 06/69] mm/sparse: Panic on memmap and usemap allocation failure Muchun Song
2026-05-14  8:15   ` Oscar Salvador
2026-05-13 13:04 ` [PATCH v2 07/69] mm/sparse: Move subsection_map_init() into sparse_init() Muchun Song
2026-05-14  8:19   ` Oscar Salvador
2026-05-13 13:04 ` [PATCH v2 08/69] mm/mm_init: Defer sparse_init() until after zone initialization Muchun Song
2026-05-13 13:04 ` [PATCH v2 09/69] mm/mm_init: Defer hugetlb reservation " Muchun Song
2026-05-13 13:04 ` [PATCH v2 10/69] mm/mm_init: Remove set_pageblock_order() call from sparse_init() Muchun Song
2026-05-13 13:04 ` [PATCH v2 11/69] mm/sparse: Move sparse_vmemmap_init_nid_late() into sparse_init_nid() Muchun Song
2026-05-13 13:04 ` [PATCH v2 12/69] mm/hugetlb_cma: Validate hugetlb CMA range by zone at reserve time Muchun Song
2026-05-13 13:04 ` [PATCH v2 13/69] mm/hugetlb: Refactor early boot gigantic hugepage allocation Muchun Song
2026-05-13 13:04 ` [PATCH v2 14/69] mm/hugetlb: Free cross-zone bootmem gigantic pages after allocation Muchun Song
2026-05-13 13:04 ` [PATCH v2 15/69] mm/hugetlb_vmemmap: Move bootmem HVO setup to early init Muchun Song
2026-05-13 13:04 ` [PATCH v2 16/69] mm/hugetlb: Remove obsolete bootmem cross-zone checks Muchun Song
2026-05-13 13:04 ` [PATCH v2 17/69] mm/sparse-vmemmap: Remove sparse_vmemmap_init_nid_late() Muchun Song
2026-05-13 13:04 ` [PATCH v2 18/69] mm/hugetlb: Remove unused bootmem cma field Muchun Song
2026-05-13 13:04 ` [PATCH v2 19/69] mm/mm_init: Make __init_page_from_nid() static Muchun Song
2026-05-13 13:04 ` [PATCH v2 20/69] mm/sparse-vmemmap: Drop VMEMMAP_POPULATE_PAGEREF Muchun Song
2026-05-13 13:04 ` [PATCH v2 21/69] mm: Rename vmemmap optimization macros around folio semantics Muchun Song
2026-05-13 13:04 ` [PATCH v2 22/69] mm/sparse: Drop power-of-2 size requirement for struct mem_section Muchun Song
2026-05-13 13:04 ` [PATCH v2 23/69] mm/sparse-vmemmap: track compound page order in " Muchun Song
2026-05-13 13:04 ` [PATCH v2 24/69] mm/mm_init: Skip initializing shared vmemmap tail pages Muchun Song
2026-05-13 13:04 ` [PATCH v2 25/69] mm/sparse-vmemmap: Initialize shared tail vmemmap pages on allocation Muchun Song
2026-05-13 13:04 ` [PATCH v2 26/69] mm/sparse-vmemmap: Support section-based vmemmap accounting Muchun Song
2026-05-13 13:04 ` [PATCH v2 27/69] mm/sparse-vmemmap: Support section-based vmemmap optimization Muchun Song
2026-05-13 13:04 ` [PATCH v2 28/69] mm/hugetlb: Use generic vmemmap optimization macros Muchun Song
2026-05-13 13:04 ` [PATCH v2 29/69] mm/sparse: Mark memblocks present earlier Muchun Song
2026-05-13 13:04 ` [PATCH v2 30/69] mm/hugetlb: Switch HugeTLB to section-based vmemmap optimization Muchun Song
2026-05-13 13:04 ` [PATCH v2 31/69] mm/sparse: Remove section_map_size() Muchun Song
2026-05-13 13:05 ` [PATCH v2 32/69] mm/mm_init: Factor out pfn_to_zone() as a shared helper Muchun Song
2026-05-13 13:05 ` [PATCH v2 33/69] mm/sparse: Remove SPARSEMEM_VMEMMAP_PREINIT Muchun Song
2026-05-13 13:05 ` [PATCH v2 34/69] mm/sparse: Inline usemap allocation into sparse_init_nid() Muchun Song
2026-05-13 13:05 ` [PATCH v2 35/69] mm/hugetlb: Remove HUGE_BOOTMEM_HVO Muchun Song
2026-05-13 13:05 ` [PATCH v2 36/69] mm/hugetlb: Remove HUGE_BOOTMEM_CMA Muchun Song
2026-05-13 13:05 ` [PATCH v2 37/69] mm/sparse-vmemmap: Factor out shared vmemmap page allocation Muchun Song
2026-05-13 13:05 ` [PATCH v2 38/69] mm/sparse-vmemmap: Introduce CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION Muchun Song
2026-05-13 13:05 ` [PATCH v2 39/69] mm/sparse-vmemmap: Switch DAX to vmemmap_shared_tail_page() Muchun Song
2026-05-13 13:05 ` [PATCH v2 40/69] powerpc/mm: " Muchun Song
2026-05-13 13:05 ` [PATCH v2 41/69] mm/sparse-vmemmap: Drop the extra tail page from DAX reservation Muchun Song
2026-05-13 13:05 ` [PATCH v2 42/69] mm/sparse-vmemmap: Switch DAX to section-based vmemmap optimization Muchun Song
2026-05-13 13:05 ` [PATCH v2 43/69] mm/sparse-vmemmap: Unify DAX and HugeTLB population paths Muchun Song
2026-05-13 13:05 ` [PATCH v2 44/69] mm/sparse-vmemmap: Remove the unused ptpfn argument Muchun Song
2026-05-13 13:05 ` [PATCH v2 45/69] powerpc/mm: Make vmemmap_populate_compound_pages() static Muchun Song
2026-05-13 13:05 ` [PATCH v2 46/69] mm/sparse-vmemmap: Map shared vmemmap tail pages read-only Muchun Song
2026-05-13 13:20 ` [PATCH v2 47/69] powerpc/mm: " Muchun Song
2026-05-13 13:20   ` [PATCH v2 48/69] mm/sparse-vmemmap: Inline vmemmap_populate_address() into its caller Muchun Song
2026-05-13 13:20   ` [PATCH v2 49/69] mm/hugetlb_vmemmap: Remove vmemmap_wrprotect_hvo() Muchun Song
2026-05-13 13:20   ` [PATCH v2 50/69] mm/sparse: Simplify section_nr_vmemmap_pages() Muchun Song
2026-05-13 13:20   ` [PATCH v2 51/69] mm/sparse-vmemmap: Introduce vmemmap_nr_struct_pages() Muchun Song
2026-05-13 13:20   ` [PATCH v2 52/69] powerpc/mm: Drop powerpc vmemmap_can_optimize() Muchun Song
2026-05-13 13:20   ` [PATCH v2 53/69] mm/sparse-vmemmap: Drop vmemmap_can_optimize() Muchun Song
2026-05-13 13:20   ` [PATCH v2 54/69] mm/sparse-vmemmap: Drop @pgmap from vmemmap population APIs Muchun Song
2026-05-13 13:20   ` [PATCH v2 55/69] mm/sparse: Decouple section activation from ZONE_DEVICE Muchun Song
2026-05-13 13:20   ` [PATCH v2 56/69] mm: Redefine HVO as Hugepage Vmemmap Optimization Muchun Song
2026-05-13 13:20   ` [PATCH v2 57/69] mm/sparse-vmemmap: Consolidate HVO enable checks Muchun Song
2026-05-13 13:20   ` [PATCH v2 58/69] mm/hugetlb: Make HVO optimizable checks depend on generic logic Muchun Song
2026-05-13 13:20   ` [PATCH v2 59/69] mm/sparse-vmemmap: Localize init_compound_tail() Muchun Song
2026-05-13 13:20   ` [PATCH v2 60/69] mm/mm_init: Check zone consistency on optimized vmemmap sections Muchun Song
2026-05-13 13:20   ` [PATCH v2 61/69] mm/hugetlb: Drop boot-time HVO handling for gigantic folios Muchun Song
2026-05-13 13:20   ` [PATCH v2 62/69] mm/hugetlb: Simplify hugetlb_folio_init_vmemmap() Muchun Song
2026-05-13 13:20   ` [PATCH v2 63/69] mm/hugetlb: Initialize the full bootmem hugepage in hugetlb code Muchun Song
2026-05-13 13:20   ` [PATCH v2 64/69] mm/mm_init: Factor out compound page initialization Muchun Song
2026-05-13 13:20   ` [PATCH v2 65/69] mm/mm_init: Make __init_single_page() static Muchun Song
2026-05-13 13:20   ` [PATCH v2 66/69] mm/cma: Move CMA pageblock initialization into cma_activate_area() Muchun Song
2026-05-13 13:20   ` [PATCH v2 67/69] mm/cma: Move init_cma_pageblock() into cma.c Muchun Song
2026-05-13 13:20   ` [PATCH v2 68/69] mm/mm_init: Initialize pageblock migratetype in memmap init helpers Muchun Song
2026-05-13 13:20   ` [PATCH v2 69/69] Documentation/mm: Rewrite vmemmap_dedup.rst for unified HVO Muchun Song
2026-05-13 17:46 ` [PATCH v2 00/69] mm: Generalize HVO for HugeTLB and device DAX Andrew Morton
2026-05-13 18:26   ` Oscar Salvador
2026-05-14  2:37     ` Muchun Song
2026-05-14  2:34   ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D282FD33-5A85-40F0-9B4A-3B4414DE5486@linux.dev \
    --to=muchun.song@linux.dev \
    --cc=Liam.Howlett@oracle.com \
    --cc=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=chleroy@kernel.org \
    --cc=david@kernel.org \
    --cc=fvdl@google.com \
    --cc=joao.m.martins@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=ljs@kernel.org \
    --cc=maddy@linux.ibm.com \
    --cc=mhocko@suse.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=osalvador@suse.de \
    --cc=rppt@kernel.org \
    --cc=songmuchun@bytedance.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox