From: Muchun Song <muchun.song@linux.dev>
To: Usama Arif <usama.arif@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>,
David Hildenbrand <david@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Mike Rapoport <rppt@kernel.org>, Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <liam@infradead.org>,
Vlastimil Babka <vbabka@kernel.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Nicholas Piggin <npiggin@gmail.com>,
"Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
"Ritesh Harjani (IBM)" <ritesh.list@gmail.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
linuxppc-dev@lists.ozlabs.org,
Muchun Song <songmuchun@bytedance.com>
Subject: Re: [PATCH v3 15/19] mm/hugetlb_vmemmap: Move bootmem HVO setup to early init
Date: Wed, 3 Jun 2026 20:24:09 +0800 [thread overview]
Message-ID: <f9dd874a-b637-4740-9a63-8da66de323ca@linux.dev> (raw)
In-Reply-To: <20260603120246.1572177-1-usama.arif@linux.dev>
On 2026/6/3 20:02, Usama Arif wrote:
> On Tue, 2 Jun 2026 18:10:35 +0800 Muchun Song <songmuchun@bytedance.com> wrote:
>
>> Bootmem HugeTLB pages currently defer HVO setup to
>> hugetlb_vmemmap_init_late(), because the optimization needs zone
>> information.
>>
>> Now that zone initialization is available earlier, the bootmem HVO setup
>> can be done directly from hugetlb_vmemmap_init_early(). This lets
>> gigantic HugeTLB pages apply HVO as soon as they are allocated.
>>
>> Bootmem gigantic pages that span multiple zones are now filtered out
>> when they are allocated, so the remaining bootmem gigantic pages seen by
>> later hugetlb initialization are already zone-valid. As a result,
>> hugetlb_vmemmap_init_late() no longer needs to handle bootmem HVO setup.
>>
>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>> ---
>> mm/hugetlb_vmemmap.c | 67 +++++++++-----------------------------------
>> 1 file changed, 13 insertions(+), 54 deletions(-)
>>
>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>> index ea6af85bfec1..464578ee246e 100644
>> --- a/mm/hugetlb_vmemmap.c
>> +++ b/mm/hugetlb_vmemmap.c
>> @@ -745,6 +745,8 @@ static bool vmemmap_should_optimize_bootmem_page(struct huge_bootmem_page *m)
>> return true;
>> }
>>
>> +static struct zone *pfn_to_zone(unsigned nid, unsigned long pfn);
>> +
>> /*
>> * Initialize memmap section for a gigantic page, HVO-style.
>> */
>> @@ -752,6 +754,7 @@ void __init hugetlb_vmemmap_init_early(int nid)
>> {
>> unsigned long psize, paddr, section_size;
>> unsigned long ns, i, pnum, pfn, nr_pages;
>> + unsigned long start, end;
>> struct huge_bootmem_page *m = NULL;
>> void *map;
>>
>> @@ -761,6 +764,8 @@ void __init hugetlb_vmemmap_init_early(int nid)
>> section_size = (1UL << PA_SECTION_SHIFT);
>>
>> list_for_each_entry(m, &huge_boot_pages[nid], list) {
>> + struct zone *zone;
>> +
>> if (!vmemmap_should_optimize_bootmem_page(m))
>> continue;
>>
>> @@ -769,6 +774,14 @@ void __init hugetlb_vmemmap_init_early(int nid)
>> paddr = virt_to_phys(m);
>> pfn = PHYS_PFN(paddr);
>> map = pfn_to_page(pfn);
>> + start = (unsigned long)map;
>> + end = start + hugetlb_vmemmap_size(m->hstate);
>> + zone = pfn_to_zone(nid, pfn);
>> +
>> + if (vmemmap_populate_hvo(start, end, huge_page_order(m->hstate),
>> + zone, HUGETLB_VMEMMAP_RESERVE_SIZE))
>> + panic("Failed to allocate memmap for HugeTLB page\n");
> The replaced hugetlb_vmemmap_init_late() path used to fall back to
> vmemmap_populate() if vmemmap_populate_hvo() returned an error and
> just lost the HVO optimization for that page.
>
> The new path panics on any non-zero return. Is the panic intended,
> given that vmemmap_populate_hvo() returns -ENOMEM on allocation
> failure and HVO is normally treated as an optimization rather than a
> hard requirement?
This is intentional; see patch 6:
mm/sparse: Panic on memmap and usemap allocation failure
We already panic on OOM anyway.
Muchun,
Thanks.
>
>> + memmap_boot_pages_add(DIV_ROUND_UP(HUGETLB_VMEMMAP_RESERVE_SIZE, PAGE_SIZE));
>>
>> pnum = pfn_to_section_nr(pfn);
>> ns = psize / section_size;
>> @@ -800,60 +813,6 @@ static struct zone *pfn_to_zone(unsigned nid, unsigned long pfn)
>>
>> void __init hugetlb_vmemmap_init_late(int nid)
>> {
>> - struct huge_bootmem_page *m, *tm;
>> - unsigned long phys, nr_pages, start, end;
>> - unsigned long pfn, nr_mmap;
>> - struct zone *zone = NULL;
>> - struct hstate *h;
>> - void *map;
>> -
>> - if (!READ_ONCE(vmemmap_optimize_enabled))
>> - return;
>> -
>> - list_for_each_entry_safe(m, tm, &huge_boot_pages[nid], list) {
>> - if (!(m->flags & HUGE_BOOTMEM_HVO))
>> - continue;
>> -
>> - phys = virt_to_phys(m);
>> - h = m->hstate;
>> - pfn = PHYS_PFN(phys);
>> - nr_pages = pages_per_huge_page(h);
>> - map = pfn_to_page(pfn);
>> - start = (unsigned long)map;
>> - end = start + nr_pages * sizeof(struct page);
>> -
>> - if (!hugetlb_bootmem_page_zones_valid(nid, m)) {
>> - /*
>> - * Oops, the hugetlb page spans multiple zones.
>> - * Remove it from the list, and populate it normally.
>> - */
>> - list_del(&m->list);
>> -
>> - vmemmap_populate(start, end, nid, NULL);
>> - nr_mmap = end - start;
>> - memmap_boot_pages_add(DIV_ROUND_UP(nr_mmap, PAGE_SIZE));
>> -
>> - memblock_phys_free(phys, huge_page_size(h));
>> - continue;
>> - }
>> -
>> - if (!zone || !zone_spans_pfn(zone, pfn))
>> - zone = pfn_to_zone(nid, pfn);
>> - if (WARN_ON_ONCE(!zone))
>> - continue;
>> -
>> - if (vmemmap_populate_hvo(start, end, huge_page_order(h), zone,
>> - HUGETLB_VMEMMAP_RESERVE_SIZE) < 0) {
>> - /* Fallback if HVO population fails */
>> - vmemmap_populate(start, end, nid, NULL);
>> - nr_mmap = end - start;
>> - } else {
>> - m->flags |= HUGE_BOOTMEM_ZONES_VALID;
>> - nr_mmap = HUGETLB_VMEMMAP_RESERVE_SIZE;
>> - }
>> -
>> - memmap_boot_pages_add(DIV_ROUND_UP(nr_mmap, PAGE_SIZE));
>> - }
>> }
>> #endif
>>
>> --
>> 2.54.0
>>
>>
next prev parent reply other threads:[~2026-06-03 12:24 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-02 10:10 [PATCH v3 00/19] mm: Refactor bootmem gigantic hugepage allocation Muchun Song
2026-06-02 10:10 ` [PATCH v3 01/19] mm/hugetlb: Fix boot panic with CONFIG_DEBUG_VM and HVO bootmem pages Muchun Song
2026-06-02 10:10 ` [PATCH v3 02/19] mm/hugetlb_vmemmap: Fix __hugetlb_vmemmap_optimize_folios() Muchun Song
2026-06-02 10:10 ` [PATCH v3 03/19] powerpc/mm: Fix wrong addr_pfn tracking in compound vmemmap population Muchun Song
2026-06-03 14:36 ` Ritesh Harjani
2026-06-04 2:09 ` Muchun Song
2026-06-02 10:10 ` [PATCH v3 04/19] mm/hugetlb: Initialize gigantic bootmem hugepage struct pages earlier Muchun Song
2026-06-02 10:10 ` [PATCH v3 05/19] mm/mm_init: Simplify deferred_free_pages() migratetype init Muchun Song
2026-06-02 10:10 ` [PATCH v3 06/19] mm/sparse: Panic on memmap and usemap allocation failure Muchun Song
2026-06-02 10:10 ` [PATCH v3 07/19] mm/sparse: Move subsection_map_init() into sparse_init() Muchun Song
2026-06-02 10:10 ` [PATCH v3 08/19] mm/mm_init: Defer sparse_init() until after zone initialization Muchun Song
2026-06-02 10:10 ` [PATCH v3 09/19] mm/mm_init: Defer hugetlb reservation " Muchun Song
2026-06-02 10:10 ` [PATCH v3 10/19] mm/mm_init: Remove set_pageblock_order() call from sparse_init() Muchun Song
2026-06-02 10:10 ` [PATCH v3 11/19] mm/sparse: Move sparse_vmemmap_init_nid_late() into sparse_init_nid() Muchun Song
2026-06-02 10:10 ` [PATCH v3 12/19] mm/hugetlb_cma: Validate hugetlb CMA range by zone at reserve time Muchun Song
2026-06-02 10:10 ` [PATCH v3 13/19] mm/hugetlb: Refactor early boot gigantic hugepage allocation Muchun Song
2026-06-02 10:10 ` [PATCH v3 14/19] mm/hugetlb: Free cross-zone bootmem gigantic pages after allocation Muchun Song
2026-06-02 15:41 ` Mike Rapoport
2026-06-03 2:53 ` Muchun Song
2026-06-02 10:10 ` [PATCH v3 15/19] mm/hugetlb_vmemmap: Move bootmem HVO setup to early init Muchun Song
2026-06-02 15:41 ` Mike Rapoport
2026-06-03 2:42 ` Muchun Song
2026-06-03 12:02 ` Usama Arif
2026-06-03 12:24 ` Muchun Song [this message]
2026-06-03 12:35 ` Usama Arif
2026-06-02 10:10 ` [PATCH v3 16/19] mm/hugetlb: Remove obsolete bootmem cross-zone checks Muchun Song
2026-06-02 15:41 ` Mike Rapoport
2026-06-02 10:10 ` [PATCH v3 17/19] mm/sparse-vmemmap: Remove sparse_vmemmap_init_nid_late() Muchun Song
2026-06-02 15:41 ` Mike Rapoport
2026-06-02 10:10 ` [PATCH v3 18/19] mm/hugetlb: Remove unused bootmem cma field Muchun Song
2026-06-02 15:41 ` Mike Rapoport
2026-06-03 2:41 ` Muchun Song
2026-06-02 10:10 ` [PATCH v3 19/19] mm/mm_init: Fold __init_page_from_nid() into __init_deferred_page() Muchun Song
2026-06-02 14:46 ` Mike Rapoport
2026-06-02 15:41 ` Mike Rapoport
2026-06-03 2:39 ` Muchun Song
2026-06-02 10:34 ` [PATCH v3 00/19] mm: Refactor bootmem gigantic hugepage allocation Oscar Salvador (SUSE)
2026-06-02 12:01 ` Muchun Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f9dd874a-b637-4740-9a63-8da66de323ca@linux.dev \
--to=muchun.song@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=chleroy@kernel.org \
--cc=david@kernel.org \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=ljs@kernel.org \
--cc=maddy@linux.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=osalvador@suse.de \
--cc=ritesh.list@gmail.com \
--cc=rppt@kernel.org \
--cc=songmuchun@bytedance.com \
--cc=usama.arif@linux.dev \
--cc=vbabka@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.