All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kiryl Shutsemau <kas@kernel.org>
To: Muchun Song <muchun.song@linux.dev>
Cc: "David Hildenbrand (Red Hat)" <david@kernel.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	 Usama Arif <usamaarif642@gmail.com>,
	Frank van der Linden <fvdl@google.com>,
	 Oscar Salvador <osalvador@suse.de>,
	Mike Rapoport <rppt@kernel.org>,
	 Vlastimil Babka <vbabka@suse.cz>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	 Zi Yan <ziy@nvidia.com>, Baoquan He <bhe@redhat.com>,
	Michal Hocko <mhocko@suse.com>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	Jonathan Corbet <corbet@lwn.net>,
	kernel-team@meta.com,  linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org
Subject: Re: [PATCHv3 10/15] mm/hugetlb: Remove fake head pages
Date: Mon, 19 Jan 2026 15:15:22 +0000	[thread overview]
Message-ID: <aW5JqibNe4CVBa07@thinkstation> (raw)
In-Reply-To: <0F1C93F3-9A1A-4929-9157-589CF8C0588D@linux.dev>

On Sat, Jan 17, 2026 at 10:38:48AM +0800, Muchun Song wrote:
> 
> 
> > On Jan 16, 2026, at 23:52, Kiryl Shutsemau <kas@kernel.org> wrote:
> > 
> > On Fri, Jan 16, 2026 at 10:38:02AM +0800, Muchun Song wrote:
> >> 
> >> 
> >>> On Jan 16, 2026, at 01:23, Kiryl Shutsemau <kas@kernel.org> wrote:
> >>> 
> >>> On Thu, Jan 15, 2026 at 05:49:43PM +0100, David Hildenbrand (Red Hat) wrote:
> >>>> On 1/15/26 15:45, Kiryl Shutsemau wrote:
> >>>>> HugeTLB Vmemmap Optimization (HVO) reduces memory usage by freeing most
> >>>>> vmemmap pages for huge pages and remapping the freed range to a single
> >>>>> page containing the struct page metadata.
> >>>>> 
> >>>>> With the new mask-based compound_info encoding (for power-of-2 struct
> >>>>> page sizes), all tail pages of the same order are now identical
> >>>>> regardless of which compound page they belong to. This means the tail
> >>>>> pages can be truly shared without fake heads.
> >>>>> 
> >>>>> Allocate a single page of initialized tail struct pages per NUMA node
> >>>>> per order in the vmemmap_tails[] array in pglist_data. All huge pages
> >>>>> of that order on the node share this tail page, mapped read-only into
> >>>>> their vmemmap. The head page remains unique per huge page.
> >>>>> 
> >>>>> This eliminates fake heads while maintaining the same memory savings,
> >>>>> and simplifies compound_head() by removing fake head detection.
> >>>>> 
> >>>>> Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> >>>>> ---
> >>>>> include/linux/mmzone.h | 16 ++++++++++++++-
> >>>>> mm/hugetlb_vmemmap.c   | 44 ++++++++++++++++++++++++++++++++++++++++--
> >>>>> mm/sparse-vmemmap.c    | 44 ++++++++++++++++++++++++++++++++++--------
> >>>>> 3 files changed, 93 insertions(+), 11 deletions(-)
> >>>>> 
> >>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >>>>> index 322ed4c42cfc..2ee3eb610291 100644
> >>>>> --- a/include/linux/mmzone.h
> >>>>> +++ b/include/linux/mmzone.h
> >>>>> @@ -82,7 +82,11 @@
> >>>>>  * currently expect (see CONFIG_HAVE_GIGANTIC_FOLIOS): with hugetlb, we expect
> >>>>>  * no folios larger than 16 GiB on 64bit and 1 GiB on 32bit.
> >>>>>  */
> >>>>> -#define MAX_FOLIO_ORDER get_order(IS_ENABLED(CONFIG_64BIT) ? SZ_16G : SZ_1G)
> >>>>> +#ifdef CONFIG_64BIT
> >>>>> +#define MAX_FOLIO_ORDER (34 - PAGE_SHIFT)
> >>>>> +#else
> >>>>> +#define MAX_FOLIO_ORDER (30 - PAGE_SHIFT)
> >>>>> +#endif
> >>>> 
> >>>> Where do these magic values stem from, and how do they related to the
> >>>> comment above that clearly spells out 16G vs. 1G ?
> >>> 
> >>> This doesn't change the resulting value: 1UL << 34 is 16GiB, 1UL << 30
> >>> is 1G. Subtract PAGE_SHIFT to get the order.
> >>> 
> >>> The change allows the value to be used to define NR_VMEMMAP_TAILS which
> >>> is used specify size of vmemmap_tails array.
> >> 
> >> How about allocate ->vmemmap_tails array dynamically? If sizeof of struct
> >> page is not power of two, then we could optimize away this array. Besides,
> >> the original MAX_FOLIO_ORDER could work as well.
> > 
> > This is tricky.
> > 
> > We need vmemmap_tails array to be around early, in
> > hugetlb_vmemmap_init_early(). By the time, we don't have slab
> > functional yet.
> 
> I mean zero-size array at the end of pg_data_t, no slab is needed.

For !NUMA, the struct is in BSS. See contig_page_data.

Dynamic array won't fly there.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

  reply	other threads:[~2026-01-19 15:15 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-15 14:45 [PATCHv3 00/15] mm: Eliminate fake head pages from vmemmap optimization Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 01/15] x86/vdso32: Prepare for <linux/pgtable.h> inclusion Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 02/15] mm: Move MAX_FOLIO_ORDER definition to mmzone.h Kiryl Shutsemau
2026-01-15 16:35   ` David Hildenbrand (Red Hat)
2026-01-15 16:48     ` David Hildenbrand (Red Hat)
2026-01-15 17:26       ` Kiryl Shutsemau
2026-01-15 17:45         ` David Hildenbrand (Red Hat)
2026-01-15 14:45 ` [PATCHv3 03/15] mm: Change the interface of prep_compound_tail() Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 04/15] mm: Rename the 'compound_head' field in the 'struct page' to 'compound_info' Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 05/15] mm: Move set/clear_compound_head() next to compound_head() Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 06/15] mm: Rework compound_head() for power-of-2 sizeof(struct page) Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 07/15] mm: Make page_zonenum() use head page Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 08/15] mm/sparse: Check memmap alignment for compound_info_has_mask() Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 09/15] mm/hugetlb: Refactor code around vmemmap_walk Kiryl Shutsemau
2026-01-19 10:04   ` Muchun Song
2026-01-19 15:26     ` Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 10/15] mm/hugetlb: Remove fake head pages Kiryl Shutsemau
2026-01-15 16:49   ` David Hildenbrand (Red Hat)
2026-01-15 17:23     ` Kiryl Shutsemau
2026-01-15 17:41       ` David Hildenbrand (Red Hat)
2026-01-15 18:58         ` Kiryl Shutsemau
2026-01-15 19:33           ` David Hildenbrand (Red Hat)
2026-01-15 19:46             ` David Hildenbrand (Red Hat)
2026-01-16  2:38       ` Muchun Song
2026-01-16 15:52         ` Kiryl Shutsemau
2026-01-17  2:38           ` Muchun Song
2026-01-19 15:15             ` Kiryl Shutsemau [this message]
2026-01-20  2:50               ` Muchun Song
2026-01-16 16:18   ` [PATCHv3.1 " Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 11/15] mm: Drop fake head checks Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 12/15] hugetlb: Remove VMEMMAP_SYNCHRONIZE_RCU Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 13/15] mm/hugetlb: Remove hugetlb_optimize_vmemmap_key static key Kiryl Shutsemau
2026-01-15 14:46 ` [PATCHv3 14/15] mm: Remove the branch from compound_head() Kiryl Shutsemau
2026-01-15 14:46 ` [PATCHv3 15/15] hugetlb: Update vmemmap_dedup.rst Kiryl Shutsemau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aW5JqibNe4CVBa07@thinkstation \
    --to=kas@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=corbet@lwn.net \
    --cc=david@kernel.org \
    --cc=fvdl@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@meta.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=rppt@kernel.org \
    --cc=usamaarif642@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.