From: Kiryl Shutsemau <kas@kernel.org>
To: "David Hildenbrand (Red Hat)" <david@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Muchun Song <muchun.song@linux.dev>,
Matthew Wilcox <willy@infradead.org>,
Usama Arif <usamaarif642@gmail.com>,
Frank van der Linden <fvdl@google.com>,
Oscar Salvador <osalvador@suse.de>,
Mike Rapoport <rppt@kernel.org>,
Vlastimil Babka <vbabka@suse.cz>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Zi Yan <ziy@nvidia.com>, Baoquan He <bhe@redhat.com>,
Michal Hocko <mhocko@suse.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Jonathan Corbet <corbet@lwn.net>,
kernel-team@meta.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org
Subject: Re: [PATCHv3 10/15] mm/hugetlb: Remove fake head pages
Date: Thu, 15 Jan 2026 18:58:39 +0000 [thread overview]
Message-ID: <aWk1tZyFZOOkF0AH@thinkstation> (raw)
In-Reply-To: <b10e3b2a-b298-4d27-b8ce-63327864c220@kernel.org>
On Thu, Jan 15, 2026 at 06:41:44PM +0100, David Hildenbrand (Red Hat) wrote:
> On 1/15/26 18:23, Kiryl Shutsemau wrote:
> > On Thu, Jan 15, 2026 at 05:49:43PM +0100, David Hildenbrand (Red Hat) wrote:
> > > On 1/15/26 15:45, Kiryl Shutsemau wrote:
> > > > HugeTLB Vmemmap Optimization (HVO) reduces memory usage by freeing most
> > > > vmemmap pages for huge pages and remapping the freed range to a single
> > > > page containing the struct page metadata.
> > > >
> > > > With the new mask-based compound_info encoding (for power-of-2 struct
> > > > page sizes), all tail pages of the same order are now identical
> > > > regardless of which compound page they belong to. This means the tail
> > > > pages can be truly shared without fake heads.
> > > >
> > > > Allocate a single page of initialized tail struct pages per NUMA node
> > > > per order in the vmemmap_tails[] array in pglist_data. All huge pages
> > > > of that order on the node share this tail page, mapped read-only into
> > > > their vmemmap. The head page remains unique per huge page.
> > > >
> > > > This eliminates fake heads while maintaining the same memory savings,
> > > > and simplifies compound_head() by removing fake head detection.
> > > >
> > > > Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> > > > ---
> > > > include/linux/mmzone.h | 16 ++++++++++++++-
> > > > mm/hugetlb_vmemmap.c | 44 ++++++++++++++++++++++++++++++++++++++++--
> > > > mm/sparse-vmemmap.c | 44 ++++++++++++++++++++++++++++++++++--------
> > > > 3 files changed, 93 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > > > index 322ed4c42cfc..2ee3eb610291 100644
> > > > --- a/include/linux/mmzone.h
> > > > +++ b/include/linux/mmzone.h
> > > > @@ -82,7 +82,11 @@
> > > > * currently expect (see CONFIG_HAVE_GIGANTIC_FOLIOS): with hugetlb, we expect
> > > > * no folios larger than 16 GiB on 64bit and 1 GiB on 32bit.
> > > > */
> > > > -#define MAX_FOLIO_ORDER get_order(IS_ENABLED(CONFIG_64BIT) ? SZ_16G : SZ_1G)
> > > > +#ifdef CONFIG_64BIT
> > > > +#define MAX_FOLIO_ORDER (34 - PAGE_SHIFT)
> > > > +#else
> > > > +#define MAX_FOLIO_ORDER (30 - PAGE_SHIFT)
> > > > +#endif
> > >
> > > Where do these magic values stem from, and how do they related to the
> > > comment above that clearly spells out 16G vs. 1G ?
> >
> > This doesn't change the resulting value: 1UL << 34 is 16GiB, 1UL << 30
> > is 1G. Subtract PAGE_SHIFT to get the order.
> >
> > The change allows the value to be used to define NR_VMEMMAP_TAILS which
> > is used specify size of vmemmap_tails array.
>
> get_order(IS_ENABLED(CONFIG_64BIT) ? SZ_16G : SZ_1G) should evaluate to a
> constant by the compiler.
>
> See __builtin_constant_p handling in get_order().
>
> If that is not working then we have to figure out why.
asm-offsets.s compilation fails:
../include/linux/mmzone.h:1574:16: error: fields must have a constant size:
'variable length array in structure' extension will never be supported
1574 | unsigned long vmemmap_tails[NR_VMEMMAP_TAILS];
Here's how preprocessor dump of vmemmap_tails looks like:
unsigned long vmemmap_tails[(get_order(1 ? (0x400000000ULL) : 0x40000000) - (( __builtin_constant_p(2 * ((1UL) << 12) / sizeof(struct page)) ? ((2 * ((1UL) << 12) / sizeof(struct page)) < 2 ? 0 : 63 - __builtin_clzll(2 * ((1UL) << 12) / sizeof(struct page))) : (sizeof(2 * ((1UL) << 12) / sizeof(struct page)) <= 4) ? __ilog2_u32(2 * ((1UL) << 12) / sizeof(struct page)) : __ilog2_u64(2 * ((1UL) << 12) / sizeof(struct page)) )) + 1)];
And here's get_order():
static inline __attribute__((__gnu_inline__)) __attribute__((__unused__)) __attribute__((no_instrument_function)) __attribute__((__always_inline__)) __attribute__((__const__)) int get_order(unsigned long size)
{
if (__builtin_constant_p(size)) {
if (!size)
return 64 - 12;
if (size < (1UL << 12))
return 0;
return ( __builtin_constant_p((size) - 1) ? (((size) - 1) < 2 ? 0 : 63 - __builtin_clzll((size) - 1)) : (sizeof((size) - 1) <= 4) ? __ilog2_u32((size) - 1) : __ilog2_u64((size) - 1) ) - 12 + 1;
}
size--;
size >>= 12;
return fls64(size);
}
I am not sure why it is not compile-time constant. I have not dig
deeper.
Switching to ilog2(IS_ENABLED(CONFIG_64BIT) ? SZ_16G : SZ_1G) - PAGE_SHIFT works,
but I personally find my variant more readable.
Do you want me to dig deeper to check if making get_order() work
possible?
> Was this only a specific config in where you ran into compile-time problems?
I am not aware about any particular config dependency. Seems to be
everywhere.
--
Kiryl Shutsemau / Kirill A. Shutemov
next prev parent reply other threads:[~2026-01-15 18:58 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-15 14:45 [PATCHv3 00/15] mm: Eliminate fake head pages from vmemmap optimization Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 01/15] x86/vdso32: Prepare for <linux/pgtable.h> inclusion Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 02/15] mm: Move MAX_FOLIO_ORDER definition to mmzone.h Kiryl Shutsemau
2026-01-15 16:35 ` David Hildenbrand (Red Hat)
2026-01-15 16:48 ` David Hildenbrand (Red Hat)
2026-01-15 17:26 ` Kiryl Shutsemau
2026-01-15 17:45 ` David Hildenbrand (Red Hat)
2026-01-15 14:45 ` [PATCHv3 03/15] mm: Change the interface of prep_compound_tail() Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 04/15] mm: Rename the 'compound_head' field in the 'struct page' to 'compound_info' Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 05/15] mm: Move set/clear_compound_head() next to compound_head() Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 06/15] mm: Rework compound_head() for power-of-2 sizeof(struct page) Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 07/15] mm: Make page_zonenum() use head page Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 08/15] mm/sparse: Check memmap alignment for compound_info_has_mask() Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 09/15] mm/hugetlb: Refactor code around vmemmap_walk Kiryl Shutsemau
2026-01-19 10:04 ` Muchun Song
2026-01-19 15:26 ` Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 10/15] mm/hugetlb: Remove fake head pages Kiryl Shutsemau
2026-01-15 16:49 ` David Hildenbrand (Red Hat)
2026-01-15 17:23 ` Kiryl Shutsemau
2026-01-15 17:41 ` David Hildenbrand (Red Hat)
2026-01-15 18:58 ` Kiryl Shutsemau [this message]
2026-01-15 19:33 ` David Hildenbrand (Red Hat)
2026-01-15 19:46 ` David Hildenbrand (Red Hat)
2026-01-16 2:38 ` Muchun Song
2026-01-16 15:52 ` Kiryl Shutsemau
2026-01-17 2:38 ` Muchun Song
2026-01-19 15:15 ` Kiryl Shutsemau
2026-01-20 2:50 ` Muchun Song
2026-01-16 16:18 ` [PATCHv3.1 " Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 11/15] mm: Drop fake head checks Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 12/15] hugetlb: Remove VMEMMAP_SYNCHRONIZE_RCU Kiryl Shutsemau
2026-01-15 14:45 ` [PATCHv3 13/15] mm/hugetlb: Remove hugetlb_optimize_vmemmap_key static key Kiryl Shutsemau
2026-01-15 14:46 ` [PATCHv3 14/15] mm: Remove the branch from compound_head() Kiryl Shutsemau
2026-01-15 14:46 ` [PATCHv3 15/15] hugetlb: Update vmemmap_dedup.rst Kiryl Shutsemau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aWk1tZyFZOOkF0AH@thinkstation \
--to=kas@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=corbet@lwn.net \
--cc=david@kernel.org \
--cc=fvdl@google.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=rppt@kernel.org \
--cc=usamaarif642@gmail.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.