All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kiryl Shutsemau <kas@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>,
	Muchun Song <muchun.song@linux.dev>
Cc: David Hildenbrand <david@kernel.org>,
	Oscar Salvador <osalvador@suse.de>,
	Mike Rapoport <rppt@kernel.org>, Vlastimil Babka <vbabka@suse.cz>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Matthew Wilcox <willy@infradead.org>, Zi Yan <ziy@nvidia.com>,
	Baoquan He <bhe@redhat.com>, Michal Hocko <mhocko@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Usama Arif <usamaarif642@gmail.com>,
	kernel-team@meta.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	Kiryl Shutsemau <kas@kernel.org>
Subject: [PATCH 00/11] mm/hugetlb: Eliminate fake head pages from vmemmap optimization
Date: Fri,  5 Dec 2025 19:43:36 +0000	[thread overview]
Message-ID: <20251205194351.1646318-1-kas@kernel.org> (raw)

This series removes "fake head pages" from the HugeTLB vmemmap
optimization (HVO) by changing how tail pages encode their relationship
to the head page.

It simplifies compound_head() and page_ref_add_unless(). Both are in the
hot path.

Background
==========

HVO reduces memory overhead by freeing vmemmap pages for HugeTLB pages
and remapping the freed virtual addresses to a single physical page.
Previously, all tail page vmemmap entries were remapped to the first
vmemmap page (containing the head struct page), creating "fake heads" -
tail pages that appear to have PG_head set when accessed through the
deduplicated vmemmap.

This required special handling in compound_head() to detect and work
around fake heads, adding complexity and overhead to a very hot path.

New Approach
============

For architectures/configs where sizeof(struct page) is a power of 2 (the
common case), this series changes how position of the head page is encoded
in the tail pages.

Instead of storing a pointer to the head page, the ->compound_info
(renamed from ->compound_head) now stores a mask.

The mask can be applied to any tail page's virtual address to compute
the head page address. Critically, all tail pages of the same order now
have identical compound_info values, regardless of which compound page
they belong to.

This enables a key optimization: instead of remapping tail vmemmap
entries to the head page (creating fake heads), we remap them to a
shared, pre-initialized vmemmap_tail page per hstate. The head page
gets its own dedicated vmemmap page, eliminating fake heads entirely.

Benefits
========

1. Smaller generated code. On defconfig, I see ~15K reduction of text
   in vmlinux:

   add/remove: 6/33 grow/shrink: 54/262 up/down: 6130/-21922 (-15792)

2. Simplified compound_head(): No fake head detection needed. The
   function is now branchless for power-of-2 struct page sizes.

3. Eliminated race condition: The old scheme required synchronize_rcu()
   to coordinate between HVO remapping and speculative PFN walkers that
   might write to fake heads. With the head page always in writable
   memory, this synchronization is unnecessary.

4. Removed static key: hugetlb_optimize_vmemmap_key is no longer needed
   since compound_head() no longer has HVO-specific branches.

5. Cleaner architecture: The vmemmap layout is now straightforward -
   head page has its own vmemmap, tails share a read-only template.

I had hoped to see performance improvement, but my testing thus far has
shown either no change or only a slight improvement within the noise.

Series Organization
===================

Patches 1-3: Preparatory refactoring
  - Change prep_compound_tail() interface to take order
  - Rename compound_head field to compound_info
  - Move set/clear_compound_head() near compound_head()

Patch 4: Core encoding change
  - Implement mask-based encoding for power-of-2 struct page

Patches 5-6: HVO restructuring
  - Refactor vmemmap_walk to support separate head/tail pages
  - Introduce per-hstate vmemmap_tail, eliminate fake heads

Patches 7-9: Cleanup
  - Remove fake head checks from compound_head(), PageTail(), etc.
  - Remove VMEMMAP_SYNCHRONIZE_RCU and synchronize_rcu() calls
  - Remove hugetlb_optimize_vmemmap_key static key

Patch 10: Optimization
  - Implement branchless compound_head() for power-of-2 case

Patch 11: Documentation
  - Update vmemmap_dedup.rst to reflect new architecture

Kiryl Shutsemau (11):
  mm: Change the interface of prep_compound_tail()
  mm: Rename the 'compound_head' field in the 'struct page' to
    'compound_info'
  mm: Move set/clear_compound_head() to compound_head()
  mm: Rework compound_head() for power-of-2 sizeof(struct page)
  mm/hugetlb: Refactor code around vmemmap_walk
  mm/hugetlb: Remove fake head pages
  mm: Drop fake head checks and fix a race condition
  hugetlb: Remove VMEMMAP_SYNCHRONIZE_RCU
  mm/hugetlb: Remove hugetlb_optimize_vmemmap_key static key
  mm: Remove the branch from compound_head()
  hugetlb: Update vmemmap_dedup.rst

 .../admin-guide/kdump/vmcoreinfo.rst          |   2 +-
 Documentation/mm/vmemmap_dedup.rst            |  62 ++---
 include/linux/hugetlb.h                       |   3 +
 include/linux/mm_types.h                      |  20 +-
 include/linux/page-flags.h                    | 163 +++++-------
 include/linux/page_ref.h                      |   8 +-
 include/linux/types.h                         |   2 +-
 kernel/vmcore_info.c                          |   2 +-
 mm/hugetlb.c                                  |   8 +-
 mm/hugetlb_vmemmap.c                          | 245 ++++++++----------
 mm/hugetlb_vmemmap.h                          |   4 +-
 mm/internal.h                                 |  11 +-
 mm/mm_init.c                                  |   2 +-
 mm/page_alloc.c                               |   4 +-
 mm/slab.h                                     |   2 +-
 mm/util.c                                     |  15 +-
 16 files changed, 242 insertions(+), 311 deletions(-)

-- 
2.51.2


             reply	other threads:[~2025-12-05 19:43 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-05 19:43 Kiryl Shutsemau [this message]
2025-12-05 19:43 ` [PATCH 01/11] mm: Change the interface of prep_compound_tail() Kiryl Shutsemau
2025-12-05 21:49   ` Usama Arif
2025-12-05 22:10     ` Kiryl Shutsemau
2025-12-05 22:15       ` Usama Arif
2025-12-05 19:43 ` [PATCH 02/11] mm: Rename the 'compound_head' field in the 'struct page' to 'compound_info' Kiryl Shutsemau
2025-12-05 19:43 ` [PATCH 03/11] mm: Move set/clear_compound_head() to compound_head() Kiryl Shutsemau
2025-12-05 19:43 ` [PATCH 04/11] mm: Rework compound_head() for power-of-2 sizeof(struct page) Kiryl Shutsemau
2025-12-06  0:25   ` Usama Arif
2025-12-06 16:29     ` Kiryl Shutsemau
2025-12-06 17:36       ` Usama Arif
2025-12-05 19:43 ` [PATCH 05/11] mm/hugetlb: Refactor code around vmemmap_walk Kiryl Shutsemau
2025-12-06 16:42   ` Usama Arif
2025-12-08 10:30     ` Kiryl Shutsemau
2025-12-05 19:43 ` [PATCH 06/11] mm/hugetlb: Remove fake head pages Kiryl Shutsemau
2025-12-06 17:03   ` Usama Arif
2025-12-08 10:40     ` Kiryl Shutsemau
2025-12-05 19:43 ` [PATCH 07/11] mm: Drop fake head checks and fix a race condition Kiryl Shutsemau
2025-12-06 17:27   ` Usama Arif
2025-12-08 10:48     ` Kiryl Shutsemau
2025-12-05 19:43 ` [PATCH 08/11] hugetlb: Remove VMEMMAP_SYNCHRONIZE_RCU Kiryl Shutsemau
2025-12-05 19:43 ` [PATCH 09/11] mm/hugetlb: Remove hugetlb_optimize_vmemmap_key static key Kiryl Shutsemau
2025-12-05 19:43 ` [PATCH 10/11] mm: Remove the branch from compound_head() Kiryl Shutsemau
2025-12-05 19:43 ` [PATCH 11/11] hugetlb: Update vmemmap_dedup.rst Kiryl Shutsemau
2025-12-05 20:16 ` [PATCH 00/11] mm/hugetlb: Eliminate fake head pages from vmemmap optimization David Hildenbrand (Red Hat)
2025-12-05 20:33   ` Kiryl Shutsemau
2025-12-05 20:44     ` David Hildenbrand (Red Hat)
2025-12-05 20:54       ` Kiryl Shutsemau
2025-12-05 21:34         ` David Hildenbrand (Red Hat)
2025-12-05 21:41           ` Kiryl Shutsemau
2025-12-06 17:47             ` Usama Arif
2025-12-08  9:53               ` David Hildenbrand (Red Hat)
2025-12-08  8:51             ` David Hildenbrand (Red Hat)
2025-12-09  6:22 ` Muchun Song
2025-12-09 14:44   ` Kiryl Shutsemau
2025-12-10  3:39     ` Muchun Song
2025-12-11  3:45       ` Muchun Song
2025-12-11 15:08       ` Kiryl Shutsemau
2025-12-12  6:45         ` Muchun Song
2025-12-09 18:20 ` Frank van der Linden
2025-12-11 15:02   ` Kiryl Shutsemau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251205194351.1646318-1-kas@kernel.org \
    --to=kas@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=corbet@lwn.net \
    --cc=david@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@meta.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=rppt@kernel.org \
    --cc=usamaarif642@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.