Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: npache@redhat.com
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	yuzhao@google.com, usamaarif642@gmail.com, lance.yang@linux.dev,
	baohua@kernel.org, dev.jain@arm.com, ryan.roberts@arm.com,
	liam@infradead.org, baolin.wang@linux.alibaba.com,
	ziy@nvidia.com, ljs@kernel.org, david@kernel.org,
	akpm@linux-foundation.org
Subject: Re: [RFC] mm: restrict zero-page remapping to underused THP splits
Date: Sat,  9 May 2026 11:21:57 +0800	[thread overview]
Message-ID: <20260509032157.61333-1-lance.yang@linux.dev> (raw)
In-Reply-To: <20260508170509.640851-1-npache@redhat.com>


On Fri, May 08, 2026 at 11:05:09AM -0600, Nico Pache wrote:
>Since commit b1f202060afe ("mm: remap unused subpages to shared zeropage
>when splitting isolated thp"), splitting an anonymous THP remaps all
>zero-filled subpages to the shared zeropage via TTU_USE_SHARED_ZEROPAGE.
>This flag is set unconditionally for every anonymous folio split,
>including splits triggered by KSM.
>
>When KSM is enabled with THP=always, this causes two regressions:
>
>1. use_zero_pages=1: KSM calls try_to_merge_one_page() which triggers
>   split_huge_page(). The split remaps all 512 zero-filled subpages to
>   the shared zeropage at once, freeing the entire 2MB THP when KSM only
>   intended to process a single 4KB page. This bypasses KSM's
>   pages_to_scan rate limiting, causing ~1GB to be freed almost
>   instantly.
>
>2. use_zero_pages=0: The same split side-effect occurs through the
>   stable/unstable tree merge paths. Each pages_to_scan iteration
>   triggers an expensive split_huge_page() that silently frees 2MB,
>   while the scanner wastes cycles on tree searches for zero-filled
>   pages that were already freed as a side-effect.
>
>Fix this by restricting TTU_USE_SHARED_ZEROPAGE to only the deferred
>split shrinker path (deferred_split_scan), which is the only caller that
>intentionally splits underused THPs to reclaim zero-filled subpages.
>Introduce folio_split_underused() as a dedicated entry point that
>passes is_underused_thp=true through __folio_split(), and use it from
>deferred_split_scan(). All other split callers (KSM, compaction, etc.)
>no longer get the zero-page remapping side-effect.
>
>Reviewers notes: this patch is one of two potential approaches. This patch
>turns off the zero-page freeing that has been done since the noted commit,
>in all the other callers, only leaving the underused shrinker to do such
>behavior. We can also take the opposite approach of with something like
>split_huge_page_no_zeropage() and call this within KSM.
>
>Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
>Signed-off-by: Nico Pache <npache@redhat.com>
>---
> include/linux/huge_mm.h |  2 +-
> mm/huge_memory.c        | 17 ++++++++++++-----
> 2 files changed, 13 insertions(+), 6 deletions(-)
>
>diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
>index 2949e5acff35..4ae1b52d7411 100644
>--- a/include/linux/huge_mm.h
>+++ b/include/linux/huge_mm.h
>@@ -378,7 +378,7 @@ int folio_check_splittable(struct folio *folio, unsigned int new_order,
> 			   enum split_type split_type);
> int folio_split(struct folio *folio, unsigned int new_order, struct page *page,
> 		struct list_head *list);
>-
>+int folio_split_underused(struct folio *folio);
> static inline int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> 		unsigned int new_order)
> {
>diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>index 970e077019b7..91f7fad72c8a 100644
>--- a/mm/huge_memory.c
>+++ b/mm/huge_memory.c
>@@ -4045,7 +4045,8 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
>  */
> static int __folio_split(struct folio *folio, unsigned int new_order,
> 		struct page *split_at, struct page *lock_at,
>-		struct list_head *list, enum split_type split_type)
>+		struct list_head *list, enum split_type split_type,
>+		bool is_underused_thp)
> {
> 	XA_STATE(xas, &folio->mapping->i_pages, folio->index);
> 	struct folio *end_folio = folio_next(folio);
>@@ -4174,7 +4175,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
> 	if (nr_shmem_dropped)
> 		shmem_uncharge(mapping->host, nr_shmem_dropped);
> 
>-	if (!ret && is_anon && !folio_is_device_private(folio))
>+	if (!ret && is_anon && !folio_is_device_private(folio) && is_underused_thp)
> 		ttu_flags = TTU_USE_SHARED_ZEROPAGE;
> 
> 	remap_page(folio, 1 << old_order, ttu_flags);
>@@ -4309,7 +4310,7 @@ int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list
> 	struct folio *folio = page_folio(page);
> 
> 	return __folio_split(folio, new_order, &folio->page, page, list,
>-			     SPLIT_TYPE_UNIFORM);
>+			     SPLIT_TYPE_UNIFORM, false);
> }
> 
> /**
>@@ -4340,7 +4341,13 @@ int folio_split(struct folio *folio, unsigned int new_order,
> 		struct page *split_at, struct list_head *list)
> {
> 	return __folio_split(folio, new_order, split_at, &folio->page, list,
>-			     SPLIT_TYPE_NON_UNIFORM);
>+			     SPLIT_TYPE_NON_UNIFORM, false);
>+}
>+
>+int folio_split_underused(struct folio *folio)
>+{
>+	return __folio_split(folio, 0, &folio->page, &folio->page,
>+			     NULL, SPLIT_TYPE_NON_UNIFORM, true);

IIUC, it should be SPLIT_TYPE_UNIFORM, not SPLIT_TYPE_NON_UNIFORM ...

deferred_split_scan() used split_folio(), so for the underused case it
split the whole THP uniformly down to order-0 pages. The shared zeropage
remapping happens later, via remove_migration_ptes(), after the split.

With SPLIT_TYPE_NON_UNIFORM and split_at == &folio->page, most of an
order-9 THP can stays as larger folios.

Then try_to_map_unused_to_zeropage() rejects those folios:

	if (PageCompound(page) || PageHWPoison(page))
		return false;

So the underused shrinker would no longer remap/free many zero-filled
subpages ...

> }
> 
> /**
>@@ -4559,7 +4566,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
> 		}
> 		if (!folio_trylock(folio))
> 			goto requeue;
>-		if (!split_folio(folio)) {
>+		if (!folio_split_underused(folio)) {
> 			did_split = true;
> 			if (underused)
> 				count_vm_event(THP_UNDERUSED_SPLIT_PAGE);
>-- 
>2.54.0
>
>


  parent reply	other threads:[~2026-05-09  3:22 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-08 17:05 [RFC] mm: restrict zero-page remapping to underused THP splits Nico Pache
2026-05-08 21:32 ` David Hildenbrand (Arm)
2026-05-09  8:25   ` Lance Yang
2026-05-10 11:39   ` Usama Arif
2026-05-11  6:36     ` David Hildenbrand (Arm)
2026-05-11 13:10       ` Usama Arif
2026-05-11 13:42         ` David Hildenbrand (Arm)
2026-05-11 13:44           ` David Hildenbrand (Arm)
2026-05-11 14:15             ` Usama Arif
2026-05-11 18:40   ` Nico Pache
2026-05-12  7:05     ` David Hildenbrand (Arm)
2026-05-12 18:36       ` Nico Pache
2026-05-12 19:02         ` David Hildenbrand (Arm)
2026-05-09  3:21 ` Lance Yang [this message]
2026-05-11 18:42   ` Nico Pache

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260509032157.61333-1-lance.yang@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=liam@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=usamaarif642@gmail.com \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox