From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-178.mta0.migadu.com (out-178.mta0.migadu.com [91.218.175.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9C96A2628D for ; Sat, 9 May 2026 03:22:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.178 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778296938; cv=none; b=VySq96Ykecsi80n3tooqJBWBKIXz22dgIjlcAMt4yy0YcHAgvKlw4RN/PIbKbKre37uTnAg9LUC6Inc5V24XcCWYAivKgH6+BbXXt50HNDJLTw4fn3BojWqQpF27KcDjTC6k8QvaOLmbILVtjwldOk3xDRwtm+aamUM+x0p0o3E= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778296938; c=relaxed/simple; bh=cxDZgnxBmR3Ya8Q1hieuMzaKaFvTy0K23Eg//q1GVhk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=KPsqSVqrkSLAB9K8OBY1X51e80Fsv+Rurklss+E04nLhRhzR1n5xR+MxGTS985lw5ixADNt35b9oYg18Tfhq7CQ7/HpNbEJiywQEzK5GXgZ3Ncwqg5DamVx/KH77tG8PLGoUk1DXYc/iSmKsix7rtLeHdxh+zBsgJPDwCBYRyvs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=j4p+fPm1; arc=none smtp.client-ip=91.218.175.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="j4p+fPm1" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1778296933; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=as0dpPJQQ5oemmKaoTTdoATFZeTUPthXq79B4CZ58js=; b=j4p+fPm1gorWRtfE+RjS8Wv5Dv6UTceWxNhHJPw0BFaeO992R9+te/rFVvo5aBsCf++Cre p6bLCtuO6Q9No6raywS8SCwHUC47ho0SSQ9Vnzmzofis3x3vPcCc6XJuDI5AVFyKJ/pXHn hAhrtLYCNJ1n+dP08rwYCyomVWjYtpQ= From: Lance Yang To: npache@redhat.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, yuzhao@google.com, usamaarif642@gmail.com, lance.yang@linux.dev, baohua@kernel.org, dev.jain@arm.com, ryan.roberts@arm.com, liam@infradead.org, baolin.wang@linux.alibaba.com, ziy@nvidia.com, ljs@kernel.org, david@kernel.org, akpm@linux-foundation.org Subject: Re: [RFC] mm: restrict zero-page remapping to underused THP splits Date: Sat, 9 May 2026 11:21:57 +0800 Message-Id: <20260509032157.61333-1-lance.yang@linux.dev> In-Reply-To: <20260508170509.640851-1-npache@redhat.com> References: <20260508170509.640851-1-npache@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT On Fri, May 08, 2026 at 11:05:09AM -0600, Nico Pache wrote: >Since commit b1f202060afe ("mm: remap unused subpages to shared zeropage >when splitting isolated thp"), splitting an anonymous THP remaps all >zero-filled subpages to the shared zeropage via TTU_USE_SHARED_ZEROPAGE. >This flag is set unconditionally for every anonymous folio split, >including splits triggered by KSM. > >When KSM is enabled with THP=always, this causes two regressions: > >1. use_zero_pages=1: KSM calls try_to_merge_one_page() which triggers > split_huge_page(). The split remaps all 512 zero-filled subpages to > the shared zeropage at once, freeing the entire 2MB THP when KSM only > intended to process a single 4KB page. This bypasses KSM's > pages_to_scan rate limiting, causing ~1GB to be freed almost > instantly. > >2. use_zero_pages=0: The same split side-effect occurs through the > stable/unstable tree merge paths. Each pages_to_scan iteration > triggers an expensive split_huge_page() that silently frees 2MB, > while the scanner wastes cycles on tree searches for zero-filled > pages that were already freed as a side-effect. > >Fix this by restricting TTU_USE_SHARED_ZEROPAGE to only the deferred >split shrinker path (deferred_split_scan), which is the only caller that >intentionally splits underused THPs to reclaim zero-filled subpages. >Introduce folio_split_underused() as a dedicated entry point that >passes is_underused_thp=true through __folio_split(), and use it from >deferred_split_scan(). All other split callers (KSM, compaction, etc.) >no longer get the zero-page remapping side-effect. > >Reviewers notes: this patch is one of two potential approaches. This patch >turns off the zero-page freeing that has been done since the noted commit, >in all the other callers, only leaving the underused shrinker to do such >behavior. We can also take the opposite approach of with something like >split_huge_page_no_zeropage() and call this within KSM. > >Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp") >Signed-off-by: Nico Pache >--- > include/linux/huge_mm.h | 2 +- > mm/huge_memory.c | 17 ++++++++++++----- > 2 files changed, 13 insertions(+), 6 deletions(-) > >diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h >index 2949e5acff35..4ae1b52d7411 100644 >--- a/include/linux/huge_mm.h >+++ b/include/linux/huge_mm.h >@@ -378,7 +378,7 @@ int folio_check_splittable(struct folio *folio, unsigned int new_order, > enum split_type split_type); > int folio_split(struct folio *folio, unsigned int new_order, struct page *page, > struct list_head *list); >- >+int folio_split_underused(struct folio *folio); > static inline int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, > unsigned int new_order) > { >diff --git a/mm/huge_memory.c b/mm/huge_memory.c >index 970e077019b7..91f7fad72c8a 100644 >--- a/mm/huge_memory.c >+++ b/mm/huge_memory.c >@@ -4045,7 +4045,8 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n > */ > static int __folio_split(struct folio *folio, unsigned int new_order, > struct page *split_at, struct page *lock_at, >- struct list_head *list, enum split_type split_type) >+ struct list_head *list, enum split_type split_type, >+ bool is_underused_thp) > { > XA_STATE(xas, &folio->mapping->i_pages, folio->index); > struct folio *end_folio = folio_next(folio); >@@ -4174,7 +4175,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order, > if (nr_shmem_dropped) > shmem_uncharge(mapping->host, nr_shmem_dropped); > >- if (!ret && is_anon && !folio_is_device_private(folio)) >+ if (!ret && is_anon && !folio_is_device_private(folio) && is_underused_thp) > ttu_flags = TTU_USE_SHARED_ZEROPAGE; > > remap_page(folio, 1 << old_order, ttu_flags); >@@ -4309,7 +4310,7 @@ int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list > struct folio *folio = page_folio(page); > > return __folio_split(folio, new_order, &folio->page, page, list, >- SPLIT_TYPE_UNIFORM); >+ SPLIT_TYPE_UNIFORM, false); > } > > /** >@@ -4340,7 +4341,13 @@ int folio_split(struct folio *folio, unsigned int new_order, > struct page *split_at, struct list_head *list) > { > return __folio_split(folio, new_order, split_at, &folio->page, list, >- SPLIT_TYPE_NON_UNIFORM); >+ SPLIT_TYPE_NON_UNIFORM, false); >+} >+ >+int folio_split_underused(struct folio *folio) >+{ >+ return __folio_split(folio, 0, &folio->page, &folio->page, >+ NULL, SPLIT_TYPE_NON_UNIFORM, true); IIUC, it should be SPLIT_TYPE_UNIFORM, not SPLIT_TYPE_NON_UNIFORM ... deferred_split_scan() used split_folio(), so for the underused case it split the whole THP uniformly down to order-0 pages. The shared zeropage remapping happens later, via remove_migration_ptes(), after the split. With SPLIT_TYPE_NON_UNIFORM and split_at == &folio->page, most of an order-9 THP can stays as larger folios. Then try_to_map_unused_to_zeropage() rejects those folios: if (PageCompound(page) || PageHWPoison(page)) return false; So the underused shrinker would no longer remap/free many zero-filled subpages ... > } > > /** >@@ -4559,7 +4566,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, > } > if (!folio_trylock(folio)) > goto requeue; >- if (!split_folio(folio)) { >+ if (!folio_split_underused(folio)) { > did_split = true; > if (underused) > count_vm_event(THP_UNDERUSED_SPLIT_PAGE); >-- >2.54.0 > >