* [PATCH mm-unstable v1 0/3] MM: Tighten control over zero-page remapping
@ 2026-06-09 11:46 Nico Pache
2026-06-09 11:46 ` [PATCH mm-unstable v1 1/3] mm/ksm: export ksm_is_running() to check KSM merge state Nico Pache
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Nico Pache @ 2026-06-09 11:46 UTC (permalink / raw)
To: linux-kernel, linux-mm, Usama Arif, Yu Zhao
Cc: aarcange, Nico Pache, Andrew Morton, David Hildenbrand, Xu Xin,
Chengming Zhou, Lorenzo Stoakes, Zi Yan, Baolin Wang,
Liam R. Howlett, Ryan Roberts, Dev Jain, Barry Song, Lance Yang,
Matthew Brost, Joshua Hahn, Rakie Kim, Byungchul Park,
Gregory Price, Ying Huang, Alistair Popple
This patch series fixes a bug in KSM where the pages_to_scan limit is
bypassed and the policy (use_zero_pages) is ignored. It also gives users
further control of the zero-page remapping that occurs when a folio is
split.
Commit b1f202060afe ("mm: remap unused subpages to shared zeropage when
splitting isolated thp") added unconditional zero-page remapping when a
folio is split. This was sold as part of the underutilized shrinker, but
has a larger effect. Since this commit, all splits do this zero-page
scanning and remapping, even if the user has the underutilized shrinker
disabled.
This unconditional zero-page remapping was also problematic for KSM, as
when KSM would try to merge a page, it would split the folio and the
zero-page remapping would remap all zero-filled pages to the shared
zero-page bypassing the use_zero_pages policy. When the user had this
feature disabled, KSM would then waste cycles scanning these already
freed pages, causing little progress to be made, and KSMs performance to
degrade.
In this series we also gate the zero-page remapping behind
split_underused_thp (the sysfs for the underutilized shrinker). This
provides users with more control over this behavior, while also preventing
the KSM bug when it is enabled.
RFC: https://lore.kernel.org/all/20260508170509.640851-1-npache@redhat.com/
Nico Pache (3):
mm/ksm: export ksm_is_running() to check KSM merge state
mm/migrate.c: Prevent folio splitting from interacting with KSM
mm/huge_memory.c: Skip zero-page remapping when underused THP shrinker
is disabled
include/linux/ksm.h | 6 ++++++
mm/huge_memory.c | 2 +-
mm/ksm.c | 6 ++++++
mm/migrate.c | 9 +++++++++
4 files changed, 22 insertions(+), 1 deletion(-)
base-commit: be18cf77e1e749c6469ff44df00eb026f7c0a365
--
2.54.0
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH mm-unstable v1 1/3] mm/ksm: export ksm_is_running() to check KSM merge state
2026-06-09 11:46 [PATCH mm-unstable v1 0/3] MM: Tighten control over zero-page remapping Nico Pache
@ 2026-06-09 11:46 ` Nico Pache
2026-06-09 14:13 ` Lorenzo Stoakes
2026-06-09 11:46 ` [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM Nico Pache
2026-06-09 11:46 ` [PATCH mm-unstable v1 3/3] mm/huge_memory.c: Skip zero-page remapping when underused THP shrinker is disabled Nico Pache
2 siblings, 1 reply; 15+ messages in thread
From: Nico Pache @ 2026-06-09 11:46 UTC (permalink / raw)
To: linux-kernel, linux-mm, Usama Arif, Yu Zhao
Cc: aarcange, Nico Pache, Andrew Morton, David Hildenbrand, Xu Xin,
Chengming Zhou, Lorenzo Stoakes, Zi Yan, Baolin Wang,
Liam R. Howlett, Ryan Roberts, Dev Jain, Barry Song, Lance Yang,
Matthew Brost, Joshua Hahn, Rakie Kim, Byungchul Park,
Gregory Price, Ying Huang, Alistair Popple
Add ksm_is_running() which returns true when KSM is actively merging
(ksm_run & KSM_RUN_MERGE).
This will be used by try_to_map_unused_to_zeropage() to skip zero-page
remapping in VM_MERGEABLE VMAs when KSM is active.
Signed-off-by: Nico Pache <npache@redhat.com>
---
include/linux/ksm.h | 6 ++++++
mm/ksm.c | 6 ++++++
2 files changed, 12 insertions(+)
diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index d39d0d5483a2..c1048b690a92 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -17,6 +17,7 @@
#ifdef CONFIG_KSM
int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
unsigned long end, int advice, vm_flags_t *vm_flags);
+bool ksm_is_running(void);
vma_flags_t ksm_vma_flags(struct mm_struct *mm, const struct file *file,
vma_flags_t vma_flags);
int ksm_enable_merge_any(struct mm_struct *mm);
@@ -144,6 +145,11 @@ static inline int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
return 0;
}
+static inline bool ksm_is_running(void)
+{
+ return false;
+}
+
static inline struct folio *ksm_might_need_to_copy(struct folio *folio,
struct vm_area_struct *vma, unsigned long addr)
{
diff --git a/mm/ksm.c b/mm/ksm.c
index 7d5b76478f0b..edc2b961ff59 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -3015,6 +3015,12 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
}
EXPORT_SYMBOL_GPL(ksm_madvise);
+bool ksm_is_running(void)
+{
+ return ksm_run & KSM_RUN_MERGE;
+}
+EXPORT_SYMBOL_GPL(ksm_is_running);
+
int __ksm_enter(struct mm_struct *mm)
{
struct ksm_mm_slot *mm_slot;
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM
2026-06-09 11:46 [PATCH mm-unstable v1 0/3] MM: Tighten control over zero-page remapping Nico Pache
2026-06-09 11:46 ` [PATCH mm-unstable v1 1/3] mm/ksm: export ksm_is_running() to check KSM merge state Nico Pache
@ 2026-06-09 11:46 ` Nico Pache
2026-06-09 12:12 ` xu.xin16
2026-06-09 14:26 ` Lorenzo Stoakes
2026-06-09 11:46 ` [PATCH mm-unstable v1 3/3] mm/huge_memory.c: Skip zero-page remapping when underused THP shrinker is disabled Nico Pache
2 siblings, 2 replies; 15+ messages in thread
From: Nico Pache @ 2026-06-09 11:46 UTC (permalink / raw)
To: linux-kernel, linux-mm, Usama Arif, Yu Zhao
Cc: aarcange, Nico Pache, Andrew Morton, David Hildenbrand, Xu Xin,
Chengming Zhou, Lorenzo Stoakes, Zi Yan, Baolin Wang,
Liam R. Howlett, Ryan Roberts, Dev Jain, Barry Song, Lance Yang,
Matthew Brost, Joshua Hahn, Rakie Kim, Byungchul Park,
Gregory Price, Ying Huang, Alistair Popple
Since commit b1f202060afe ("mm: remap unused subpages to shared zeropage
when splitting isolated thp"), splitting an anonymous THP remaps all
zero-filled subpages to the shared zeropage via TTU_USE_SHARED_ZEROPAGE.
This flag is set unconditionally for every anonymous folio split,
including splits triggered by KSM.
When KSM is enabled with THP=always, this causes two regressions:
1. use_zero_pages=1: KSM calls try_to_merge_one_page() which triggers
split_huge_page(). The split remaps all 512 zero-filled subpages to
the shared zeropage at once, freeing the entire 2MB THP when KSM only
intended to process a single 4KB page. This bypasses KSM's
pages_to_scan rate limiting, causing ~1GB to be freed almost
instantly.
2. use_zero_pages=0: The same split side-effect occurs through the
stable/unstable tree merge paths. Each pages_to_scan iteration
triggers an expensive split_huge_page() that silently frees 2MB,
while the scanner wastes cycles on tree searches for zero-filled
pages that were already freed as a side-effect.
Fix this by restricting TTU_USE_SHARED_ZEROPAGE being set in the case that
KSM is running and the VMA has VM_MERGEABLE.
Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Signed-off-by: Nico Pache <npache@redhat.com>
---
mm/migrate.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/mm/migrate.c b/mm/migrate.c
index d9b23909d716..f410f972fc5e 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -304,6 +304,15 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
if (PageCompound(page) || PageHWPoison(page))
return false;
+ /*
+ * Let KSM handle the zero-filled page deduplication according to its
+ * own rate limit (pages_to_scan) and policy (use_zero_pages). Without
+ * this, a KSM-triggered THP split would remap all zero-filled subpages
+ * to the shared zeropage as a side effect.
+ */
+ if (ksm_is_running() && (pvmw->vma->vm_flags & VM_MERGEABLE))
+ return false;
+
VM_BUG_ON_PAGE(!PageAnon(page), page);
VM_BUG_ON_PAGE(!PageLocked(page), page);
VM_BUG_ON_PAGE(pte_present(old_pte), page);
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH mm-unstable v1 3/3] mm/huge_memory.c: Skip zero-page remapping when underused THP shrinker is disabled
2026-06-09 11:46 [PATCH mm-unstable v1 0/3] MM: Tighten control over zero-page remapping Nico Pache
2026-06-09 11:46 ` [PATCH mm-unstable v1 1/3] mm/ksm: export ksm_is_running() to check KSM merge state Nico Pache
2026-06-09 11:46 ` [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM Nico Pache
@ 2026-06-09 11:46 ` Nico Pache
2 siblings, 0 replies; 15+ messages in thread
From: Nico Pache @ 2026-06-09 11:46 UTC (permalink / raw)
To: linux-kernel, linux-mm, Usama Arif, Yu Zhao
Cc: aarcange, Nico Pache, David Hildenbrand, Andrew Morton, Xu Xin,
Chengming Zhou, Lorenzo Stoakes, Zi Yan, Baolin Wang,
Liam R. Howlett, Ryan Roberts, Dev Jain, Barry Song, Lance Yang,
Matthew Brost, Joshua Hahn, Rakie Kim, Byungchul Park,
Gregory Price, Ying Huang, Alistair Popple
The zero-page remapping in __folio_split() was added to support the
underused THP shrinker. When the shrinker is disabled via shrink_underused,
gate TTU_USE_SHARED_ZEROPAGE on split_underused_thp to avoid scanning
subpages for zero content unnecessarily.
Suggested-by: David Hildenbrand <david@kernel.org>
Signed-off-by: Nico Pache <npache@redhat.com>
---
mm/huge_memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 94bd656eeaf8..37748d92f277 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -4103,7 +4103,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
if (nr_shmem_dropped)
shmem_uncharge(mapping->host, nr_shmem_dropped);
- if (!ret && is_anon && !folio_is_device_private(folio))
+ if (!ret && is_anon && !folio_is_device_private(folio) && split_underused_thp)
ttu_flags = TTU_USE_SHARED_ZEROPAGE;
remap_page(folio, 1 << old_order, ttu_flags);
--
2.54.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM
2026-06-09 11:46 ` [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM Nico Pache
@ 2026-06-09 12:12 ` xu.xin16
2026-06-09 12:57 ` Nico Pache
2026-06-09 13:06 ` Lance Yang
2026-06-09 14:26 ` Lorenzo Stoakes
1 sibling, 2 replies; 15+ messages in thread
From: xu.xin16 @ 2026-06-09 12:12 UTC (permalink / raw)
To: npache
Cc: linux-kernel, linux-mm, usamaarif642, yuzhao, aarcange, npache,
akpm, david, chengming.zhou, ljs, ziy, baolin.wang, liam,
ryan.roberts, dev.jain, baohua, lance.yang, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, apopple
>Since commit b1f202060afe ("mm: remap unused subpages to shared zeropage
>when splitting isolated thp"), splitting an anonymous THP remaps all
>zero-filled subpages to the shared zeropage via TTU_USE_SHARED_ZEROPAGE.
>This flag is set unconditionally for every anonymous folio split,
>including splits triggered by KSM.
>
>When KSM is enabled with THP=always, this causes two regressions:
>
>1. use_zero_pages=1: KSM calls try_to_merge_one_page() which triggers
> split_huge_page(). The split remaps all 512 zero-filled subpages to
> the shared zeropage at once, freeing the entire 2MB THP when KSM only
> intended to process a single 4KB page. This bypasses KSM's
> pages_to_scan rate limiting, causing ~1GB to be freed almost
> instantly.
>
Why do you see it as regressions?
AFAIU, KSM and THP do often conflict with each other. THP tries hard to collapse
a huge page (which may contain many zero pages). If KSM is enabled and part of
that huge page is mergeable, it can easily be split by KSM, rendering THP's
efforts futile.
Therefore, in our actual production environment, we typically avoid making the
same region both KSM mergeable and THP always.
>2. use_zero_pages=0: The same split side-effect occurs through the
> stable/unstable tree merge paths. Each pages_to_scan iteration
> triggers an expensive split_huge_page() that silently frees 2MB,
> while the scanner wastes cycles on tree searches for zero-filled
> pages that were already freed as a side-effect.
>
>Fix this by restricting TTU_USE_SHARED_ZEROPAGE being set in the case that
>KSM is running and the VMA has VM_MERGEABLE.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM
2026-06-09 12:12 ` xu.xin16
@ 2026-06-09 12:57 ` Nico Pache
2026-06-09 12:59 ` David Hildenbrand (Arm)
2026-06-09 13:47 ` xu.xin16
2026-06-09 13:06 ` Lance Yang
1 sibling, 2 replies; 15+ messages in thread
From: Nico Pache @ 2026-06-09 12:57 UTC (permalink / raw)
To: xu.xin16
Cc: linux-kernel, linux-mm, usamaarif642, yuzhao, aarcange, akpm,
david, chengming.zhou, ljs, ziy, baolin.wang, liam, ryan.roberts,
dev.jain, baohua, lance.yang, matthew.brost, joshua.hahnjy,
rakie.kim, byungchul, gourry, ying.huang, apopple
On Tue, Jun 9, 2026 at 6:12 AM <xu.xin16@zte.com.cn> wrote:
>
> >Since commit b1f202060afe ("mm: remap unused subpages to shared zeropage
> >when splitting isolated thp"), splitting an anonymous THP remaps all
> >zero-filled subpages to the shared zeropage via TTU_USE_SHARED_ZEROPAGE.
> >This flag is set unconditionally for every anonymous folio split,
> >including splits triggered by KSM.
> >
> >When KSM is enabled with THP=always, this causes two regressions:
> >
> >1. use_zero_pages=1: KSM calls try_to_merge_one_page() which triggers
> > split_huge_page(). The split remaps all 512 zero-filled subpages to
> > the shared zeropage at once, freeing the entire 2MB THP when KSM only
> > intended to process a single 4KB page. This bypasses KSM's
> > pages_to_scan rate limiting, causing ~1GB to be freed almost
> > instantly.
> >
>
> Why do you see it as regressions?
Since the zero-page remapping was introduced our test has shown the
following behavior changes:
With use_zero_pages=0, the merge rate drops from 60MB/s to ~6 MB/s
even after raising pages_to_scan. The KSM merging is now much slower
and CPU utilization has increased.
With use_zero_pages=1, ~1 GB is freed almost instantly, and it no
longer respects the pages_to_scan behavior.
Even with just this patch (1 & 2) or the RFC linked in the cover
letter, the issue no longer occurs.
>
> AFAIU, KSM and THP do often conflict with each other. THP tries hard to collapse
> a huge page (which may contain many zero pages). If KSM is enabled and part of
> that huge page is mergeable, it can easily be split by KSM, rendering THP's
> efforts futile.
>
> Therefore, in our actual production environment, we typically avoid making the
> same region both KSM mergeable and THP always.
THP=always is a global setting used in many production environments,
so these features now interact very poorly together.
Let me know if that answers your question!
Cheers,
-- Nico
>
>
> >2. use_zero_pages=0: The same split side-effect occurs through the
> > stable/unstable tree merge paths. Each pages_to_scan iteration
> > triggers an expensive split_huge_page() that silently frees 2MB,
> > while the scanner wastes cycles on tree searches for zero-filled
> > pages that were already freed as a side-effect.
> >
> >Fix this by restricting TTU_USE_SHARED_ZEROPAGE being set in the case that
> >KSM is running and the VMA has VM_MERGEABLE.
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM
2026-06-09 12:57 ` Nico Pache
@ 2026-06-09 12:59 ` David Hildenbrand (Arm)
2026-06-09 13:47 ` xu.xin16
1 sibling, 0 replies; 15+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-09 12:59 UTC (permalink / raw)
To: Nico Pache, xu.xin16
Cc: linux-kernel, linux-mm, usamaarif642, yuzhao, aarcange, akpm,
chengming.zhou, ljs, ziy, baolin.wang, liam, ryan.roberts,
dev.jain, baohua, lance.yang, matthew.brost, joshua.hahnjy,
rakie.kim, byungchul, gourry, ying.huang, apopple
On 6/9/26 14:57, Nico Pache wrote:
> On Tue, Jun 9, 2026 at 6:12 AM <xu.xin16@zte.com.cn> wrote:
>>
>>> Since commit b1f202060afe ("mm: remap unused subpages to shared zeropage
>>> when splitting isolated thp"), splitting an anonymous THP remaps all
>>> zero-filled subpages to the shared zeropage via TTU_USE_SHARED_ZEROPAGE.
>>> This flag is set unconditionally for every anonymous folio split,
>>> including splits triggered by KSM.
>>>
>>> When KSM is enabled with THP=always, this causes two regressions:
>>>
>>> 1. use_zero_pages=1: KSM calls try_to_merge_one_page() which triggers
>>> split_huge_page(). The split remaps all 512 zero-filled subpages to
>>> the shared zeropage at once, freeing the entire 2MB THP when KSM only
>>> intended to process a single 4KB page. This bypasses KSM's
>>> pages_to_scan rate limiting, causing ~1GB to be freed almost
>>> instantly.
>>>
>>
>> Why do you see it as regressions?
>
> Since the zero-page remapping was introduced our test has shown the
> following behavior changes:
>
> With use_zero_pages=0, the merge rate drops from 60MB/s to ~6 MB/s
> even after raising pages_to_scan. The KSM merging is now much slower
> and CPU utilization has increased.
>
> With use_zero_pages=1, ~1 GB is freed almost instantly, and it no
> longer respects the pages_to_scan behavior.
>
> Even with just this patch (1 & 2) or the RFC linked in the cover
> letter, the issue no longer occurs.
>
>>
>> AFAIU, KSM and THP do often conflict with each other. THP tries hard to collapse
>> a huge page (which may contain many zero pages). If KSM is enabled and part of
>> that huge page is mergeable, it can easily be split by KSM, rendering THP's
>> efforts futile.
>>
>> Therefore, in our actual production environment, we typically avoid making the
>> same region both KSM mergeable and THP always.
>
> THP=always is a global setting used in many production environments,
> so these features now interact very poorly together.
Red Hat documents, though, that both things in combination are shaky:
"As KSM can reduce the occurrence of transparent huge pages, you may want to
disable it before enabling THP." [1]
(for RHEL 6, but nothing should have changed in that regard)
But yeah, having KSM and THP enabled at the same time is not uncommon.
[1]
https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/6/html/virtualization_tuning_and_optimization_guide/sect-virtualization_tuning_optimization_guide-memory-huge_pages
--
Cheers,
David
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM
2026-06-09 12:12 ` xu.xin16
2026-06-09 12:57 ` Nico Pache
@ 2026-06-09 13:06 ` Lance Yang
2026-06-09 13:42 ` Nico Pache
1 sibling, 1 reply; 15+ messages in thread
From: Lance Yang @ 2026-06-09 13:06 UTC (permalink / raw)
To: xu.xin16
Cc: npache, linux-kernel, linux-mm, usamaarif642, yuzhao, aarcange,
akpm, david, chengming.zhou, ljs, ziy, baolin.wang, liam,
ryan.roberts, dev.jain, baohua, lance.yang, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, apopple
On Tue, Jun 09, 2026 at 08:12:02PM +0800, xu.xin16@zte.com.cn wrote:
>>Since commit b1f202060afe ("mm: remap unused subpages to shared zeropage
>>when splitting isolated thp"), splitting an anonymous THP remaps all
>>zero-filled subpages to the shared zeropage via TTU_USE_SHARED_ZEROPAGE.
>>This flag is set unconditionally for every anonymous folio split,
>>including splits triggered by KSM.
>>
>>When KSM is enabled with THP=always, this causes two regressions:
>>
>>1. use_zero_pages=1: KSM calls try_to_merge_one_page() which triggers
>> split_huge_page(). The split remaps all 512 zero-filled subpages to
>> the shared zeropage at once, freeing the entire 2MB THP when KSM only
>> intended to process a single 4KB page. This bypasses KSM's
>> pages_to_scan rate limiting, causing ~1GB to be freed almost
>> instantly.
>>
>
>Why do you see it as regressions?
>
>AFAIU, KSM and THP do often conflict with each other. THP tries hard to collapse
>a huge page (which may contain many zero pages). If KSM is enabled and part of
>that huge page is mergeable, it can easily be split by KSM, rendering THP's
>efforts futile.
>
>Therefore, in our actual production environment, we typically avoid making the
>same region both KSM mergeable and THP always.
Right, some setups may choose to avoid using KSM and THP always on the
same region. But that is not something the kernel can assume :)
David noted in the RFC that QEMU may use both MADV_HUGEPAGE and
MADV_MERGEABLE, while KSM can be enabled later system-wide.
And I think Nico means something different from KSM spliting THPs in
general.
KSM has been able to split THP before. the new part from b1f202060afe is
that a KSM-triggered split can also remap zero-filled subpages to the
shared zeropage, outside KSM's own use_zero_pages/pages_to_scan controls.
Maybe the changelog could spell that out :)
>
>>2. use_zero_pages=0: The same split side-effect occurs through the
>> stable/unstable tree merge paths. Each pages_to_scan iteration
>> triggers an expensive split_huge_page() that silently frees 2MB,
>> while the scanner wastes cycles on tree searches for zero-filled
>> pages that were already freed as a side-effect.
>>
>>Fix this by restricting TTU_USE_SHARED_ZEROPAGE being set in the case that
>>KSM is running and the VMA has VM_MERGEABLE.
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM
2026-06-09 13:06 ` Lance Yang
@ 2026-06-09 13:42 ` Nico Pache
2026-06-09 13:49 ` xu.xin16
0 siblings, 1 reply; 15+ messages in thread
From: Nico Pache @ 2026-06-09 13:42 UTC (permalink / raw)
To: Lance Yang
Cc: xu.xin16, linux-kernel, linux-mm, usamaarif642, yuzhao, aarcange,
akpm, david, chengming.zhou, ljs, ziy, baolin.wang, liam,
ryan.roberts, dev.jain, baohua, matthew.brost, joshua.hahnjy,
rakie.kim, byungchul, gourry, ying.huang, apopple
On Tue, Jun 9, 2026 at 7:06 AM Lance Yang <lance.yang@linux.dev> wrote:
>
>
> On Tue, Jun 09, 2026 at 08:12:02PM +0800, xu.xin16@zte.com.cn wrote:
> >>Since commit b1f202060afe ("mm: remap unused subpages to shared zeropage
> >>when splitting isolated thp"), splitting an anonymous THP remaps all
> >>zero-filled subpages to the shared zeropage via TTU_USE_SHARED_ZEROPAGE.
> >>This flag is set unconditionally for every anonymous folio split,
> >>including splits triggered by KSM.
> >>
> >>When KSM is enabled with THP=always, this causes two regressions:
> >>
> >>1. use_zero_pages=1: KSM calls try_to_merge_one_page() which triggers
> >> split_huge_page(). The split remaps all 512 zero-filled subpages to
> >> the shared zeropage at once, freeing the entire 2MB THP when KSM only
> >> intended to process a single 4KB page. This bypasses KSM's
> >> pages_to_scan rate limiting, causing ~1GB to be freed almost
> >> instantly.
> >>
> >
> >Why do you see it as regressions?
> >
> >AFAIU, KSM and THP do often conflict with each other. THP tries hard to collapse
> >a huge page (which may contain many zero pages). If KSM is enabled and part of
> >that huge page is mergeable, it can easily be split by KSM, rendering THP's
> >efforts futile.
> >
> >Therefore, in our actual production environment, we typically avoid making the
> >same region both KSM mergeable and THP always.
>
> Right, some setups may choose to avoid using KSM and THP always on the
> same region. But that is not something the kernel can assume :)
>
> David noted in the RFC that QEMU may use both MADV_HUGEPAGE and
> MADV_MERGEABLE, while KSM can be enabled later system-wide.
>
> And I think Nico means something different from KSM spliting THPs in
> general.
>
> KSM has been able to split THP before. the new part from b1f202060afe is
> that a KSM-triggered split can also remap zero-filled subpages to the
> shared zeropage, outside KSM's own use_zero_pages/pages_to_scan controls.
>
> Maybe the changelog could spell that out :)
Yeah maybe I didnt properly explain that :p
After some thought I still think the alternative approach i mentioned
in the RFC may be better.
ie) prevent the zero-page merging that results from KSM splitting a
folio. The check we add here is more general and will skip this
zero-page merging with all MERGEABLE mappings, not just those causing
the issue (the KSM splitting). The result is that even migrations, etc
that are also MERGEABLE will be skipped needlessly.
If we use this approach we also don't need the first patch of the series.
Cheers,
-- Nico
>
> >
> >>2. use_zero_pages=0: The same split side-effect occurs through the
> >> stable/unstable tree merge paths. Each pages_to_scan iteration
> >> triggers an expensive split_huge_page() that silently frees 2MB,
> >> while the scanner wastes cycles on tree searches for zero-filled
> >> pages that were already freed as a side-effect.
> >>
> >>Fix this by restricting TTU_USE_SHARED_ZEROPAGE being set in the case that
> >>KSM is running and the VMA has VM_MERGEABLE.
> >
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM
2026-06-09 12:57 ` Nico Pache
2026-06-09 12:59 ` David Hildenbrand (Arm)
@ 2026-06-09 13:47 ` xu.xin16
2026-06-09 14:07 ` Zi Yan
1 sibling, 1 reply; 15+ messages in thread
From: xu.xin16 @ 2026-06-09 13:47 UTC (permalink / raw)
To: npache
Cc: linux-kernel, linux-mm, usamaarif642, yuzhao, aarcange, akpm,
david, chengming.zhou, ljs, ziy, baolin.wang, liam, ryan.roberts,
dev.jain, baohua, lance.yang, matthew.brost, joshua.hahnjy,
rakie.kim, byungchul, gourry, ying.huang, apopple
> > >Since commit b1f202060afe ("mm: remap unused subpages to shared zeropage
> > >when splitting isolated thp"), splitting an anonymous THP remaps all
> > >zero-filled subpages to the shared zeropage via TTU_USE_SHARED_ZEROPAGE.
> > >This flag is set unconditionally for every anonymous folio split,
> > >including splits triggered by KSM.
> > >
> > >When KSM is enabled with THP=always, this causes two regressions:
> > >
> > >1. use_zero_pages=1: KSM calls try_to_merge_one_page() which triggers
> > > split_huge_page(). The split remaps all 512 zero-filled subpages to
> > > the shared zeropage at once, freeing the entire 2MB THP when KSM only
> > > intended to process a single 4KB page. This bypasses KSM's
> > > pages_to_scan rate limiting, causing ~1GB to be freed almost
> > > instantly.
> > >
> >
> > Why do you see it as regressions?
>
> Since the zero-page remapping was introduced our test has shown the
> following behavior changes:
>
> With use_zero_pages=0, the merge rate drops from 60MB/s to ~6 MB/s
> even after raising pages_to_scan. The KSM merging is now much slower
> and CPU utilization has increased.
>
> With use_zero_pages=1, ~1 GB is freed almost instantly, and it no
> longer respects the pages_to_scan behavior.
>
> Even with just this patch (1 & 2) or the RFC linked in the cover
> letter, the issue no longer occurs.
Understood. You're saying that the additional processing action in split_huge_page
(remap unused subpages to shared zeropage) increases the scanning cost of ksmd.
However, I still wouldn't simply classify this as a performance regression,
because commit b1f202060afe increases memory savings through this action — so
it saves memory at the cost of additional CPU overhead.
If you want to address the increased overhead on ksmd, I think we could add a
check for the shared zeropage in cmp_and_merge_page, and skip merging when a
shared zeropage is detected.
>>
>> AFAIU, KSM and THP do often conflict with each other. THP tries hard to collapse
>> a huge page (which may contain many zero pages). If KSM is enabled and part of
>> that huge page is mergeable, it can easily be split by KSM, rendering THP's
>> efforts futile.
>>
>> Therefore, in our actual production environment, we typically avoid making the
>> same region both KSM mergeable and THP always.
>
>THP=always is a global setting used in many production environments,
>so these features now interact very poorly together.
>
Actually, I have long thought about submitting a patch: add a new interface
'skip_huge_page' under KSM's sysfs, allowing users to choose not to split huge pages.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM
2026-06-09 13:42 ` Nico Pache
@ 2026-06-09 13:49 ` xu.xin16
2026-06-09 14:14 ` Lorenzo Stoakes
0 siblings, 1 reply; 15+ messages in thread
From: xu.xin16 @ 2026-06-09 13:49 UTC (permalink / raw)
To: npache, david
Cc: lance.yang, linux-kernel, linux-mm, usamaarif642, yuzhao,
aarcange, akpm, david, chengming.zhou, ljs, ziy, baolin.wang,
liam, ryan.roberts, dev.jain, baohua, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, apopple,
wang.yaxin
> > >AFAIU, KSM and THP do often conflict with each other. THP tries hard to collapse
> > >a huge page (which may contain many zero pages). If KSM is enabled and part of
> > >that huge page is mergeable, it can easily be split by KSM, rendering THP's
> > >efforts futile.
> > >
> > >Therefore, in our actual production environment, we typically avoid making the
> > >same region both KSM mergeable and THP always.
> >
> > Right, some setups may choose to avoid using KSM and THP always on the
> > same region. But that is not something the kernel can assume :)
> >
> > David noted in the RFC that QEMU may use both MADV_HUGEPAGE and
> > MADV_MERGEABLE, while KSM can be enabled later system-wide.
> >
> > And I think Nico means something different from KSM spliting THPs in
> > general.
> >
> > KSM has been able to split THP before. the new part from b1f202060afe is
> > that a KSM-triggered split can also remap zero-filled subpages to the
> > shared zeropage, outside KSM's own use_zero_pages/pages_to_scan controls.
> >
> > Maybe the changelog could spell that out :)
>
> Yeah maybe I didnt properly explain that :p
>
> After some thought I still think the alternative approach i mentioned
> in the RFC may be better.
>
> ie) prevent the zero-page merging that results from KSM splitting a
> folio. The check we add here is more general and will skip this
> zero-page merging with all MERGEABLE mappings, not just those causing
> the issue (the KSM splitting). The result is that even migrations, etc
> that are also MERGEABLE will be skipped needlessly.
>
> If we use this approach we also don't need the first patch of the series.
Yeah, That's also what I want to suggest.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM
2026-06-09 13:47 ` xu.xin16
@ 2026-06-09 14:07 ` Zi Yan
0 siblings, 0 replies; 15+ messages in thread
From: Zi Yan @ 2026-06-09 14:07 UTC (permalink / raw)
To: xu.xin16
Cc: npache, linux-kernel, linux-mm, usamaarif642, yuzhao, aarcange,
akpm, david, chengming.zhou, ljs, baolin.wang, liam, ryan.roberts,
dev.jain, baohua, lance.yang, matthew.brost, joshua.hahnjy,
rakie.kim, byungchul, gourry, ying.huang, apopple
On 9 Jun 2026, at 9:47, xu.xin16@zte.com.cn wrote:
>>>> Since commit b1f202060afe ("mm: remap unused subpages to shared zeropage
>>>> when splitting isolated thp"), splitting an anonymous THP remaps all
>>>> zero-filled subpages to the shared zeropage via TTU_USE_SHARED_ZEROPAGE.
>>>> This flag is set unconditionally for every anonymous folio split,
>>>> including splits triggered by KSM.
>>>>
>>>> When KSM is enabled with THP=always, this causes two regressions:
>>>>
>>>> 1. use_zero_pages=1: KSM calls try_to_merge_one_page() which triggers
>>>> split_huge_page(). The split remaps all 512 zero-filled subpages to
>>>> the shared zeropage at once, freeing the entire 2MB THP when KSM only
>>>> intended to process a single 4KB page. This bypasses KSM's
>>>> pages_to_scan rate limiting, causing ~1GB to be freed almost
>>>> instantly.
>>>>
>>>
>>> Why do you see it as regressions?
>>
>> Since the zero-page remapping was introduced our test has shown the
>> following behavior changes:
>>
>> With use_zero_pages=0, the merge rate drops from 60MB/s to ~6 MB/s
>> even after raising pages_to_scan. The KSM merging is now much slower
>> and CPU utilization has increased.
>>
>> With use_zero_pages=1, ~1 GB is freed almost instantly, and it no
>> longer respects the pages_to_scan behavior.
>>
>> Even with just this patch (1 & 2) or the RFC linked in the cover
>> letter, the issue no longer occurs.
>
> Understood. You're saying that the additional processing action in split_huge_page
> (remap unused subpages to shared zeropage) increases the scanning cost of ksmd.
>
> However, I still wouldn't simply classify this as a performance regression,
> because commit b1f202060afe increases memory savings through this action — so
> it saves memory at the cost of additional CPU overhead.
>
> If you want to address the increased overhead on ksmd, I think we could add a
> check for the shared zeropage in cmp_and_merge_page, and skip merging when a
> shared zeropage is detected.
>
>>>
>>> AFAIU, KSM and THP do often conflict with each other. THP tries hard to collapse
>>> a huge page (which may contain many zero pages). If KSM is enabled and part of
>>> that huge page is mergeable, it can easily be split by KSM, rendering THP's
>>> efforts futile.
>>>
>>> Therefore, in our actual production environment, we typically avoid making the
>>> same region both KSM mergeable and THP always.
>>
>> THP=always is a global setting used in many production environments,
>> so these features now interact very poorly together.
>>
>
> Actually, I have long thought about submitting a patch: add a new interface
> 'skip_huge_page' under KSM's sysfs, allowing users to choose not to split huge pages.
Just think out loud. Or just skip huge pages all the time unless memory pressure
is present. Basically treat KSM as a way of reducing memory pressure by merging
pages.
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH mm-unstable v1 1/3] mm/ksm: export ksm_is_running() to check KSM merge state
2026-06-09 11:46 ` [PATCH mm-unstable v1 1/3] mm/ksm: export ksm_is_running() to check KSM merge state Nico Pache
@ 2026-06-09 14:13 ` Lorenzo Stoakes
0 siblings, 0 replies; 15+ messages in thread
From: Lorenzo Stoakes @ 2026-06-09 14:13 UTC (permalink / raw)
To: Nico Pache
Cc: linux-kernel, linux-mm, Usama Arif, Yu Zhao, aarcange,
Andrew Morton, David Hildenbrand, Xu Xin, Chengming Zhou, Zi Yan,
Baolin Wang, Liam R. Howlett, Ryan Roberts, Dev Jain, Barry Song,
Lance Yang, Matthew Brost, Joshua Hahn, Rakie Kim, Byungchul Park,
Gregory Price, Ying Huang, Alistair Popple
On Tue, Jun 09, 2026 at 05:46:13AM -0600, Nico Pache wrote:
> Add ksm_is_running() which returns true when KSM is actively merging
> (ksm_run & KSM_RUN_MERGE).
>
> This will be used by try_to_map_unused_to_zeropage() to skip zero-page
> remapping in VM_MERGEABLE VMAs when KSM is active.
>
> Signed-off-by: Nico Pache <npache@redhat.com>
> ---
> include/linux/ksm.h | 6 ++++++
> mm/ksm.c | 6 ++++++
> 2 files changed, 12 insertions(+)
>
> diff --git a/include/linux/ksm.h b/include/linux/ksm.h
> index d39d0d5483a2..c1048b690a92 100644
> --- a/include/linux/ksm.h
> +++ b/include/linux/ksm.h
> @@ -17,6 +17,7 @@
> #ifdef CONFIG_KSM
> int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
> unsigned long end, int advice, vm_flags_t *vm_flags);
> +bool ksm_is_running(void);
> vma_flags_t ksm_vma_flags(struct mm_struct *mm, const struct file *file,
> vma_flags_t vma_flags);
> int ksm_enable_merge_any(struct mm_struct *mm);
> @@ -144,6 +145,11 @@ static inline int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
> return 0;
> }
>
> +static inline bool ksm_is_running(void)
> +{
> + return false;
> +}
> +
> static inline struct folio *ksm_might_need_to_copy(struct folio *folio,
> struct vm_area_struct *vma, unsigned long addr)
> {
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 7d5b76478f0b..edc2b961ff59 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -3015,6 +3015,12 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
> }
> EXPORT_SYMBOL_GPL(ksm_madvise);
>
> +bool ksm_is_running(void)
> +{
> + return ksm_run & KSM_RUN_MERGE;
> +}
> +EXPORT_SYMBOL_GPL(ksm_is_running);
Why are we exporting this? What modules need this? We should never export unless
we have a very specific reason to.
> +
> int __ksm_enter(struct mm_struct *mm)
> {
> struct ksm_mm_slot *mm_slot;
> --
> 2.54.0
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM
2026-06-09 13:49 ` xu.xin16
@ 2026-06-09 14:14 ` Lorenzo Stoakes
0 siblings, 0 replies; 15+ messages in thread
From: Lorenzo Stoakes @ 2026-06-09 14:14 UTC (permalink / raw)
To: xu.xin16
Cc: npache, david, lance.yang, linux-kernel, linux-mm, usamaarif642,
yuzhao, aarcange, akpm, chengming.zhou, ziy, baolin.wang, liam,
ryan.roberts, dev.jain, baohua, matthew.brost, joshua.hahnjy,
rakie.kim, byungchul, gourry, ying.huang, apopple, wang.yaxin
On Tue, Jun 09, 2026 at 09:49:47PM +0800, xu.xin16@zte.com.cn wrote:
> > > >AFAIU, KSM and THP do often conflict with each other. THP tries hard to collapse
> > > >a huge page (which may contain many zero pages). If KSM is enabled and part of
> > > >that huge page is mergeable, it can easily be split by KSM, rendering THP's
> > > >efforts futile.
> > > >
> > > >Therefore, in our actual production environment, we typically avoid making the
> > > >same region both KSM mergeable and THP always.
> > >
> > > Right, some setups may choose to avoid using KSM and THP always on the
> > > same region. But that is not something the kernel can assume :)
> > >
> > > David noted in the RFC that QEMU may use both MADV_HUGEPAGE and
> > > MADV_MERGEABLE, while KSM can be enabled later system-wide.
> > >
> > > And I think Nico means something different from KSM spliting THPs in
> > > general.
> > >
> > > KSM has been able to split THP before. the new part from b1f202060afe is
> > > that a KSM-triggered split can also remap zero-filled subpages to the
> > > shared zeropage, outside KSM's own use_zero_pages/pages_to_scan controls.
> > >
> > > Maybe the changelog could spell that out :)
> >
> > Yeah maybe I didnt properly explain that :p
> >
> > After some thought I still think the alternative approach i mentioned
> > in the RFC may be better.
> >
> > ie) prevent the zero-page merging that results from KSM splitting a
> > folio. The check we add here is more general and will skip this
> > zero-page merging with all MERGEABLE mappings, not just those causing
> > the issue (the KSM splitting). The result is that even migrations, etc
> > that are also MERGEABLE will be skipped needlessly.
> >
> > If we use this approach we also don't need the first patch of the series.
>
> Yeah, That's also what I want to suggest.
I guess I'll hold off on review until the new version is posted then.
This feels like something we should have worked out in the RFC though?
Thanks, Lorenzo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM
2026-06-09 11:46 ` [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM Nico Pache
2026-06-09 12:12 ` xu.xin16
@ 2026-06-09 14:26 ` Lorenzo Stoakes
1 sibling, 0 replies; 15+ messages in thread
From: Lorenzo Stoakes @ 2026-06-09 14:26 UTC (permalink / raw)
To: Nico Pache
Cc: linux-kernel, linux-mm, Usama Arif, Yu Zhao, aarcange,
Andrew Morton, David Hildenbrand, Xu Xin, Chengming Zhou, Zi Yan,
Baolin Wang, Liam R. Howlett, Ryan Roberts, Dev Jain, Barry Song,
Lance Yang, Matthew Brost, Joshua Hahn, Rakie Kim, Byungchul Park,
Gregory Price, Ying Huang, Alistair Popple
On Tue, Jun 09, 2026 at 05:46:14AM -0600, Nico Pache wrote:
> Since commit b1f202060afe ("mm: remap unused subpages to shared zeropage
> when splitting isolated thp"), splitting an anonymous THP remaps all
> zero-filled subpages to the shared zeropage via TTU_USE_SHARED_ZEROPAGE.
> This flag is set unconditionally for every anonymous folio split,
> including splits triggered by KSM.
>
> When KSM is enabled with THP=always, this causes two regressions:
>
> 1. use_zero_pages=1: KSM calls try_to_merge_one_page() which triggers
> split_huge_page(). The split remaps all 512 zero-filled subpages to
> the shared zeropage at once, freeing the entire 2MB THP when KSM only
> intended to process a single 4KB page. This bypasses KSM's
> pages_to_scan rate limiting, causing ~1GB to be freed almost
> instantly.
>
> 2. use_zero_pages=0: The same split side-effect occurs through the
> stable/unstable tree merge paths. Each pages_to_scan iteration
> triggers an expensive split_huge_page() that silently frees 2MB,
> while the scanner wastes cycles on tree searches for zero-filled
> pages that were already freed as a side-effect.
>
> Fix this by restricting TTU_USE_SHARED_ZEROPAGE being set in the case that
> KSM is running and the VMA has VM_MERGEABLE.
>
> Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
> Signed-off-by: Nico Pache <npache@redhat.com>
> ---
> mm/migrate.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index d9b23909d716..f410f972fc5e 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -304,6 +304,15 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
> if (PageCompound(page) || PageHWPoison(page))
> return false;
>
> + /*
> + * Let KSM handle the zero-filled page deduplication according to its
> + * own rate limit (pages_to_scan) and policy (use_zero_pages). Without
> + * this, a KSM-triggered THP split would remap all zero-filled subpages
> + * to the shared zeropage as a side effect.
> + */
> + if (ksm_is_running() && (pvmw->vma->vm_flags & VM_MERGEABLE))
Please use the new VMA flag API.
This would be vma_test(pvmw->vma, VMA_MERGEABLE_BIT).
> + return false;
> +
> VM_BUG_ON_PAGE(!PageAnon(page), page);
> VM_BUG_ON_PAGE(!PageLocked(page), page);
> VM_BUG_ON_PAGE(pte_present(old_pte), page);
> --
> 2.54.0
>
Thanks, Lorenzo
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2026-06-09 14:26 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09 11:46 [PATCH mm-unstable v1 0/3] MM: Tighten control over zero-page remapping Nico Pache
2026-06-09 11:46 ` [PATCH mm-unstable v1 1/3] mm/ksm: export ksm_is_running() to check KSM merge state Nico Pache
2026-06-09 14:13 ` Lorenzo Stoakes
2026-06-09 11:46 ` [PATCH mm-unstable v1 2/3] mm/migrate.c: Prevent folio splitting from interacting with KSM Nico Pache
2026-06-09 12:12 ` xu.xin16
2026-06-09 12:57 ` Nico Pache
2026-06-09 12:59 ` David Hildenbrand (Arm)
2026-06-09 13:47 ` xu.xin16
2026-06-09 14:07 ` Zi Yan
2026-06-09 13:06 ` Lance Yang
2026-06-09 13:42 ` Nico Pache
2026-06-09 13:49 ` xu.xin16
2026-06-09 14:14 ` Lorenzo Stoakes
2026-06-09 14:26 ` Lorenzo Stoakes
2026-06-09 11:46 ` [PATCH mm-unstable v1 3/3] mm/huge_memory.c: Skip zero-page remapping when underused THP shrinker is disabled Nico Pache
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.