public inbox for mm-commits@vger.kernel.org
 help / color / mirror / Atom feed
* + ksm-optimize-rmap_walk_ksm-by-passing-a-suitable-address-range.patch added to mm-unstable branch
@ 2026-04-05 21:01 Andrew Morton
  0 siblings, 0 replies; only message in thread
From: Andrew Morton @ 2026-04-05 21:01 UTC (permalink / raw)
  To: mm-commits, yang.yang29, wang.yaxin, michel, ljs, hughd, david,
	chengming.zhou, xu.xin16, akpm


The patch titled
     Subject: ksm: optimize rmap_walk_ksm by passing a suitable address range
has been added to the -mm mm-unstable branch.  Its filename is
     ksm-optimize-rmap_walk_ksm-by-passing-a-suitable-address-range.patch

This patch will shortly appear at
     https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/ksm-optimize-rmap_walk_ksm-by-passing-a-suitable-address-range.patch

This patch will later appear in the mm-unstable branch at
    git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next via various
branches at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there most days

------------------------------------------------------
From: xu xin <xu.xin16@zte.com.cn>
Subject: ksm: optimize rmap_walk_ksm by passing a suitable address range
Date: Thu, 12 Feb 2026 19:30:45 +0800 (CST)

Problem
=======
When available memory is extremely tight, causing KSM pages to be swapped
out, or when there is significant memory fragmentation and THP triggers
memory compaction, the system will invoke the rmap_walk_ksm function to
perform reverse mapping.  However, we observed that this function becomes
particularly time-consuming when a large number of VMAs (e.g., 20,000)
share the same anon_vma.  Through debug trace analysis, we found that most
of the latency occurs within anon_vma_interval_tree_foreach, leading to an
excessively long hold time on the anon_vma lock (even reaching 500ms or
more), which in turn causes upper-layer applications (waiting for the
anon_vma lock) to be blocked for extended periods.

Root Cause
==========
Further investigation revealed that 99.9% of iterations inside the
anon_vma_interval_tree_foreach loop are skipped due to the first check "if
(addr < vma->vm_start || addr >= vma->vm_end)), indicating that a large
number of loop iterations are ineffective.  This inefficiency arises
because the pgoff_start and pgoff_end parameters passed to
anon_vma_interval_tree_foreach span the entire address space from 0 to
ULONG_MAX, resulting in very poor loop efficiency.

Solution
========
In fact, we can significantly improve performance by passing a more precise
range based on the given addr. Since the original pages merged by KSM
correspond to anonymous VMAs, the page offset can be calculated as
pgoff = address >> PAGE_SHIFT. Therefore, we can optimize the call by
defining:

	pgoff = rmap_item->address >> PAGE_SHIFT;

Performance
===========
In our real embedded Linux environment, the measured metrcis were as
follows:

1) Time_ms: Max time for holding anon_vma lock in a single
   rmap_walk_ksm.

2) Nr_iteration_total: The max times of iterations in a loop of
   anon_vma_interval_tree_foreach

3) Skip_addr_out_of_range: The max times of skipping due to the first
   check (vma->vm_start and vma->vm_end) in a loop of
   anon_vma_interval_tree_foreach.

4) Skip_mm_mismatch: The max times of skipping due to the second check
   (rmap_item->mm == vma->vm_mm) in a loop of
   anon_vma_interval_tree_foreach.

The result is as follows:

         Time_ms      Nr_iteration_total    Skip_addr_out_of_range   Skip_mm_mismatch
Before:  228.65       22169                 22168                    0
After :   0.396        3                     0                       2

The referenced reproducer of rmap_walk_ksm can be found at:
https://lore.kernel.org/all/20260206151424734QIyWL_pA-1QeJPbJlUxsO@zte.com.cn/

Link: https://lkml.kernel.org/r/20260212193045556CbzCX8p9gDu73tQ2nvHEI@zte.com.cn
Co-developed-by: Wang Yaxin <wang.yaxin@zte.com.cn>
Signed-off-by: Wang Yaxin <wang.yaxin@zte.com.cn>
Signed-off-by: xu xin <xu.xin16@zte.com.cn>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Hugh Dickins <hughd@google.com>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Michel Lespinasse <michel@lespinasse.org>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/ksm.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

--- a/mm/ksm.c~ksm-optimize-rmap_walk_ksm-by-passing-a-suitable-address-range
+++ a/mm/ksm.c
@@ -3173,6 +3173,7 @@ again:
 	hlist_for_each_entry(rmap_item, &stable_node->hlist, hlist) {
 		/* Ignore the stable/unstable/sqnr flags */
 		const unsigned long addr = rmap_item->address & PAGE_MASK;
+		const pgoff_t pgoff = rmap_item->address >> PAGE_SHIFT;
 		struct anon_vma *anon_vma = rmap_item->anon_vma;
 		struct anon_vma_chain *vmac;
 		struct vm_area_struct *vma;
@@ -3186,8 +3187,12 @@ again:
 			anon_vma_lock_read(anon_vma);
 		}
 
+		/*
+		 * Currently KSM folios are order-0 normal pages, so pgoff_end
+		 * should be the same as pgoff_start.
+		 */
 		anon_vma_interval_tree_foreach(vmac, &anon_vma->rb_root,
-					       0, ULONG_MAX) {
+					       pgoff, pgoff) {
 
 			cond_resched();
 			vma = vmac->vma;
_

Patches currently in -mm which might be from xu.xin16@zte.com.cn are

ksm-optimize-rmap_walk_ksm-by-passing-a-suitable-address-range.patch


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2026-04-05 21:01 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-05 21:01 + ksm-optimize-rmap_walk_ksm-by-passing-a-suitable-address-range.patch added to mm-unstable branch Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox