public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/5] KSM: Optimizations for rmap_walk_ksm
@ 2026-05-03 12:35 xu.xin16
  2026-05-03 12:39 ` [PATCH v4 1/5] mm/rmap: add tracepoint for rmap_walk xu.xin16
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: xu.xin16 @ 2026-05-03 12:35 UTC (permalink / raw)
  To: akpm, david, ljs, hughd; +Cc: linux-mm, linux-kernel, michel, xu.xin16

From: xu xin <xu.xin16@zte.com.cn>

Deep investigation revealed that rmap_walk_ksm's 99.9% of iterations inside
the anon_vma_interval_tree_foreach loop are skipped due to the first check
"if (addr < vma->vm_start || addr >= vma->vm_end)), indicating that a large
number of loop iterations are ineffective. This inefficiency arises because
the pgoff_start and pgoff_end parameters passed to
anon_vma_interval_tree_foreach span the entire address space from 0 to
ULONG_MAX, resulting in very poor loop efficiency.

An initial immature thought was using the "rmap_item->address >> PAGE_SHIFT"
to be the pgoff passed into anon_vma_interval_tree_foreach(). But this is
flawed because when a range has been mremap-moved, when its anon folio
indexes and anon_vma pgoff correspond to the original user address,
not to the current user address, which was pointed out at:

  https://lore.kernel.org/all/02e1b8df-d568-8cbb-b8f6-46d5476d9d75@google.com/

According to the implementation of anon_vma_interval_tree_foreach —
it essentially iterates to find a suitable VMA such that the provided pgoff falls
within the VMA's range [vm_pgoff, vm_pgoff + vma_pages(v) - 1].

So the solution is to add vm_pgoff field in ksm_rmap_item and use vm_pgoff instead of
address >> PAGE_SHIFT.


Changes in v4:
 - Add a tracepoint for rmap_walk
 - Provide a testbench for rmap_walk
 - Add vm_pgoff field in ksm_rmap_item
 - use vm_pgoff instead of address >> PAGE_SHIFT (Suggested by David and Lorenzo)

Changes in v3:
- Fix some typos in commit description
- Replace "pgoff_start" and 'pgoff_end' by 'pgoff'.

Changes in v2:
- Use const variable to initialize 'addr'  "pgoff_start" and 'pgoff_end'
- Let pgoff_end = pgoff_start, since KSM folios are always order-0 (Suggested by David)

xu xin (5):
  mm/rmap: add tracepoint for rmap_walk
  tools/testing: add rmap walk latency benchmark for KSM, anonymous and
    file pages
  ksm: add vm_pgoff into ksm_rmap_item
  ksm: Optimize rmap_walk_ksm by passing a suitable address range
  ksm: add mremap selftests for ksm_rmap_walk

 MAINTAINERS                          |   3 +
 include/trace/events/rmap.h          |  49 +++
 mm/ksm.c                             |  48 ++-
 mm/rmap.c                            |  14 +
 tools/testing/rmap/Makefile          |  11 +
 tools/testing/rmap/rmap_benchmark.c  | 488 +++++++++++++++++++++++++++
 tools/testing/selftests/mm/rmap.c    |  79 +++++
 tools/testing/selftests/mm/vm_util.c |  38 +++
 tools/testing/selftests/mm/vm_util.h |   2 +
 9 files changed, 724 insertions(+), 8 deletions(-)
 create mode 100644 include/trace/events/rmap.h
 create mode 100644 tools/testing/rmap/Makefile
 create mode 100644 tools/testing/rmap/rmap_benchmark.c

-- 
2.25.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-05-03 14:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-03 12:35 [PATCH v4 0/5] KSM: Optimizations for rmap_walk_ksm xu.xin16
2026-05-03 12:39 ` [PATCH v4 1/5] mm/rmap: add tracepoint for rmap_walk xu.xin16
2026-05-03 12:42 ` [PATCH v4 2/5] tools/testing: add rmap walk latency benchmark for KSM, anonymous and file pages xu.xin16
2026-05-03 12:48 ` [PATCH v4 3/5] ksm: add vm_pgoff into ksm_rmap_item xu.xin16
2026-05-03 12:50 ` [PATCH v4 4/5] ksm: Optimize rmap_walk_ksm by passing a suitable address range xu.xin16
2026-05-03 12:51 ` [PATCH v4 5/5] ksm: add mremap selftests for ksm_rmap_walk xu.xin16
2026-05-03 14:59 ` [PATCH v4 0/5] KSM: Optimizations for rmap_walk_ksm Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox