From: "David Hildenbrand (arm)" <david@kernel.org>
To: Hugh Dickins <hughd@google.com>, xu.xin16@zte.com.cn
Cc: akpm@linux-foundation.org, chengming.zhou@linux.dev,
wang.yaxin@zte.com.cn, yang.yang29@zte.com.cn,
Michel Lespinasse <michel@lespinasse.org>,
Lorenzo Stoakes <ljs@kernel.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/2] ksm: Optimize rmap_walk_ksm by passing a suitable address range
Date: Mon, 6 Apr 2026 11:21:41 +0200 [thread overview]
Message-ID: <8481d0dd-2471-4acc-a429-cbc02451a812@kernel.org> (raw)
In-Reply-To: <02e1b8df-d568-8cbb-b8f6-46d5476d9d75@google.com>
On 4/5/26 06:44, Hugh Dickins wrote:
> On Thu, 12 Feb 2026, xu.xin16@zte.com.cn wrote:
>
>> From: xu xin <xu.xin16@zte.com.cn>
>>
>> Problem
>> =======
>> When available memory is extremely tight, causing KSM pages to be swapped
>> out, or when there is significant memory fragmentation and THP triggers
>> memory compaction, the system will invoke the rmap_walk_ksm function to
>> perform reverse mapping. However, we observed that this function becomes
>> particularly time-consuming when a large number of VMAs (e.g., 20,000)
>> share the same anon_vma. Through debug trace analysis, we found that most
>> of the latency occurs within anon_vma_interval_tree_foreach, leading to an
>> excessively long hold time on the anon_vma lock (even reaching 500ms or
>> more), which in turn causes upper-layer applications (waiting for the
>> anon_vma lock) to be blocked for extended periods.
>>
>> Root Cause
>> ==========
>> Further investigation revealed that 99.9% of iterations inside the
>> anon_vma_interval_tree_foreach loop are skipped due to the first check
>> "if (addr < vma->vm_start || addr >= vma->vm_end)), indicating that a large
>> number of loop iterations are ineffective. This inefficiency arises because
>> the pgoff_start and pgoff_end parameters passed to
>> anon_vma_interval_tree_foreach span the entire address space from 0 to
>> ULONG_MAX, resulting in very poor loop efficiency.
>>
>> Solution
>> ========
>> In fact, we can significantly improve performance by passing a more precise
>> range based on the given addr. Since the original pages merged by KSM
>> correspond to anonymous VMAs, the page offset can be calculated as
>> pgoff = address >> PAGE_SHIFT. Therefore, we can optimize the call by
>> defining:
>>
>> pgoff = rmap_item->address >> PAGE_SHIFT;
>>
>> Performance
>> ===========
>> In our real embedded Linux environment, the measured metrcis were as
>> follows:
>>
>> 1) Time_ms: Max time for holding anon_vma lock in a single rmap_walk_ksm.
>> 2) Nr_iteration_total: The max times of iterations in a loop of anon_vma_interval_tree_foreach
>> 3) Skip_addr_out_of_range: The max times of skipping due to the first check (vma->vm_start
>> and vma->vm_end) in a loop of anon_vma_interval_tree_foreach.
>> 4) Skip_mm_mismatch: The max times of skipping due to the second check (rmap_item->mm == vma->vm_mm)
>> in a loop of anon_vma_interval_tree_foreach.
>>
>> The result is as follows:
>>
>> Time_ms Nr_iteration_total Skip_addr_out_of_range Skip_mm_mismatch
>> Before: 228.65 22169 22168 0
>> After : 0.396 3 0 2
>>
>> The referenced reproducer of rmap_walk_ksm can be found at:
>> https://lore.kernel.org/all/20260206151424734QIyWL_pA-1QeJPbJlUxsO@zte.com.cn/
>>
>> Co-developed-by: Wang Yaxin <wang.yaxin@zte.com.cn>
>> Signed-off-by: Wang Yaxin <wang.yaxin@zte.com.cn>
>> Signed-off-by: xu xin <xu.xin16@zte.com.cn>
>
> This is a very attractive speedup, but I believe it's flawed: in the
> special case when a range has been mremap-moved, when its anon folio
> indexes and anon_vma pgoff correspond to the original user address,
> not to the current user address.
[as discussed in earlier versions of this patch set]
mremap() breaks KSM in the range to be moved.
See prep_move_vma()->ksm_madvise(MADV_UNMERGEABLE)
So I am not sure what you say can trigger.
But I'm just scrolling by, as I'm still busy celebrating Easter :)
--
Cheers,
David
next prev parent reply other threads:[~2026-04-06 9:21 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-12 11:28 [PATCH v3 0/2] KSM: Optimizations for rmap_walk_ksm xu.xin16
2026-02-12 11:29 ` [PATCH v3 1/2] ksm: Initialize the addr only once in rmap_walk_ksm xu.xin16
2026-02-12 11:30 ` [PATCH v3 2/2] ksm: Optimize rmap_walk_ksm by passing a suitable address range xu.xin16
2026-02-12 12:21 ` David Hildenbrand (Arm)
2026-04-05 4:44 ` Hugh Dickins
2026-04-05 21:01 ` Andrew Morton
2026-04-07 9:43 ` Lorenzo Stoakes (Oracle)
2026-04-07 21:21 ` Andrew Morton
2026-04-08 6:29 ` Lorenzo Stoakes
2026-04-06 1:58 ` xu.xin16
2026-04-06 5:35 ` Hugh Dickins
2026-04-07 6:21 ` xu.xin16
2026-04-07 9:36 ` Lorenzo Stoakes (Oracle)
2026-04-08 12:57 ` David Hildenbrand (Arm)
2026-04-09 9:18 ` Lorenzo Stoakes
2026-04-09 9:37 ` David Hildenbrand (Arm)
2026-04-09 9:41 ` David Hildenbrand (Arm)
2026-04-09 9:53 ` Lorenzo Stoakes
2026-04-09 9:56 ` David Hildenbrand (Arm)
2026-04-09 9:55 ` David Hildenbrand (Arm)
2026-04-09 9:59 ` Lorenzo Stoakes
2026-04-09 10:56 ` 答复: " xu.xin16
2026-04-09 11:59 ` David Hildenbrand (Arm)
2026-04-09 12:26 ` David Hildenbrand (Arm)
2026-04-10 8:06 ` xu.xin16
2026-04-10 9:06 ` David Hildenbrand (Arm)
2026-04-09 10:06 ` xu.xin16
2026-04-09 10:09 ` Lorenzo Stoakes
2026-04-06 9:21 ` David Hildenbrand (arm) [this message]
2026-04-06 9:23 ` David Hildenbrand (arm)
2026-04-07 9:39 ` Lorenzo Stoakes (Oracle)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8481d0dd-2471-4acc-a429-cbc02451a812@kernel.org \
--to=david@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=michel@lespinasse.org \
--cc=wang.yaxin@zte.com.cn \
--cc=xu.xin16@zte.com.cn \
--cc=yang.yang29@zte.com.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.