From: Lokesh Gidra <lokeshgidra@google.com>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Zi Yan <ziy@nvidia.com>, Barry Song <21cnbao@gmail.com>,
"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>,
Peter Xu <peterx@redhat.com>,
Suren Baghdasaryan <surenb@google.com>,
Kalesh Singh <kaleshsingh@google.com>,
android-mm <android-mm@google.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
Jann Horn <jannh@google.com>, Rik van Riel <riel@surriel.com>,
Vlastimil Babka <vbabka@suse.cz>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
David Hildenbrand <david@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [DISCUSSION] Unconditionally lock folios when calling rmap_walk()
Date: Sat, 23 Aug 2025 21:18:11 -0700 [thread overview]
Message-ID: <CA+EESO7j4dY3KjBWybTG6uQmXJ8kyhBrid3rTk5XAP7poZOhYQ@mail.gmail.com> (raw)
In-Reply-To: <CA+EESO4Z6wtX7ZMdDHQRe5jAAS_bQ-POq5+4aDx5jh2DvY6UHg@mail.gmail.com>
On Fri, Aug 22, 2025 at 10:29 AM Lokesh Gidra <lokeshgidra@google.com> wrote:
>
> Hi all,
>
> Currently, some callers of rmap_walk() conditionally avoid try-locking
> non-ksm anon folios. This necessitates serialization through anon_vma
> write-lock elsewhere when folio->mapping and/or folio->index (fields
> involved in rmap_walk()) are to be updated. This hurts scalability due
> to coarse granularity of the lock. For instance, when multiple threads
> invoke userfaultfd’s MOVE ioctl simultaneously to move distinct pages
> from the same src VMA, they all contend for the corresponding
> anon_vma’s lock. Field traces for arm64 android devices reveal over
> 30ms of uninterruptible sleep in the main UI thread, leading to janky
> user interactions.
>
> Among all rmap_walk() callers that don’t lock anon folios,
> folio_referenced() is the most critical (others are
> page_idle_clear_pte_refs(), damon_folio_young(), and
> damon_folio_mkold()). The relevant code in folio_referenced() is:
>
> if (!is_locked && (!folio_test_anon(folio) || folio_test_ksm(folio))) {
> we_locked = folio_trylock(folio);
> if (!we_locked)
> return 1;
> }
>
> It’s unclear why locking anon_vma exclusively (when updating
> folio->mapping, like in uffd MOVE) is beneficial over walking rmap
> with folio locked. It’s in the reclaim path, so should not be a
> critical path that necessitates some special treatment, unless I’m
> missing something.
>
> Therefore, I propose simplifying the locking mechanism by ensuring the
> folio is locked before calling rmap_walk(). This helps avoid locking
> anon_vma when updating folio->mapping, which, for instance, will help
> eliminate the uninterruptible sleep observed in the field traces
> mentioned earlier. Furthermore, it enables us to simplify the code in
> folio_lock_anon_vma_read() by removing the re-check to ensure that the
> field hasn’t changed under us.
Hi Harry,
Your comment [1] in the other thread was quite useful and also needed
to be responded to. So bringing it here for continuing discussion.
It seems from your comment that you misunderstood my proposal. I am
not suggesting replacing anon_vma lock with folio lock during rmap
walk. Clearly, it is essential for all the reasons that you
enumerated. My proposal is to lock anon folios during rmap_walk(),
like file and KSM folios.
This helps in improving scalability (and also simplifying code in
folio_lock_anon_vma_read()) as then we can serialize on folio lock
instead of anon_vma lock when moving the folio to a different root
anon_vma in folio_move_anon_rmap() [2].
[1] https://lore.kernel.org/all/aKhIL3OguViS9myH@hyeyoo/
[2] https://lore.kernel.org/all/e5d41fbe-a91b-9491-7b93-733f67e75a54@redhat.com/
>
> Thanks,
> Lokesh
next prev parent reply other threads:[~2025-08-24 4:18 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-22 17:29 [DISCUSSION] Unconditionally lock folios when calling rmap_walk() Lokesh Gidra
2025-08-22 17:44 ` Matthew Wilcox
2025-08-22 18:08 ` Lorenzo Stoakes
2025-08-24 4:18 ` Lokesh Gidra [this message]
2025-08-24 5:31 ` Harry Yoo
2025-08-24 6:45 ` Lokesh Gidra
2025-08-25 10:52 ` Harry Yoo
2025-08-28 11:32 ` Lorenzo Stoakes
2025-08-25 15:19 ` David Hildenbrand
2025-08-25 18:46 ` Lokesh Gidra
2025-08-28 12:04 ` Lorenzo Stoakes
2025-08-29 0:23 ` Lokesh Gidra
2025-08-29 8:42 ` David Hildenbrand
2025-08-29 9:04 ` Lorenzo Stoakes
2025-09-02 18:59 ` Lokesh Gidra
2025-09-02 19:01 ` David Hildenbrand
2025-09-02 19:04 ` Lokesh Gidra
2025-08-29 9:01 ` Lorenzo Stoakes
2025-08-26 15:52 ` Lorenzo Stoakes
2025-08-26 22:23 ` Lokesh Gidra
2025-08-28 11:31 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CA+EESO7j4dY3KjBWybTG6uQmXJ8kyhBrid3rTk5XAP7poZOhYQ@mail.gmail.com \
--to=lokeshgidra@google.com \
--cc=21cnbao@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=android-mm@google.com \
--cc=david@redhat.com \
--cc=harry.yoo@oracle.com \
--cc=jannh@google.com \
--cc=kaleshsingh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=peterx@redhat.com \
--cc=riel@surriel.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).