From: Matthew Wilcox <willy@infradead.org>
To: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: linux-mm@kvack.org, liam.howlett@oracle.com, surenb@google.com,
ldufour@linux.ibm.com, michel@lespinasse.org, vbabka@suse.cz,
linux-kernel@vger.kernel.org
Subject: Re: [QUESTION] about the maple tree and current status of mmap_lock scalability
Date: Thu, 29 Dec 2022 16:51:37 +0000 [thread overview]
Message-ID: <Y63FmaNoLAcdsLaU@casper.infradead.org> (raw)
In-Reply-To: <Y62ipKlWGEbJZKXv@hyeyoo>
On Thu, Dec 29, 2022 at 11:22:28PM +0900, Hyeonggon Yoo wrote:
> On Wed, Dec 28, 2022 at 08:50:36PM +0000, Matthew Wilcox wrote:
> > The long term goal is even larger than this. Ideally, the VMA tree
> > would be protected by a spinlock rather than a mutex.
>
> You mean replacing mmap_lock rwsem with a spinlock?
> How is that possible if readers can take it for page fault?
The mmap_lock is taken for many, many things. So the plan was to
have a spinlock in the maple tree (indeed, there's still one there;
it's just in a union with the lockdep_map_p). VMA readers would walk
the tree protected only by RCU; VMA writers would take the spinlock
while modifying the tree. The work Suren, Liam & I are engaged in
still uses the mmap semaphore for writers, but we do walk the tree
under RCU protection.
> > While I've read the RCUVM paper, I wouldn't say it was particularly an
> > inspiration. The Maple Tree is independent of the VM; it's a general
> > purpose B-tree.
>
> My intention was to ask how to synchronize with other VMA operations
> after the tree traversal with RCU. (Because it's unreasonable to handle
> page fault in RCU read-side critical section)
>
> Per-VMA lock seem to solve it by taking the VMA lock in read mode within
> RCU read-side critical section.
Right, but it's a little more complex than that. The real "lock" on
the VMA is actually a sequence count. https://lwn.net/Articles/906852/
does a good job of explaining it, but the VMA lock is really there as
a convenient way for the writer to wait for readers to be sufficiently
"finished" with handling the page fault that any conflicting changes
will be correctly retired.
https://www.infradead.org/~willy/linux/store-free-page-faults.html
outlines how I intend to proceed from Suren's current scheme (where
RCU is only used to protect the tree walk) to using RCU for the
entire page fault.
next prev parent reply other threads:[~2022-12-29 16:51 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-28 12:48 [QUESTION] about the maple tree and current status of mmap_lock scalability Hyeonggon Yoo
2022-12-28 17:10 ` Suren Baghdasaryan
2022-12-29 11:33 ` Hyeonggon Yoo
2022-12-28 20:50 ` Matthew Wilcox
2022-12-29 14:22 ` Hyeonggon Yoo
2022-12-29 16:51 ` Matthew Wilcox [this message]
2022-12-29 17:10 ` Lorenzo Stoakes
2022-12-29 17:21 ` Suren Baghdasaryan
2022-12-29 17:31 ` Matthew Wilcox
2023-01-02 12:04 ` Hyeonggon Yoo
2023-01-02 14:37 ` Matthew Wilcox
2023-02-20 14:26 ` Hyeonggon Yoo
2023-02-20 14:43 ` Matthew Wilcox
2023-02-22 11:38 ` Hyeonggon Yoo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y63FmaNoLAcdsLaU@casper.infradead.org \
--to=willy@infradead.org \
--cc=42.hyeyoo@gmail.com \
--cc=ldufour@linux.ibm.com \
--cc=liam.howlett@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=michel@lespinasse.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).