From: Lorenzo Stoakes <ljs@kernel.org>
To: Barry Song <baohua@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>,
Matthew Wilcox <willy@infradead.org>,
akpm@linux-foundation.org, linux-mm@kvack.org, david@kernel.org,
liam@infradead.org, vbabka@kernel.org, rppt@kernel.org,
mhocko@suse.com, jack@suse.cz, pfalcato@suse.de,
wanglian@kylinos.cn, chentao@kylinos.cn, lianux.mm@gmail.com,
kunwu.chan@gmail.com, liyangouwen1@oppo.com, chrisl@kernel.org,
kasong@tencent.com, shikemeng@huaweicloud.com,
nphamcs@gmail.com, bhe@redhat.com, youngjun.park@lge.com,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, loongarch@lists.linux.dev,
linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org,
linux-s390@vger.kernel.org, Nanzhe Zhao <nzzhao@126.com>
Subject: Re: [PATCH v2 0/5] mm: reduce mmap_lock contention and improve page fault performance
Date: Wed, 20 May 2026 11:07:18 +0100 [thread overview]
Message-ID: <ag2GKrvRHzP4KhOl@lucifer> (raw)
In-Reply-To: <CAGsJ_4zN5ezh9vvvQDQdMF2KuuDGvkhNjTZWc0y0Lsa-P4Aahw@mail.gmail.com>
On Wed, May 20, 2026 at 05:07:16PM +0800, Barry Song wrote:
> On Wed, May 20, 2026 at 3:50 PM Lorenzo Stoakes <ljs@kernel.org> wrote:
> >
> > On Wed, May 20, 2026 at 05:18:52AM +0800, Barry Song wrote:
> > > On Tue, May 19, 2026 at 8:53 PM Lorenzo Stoakes <ljs@kernel.org> wrote:
> > > >
> > > > On Mon, May 18, 2026 at 12:56:59PM -0700, Suren Baghdasaryan wrote:
> > > >
> > > > > >
> > > > > > I think we either need to fix `fork()`, or keep the current
> > > > > > behavior of dropping the VMA lock before performing I/O.
> > > > >
> > > > > I see. So, this problem arises from the fact that we are changing the
> > > > > pagefaults requiring I/O operation to hold VMA lock...
> > > > > And you want to lock VMA on fork only if vma_is_anonymous(vma) ||
> > > > > is_cow_mapping(vma->vm_flags). So, we will be blocking page faults for
> > > > > anonymous and COW VMAs only while holding mmap_write_lock, preventing
> > > > > any VMA modification. On the surface, that looks ok to me but I might
> > > > > be missing some corner cases. If nobody sees any obvious issues, I
> > > > > think it's worth a try.
> > > >
> > > > Not sure if you noticed but I did raise concerns ;)
> > > >
> > > > I wonder if you've confused the fault path and fork here, as I think Barry has
> > > > been a little unclear on that.
> > >
> > > I think I’ve been absolutely clear :-)
> >
> > On this point sure, I would argue less so around the fork stuff but I responded
> > on that specifically elsewhere so let's keep things moving :>)
> >
> > > We should either stick to the current behavior - drop
> > > the VMA lock before doing I/O, or change fork() so that it
> > > does not wait on vma_start_write().
> >
> > Again, as I said elsewhere, I think there might be a 3rd way possibly. It's a
> > big mistake to assume that there are only specific solutions to problems in the
> > kernel then to present a false dichotomy.
>
> I recalled that when we discussed this part in my slides:
>
> ‘For simplicity, rather than using a whitelist mechanism for
> per-VMA retry, we could use a blacklist instead: default to
> always retry via the VMA lock, and only allow mmap_lock-based
> page-fault retry for specific cases such as
> __vmf_anon_prepare().’
Yeah that's an itneresting approach actually, sorry if I missed that.
>
> Suren mentioned introducing a FALLBACK flag. With the
> FALLBACK flag, we would retry via mmap_lock; with the RETRY
> flag, we would retry via the VMA lock.
Yeah, and honestly I'm beginning to wonder if we don't just have to pay the
complexity tax anyway and eat the fact we have to deal with that.
But as per Josef's comment re: this whole mechanism, simply not waiting for
file-backed I think is another option (but I don't recall where we left that
conversation actually?)
Anyway I want to make sure any complexity we add is necessary so will take a
look through patches and have a think (and obviously others will have their own
opinions!)
>
> Not sure whether this could really be called a ‘third way,’
> but it seems more like a shift from a whitelist model to a
> blacklist model, without changing the fundamental design, but
> it does change where we would need to touch the source code.
Right yeah, good to have more options.
>
> >
> > We absolutely hear you on this being a problem and it WILL be addressed one way
> > or another.
>
> Thanks. This is a bit of light in what has felt like a fairly
> dark situation. I really appreciate your thoughtful and
> responsible approach.
Yes, sorry, I maybe was a bit too harsh in my tone here, I didn't really intend
to be negative as to addresisng the problem as a whole.
Moreso I've been concerned about the fork approach, and that is what's led to me
being shall we say 'emphatic' about it :)
But of course I sometimes make mistakes in quite how my tone comes across, so
apologies if it came across overly negatively - I am negative (on a technical
level) about the fork approach, but not the fact we should address this.
To be clear - I'm very glad you've brought this up, it's important, as much as
it's painful that we have this issue in the first place! :)
>
> >
> > Of the two approaches, as I said elsewhere, I prefer what you've done in this
> > series to anything touching fork.
> >
> > But give me time to look through the series please (I'd also suggest RFC'ing
> > when it's something kinda fundamental that might generate converastion, makes
> > life a bit easier on the review side :)
>
> Thanks! Sure, I’m happy to wait and there’s no urgency.
>
> Last year you made quite a significant contribution to the work
> when I tried to remove mmap_lock in madvise. I really
> appreciated it. Now we’re back to the same lock again, just in
> different places.
Yeah :) one day maybe we can get rid of it altogether (maybe I'm dreaming :)
>
> Best Regards
> Barry
Cheers, Lorenzo
next prev parent reply other threads:[~2026-05-20 10:07 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-30 4:04 [PATCH v2 0/5] mm: reduce mmap_lock contention and improve page fault performance Barry Song (Xiaomi)
2026-04-30 4:04 ` [PATCH v2 1/5] mm/filemap: Retry fault by VMA lock if the lock was released for I/O Barry Song (Xiaomi)
2026-04-30 4:04 ` [PATCH v2 2/5] mm/swapin: Retry swapin " Barry Song (Xiaomi)
2026-04-30 4:04 ` [PATCH v2 3/5] mm: Move folio_lock_or_retry() and drop __folio_lock_or_retry() Barry Song (Xiaomi)
2026-04-30 4:04 ` [PATCH v2 4/5] mm: Don't retry page fault if folio is uptodate during swap-in Barry Song (Xiaomi)
2026-04-30 12:35 ` Matthew Wilcox
2026-05-01 16:11 ` Matthew Wilcox
2026-04-30 4:04 ` [PATCH v2 5/5] mm/filemap: Avoid retrying page faults on uptodate folios in filemap faults Barry Song (Xiaomi)
2026-04-30 12:37 ` [PATCH v2 0/5] mm: reduce mmap_lock contention and improve page fault performance Matthew Wilcox
2026-04-30 22:49 ` Barry Song
2026-05-01 14:56 ` Matthew Wilcox
2026-05-01 17:44 ` Barry Song
2026-05-01 17:57 ` Matthew Wilcox
2026-05-01 18:25 ` Barry Song
2026-05-01 19:39 ` Matthew Wilcox
2026-05-03 20:39 ` Barry Song
2026-05-03 13:13 ` Jan Kara
2026-05-03 19:55 ` Barry Song
2026-05-04 13:03 ` Jan Kara
2026-05-04 13:35 ` Barry Song
2026-05-04 14:15 ` Barry Song
2026-05-17 8:45 ` Barry Song
2026-05-18 9:46 ` Lorenzo Stoakes
2026-05-18 11:25 ` Barry Song
2026-05-18 16:17 ` Matthew Wilcox
2026-05-18 20:50 ` Barry Song
2026-05-18 19:56 ` Suren Baghdasaryan
2026-05-18 21:14 ` Barry Song
2026-05-19 12:45 ` Lorenzo Stoakes
2026-05-19 14:17 ` Liam R. Howlett
2026-05-19 22:01 ` Barry Song
2026-05-19 12:53 ` Lorenzo Stoakes
2026-05-19 21:18 ` Barry Song
2026-05-20 7:50 ` Lorenzo Stoakes
2026-05-20 9:07 ` Barry Song
2026-05-20 10:07 ` Lorenzo Stoakes [this message]
2026-05-20 5:51 ` Suren Baghdasaryan
2026-05-20 10:33 ` David Hildenbrand (Arm)
2026-05-20 12:55 ` Lorenzo Stoakes
2026-05-19 12:43 ` Lorenzo Stoakes
2026-05-18 9:53 ` David Hildenbrand (Arm)
2026-05-19 13:42 ` Lorenzo Stoakes
2026-05-18 21:21 ` Yang Shi
2026-05-19 11:07 ` Barry Song
2026-05-19 13:34 ` Lorenzo Stoakes
2026-05-19 18:50 ` Yang Shi
2026-05-19 20:53 ` Yang Shi
2026-05-19 13:12 ` Lorenzo Stoakes
2026-05-19 13:39 ` Lorenzo Stoakes
2026-05-19 18:41 ` Yang Shi
2026-05-19 21:02 ` Yang Shi
2026-05-20 8:11 ` Lorenzo Stoakes
2026-05-01 15:52 ` Lorenzo Stoakes
2026-05-01 16:06 ` Matthew Wilcox
2026-05-01 17:09 ` Lorenzo Stoakes
2026-05-01 17:59 ` Barry Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ag2GKrvRHzP4KhOl@lucifer \
--to=ljs@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=bhe@redhat.com \
--cc=chentao@kylinos.cn \
--cc=chrisl@kernel.org \
--cc=david@kernel.org \
--cc=jack@suse.cz \
--cc=kasong@tencent.com \
--cc=kunwu.chan@gmail.com \
--cc=liam@infradead.org \
--cc=lianux.mm@gmail.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-s390@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=liyangouwen1@oppo.com \
--cc=loongarch@lists.linux.dev \
--cc=mhocko@suse.com \
--cc=nphamcs@gmail.com \
--cc=nzzhao@126.com \
--cc=pfalcato@suse.de \
--cc=rppt@kernel.org \
--cc=shikemeng@huaweicloud.com \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=wanglian@kylinos.cn \
--cc=willy@infradead.org \
--cc=youngjun.park@lge.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox