From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Barry Song <baohua@kernel.org>,
Matthew Wilcox <willy@infradead.org>,
surenb@google.com
Cc: akpm@linux-foundation.org, linux-mm@kvack.org, ljs@kernel.org,
liam@infradead.org, vbabka@kernel.org, rppt@kernel.org,
mhocko@suse.com, jack@suse.cz, pfalcato@suse.de,
wanglian@kylinos.cn, chentao@kylinos.cn, lianux.mm@gmail.com,
kunwu.chan@gmail.com, liyangouwen1@oppo.com, chrisl@kernel.org,
kasong@tencent.com, shikemeng@huaweicloud.com, nphamcs@gmail.com,
bhe@redhat.com, youngjun.park@lge.com,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, loongarch@lists.linux.dev,
linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org,
linux-s390@vger.kernel.org, Nanzhe Zhao <nzzhao@126.com>
Subject: Re: [PATCH v2 0/5] mm: reduce mmap_lock contention and improve page fault performance
Date: Mon, 18 May 2026 11:53:37 +0200 [thread overview]
Message-ID: <b65722ee-6476-4038-bfbb-44a32b3544fd@kernel.org> (raw)
In-Reply-To: <CAGsJ_4ysMcrmDLSOwBkf7qwCQrcDWeEMXkHDajTJFMLKUk0bSQ@mail.gmail.com>
On 5/17/26 10:45, Barry Song wrote:
> On Sat, May 2, 2026 at 1:58 AM Matthew Wilcox <willy@infradead.org> wrote:
>>
>> On Sat, May 02, 2026 at 01:44:34AM +0800, Barry Song wrote:
>>>
>>> It doesn’t have to involve unmapping or applying mprotect to
>>> the entire VMA—just a portion of it is sufficient.
>>
>> Yes, but that still fails to answer "does this actually happen". How much
>> performance is all this complexity in the page fault handler buying us?
>> If you don't answer this question, I'm just going to go in and rip it
>> all out.
>>
>
> Hi Matthew (and Lorenzo, Jan, and anyone else who may be
> waiting for answers),
>
> As promised during LSF/MM/BPF, we conducted thorough
> testing on Android phones to determine whether performing
> I/O in `filemap_fault()` can block `vma_start_write()`.
> I wanted to give a quick update on this question.
>
> Nanzhe at Xiaomi created tracing scripts and ran various
> applications on Android devices with I/O performed under
> the VMA lock in `filemap_fault()`. We found that:
>
> 1. There are very few cases where unmap() is blocked by
> page faults. I assume this is due to buggy user code
> or poor synchronization between reads and unmap().
> So I assume it is not a problem.
>
> 2. We observed many cases where `vma_start_write()`
> is blocked by page-fault I/O in some applications.
> The blocking occurs in the `dup_mmap()` path during
> fork().
>
> With Suren's commit fb49c455323ff ("fork: lock VMAs of
> the parent process when forking"), we now always hold
> `vma_write_lock()` for each VMA. Note that the
> `mmap_lock` write lock is also held, which could lead to
> chained waiting if page-fault I/O is performed without
> releasing the VMA lock.
>
> My gut feeling is that Suren's commit may be overshooting,
> so my rough idea is that we might want to do something like
> the following (we haven't tested it yet and it might be
> wrong):
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 2311ae7c2ff4..5ddaf297f31a 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1762,7 +1762,13 @@ __latent_entropy int dup_mmap(struct mm_struct
> *mm, struct mm_struct *oldmm)
> for_each_vma(vmi, mpnt) {
> struct file *file;
>
> - retval = vma_start_write_killable(mpnt);
> + /*
> + * For anonymous or writable private VMAs, prevent
> + * concurrent CoW faults.
> + */
> + if (!mpnt->vm_file || (!(mpnt->vm_flags & VM_SHARED) &&
> + (mpnt->vm_flags & VM_WRITE)))
> + retval = vma_start_write_killable(mpnt);
Likely is_cow_mapping() is what you would want to check to handle VMAs that
could have anonymous pages in them.
--
Cheers,
David
WARNING: multiple messages have this Message-ID (diff)
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Barry Song <baohua@kernel.org>,
Matthew Wilcox <willy@infradead.org>,
surenb@google.com
Cc: akpm@linux-foundation.org, linux-mm@kvack.org, ljs@kernel.org,
liam@infradead.org, vbabka@kernel.org, rppt@kernel.org,
mhocko@suse.com, jack@suse.cz, pfalcato@suse.de,
wanglian@kylinos.cn, chentao@kylinos.cn, lianux.mm@gmail.com,
kunwu.chan@gmail.com, liyangouwen1@oppo.com, chrisl@kernel.org,
kasong@tencent.com, shikemeng@huaweicloud.com, nphamcs@gmail.com,
bhe@redhat.com, youngjun.park@lge.com,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, loongarch@lists.linux.dev,
linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org,
linux-s390@vger.kernel.org, Nanzhe Zhao <nzzhao@126.com>
Subject: Re: [PATCH v2 0/5] mm: reduce mmap_lock contention and improve page fault performance
Date: Mon, 18 May 2026 11:53:37 +0200 [thread overview]
Message-ID: <b65722ee-6476-4038-bfbb-44a32b3544fd@kernel.org> (raw)
In-Reply-To: <CAGsJ_4ysMcrmDLSOwBkf7qwCQrcDWeEMXkHDajTJFMLKUk0bSQ@mail.gmail.com>
On 5/17/26 10:45, Barry Song wrote:
> On Sat, May 2, 2026 at 1:58 AM Matthew Wilcox <willy@infradead.org> wrote:
>>
>> On Sat, May 02, 2026 at 01:44:34AM +0800, Barry Song wrote:
>>>
>>> It doesn’t have to involve unmapping or applying mprotect to
>>> the entire VMA—just a portion of it is sufficient.
>>
>> Yes, but that still fails to answer "does this actually happen". How much
>> performance is all this complexity in the page fault handler buying us?
>> If you don't answer this question, I'm just going to go in and rip it
>> all out.
>>
>
> Hi Matthew (and Lorenzo, Jan, and anyone else who may be
> waiting for answers),
>
> As promised during LSF/MM/BPF, we conducted thorough
> testing on Android phones to determine whether performing
> I/O in `filemap_fault()` can block `vma_start_write()`.
> I wanted to give a quick update on this question.
>
> Nanzhe at Xiaomi created tracing scripts and ran various
> applications on Android devices with I/O performed under
> the VMA lock in `filemap_fault()`. We found that:
>
> 1. There are very few cases where unmap() is blocked by
> page faults. I assume this is due to buggy user code
> or poor synchronization between reads and unmap().
> So I assume it is not a problem.
>
> 2. We observed many cases where `vma_start_write()`
> is blocked by page-fault I/O in some applications.
> The blocking occurs in the `dup_mmap()` path during
> fork().
>
> With Suren's commit fb49c455323ff ("fork: lock VMAs of
> the parent process when forking"), we now always hold
> `vma_write_lock()` for each VMA. Note that the
> `mmap_lock` write lock is also held, which could lead to
> chained waiting if page-fault I/O is performed without
> releasing the VMA lock.
>
> My gut feeling is that Suren's commit may be overshooting,
> so my rough idea is that we might want to do something like
> the following (we haven't tested it yet and it might be
> wrong):
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 2311ae7c2ff4..5ddaf297f31a 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1762,7 +1762,13 @@ __latent_entropy int dup_mmap(struct mm_struct
> *mm, struct mm_struct *oldmm)
> for_each_vma(vmi, mpnt) {
> struct file *file;
>
> - retval = vma_start_write_killable(mpnt);
> + /*
> + * For anonymous or writable private VMAs, prevent
> + * concurrent CoW faults.
> + */
> + if (!mpnt->vm_file || (!(mpnt->vm_flags & VM_SHARED) &&
> + (mpnt->vm_flags & VM_WRITE)))
> + retval = vma_start_write_killable(mpnt);
Likely is_cow_mapping() is what you would want to check to handle VMAs that
could have anonymous pages in them.
--
Cheers,
David
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2026-05-18 9:53 UTC|newest]
Thread overview: 145+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-30 4:04 [PATCH v2 0/5] mm: reduce mmap_lock contention and improve page fault performance Barry Song (Xiaomi)
2026-04-30 4:04 ` Barry Song (Xiaomi)
2026-04-30 4:04 ` [PATCH v2 1/5] mm/filemap: Retry fault by VMA lock if the lock was released for I/O Barry Song (Xiaomi)
2026-04-30 4:04 ` Barry Song (Xiaomi)
2026-04-30 4:04 ` [PATCH v2 2/5] mm/swapin: Retry swapin " Barry Song (Xiaomi)
2026-04-30 4:04 ` Barry Song (Xiaomi)
2026-04-30 4:04 ` [PATCH v2 3/5] mm: Move folio_lock_or_retry() and drop __folio_lock_or_retry() Barry Song (Xiaomi)
2026-04-30 4:04 ` Barry Song (Xiaomi)
2026-04-30 4:04 ` [PATCH v2 4/5] mm: Don't retry page fault if folio is uptodate during swap-in Barry Song (Xiaomi)
2026-04-30 4:04 ` Barry Song (Xiaomi)
2026-04-30 12:35 ` Matthew Wilcox
2026-04-30 12:35 ` Matthew Wilcox
2026-05-01 16:11 ` Matthew Wilcox
2026-05-01 16:11 ` Matthew Wilcox
2026-04-30 4:04 ` [PATCH v2 5/5] mm/filemap: Avoid retrying page faults on uptodate folios in filemap faults Barry Song (Xiaomi)
2026-04-30 4:04 ` Barry Song (Xiaomi)
2026-04-30 12:37 ` [PATCH v2 0/5] mm: reduce mmap_lock contention and improve page fault performance Matthew Wilcox
2026-04-30 12:37 ` Matthew Wilcox
2026-04-30 22:49 ` Barry Song
2026-04-30 22:49 ` Barry Song
2026-05-01 14:56 ` Matthew Wilcox
2026-05-01 14:56 ` Matthew Wilcox
2026-05-01 17:44 ` Barry Song
2026-05-01 17:44 ` Barry Song
2026-05-01 17:57 ` Matthew Wilcox
2026-05-01 17:57 ` Matthew Wilcox
2026-05-01 18:25 ` Barry Song
2026-05-01 18:25 ` Barry Song
2026-05-01 19:39 ` Matthew Wilcox
2026-05-01 19:39 ` Matthew Wilcox
2026-05-03 20:39 ` Barry Song
2026-05-03 20:39 ` Barry Song
2026-05-03 13:13 ` Jan Kara
2026-05-03 13:13 ` Jan Kara
2026-05-03 19:55 ` Barry Song
2026-05-03 19:55 ` Barry Song
2026-05-04 13:03 ` Jan Kara
2026-05-04 13:03 ` Jan Kara
2026-05-04 13:35 ` Barry Song
2026-05-04 13:35 ` Barry Song
2026-05-04 14:15 ` Barry Song
2026-05-04 14:15 ` Barry Song
2026-05-17 8:45 ` Barry Song
2026-05-17 8:45 ` Barry Song
2026-05-18 9:46 ` Lorenzo Stoakes
2026-05-18 9:46 ` Lorenzo Stoakes
2026-05-18 11:25 ` Barry Song
2026-05-18 11:25 ` Barry Song
2026-05-18 16:17 ` Matthew Wilcox
2026-05-18 16:17 ` Matthew Wilcox
2026-05-18 20:50 ` Barry Song
2026-05-18 20:50 ` Barry Song
2026-05-18 19:56 ` Suren Baghdasaryan
2026-05-18 19:56 ` Suren Baghdasaryan
2026-05-18 21:14 ` Barry Song
2026-05-18 21:14 ` Barry Song
2026-05-19 12:45 ` Lorenzo Stoakes
2026-05-19 12:45 ` Lorenzo Stoakes
2026-05-19 14:17 ` Liam R. Howlett
2026-05-19 14:17 ` Liam R. Howlett
2026-05-19 22:01 ` Barry Song
2026-05-19 22:01 ` Barry Song
2026-05-20 21:04 ` Matthew Wilcox
2026-05-20 21:04 ` Matthew Wilcox
2026-05-20 21:14 ` Barry Song
2026-05-20 21:14 ` Barry Song
2026-05-20 21:15 ` Matthew Wilcox
2026-05-20 21:15 ` Matthew Wilcox
2026-05-20 21:35 ` David Hildenbrand (Arm)
2026-05-20 21:35 ` David Hildenbrand (Arm)
2026-05-20 23:37 ` Barry Song
2026-05-20 23:37 ` Barry Song
2026-05-22 15:53 ` Lorenzo Stoakes
2026-05-22 15:53 ` Lorenzo Stoakes
2026-05-22 21:31 ` Barry Song
2026-05-22 21:31 ` Barry Song
2026-05-22 2:33 ` Barry Song (Xiaomi)
2026-05-22 2:33 ` Barry Song (Xiaomi)
2026-05-22 13:09 ` Matthew Wilcox
2026-05-22 13:09 ` Matthew Wilcox
2026-05-22 13:36 ` Barry Song
2026-05-22 13:36 ` Barry Song
2026-05-22 13:48 ` Barry Song
2026-05-22 13:48 ` Barry Song
2026-05-22 15:42 ` Lorenzo Stoakes
2026-05-22 15:42 ` Lorenzo Stoakes
2026-05-19 12:53 ` Lorenzo Stoakes
2026-05-19 12:53 ` Lorenzo Stoakes
2026-05-19 21:18 ` Barry Song
2026-05-19 21:18 ` Barry Song
2026-05-20 7:50 ` Lorenzo Stoakes
2026-05-20 7:50 ` Lorenzo Stoakes
2026-05-20 9:07 ` Barry Song
2026-05-20 9:07 ` Barry Song
2026-05-20 10:07 ` Lorenzo Stoakes
2026-05-20 10:07 ` Lorenzo Stoakes
2026-05-20 16:20 ` Suren Baghdasaryan
2026-05-20 16:20 ` Suren Baghdasaryan
2026-05-20 5:51 ` Suren Baghdasaryan
2026-05-20 5:51 ` Suren Baghdasaryan
2026-05-22 15:39 ` Lorenzo Stoakes
2026-05-22 15:39 ` Lorenzo Stoakes
2026-05-20 10:33 ` David Hildenbrand (Arm)
2026-05-20 10:33 ` David Hildenbrand (Arm)
2026-05-20 12:55 ` Lorenzo Stoakes
2026-05-20 12:55 ` Lorenzo Stoakes
2026-05-20 21:39 ` Yang Shi
2026-05-20 21:39 ` Yang Shi
2026-05-22 15:37 ` Lorenzo Stoakes
2026-05-22 15:37 ` Lorenzo Stoakes
2026-05-19 12:43 ` Lorenzo Stoakes
2026-05-19 12:43 ` Lorenzo Stoakes
2026-05-18 9:53 ` David Hildenbrand (Arm) [this message]
2026-05-18 9:53 ` David Hildenbrand (Arm)
2026-05-19 13:42 ` Lorenzo Stoakes
2026-05-19 13:42 ` Lorenzo Stoakes
2026-05-18 21:21 ` Yang Shi
2026-05-18 21:21 ` Yang Shi
2026-05-19 11:07 ` Barry Song
2026-05-19 11:07 ` Barry Song
2026-05-19 13:34 ` Lorenzo Stoakes
2026-05-19 13:34 ` Lorenzo Stoakes
2026-05-19 18:50 ` Yang Shi
2026-05-19 18:50 ` Yang Shi
2026-05-19 20:53 ` Yang Shi
2026-05-19 20:53 ` Yang Shi
2026-05-19 13:12 ` Lorenzo Stoakes
2026-05-19 13:12 ` Lorenzo Stoakes
2026-05-19 13:39 ` Lorenzo Stoakes
2026-05-19 13:39 ` Lorenzo Stoakes
2026-05-19 18:41 ` Yang Shi
2026-05-19 18:41 ` Yang Shi
2026-05-19 21:02 ` Yang Shi
2026-05-19 21:02 ` Yang Shi
2026-05-20 8:11 ` Lorenzo Stoakes
2026-05-20 8:11 ` Lorenzo Stoakes
2026-05-01 15:52 ` Lorenzo Stoakes
2026-05-01 15:52 ` Lorenzo Stoakes
2026-05-01 16:06 ` Matthew Wilcox
2026-05-01 16:06 ` Matthew Wilcox
2026-05-01 17:09 ` Lorenzo Stoakes
2026-05-01 17:09 ` Lorenzo Stoakes
2026-05-01 17:59 ` Barry Song
2026-05-01 17:59 ` Barry Song
2026-05-20 2:04 ` Hillf Danton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b65722ee-6476-4038-bfbb-44a32b3544fd@kernel.org \
--to=david@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=bhe@redhat.com \
--cc=chentao@kylinos.cn \
--cc=chrisl@kernel.org \
--cc=jack@suse.cz \
--cc=kasong@tencent.com \
--cc=kunwu.chan@gmail.com \
--cc=liam@infradead.org \
--cc=lianux.mm@gmail.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-s390@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=liyangouwen1@oppo.com \
--cc=ljs@kernel.org \
--cc=loongarch@lists.linux.dev \
--cc=mhocko@suse.com \
--cc=nphamcs@gmail.com \
--cc=nzzhao@126.com \
--cc=pfalcato@suse.de \
--cc=rppt@kernel.org \
--cc=shikemeng@huaweicloud.com \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=wanglian@kylinos.cn \
--cc=willy@infradead.org \
--cc=youngjun.park@lge.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.