From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, "David Hildenbrand" <david@redhat.com>,
"Jiri Slaby" <jirislaby@kernel.org>,
"Holger Hoffstätte" <holger@applied-asynchrony.com>,
"Jacob Young" <jacobly.alt@gmail.com>,
"Suren Baghdasaryan" <surenb@google.com>,
"Linus Torvalds" <torvalds@linux-foundation.org>
Subject: [PATCH 6.4 8/8] fork: lock VMAs of the parent process when forking, again
Date: Sun, 9 Jul 2023 13:14:14 +0200 [thread overview]
Message-ID: <20230709111345.543400132@linuxfoundation.org> (raw)
In-Reply-To: <20230709111345.297026264@linuxfoundation.org>
From: Suren Baghdasaryan <surenb@google.com>
commit fb49c455323ff8319a123dd312be9082c49a23a5 upstream.
When forking a child process, the parent write-protects anonymous pages
and COW-shares them with the child being forked using copy_present_pte().
We must not take any concurrent page faults on the source vma's as they
are being processed, as we expect both the vma and the pte's behind it
to be stable. For example, the anon_vma_fork() expects the parents
vma->anon_vma to not change during the vma copy.
A concurrent page fault on a page newly marked read-only by the page
copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the
source vma, defeating the anon_vma_clone() that wasn't done because the
parent vma originally didn't have an anon_vma, but we now might end up
copying a pte entry for a page that has one.
Before the per-vma lock based changes, the mmap_lock guaranteed
exclusion with concurrent page faults. But now we need to do a
vma_start_write() to make sure no concurrent faults happen on this vma
while it is being processed.
This fix can potentially regress some fork-heavy workloads. Kernel
build time did not show noticeable regression on a 56-core machine while
a stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~5% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Suggested-by: David Hildenbrand <david@redhat.com>
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/
Reported-by: Jacob Young <jacobly.alt@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
Cc: stable@vger.kernel.org
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/fork.c | 1 +
1 file changed, 1 insertion(+)
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -696,6 +696,7 @@ static __latent_entropy int dup_mmap(str
for_each_vma(old_vmi, mpnt) {
struct file *file;
+ vma_start_write(mpnt);
if (mpnt->vm_flags & VM_DONTCOPY) {
vm_stat_account(mm, mpnt->vm_flags, -vma_pages(mpnt));
continue;
prev parent reply other threads:[~2023-07-09 11:14 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-09 11:14 [PATCH 6.4 0/8] 6.4.3-rc1 review Greg Kroah-Hartman
2023-07-09 11:14 ` [PATCH 6.4 1/8] mm: disable CONFIG_PER_VMA_LOCK until its fixed Greg Kroah-Hartman
2023-07-09 11:14 ` [PATCH 6.4 2/8] mm: lock a vma before stack expansion Greg Kroah-Hartman
2023-07-09 11:14 ` [PATCH 6.4 3/8] mm: lock newly mapped VMA which can be modified after it becomes visible Greg Kroah-Hartman
2023-07-09 11:14 ` [PATCH 6.4 4/8] mm: lock newly mapped VMA with corrected ordering Greg Kroah-Hartman
2023-07-09 11:14 ` [PATCH 6.4 5/8] mm: call arch_swap_restore() from do_swap_page() Greg Kroah-Hartman
2023-07-09 11:14 ` [PATCH 6.4 6/8] bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page Greg Kroah-Hartman
2023-07-09 11:14 ` [PATCH 6.4 7/8] fork: lock VMAs of the parent process when forking Greg Kroah-Hartman
2023-07-09 12:39 ` Thorsten Leemhuis
2023-07-09 13:32 ` Greg Kroah-Hartman
2023-07-09 16:04 ` Suren Baghdasaryan
2023-07-09 16:09 ` Greg Kroah-Hartman
2023-07-09 19:53 ` Suren Baghdasaryan
2023-07-09 20:24 ` Suren Baghdasaryan
2023-07-09 20:40 ` Greg Kroah-Hartman
2023-07-09 11:14 ` Greg Kroah-Hartman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230709111345.543400132@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=david@redhat.com \
--cc=holger@applied-asynchrony.com \
--cc=jacobly.alt@gmail.com \
--cc=jirislaby@kernel.org \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=surenb@google.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).