From: Ahmed Elaidy <elaidya225@gmail.com>
To: stable@vger.kernel.org
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
lorenzo.stoakes@oracle.com, avagin@gmail.com,
Pedro Falcato <pfalcato@suse.de>,
Vlastimil Babka <vbabka@suse.cz>,
"David Hildenbrand (Red Hat)" <david@kernel.org>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Barry Song <baohua@kernel.org>, Dev Jain <dev.jain@arm.com>,
Jann Horn <jannh@google.com>, Jonathan Corbet <corbet@lwn.net>,
Lance Yang <lance.yang@linux.dev>,
Liam Howlett <liam.howlett@oracle.com>,
"Masami Hiramatsu (Google)" <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Michal Hocko <mhocko@suse.com>, Mike Rapoport <rppt@kernel.org>,
Nico Pache <npache@redhat.com>,
Ryan Roberts <ryan.roberts@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Suren Baghdasaryan <surenb@google.com>, Zi Yan <ziy@nvidia.com>,
Ahmed Elaidy <elaidya225@gmail.com>
Subject: [PATCH v1 5/9] mm: introduce copy-on-fork VMAs and make VM_MAYBE_GUARD one
Date: Sat, 25 Apr 2026 00:12:39 +0300 [thread overview]
Message-ID: <20260424211315.1072123-6-elaidya225@gmail.com> (raw)
In-Reply-To: <20260424211315.1072123-1-elaidya225@gmail.com>
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Gather all the VMA flags whose presence implies that page tables must be
copied on fork into a single bitmap - VM_COPY_ON_FORK - and use this
rather than specifying individual flags in vma_needs_copy().
We also add VM_MAYBE_GUARD to this list, as it being set on a VMA implies
that there may be metadata contained in the page tables (that is - guard
markers) which would will not and cannot be propagated upon fork.
This was already being done manually previously in vma_needs_copy(), but
this makes it very explicit, alongside VM_PFNMAP, VM_MIXEDMAP and
VM_UFFD_WP all of which imply the same.
Note that VM_STICKY flags ought generally to be marked VM_COPY_ON_FORK too
- because equally a flag being VM_STICKY indicates that the VMA contains
metadat that is not propagated by being faulted in - i.e. that the VMA
metadata does not fully describe the VMA alone, and thus we must propagate
whatever metadata there is on a fork.
However, for maximum flexibility, we do not make this necessarily the case
here.
Link: https://lkml.kernel.org/r/5d41b24e7bc622cda0af92b6d558d7f4c0d1bc8c.1763460113.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit ab04b530e7e8bd5cf9fb0c1ad20e0deee8f569ec)
Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
---
include/linux/mm.h | 26 ++++++++++++++++++++++++++
mm/memory.c | 18 ++++--------------
tools/testing/vma/vma_internal.h | 26 ++++++++++++++++++++++++++
3 files changed, 56 insertions(+), 14 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ebc5fb0af5ce..2bad4bf67d0f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -538,6 +538,32 @@ extern unsigned int kobjsize(const void *objp);
*/
#define VM_IGNORE_MERGE (VM_SOFTDIRTY | VM_STICKY)
+/*
+ * Flags which should result in page tables being copied on fork. These are
+ * flags which indicate that the VMA maps page tables which cannot be
+ * reconsistuted upon page fault, so necessitate page table copying upon
+ *
+ * VM_PFNMAP / VM_MIXEDMAP - These contain kernel-mapped data which cannot be
+ * reasonably reconstructed on page fault.
+ *
+ * VM_UFFD_WP - Encodes metadata about an installed uffd
+ * write protect handler, which cannot be
+ * reconstructed on page fault.
+ *
+ * We always copy pgtables when dst_vma has uffd-wp
+ * enabled even if it's file-backed
+ * (e.g. shmem). Because when uffd-wp is enabled,
+ * pgtable contains uffd-wp protection information,
+ * that's something we can't retrieve from page cache,
+ * and skip copying will lose those info.
+ *
+ * VM_MAYBE_GUARD - Could contain page guard region markers which
+ * by design are a property of the page tables
+ * only and thus cannot be reconstructed on page
+ * fault.
+ */
+#define VM_COPY_ON_FORK (VM_PFNMAP | VM_MIXEDMAP | VM_UFFD_WP | VM_MAYBE_GUARD)
+
/*
* mapping from the currently active vm_flags protection bits (the
* low four bits) to a page protection mask..
diff --git a/mm/memory.c b/mm/memory.c
index dde20cd5fa5b..1dd68a0e2be4 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1463,25 +1463,15 @@ copy_p4d_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
static bool
vma_needs_copy(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma)
{
+ if (src_vma->vm_flags & VM_COPY_ON_FORK)
+ return true;
/*
- * Always copy pgtables when dst_vma has uffd-wp enabled even if it's
- * file-backed (e.g. shmem). Because when uffd-wp is enabled, pgtable
- * contains uffd-wp protection information, that's something we can't
- * retrieve from page cache, and skip copying will lose those info.
+ * The presence of an anon_vma indicates an anonymous VMA has page
+ * tables which naturally cannot be reconstituted on page fault.
*/
- if (userfaultfd_wp(dst_vma))
- return true;
-
- if (src_vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
- return true;
-
if (src_vma->anon_vma)
return true;
- /* Guard regions have modified page tables that require copying. */
- if (src_vma->vm_flags & VM_MAYBE_GUARD)
- return true;
-
/*
* Don't copy ptes where a page fault will fill them correctly. Fork
* becomes much lighter when there are big shared or private readonly
diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
index 19914c28977e..6ee803873e00 100644
--- a/tools/testing/vma/vma_internal.h
+++ b/tools/testing/vma/vma_internal.h
@@ -145,6 +145,32 @@ extern unsigned long dac_mmap_min_addr;
*/
#define VM_IGNORE_MERGE (VM_SOFTDIRTY | VM_STICKY)
+/*
+ * Flags which should result in page tables being copied on fork. These are
+ * flags which indicate that the VMA maps page tables which cannot be
+ * reconsistuted upon page fault, so necessitate page table copying upon
+ *
+ * VM_PFNMAP / VM_MIXEDMAP - These contain kernel-mapped data which cannot be
+ * reasonably reconstructed on page fault.
+ *
+ * VM_UFFD_WP - Encodes metadata about an installed uffd
+ * write protect handler, which cannot be
+ * reconstructed on page fault.
+ *
+ * We always copy pgtables when dst_vma has uffd-wp
+ * enabled even if it's file-backed
+ * (e.g. shmem). Because when uffd-wp is enabled,
+ * pgtable contains uffd-wp protection information,
+ * that's something we can't retrieve from page cache,
+ * and skip copying will lose those info.
+ *
+ * VM_MAYBE_GUARD - Could contain page guard region markers which
+ * by design are a property of the page tables
+ * only and thus cannot be reconstructed on page
+ * fault.
+ */
+#define VM_COPY_ON_FORK (VM_PFNMAP | VM_MIXEDMAP | VM_UFFD_WP | VM_MAYBE_GUARD)
+
#define FIRST_USER_ADDRESS 0UL
#define USER_PGTABLES_CEILING 0UL
--
2.53.0
next prev parent reply other threads:[~2026-04-24 21:13 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-24 21:12 [PATCH 6.18.y v1 0/9] mm: backport sticky VMA flags and soft-dirty fix Ahmed Elaidy
2026-04-24 21:12 ` [PATCH v1 1/9] mm: introduce VM_MAYBE_GUARD and make visible in /proc/$pid/smaps Ahmed Elaidy
2026-04-24 21:12 ` [PATCH v1 2/9] mm: add atomic VMA flags and set VM_MAYBE_GUARD as such Ahmed Elaidy
2026-04-24 21:12 ` [PATCH v1 3/9] mm: update vma_modify_flags() to handle residual flags, document Ahmed Elaidy
2026-04-24 21:12 ` [PATCH v1 4/9] mm: implement sticky VMA flags Ahmed Elaidy
2026-04-24 21:12 ` Ahmed Elaidy [this message]
2026-04-24 21:12 ` [PATCH v1 6/9] mm: set the VM_MAYBE_GUARD flag on guard region install Ahmed Elaidy
2026-04-24 21:12 ` [PATCH v1 7/9] tools/testing/vma: add VMA sticky userland tests Ahmed Elaidy
2026-04-24 21:12 ` [PATCH v1 8/9] mm: propagate VM_SOFTDIRTY on merge Ahmed Elaidy
2026-04-24 21:12 ` [PATCH v1 9/9] testing/selftests/mm: add soft-dirty merge self-test Ahmed Elaidy
2026-04-24 21:55 ` [PATCH 6.18.y v1 0/9] mm: backport sticky VMA flags and soft-dirty fix Andrei Vagin
2026-04-24 22:11 ` [PATCH v2] mm: fix VM_SOFTDIRTY propagation on VMA merge Ahmed Elaidy
2026-05-04 16:42 ` Andrei Vagin
2026-05-04 19:54 ` Ahmed Elaidy
2026-05-04 19:54 ` [PATCH 6.18.y v3] " Ahmed Elaidy
2026-05-04 19:58 ` [PATCH v2] " Ahmed Elaidy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260424211315.1072123-6-elaidya225@gmail.com \
--to=elaidya225@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=avagin@gmail.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=corbet@lwn.net \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=jannh@google.com \
--cc=lance.yang@linux.dev \
--cc=liam.howlett@oracle.com \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=pfalcato@suse.de \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=stable@vger.kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox