From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Lorenzo Stoakes <ljs@kernel.org>,
Pedro Falcato <pfalcato@suse.de>,
Vlastimil Babka <vbabka@suse.cz>,
"David Hildenbrand (Red Hat)" <david@kernel.org>,
Andrei Vagin <avagin@gmail.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Barry Song <baohua@kernel.org>, Dev Jain <dev.jain@arm.com>,
Jann Horn <jannh@google.com>, Jonathan Corbet <corbet@lwn.net>,
Lance Yang <lance.yang@linux.dev>,
Liam Howlett <liam.howlett@oracle.com>,
"Masami Hiramatsu (Google)" <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Michal Hocko <mhocko@suse.com>, Mike Rapoport <rppt@kernel.org>,
Nico Pache <npache@redhat.com>,
Ryan Roberts <ryan.roberts@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Suren Baghdasaryan <surenb@google.com>, Zi Yan <ziy@nvidia.com>,
Andrew Morton <akpm@linux-foundation.org>,
Ahmed Elaidy <elaidya225@gmail.com>
Subject: [PATCH 6.18 38/60] mm: introduce copy-on-fork VMAs and make VM_MAYBE_GUARD one
Date: Thu, 25 Jun 2026 14:03:23 +0100 [thread overview]
Message-ID: <20260625125651.220918535@linuxfoundation.org> (raw)
In-Reply-To: <20260625125645.554579168@linuxfoundation.org>
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
commit ab04b530e7e8bd5cf9fb0c1ad20e0deee8f569ec upstream.
Gather all the VMA flags whose presence implies that page tables must be
copied on fork into a single bitmap - VM_COPY_ON_FORK - and use this
rather than specifying individual flags in vma_needs_copy().
We also add VM_MAYBE_GUARD to this list, as it being set on a VMA implies
that there may be metadata contained in the page tables (that is - guard
markers) which would will not and cannot be propagated upon fork.
This was already being done manually previously in vma_needs_copy(), but
this makes it very explicit, alongside VM_PFNMAP, VM_MIXEDMAP and
VM_UFFD_WP all of which imply the same.
Note that VM_STICKY flags ought generally to be marked VM_COPY_ON_FORK too
- because equally a flag being VM_STICKY indicates that the VMA contains
metadat that is not propagated by being faulted in - i.e. that the VMA
metadata does not fully describe the VMA alone, and thus we must propagate
whatever metadata there is on a fork.
However, for maximum flexibility, we do not make this necessarily the case
here.
Link: https://lkml.kernel.org/r/5d41b24e7bc622cda0af92b6d558d7f4c0d1bc8c.1763460113.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/linux/mm.h | 26 ++++++++++++++++++++++++++
mm/memory.c | 18 ++++--------------
tools/testing/vma/vma_internal.h | 26 ++++++++++++++++++++++++++
3 files changed, 56 insertions(+), 14 deletions(-)
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -539,6 +539,32 @@ extern unsigned int kobjsize(const void
#define VM_IGNORE_MERGE (VM_SOFTDIRTY | VM_STICKY)
/*
+ * Flags which should result in page tables being copied on fork. These are
+ * flags which indicate that the VMA maps page tables which cannot be
+ * reconsistuted upon page fault, so necessitate page table copying upon
+ *
+ * VM_PFNMAP / VM_MIXEDMAP - These contain kernel-mapped data which cannot be
+ * reasonably reconstructed on page fault.
+ *
+ * VM_UFFD_WP - Encodes metadata about an installed uffd
+ * write protect handler, which cannot be
+ * reconstructed on page fault.
+ *
+ * We always copy pgtables when dst_vma has uffd-wp
+ * enabled even if it's file-backed
+ * (e.g. shmem). Because when uffd-wp is enabled,
+ * pgtable contains uffd-wp protection information,
+ * that's something we can't retrieve from page cache,
+ * and skip copying will lose those info.
+ *
+ * VM_MAYBE_GUARD - Could contain page guard region markers which
+ * by design are a property of the page tables
+ * only and thus cannot be reconstructed on page
+ * fault.
+ */
+#define VM_COPY_ON_FORK (VM_PFNMAP | VM_MIXEDMAP | VM_UFFD_WP | VM_MAYBE_GUARD)
+
+/*
* mapping from the currently active vm_flags protection bits (the
* low four bits) to a page protection mask..
*/
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1479,25 +1479,15 @@ copy_p4d_range(struct vm_area_struct *ds
static bool
vma_needs_copy(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma)
{
+ if (src_vma->vm_flags & VM_COPY_ON_FORK)
+ return true;
/*
- * Always copy pgtables when dst_vma has uffd-wp enabled even if it's
- * file-backed (e.g. shmem). Because when uffd-wp is enabled, pgtable
- * contains uffd-wp protection information, that's something we can't
- * retrieve from page cache, and skip copying will lose those info.
+ * The presence of an anon_vma indicates an anonymous VMA has page
+ * tables which naturally cannot be reconstituted on page fault.
*/
- if (userfaultfd_wp(dst_vma))
- return true;
-
- if (src_vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
- return true;
-
if (src_vma->anon_vma)
return true;
- /* Guard regions have modified page tables that require copying. */
- if (src_vma->vm_flags & VM_MAYBE_GUARD)
- return true;
-
/*
* Don't copy ptes where a page fault will fill them correctly. Fork
* becomes much lighter when there are big shared or private readonly
--- a/tools/testing/vma/vma_internal.h
+++ b/tools/testing/vma/vma_internal.h
@@ -145,6 +145,32 @@ extern unsigned long dac_mmap_min_addr;
*/
#define VM_IGNORE_MERGE (VM_SOFTDIRTY | VM_STICKY)
+/*
+ * Flags which should result in page tables being copied on fork. These are
+ * flags which indicate that the VMA maps page tables which cannot be
+ * reconsistuted upon page fault, so necessitate page table copying upon
+ *
+ * VM_PFNMAP / VM_MIXEDMAP - These contain kernel-mapped data which cannot be
+ * reasonably reconstructed on page fault.
+ *
+ * VM_UFFD_WP - Encodes metadata about an installed uffd
+ * write protect handler, which cannot be
+ * reconstructed on page fault.
+ *
+ * We always copy pgtables when dst_vma has uffd-wp
+ * enabled even if it's file-backed
+ * (e.g. shmem). Because when uffd-wp is enabled,
+ * pgtable contains uffd-wp protection information,
+ * that's something we can't retrieve from page cache,
+ * and skip copying will lose those info.
+ *
+ * VM_MAYBE_GUARD - Could contain page guard region markers which
+ * by design are a property of the page tables
+ * only and thus cannot be reconstructed on page
+ * fault.
+ */
+#define VM_COPY_ON_FORK (VM_PFNMAP | VM_MIXEDMAP | VM_UFFD_WP | VM_MAYBE_GUARD)
+
#define FIRST_USER_ADDRESS 0UL
#define USER_PGTABLES_CEILING 0UL
next prev parent reply other threads:[~2026-06-25 13:08 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-25 13:02 [PATCH 6.18 00/60] 6.18.37-rc1 review Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 01/60] io_uring/net: Avoid msghdr on op_connect/op_bind async data Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 02/60] net: stmmac: fix stm32 (and potentially others) resume regression Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 03/60] fuse: re-lock request before replacing page cache folio Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 04/60] Revert "NFSD: Defer sub-object cleanup in export put callbacks" Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 05/60] debugobjects: Allow to refill the pool before SYSTEM_SCHEDULING Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 06/60] debugobjects: Use LD_WAIT_CONFIG instead of LD_WAIT_SLEEP Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 07/60] debugobjects: Do not fill_pool() if pi_blocked_on Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 08/60] debugobjects: Dont call fill_pool() in early boot hardirq context Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 09/60] RDMA/bnxt_re: zero shared page before exposing to userspace Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 10/60] i2c: stub: Reject I2C block transfers with invalid length Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 11/60] net: qualcomm: rmnet: fix endpoint use-after-free in rmnet_dellink() Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 12/60] agp/amd64: Fix broken error propagation in agp_amd64_probe() Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 13/60] ACPI: scan: Use async schedule function in acpi_scan_clear_dep_fn() Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 14/60] rose: fix dev_put() leak in rose_loopback_timer() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 15/60] rose: hold loopback neighbour reference across timer callback Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 16/60] rose: fix race between loopback timer and module removal Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 17/60] rose: clear neighbour pointer after rose_neigh_put() in state machines Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 18/60] rose: guard rose_neigh_put() against NULL in timer expiry Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 19/60] rose: fix netdev double-hold in rose_rx_call_request() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 20/60] rose: fix notifier unregistered too early in rose_exit() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 21/60] rose: set SOCK_DESTROY in rose_kill_by_device() for prompt cleanup Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 22/60] rose: disconnect orphaned STATE_2 sockets when device is gone Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 23/60] rose: fix netdev double-hold in rose_make_new() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 24/60] rose: release netdev ref and destroy orphaned incoming sockets Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 25/60] rose: drop CALL_REQUEST in loopback timer when device is not running Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 26/60] rose: cancel neighbour timers in rose_neigh_put() before freeing Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 27/60] rose: clear neighbour pointer in rose_kill_by_device() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 28/60] rose: dont free fd-owned sockets when reaping in the heartbeat Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 29/60] regulator: core: fix locking in regulator_resolve_supply() error path Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 30/60] hv: utils: handle and propagate errors in kvp_register Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 31/60] Drivers: hv: vmbus: Improve the logic of reserving fb_mmio on Gen2 VMs Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 32/60] firmware: samsung: acpm: Fix cross-thread RX length corruption Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 33/60] sctp: disable BH before calling udp_tunnel_xmit_skb() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 34/60] mm: introduce VM_MAYBE_GUARD and make visible in /proc/$pid/smaps Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 35/60] mm: add atomic VMA flags and set VM_MAYBE_GUARD as such Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 36/60] mm: update vma_modify_flags() to handle residual flags, document Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 37/60] mm: implement sticky VMA flags Greg Kroah-Hartman
2026-06-25 13:03 ` Greg Kroah-Hartman [this message]
2026-06-25 13:03 ` [PATCH 6.18 39/60] mm: set the VM_MAYBE_GUARD flag on guard region install Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 40/60] mm: propagate VM_SOFTDIRTY on merge Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 41/60] testing/selftests/mm: add soft-dirty merge self-test Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 42/60] net: export netif_open for self_test usage Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 43/60] net: net_failover: Fix the deadlock in slave register Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 44/60] iio: light: veml6075: add bounds check to veml6075_it_ms index Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 45/60] iio: adc: ti-ads1298: add bounds check to pga_settings index Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 46/60] Input: rmi4 - fix register descriptor address calculation Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 47/60] Input: rmi4 - refactor register descriptor parsing Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 48/60] Input: rmi4 - fix type overflow in register counts Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 49/60] Input: rmi4 - fix num_subpackets overflow in register descriptor Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 50/60] Input: rmi4 - fix memory leak in rmi_set_attn_data() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 51/60] Input: rmi4 - iterative IRQ handler Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 52/60] Input: rmi4 - fix bit count in bitmap_copy() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 53/60] crypto: qat - remove unused character device and IOCTLs Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 54/60] vc_screen: fix null-ptr-deref in vcs_notifier() during concurrent vcs_write Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 55/60] serial: qcom_geni: Fix RX DMA stall when SE_DMA_RX_LEN_IN is zero Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 56/60] drivers/base/memory: set mem->altmap after successful device registration Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 57/60] ksmbd: reject non-VALID session in compound request branch Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 58/60] media: vidtv: fix NULL pointer dereference in vidtv_mux_push_si Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 59/60] virtiofs: fix UAF on submount umount Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 60/60] mm: do not copy page tables unnecessarily for VM_UFFD_WP Greg Kroah-Hartman
2026-06-25 13:33 ` [PATCH 6.18 00/60] 6.18.37-rc1 review Florian Fainelli
2026-06-25 15:27 ` Brett A C Sheffield
2026-06-25 17:11 ` Peter Schneider
2026-06-26 0:04 ` Shuah Khan
2026-06-26 5:11 ` Ron Economos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260625125651.220918535@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=avagin@gmail.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=corbet@lwn.net \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=elaidya225@gmail.com \
--cc=jannh@google.com \
--cc=lance.yang@linux.dev \
--cc=liam.howlett@oracle.com \
--cc=ljs@kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=patches@lists.linux.dev \
--cc=pfalcato@suse.de \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=stable@vger.kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox