From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Lorenzo Stoakes <ljs@kernel.org>,
Pedro Falcato <pfalcato@suse.de>,
Vlastimil Babka <vbabka@suse.cz>,
"David Hildenbrand (Red Hat)" <david@kernel.org>,
Lance Yang <lance.yang@linux.dev>,
Andrei Vagin <avagin@gmail.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Barry Song <baohua@kernel.org>, Dev Jain <dev.jain@arm.com>,
Jann Horn <jannh@google.com>, Jonathan Corbet <corbet@lwn.net>,
Liam Howlett <liam.howlett@oracle.com>,
"Masami Hiramatsu (Google)" <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Michal Hocko <mhocko@suse.com>, Mike Rapoport <rppt@kernel.org>,
Nico Pache <npache@redhat.com>,
Ryan Roberts <ryan.roberts@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Suren Baghdasaryan <surenb@google.com>, Zi Yan <ziy@nvidia.com>,
Andrew Morton <akpm@linux-foundation.org>,
Ahmed Elaidy <elaidya225@gmail.com>
Subject: [PATCH 6.18 34/60] mm: introduce VM_MAYBE_GUARD and make visible in /proc/$pid/smaps
Date: Thu, 25 Jun 2026 14:03:19 +0100 [thread overview]
Message-ID: <20260625125650.574635767@linuxfoundation.org> (raw)
In-Reply-To: <20260625125645.554579168@linuxfoundation.org>
6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
commit 5dba5cc2e0ffa76f2f6c8922a04469dc9602c396 upstream.
Patch series "introduce VM_MAYBE_GUARD and make it sticky", v4.
Currently, guard regions are not visible to users except through
/proc/$pid/pagemap, with no explicit visibility at the VMA level.
This makes the feature less useful, as it isn't entirely apparent which
VMAs may have these entries present, especially when performing actions
which walk through memory regions such as those performed by CRIU.
This series addresses this issue by introducing the VM_MAYBE_GUARD flag
which fulfils this role, updating the smaps logic to display an entry for
these.
The semantics of this flag are that a guard region MAY be present if set
(we cannot be sure, as we can't efficiently track whether an
MADV_GUARD_REMOVE finally removes all the guard regions in a VMA) - but if
not set the VMA definitely does NOT have any guard regions present.
It's problematic to establish this flag without further action, because
that means that VMAs with guard regions in them become non-mergeable with
adjacent VMAs for no especially good reason.
To work around this, this series also introduces the concept of 'sticky'
VMA flags - that is flags which:
a. if set in one VMA and not in another still permit those VMAs to be
merged (if otherwise compatible).
b. When they are merged, the resultant VMA must have the flag set.
The VMA logic is updated to propagate these flags correctly.
Additionally, VM_MAYBE_GUARD being an explicit VMA flag allows us to solve
an issue with file-backed guard regions - previously these established an
anon_vma object for file-backed mappings solely to have vma_needs_copy()
correctly propagate guard region mappings to child processes.
We introduce a new flag alias VM_COPY_ON_FORK (which currently only
specifies VM_MAYBE_GUARD) and update vma_needs_copy() to check explicitly
for this flag and to copy page tables if it is present, which resolves
this issue.
Additionally, we add the ability for allow-listed VMA flags to be
atomically writable with only mmap/VMA read locks held.
The only flag we allow so far is VM_MAYBE_GUARD, which we carefully ensure
does not cause any races by being allowed to do so.
This allows us to maintain guard region installation as a read-locked
operation and not endure the overhead of obtaining a write lock here.
Finally we introduce extensive VMA userland tests to assert that the
sticky VMA logic behaves correctly as well as guard region self tests to
assert that smaps visibility is correctly implemented.
This patch (of 9):
Currently, if a user needs to determine if guard regions are present in a
range, they have to scan all VMAs (or have knowledge of which ones might
have guard regions).
Since commit 8e2f2aeb8b48 ("fs/proc/task_mmu: add guard region bit to
pagemap") and the related commit a516403787e0 ("fs/proc: extend the
PAGEMAP_SCAN ioctl to report guard regions"), users can use either
/proc/$pid/pagemap or the PAGEMAP_SCAN functionality to perform this
operation at a virtual address level.
This is not ideal, and it gives no visibility at a /proc/$pid/smaps level
that guard regions exist in ranges.
This patch remedies the situation by establishing a new VMA flag,
VM_MAYBE_GUARD, to indicate that a VMA may contain guard regions (it is
uncertain because we cannot reasonably determine whether a
MADV_GUARD_REMOVE call has removed all of the guard regions in a VMA, and
additionally VMAs may change across merge/split).
We utilise 0x800 for this flag which makes it available to 32-bit
architectures also, a flag that was previously used by VM_DENYWRITE, which
was removed in commit 8d0920bde5eb ("mm: remove VM_DENYWRITE") and hasn't
bee reused yet.
We also update the smaps logic and documentation to identify these VMAs.
Another major use of this functionality is that we can use it to identify
that we ought to copy page tables on fork.
We do not actually implement usage of this flag in mm/madvise.c yet as we
need to allow some VMA flags to be applied atomically under mmap/VMA read
lock in order to avoid the need to acquire a write lock for this purpose.
Link: https://lkml.kernel.org/r/cover.1763460113.git.ljs@kernel.org
Link: https://lkml.kernel.org/r/cf8ef821eba29b6c5b5e138fffe95d6dcabdedb9.1763460113.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ahmed Elaidy <elaidya225@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
Documentation/filesystems/proc.rst | 5 +++--
fs/proc/task_mmu.c | 1 +
include/linux/mm.h | 3 +++
include/trace/events/mmflags.h | 1 +
mm/memory.c | 4 ++++
tools/testing/vma/vma_internal.h | 1 +
6 files changed, 13 insertions(+), 2 deletions(-)
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -553,7 +553,7 @@ otherwise.
kernel flags associated with the particular virtual memory area in two letter
encoded manner. The codes are the following:
- == =======================================
+ == =============================================================
rd readable
wr writeable
ex executable
@@ -591,7 +591,8 @@ encoded manner. The codes are the follow
sl sealed
lf lock on fault pages
dp always lazily freeable mapping
- == =======================================
+ gu maybe contains guard regions (if not set, definitely doesn't)
+ == =============================================================
Note that there is no guarantee that every flag and associated mnemonic will
be present in all further kernel releases. Things get changed, the flags may
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1159,6 +1159,7 @@ static void show_smap_vma_flags(struct s
[ilog2(VM_MAYSHARE)] = "ms",
[ilog2(VM_GROWSDOWN)] = "gd",
[ilog2(VM_PFNMAP)] = "pf",
+ [ilog2(VM_MAYBE_GUARD)] = "gu",
[ilog2(VM_LOCKED)] = "lo",
[ilog2(VM_IO)] = "io",
[ilog2(VM_SEQ_READ)] = "sr",
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -269,6 +269,8 @@ extern struct rw_semaphore nommu_region_
extern unsigned int kobjsize(const void *objp);
#endif
+#define VM_MAYBE_GUARD_BIT 11
+
/*
* vm_flags in vm_area_struct, see mm_types.h.
* When changing, update also include/trace/events/mmflags.h
@@ -294,6 +296,7 @@ extern unsigned int kobjsize(const void
#define VM_UFFD_MISSING 0
#endif /* CONFIG_MMU */
#define VM_PFNMAP 0x00000400 /* Page-ranges managed without "struct page", just pure PFN */
+#define VM_MAYBE_GUARD BIT(VM_MAYBE_GUARD_BIT) /* The VMA maybe contains guard regions. */
#define VM_UFFD_WP 0x00001000 /* wrprotect pages tracking */
#define VM_LOCKED 0x00002000
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -213,6 +213,7 @@ IF_HAVE_PG_ARCH_3(arch_3)
{VM_UFFD_MISSING, "uffd_missing" }, \
IF_HAVE_UFFD_MINOR(VM_UFFD_MINOR, "uffd_minor" ) \
{VM_PFNMAP, "pfnmap" }, \
+ {VM_MAYBE_GUARD, "maybe_guard" }, \
{VM_UFFD_WP, "uffd_wp" }, \
{VM_LOCKED, "locked" }, \
{VM_IO, "io" }, \
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1494,6 +1494,10 @@ vma_needs_copy(struct vm_area_struct *ds
if (src_vma->anon_vma)
return true;
+ /* Guard regions have modified page tables that require copying. */
+ if (src_vma->vm_flags & VM_MAYBE_GUARD)
+ return true;
+
/*
* Don't copy ptes where a page fault will fill them correctly. Fork
* becomes much lighter when there are big shared or private readonly
--- a/tools/testing/vma/vma_internal.h
+++ b/tools/testing/vma/vma_internal.h
@@ -56,6 +56,7 @@ extern unsigned long dac_mmap_min_addr;
#define VM_MAYEXEC 0x00000040
#define VM_GROWSDOWN 0x00000100
#define VM_PFNMAP 0x00000400
+#define VM_MAYBE_GUARD 0x00000800
#define VM_LOCKED 0x00002000
#define VM_IO 0x00004000
#define VM_SEQ_READ 0x00008000 /* App will access data sequentially */
next prev parent reply other threads:[~2026-06-25 13:07 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-25 13:02 [PATCH 6.18 00/60] 6.18.37-rc1 review Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 01/60] io_uring/net: Avoid msghdr on op_connect/op_bind async data Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 02/60] net: stmmac: fix stm32 (and potentially others) resume regression Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 03/60] fuse: re-lock request before replacing page cache folio Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 04/60] Revert "NFSD: Defer sub-object cleanup in export put callbacks" Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 05/60] debugobjects: Allow to refill the pool before SYSTEM_SCHEDULING Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 06/60] debugobjects: Use LD_WAIT_CONFIG instead of LD_WAIT_SLEEP Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 07/60] debugobjects: Do not fill_pool() if pi_blocked_on Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 08/60] debugobjects: Dont call fill_pool() in early boot hardirq context Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 09/60] RDMA/bnxt_re: zero shared page before exposing to userspace Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 10/60] i2c: stub: Reject I2C block transfers with invalid length Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 11/60] net: qualcomm: rmnet: fix endpoint use-after-free in rmnet_dellink() Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 12/60] agp/amd64: Fix broken error propagation in agp_amd64_probe() Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 13/60] ACPI: scan: Use async schedule function in acpi_scan_clear_dep_fn() Greg Kroah-Hartman
2026-06-25 13:02 ` [PATCH 6.18 14/60] rose: fix dev_put() leak in rose_loopback_timer() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 15/60] rose: hold loopback neighbour reference across timer callback Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 16/60] rose: fix race between loopback timer and module removal Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 17/60] rose: clear neighbour pointer after rose_neigh_put() in state machines Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 18/60] rose: guard rose_neigh_put() against NULL in timer expiry Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 19/60] rose: fix netdev double-hold in rose_rx_call_request() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 20/60] rose: fix notifier unregistered too early in rose_exit() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 21/60] rose: set SOCK_DESTROY in rose_kill_by_device() for prompt cleanup Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 22/60] rose: disconnect orphaned STATE_2 sockets when device is gone Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 23/60] rose: fix netdev double-hold in rose_make_new() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 24/60] rose: release netdev ref and destroy orphaned incoming sockets Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 25/60] rose: drop CALL_REQUEST in loopback timer when device is not running Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 26/60] rose: cancel neighbour timers in rose_neigh_put() before freeing Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 27/60] rose: clear neighbour pointer in rose_kill_by_device() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 28/60] rose: dont free fd-owned sockets when reaping in the heartbeat Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 29/60] regulator: core: fix locking in regulator_resolve_supply() error path Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 30/60] hv: utils: handle and propagate errors in kvp_register Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 31/60] Drivers: hv: vmbus: Improve the logic of reserving fb_mmio on Gen2 VMs Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 32/60] firmware: samsung: acpm: Fix cross-thread RX length corruption Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 33/60] sctp: disable BH before calling udp_tunnel_xmit_skb() Greg Kroah-Hartman
2026-06-25 13:03 ` Greg Kroah-Hartman [this message]
2026-06-25 13:03 ` [PATCH 6.18 35/60] mm: add atomic VMA flags and set VM_MAYBE_GUARD as such Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 36/60] mm: update vma_modify_flags() to handle residual flags, document Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 37/60] mm: implement sticky VMA flags Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 38/60] mm: introduce copy-on-fork VMAs and make VM_MAYBE_GUARD one Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 39/60] mm: set the VM_MAYBE_GUARD flag on guard region install Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 40/60] mm: propagate VM_SOFTDIRTY on merge Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 41/60] testing/selftests/mm: add soft-dirty merge self-test Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 42/60] net: export netif_open for self_test usage Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 43/60] net: net_failover: Fix the deadlock in slave register Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 44/60] iio: light: veml6075: add bounds check to veml6075_it_ms index Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 45/60] iio: adc: ti-ads1298: add bounds check to pga_settings index Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 46/60] Input: rmi4 - fix register descriptor address calculation Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 47/60] Input: rmi4 - refactor register descriptor parsing Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 48/60] Input: rmi4 - fix type overflow in register counts Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 49/60] Input: rmi4 - fix num_subpackets overflow in register descriptor Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 50/60] Input: rmi4 - fix memory leak in rmi_set_attn_data() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 51/60] Input: rmi4 - iterative IRQ handler Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 52/60] Input: rmi4 - fix bit count in bitmap_copy() Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 53/60] crypto: qat - remove unused character device and IOCTLs Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 54/60] vc_screen: fix null-ptr-deref in vcs_notifier() during concurrent vcs_write Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 55/60] serial: qcom_geni: Fix RX DMA stall when SE_DMA_RX_LEN_IN is zero Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 56/60] drivers/base/memory: set mem->altmap after successful device registration Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 57/60] ksmbd: reject non-VALID session in compound request branch Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 58/60] media: vidtv: fix NULL pointer dereference in vidtv_mux_push_si Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 59/60] virtiofs: fix UAF on submount umount Greg Kroah-Hartman
2026-06-25 13:03 ` [PATCH 6.18 60/60] mm: do not copy page tables unnecessarily for VM_UFFD_WP Greg Kroah-Hartman
2026-06-25 13:33 ` [PATCH 6.18 00/60] 6.18.37-rc1 review Florian Fainelli
2026-06-25 15:27 ` Brett A C Sheffield
2026-06-25 17:11 ` Peter Schneider
2026-06-26 0:04 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260625125650.574635767@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=avagin@gmail.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=corbet@lwn.net \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=elaidya225@gmail.com \
--cc=jannh@google.com \
--cc=lance.yang@linux.dev \
--cc=liam.howlett@oracle.com \
--cc=ljs@kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=patches@lists.linux.dev \
--cc=pfalcato@suse.de \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=stable@vger.kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.