From: Kalesh Singh <kaleshsingh@google.com>
To: akpm@linux-foundation.org, minchan@kernel.org,
lorenzo.stoakes@oracle.com, david@redhat.com,
Liam.Howlett@oracle.com, rppt@kernel.org, pfalcato@suse.de
Cc: rostedt@goodmis.org, hughd@google.com, kernel-team@android.com,
android-mm@google.com, Kalesh Singh <kaleshsingh@google.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
Kees Cook <kees@kernel.org>, Vlastimil Babka <vbabka@suse.cz>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>, Jann Horn <jannh@google.com>,
Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
Shuah Khan <shuah@kernel.org>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org,
linux-kselftest@vger.kernel.org
Subject: [PATCH v4 3/5] mm: Introduce max_vma_count() to abstract the max map count sysctl
Date: Tue, 28 Oct 2025 14:24:34 -0700 [thread overview]
Message-ID: <20251028212528.681081-4-kaleshsingh@google.com> (raw)
In-Reply-To: <20251028212528.681081-1-kaleshsingh@google.com>
Introduce a new helper function, max_vma_count(), to act as the
canonical accessor for the maximum VMA count limit.
The global variable sysctl_max_map_count is used in multiple files to
check the VMA limit. This direct usage exposes an implementation detail
and makes the code harder to read and maintain.
This patch abstracts the global variable behind the more aptly named
max_vma_count() function. As a result, the sysctl_max_map_count
variable can now be made static to mm/mmap.c, improving encapsulation.
All call sites are converted to use the new helper, making the limit
checks more readable.
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
---
Changes in v4:
- Introduce max_vma_count() to abstract the max map count sysctl,
replacing the previously proposed vma_count_remaining() helper
-- since this remaining count can now be negative as some cases
are allowed to exceed the limit.
- Convert all callers to use the new helper.
Changes in v3:
- Move vma_count_remaining() out of #if CONFIG_SYSCTL to fix build
failure
- Use READ_ONCE() for sysclt_max_map_count, per David, Lorenzo
- Remove use of ternary op in vma_count_remaining, per Lorenzo
- Rebase on mm-new to fix conflicts in vma_internal.h and
mm/internal.h
include/linux/mm.h | 2 --
mm/internal.h | 3 +++
mm/mmap.c | 9 ++++++++-
mm/mremap.c | 7 ++++---
mm/nommu.c | 2 +-
mm/util.c | 1 -
mm/vma.c | 10 +++++-----
tools/testing/vma/vma_internal.h | 6 ++++++
8 files changed, 27 insertions(+), 13 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index aada935c4950..5db9d95043f6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -205,8 +205,6 @@ static inline void __mm_zero_struct_page(struct page *page)
#define MAPCOUNT_ELF_CORE_MARGIN (5)
#define DEFAULT_MAX_MAP_COUNT (USHRT_MAX - MAPCOUNT_ELF_CORE_MARGIN)
-extern int sysctl_max_map_count;
-
extern unsigned long sysctl_user_reserve_kbytes;
extern unsigned long sysctl_admin_reserve_kbytes;
diff --git a/mm/internal.h b/mm/internal.h
index 116a1ba85e66..eba30ff7c8dc 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1702,4 +1702,7 @@ static inline int io_remap_pfn_range_complete(struct vm_area_struct *vma,
return remap_pfn_range_complete(vma, addr, pfn, size, prot);
}
+/* mmap.c */
+int max_vma_count(void);
+
#endif /* __MM_INTERNAL_H */
diff --git a/mm/mmap.c b/mm/mmap.c
index 78843a2fae42..5a967a307099 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -383,7 +383,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
* sysctl_max_map_count limit by one. This behavior is preserved to
* avoid breaking existing applications.
*/
- if (mm->map_count > sysctl_max_map_count)
+ if (max_vma_count() - mm->map_count < 0)
return -ENOMEM;
/*
@@ -1504,6 +1504,13 @@ struct vm_area_struct *_install_special_mapping(
&special_mapping_vmops);
}
+static int sysctl_max_map_count __read_mostly = DEFAULT_MAX_MAP_COUNT;
+
+int max_vma_count(void)
+{
+ return READ_ONCE(sysctl_max_map_count);
+}
+
#ifdef CONFIG_SYSCTL
#if defined(HAVE_ARCH_PICK_MMAP_LAYOUT) || \
defined(CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT)
diff --git a/mm/mremap.c b/mm/mremap.c
index a7f531c17b79..02c38fd957e4 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -1040,7 +1040,7 @@ static unsigned long prep_move_vma(struct vma_remap_struct *vrm)
* We'd prefer to avoid failure later on in do_munmap:
* which may split one vma into three before unmapping.
*/
- if (current->mm->map_count >= sysctl_max_map_count - 3)
+ if (max_vma_count() - current->mm->map_count < 4)
return -ENOMEM;
if (vma->vm_ops && vma->vm_ops->may_split) {
@@ -1811,9 +1811,10 @@ static unsigned long check_mremap_params(struct vma_remap_struct *vrm)
* split in 3 before unmapping it.
* That means 2 more maps (1 for each) to the ones we already hold.
* Check whether current map count plus 2 still leads us to 4 maps below
- * the threshold, otherwise return -ENOMEM here to be more safe.
+ * the threshold. In other words, is the current map count + 6 at or
+ * below the threshold? Otherwise return -ENOMEM here to be more safe.
*/
- if ((current->mm->map_count + 2) >= sysctl_max_map_count - 3)
+ if (max_vma_count() - current->mm->map_count < 6)
return -ENOMEM;
return 0;
diff --git a/mm/nommu.c b/mm/nommu.c
index c3a23b082adb..ae2b20cc324a 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1317,7 +1317,7 @@ static int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
return -ENOMEM;
mm = vma->vm_mm;
- if (mm->map_count >= sysctl_max_map_count)
+ if (max_vma_count() - mm->map_count < 1)
return -ENOMEM;
region = kmem_cache_alloc(vm_region_jar, GFP_KERNEL);
diff --git a/mm/util.c b/mm/util.c
index 97cae40c0209..eb1bcfc1d48d 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -752,7 +752,6 @@ EXPORT_SYMBOL(folio_mc_copy);
int sysctl_overcommit_memory __read_mostly = OVERCOMMIT_GUESS;
static int sysctl_overcommit_ratio __read_mostly = 50;
static unsigned long sysctl_overcommit_kbytes __read_mostly;
-int sysctl_max_map_count __read_mostly = DEFAULT_MAX_MAP_COUNT;
unsigned long sysctl_user_reserve_kbytes __read_mostly = 1UL << 17; /* 128MB */
unsigned long sysctl_admin_reserve_kbytes __read_mostly = 1UL << 13; /* 8MB */
diff --git a/mm/vma.c b/mm/vma.c
index d0bb3127280e..768d216beed3 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -493,8 +493,8 @@ void unmap_region(struct ma_state *mas, struct vm_area_struct *vma,
}
/*
- * __split_vma() bypasses sysctl_max_map_count checking. We use this where it
- * has already been checked or doesn't make sense to fail.
+ * __split_vma() bypasses max_vma_count() checks. We use this where
+ * it has already been checked or doesn't make sense to fail.
* VMA Iterator will point to the original VMA.
*/
static __must_check int
@@ -594,7 +594,7 @@ __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
static int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
unsigned long addr, int new_below)
{
- if (vma->vm_mm->map_count >= sysctl_max_map_count)
+ if (max_vma_count() - vma->vm_mm->map_count < 1)
return -ENOMEM;
return __split_vma(vmi, vma, addr, new_below);
@@ -1347,7 +1347,7 @@ static int vms_gather_munmap_vmas(struct vma_munmap_struct *vms,
* its limit temporarily, to help free resources as expected.
*/
if (vms->end < vms->vma->vm_end &&
- vms->vma->vm_mm->map_count >= sysctl_max_map_count) {
+ max_vma_count() - vms->vma->vm_mm->map_count < 1) {
error = -ENOMEM;
goto map_count_exceeded;
}
@@ -2819,7 +2819,7 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *vma,
* typically extends the existing brk VMA rather than creating a new one.
* See also the comment in do_mmap().
*/
- if (mm->map_count > sysctl_max_map_count)
+ if (max_vma_count() - mm->map_count < 0)
return -ENOMEM;
if (security_vm_enough_memory_mm(mm, len >> PAGE_SHIFT))
diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
index d873667704e8..41d354a699c5 100644
--- a/tools/testing/vma/vma_internal.h
+++ b/tools/testing/vma/vma_internal.h
@@ -1491,4 +1491,10 @@ static inline int do_munmap(struct mm_struct *, unsigned long, size_t,
return 0;
}
+/* Helper to get max vma count */
+static int max_vma_count(void)
+{
+ return sysctl_max_map_count;
+}
+
#endif /* __MM_VMA_INTERNAL_H */
--
2.51.1.851.g4ebd6896fd-goog
next prev parent reply other threads:[~2025-10-28 21:26 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-28 21:24 [PATCH v4 0/5] mm: Refactor and improve VMA count limit code Kalesh Singh
2025-10-28 21:24 ` [PATCH v4 1/5] mm: Document lenient map_count checks Kalesh Singh
2025-11-03 17:08 ` David Hildenbrand (Red Hat)
2025-10-28 21:24 ` [PATCH v4 2/5] mm/selftests: add max_vma_count tests Kalesh Singh
2025-11-03 17:13 ` David Hildenbrand (Red Hat)
2025-11-03 23:58 ` Kalesh Singh
2025-10-28 21:24 ` Kalesh Singh [this message]
2025-10-28 21:24 ` [PATCH v4 4/5] mm: rename mm_struct::map_count to vma_count Kalesh Singh
2025-10-28 21:24 ` [PATCH v4 5/5] mm/tracing: introduce trace_mm_insufficient_vma_slots event Kalesh Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251028212528.681081-4-kaleshsingh@google.com \
--to=kaleshsingh@google.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=android-mm@google.com \
--cc=brauner@kernel.org \
--cc=bsegall@google.com \
--cc=david@redhat.com \
--cc=dietmar.eggemann@arm.com \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=jannh@google.com \
--cc=juri.lelli@redhat.com \
--cc=kees@kernel.org \
--cc=kernel-team@android.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mgorman@suse.de \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=pfalcato@suse.de \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=shuah@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=vincent.guittot@linaro.org \
--cc=viro@zeniv.linux.org.uk \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox