linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] madvise anon_name cleanups
@ 2025-06-24 13:03 Vlastimil Babka
  2025-06-24 13:03 ` [PATCH v2 1/4] mm, madvise: simplify anon_name handling Vlastimil Babka
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 13:03 UTC (permalink / raw)
  To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
	Michal Hocko, Colin Cross
  Cc: linux-mm, linux-kernel, Vlastimil Babka

While reviewing Lorenzo's madvise cleanups I've noticed that we can
handle anon_name in madvise code much better, so sending that as patch
1. Initially I wanted to do first move the existing logic from
madvise_vma_behavior() to madvise_update_vma() as a separate patch
before the actual simplification but that would require adding
anon_vma_name_put() in error handling paths only to be removed again, so
it's a single patch to avoid churn.

It's also an opportunity to move some mm code from prctl under mm,
hence patch 2. After code moving preparation in patch 3, also unify
madvise lock handling for madvise_set_anon_name() in patch 4.

Based on mm-new.

Taking the RFC off as concerns from RFC should be addressed, but not
without risk as patches 3+4 are new. Due to rewrite of patch 1 I didn't
keep Suren's R-b (but thanks!).

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
Changes in v2:
- Refactor madvise_update_vma() to select between vma_modify_flags() and
  vma_modify_flags_name() based on Lorenzo's suggestion. This should
  also address Suren's concerns.
- Reduce the code move from kernel/sys.c to code handling
  PR_SET_VMA_ANON_NAME, per Lorenzo.
- Added patches 3, 4 to unify mm locking.
- Link to v1: https://patch.msgid.link/20250623-anon_name_cleanup-v1-0-04c94384046f@suse.cz

---
Vlastimil Babka (4):
      mm, madvise: simplify anon_name handling
      mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c
      mm, madvise: move madvise_set_anon_name() down the file
      mm, madvise: use standard madvise locking in madvise_set_anon_name()

 include/linux/mm.h |  14 ++---
 kernel/sys.c       |  50 +----------------
 mm/madvise.c       | 154 ++++++++++++++++++++++++++++++++++-------------------
 3 files changed, 106 insertions(+), 112 deletions(-)
---
base-commit: 4216fd45fc9156da0ee33fcb25cc0a5265049e32
change-id: 20250623-anon_name_cleanup-e89b687038ed

Best regards,
-- 
Vlastimil Babka <vbabka@suse.cz>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 1/4] mm, madvise: simplify anon_name handling
  2025-06-24 13:03 [PATCH v2 0/4] madvise anon_name cleanups Vlastimil Babka
@ 2025-06-24 13:03 ` Vlastimil Babka
  2025-06-24 13:58   ` David Hildenbrand
                     ` (2 more replies)
  2025-06-24 13:03 ` [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c Vlastimil Babka
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 13:03 UTC (permalink / raw)
  To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
	Michal Hocko, Colin Cross
  Cc: linux-mm, linux-kernel, Vlastimil Babka

Since the introduction in 9a10064f5625 ("mm: add a field to store names
for private anonymous memory") the code to set anon_name on a vma has
been using madvise_update_vma() to call replace_anon_vma_name(). Since
the former is called also by a number of other madvise behaviours that
do not set a new anon_name, they have been passing the existing
anon_name of the vma to make replace_vma_anon_name() a no-op.

This is rather wasteful as it needs anon_vma_name_eq() to determine the
no-op situations, and checks for when replace_vma_anon_name() is allowed
(the vma is anon/shmem) duplicate the checks already done earlier in
madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
fix use-after-free when anon vma name is used after vma is freed")
adding anon_name refcount get/put operations exactly to the cases that
actually do not change anon_name - just so the replace_vma_anon_name()
can keep safely determining it has nothing to do.

The recent madvise cleanups made this suboptimal handling very obvious,
but happily also allow for an easy fix. madvise_update_vma() now has the
complete information whether it's been called to set a new anon_name, so
stop passing it the existing vma's name and doing the refcount get/put
in its only caller madvise_vma_behavior().

In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
only to cases where we are setting a new name, otherwise we know it's a
no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
can remove the duplicate checks for vma being anon/shmem that were done
already in madvise_vma_behavior().

Additionally, by using vma_modify_flags() when not modifying the
anon_name, avoid explicitly passing the existing vma's anon_name and
storing a pointer to it in struct madv_behavior or a local variable.
This prevents the danger of accessing a freed anon_name after vma
merging, previously fixed by commit 942341dcc574.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/madvise.c | 37 +++++++++++++------------------------
 1 file changed, 13 insertions(+), 24 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
 }
 #endif /* CONFIG_ANON_VMA_NAME */
 /*
- * Update the vm_flags on region of a vma, splitting it or merging it as
- * necessary.  Must be called with mmap_lock held for writing;
- * Caller should ensure anon_name stability by raising its refcount even when
- * anon_name belongs to a valid vma because this function might free that vma.
+ * Update the vm_flags and/or anon_name on region of a vma, splitting it or
+ * merging it as necessary. Must be called with mmap_lock held for writing.
  */
 static int madvise_update_vma(vm_flags_t new_flags,
 		struct madvise_behavior *madv_behavior)
 {
-	int error;
 	struct vm_area_struct *vma = madv_behavior->vma;
 	struct madvise_behavior_range *range = &madv_behavior->range;
 	struct anon_vma_name *anon_name = madv_behavior->anon_name;
+	bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME;
 	VMA_ITERATOR(vmi, madv_behavior->mm, range->start);
 
-	if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name))
+	if (new_flags == vma->vm_flags && (!set_new_anon_name ||
+			anon_vma_name_eq(anon_vma_name(vma), anon_name)))
 		return 0;
 
-	vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
+	if (set_new_anon_name)
+		vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
 			range->start, range->end, new_flags, anon_name);
+	else
+		vma = vma_modify_flags(&vmi, madv_behavior->prev, vma,
+			range->start, range->end, new_flags);
+
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
@@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags,
 	/* vm_flags is protected by the mmap_lock held in write mode. */
 	vma_start_write(vma);
 	vm_flags_reset(vma, new_flags);
-	if (!vma->vm_file || vma_is_anon_shmem(vma)) {
-		error = replace_anon_vma_name(vma, anon_name);
-		if (error)
-			return error;
-	}
+	if (set_new_anon_name)
+		return replace_anon_vma_name(vma, anon_name);
 
 	return 0;
 }
@@ -1313,7 +1314,6 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior)
 	int behavior = madv_behavior->behavior;
 	struct vm_area_struct *vma = madv_behavior->vma;
 	vm_flags_t new_flags = vma->vm_flags;
-	bool set_new_anon_name = behavior == __MADV_SET_ANON_VMA_NAME;
 	struct madvise_behavior_range *range = &madv_behavior->range;
 	int error;
 
@@ -1403,18 +1403,7 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior)
 	/* This is a write operation.*/
 	VM_WARN_ON_ONCE(madv_behavior->lock_mode != MADVISE_MMAP_WRITE_LOCK);
 
-	/*
-	 * madvise_update_vma() might cause a VMA merge which could put an
-	 * anon_vma_name, so we must hold an additional reference on the
-	 * anon_vma_name so it doesn't disappear from under us.
-	 */
-	if (!set_new_anon_name) {
-		madv_behavior->anon_name = anon_vma_name(vma);
-		anon_vma_name_get(madv_behavior->anon_name);
-	}
 	error = madvise_update_vma(new_flags, madv_behavior);
-	if (!set_new_anon_name)
-		anon_vma_name_put(madv_behavior->anon_name);
 out:
 	/*
 	 * madvise() returns EAGAIN if kernel resources, such as

-- 
2.50.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c
  2025-06-24 13:03 [PATCH v2 0/4] madvise anon_name cleanups Vlastimil Babka
  2025-06-24 13:03 ` [PATCH v2 1/4] mm, madvise: simplify anon_name handling Vlastimil Babka
@ 2025-06-24 13:03 ` Vlastimil Babka
  2025-06-24 14:04   ` David Hildenbrand
  2025-06-24 15:28   ` Lorenzo Stoakes
  2025-06-24 13:03 ` [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file Vlastimil Babka
  2025-06-24 13:03 ` [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name() Vlastimil Babka
  3 siblings, 2 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 13:03 UTC (permalink / raw)
  To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
	Michal Hocko, Colin Cross
  Cc: linux-mm, linux-kernel, Vlastimil Babka

Setting anon_name is done via madvise_set_anon_name() and behaves a lot
of like other madvise operations. However, apparently because madvise()
has lacked the 4th argument and prctl() not, the userspace entry point
has been implemented via prctl(PR_SET_VMA, ...) and handled first by
prctl_set_vma().

Currently prctl_set_vma() lives in kernel/sys.c but setting the
vma->anon_name is mm-specific code so extract it to a new
set_anon_vma_name() function under mm. mm/madvise.c seems to be the most
straightforward place as that's where madvise_set_anon_name() lives.
Stop declaring the latter in mm.h and instead declare
set_anon_vma_name().

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/mm.h | 14 +++++++-------
 kernel/sys.c       | 50 +-------------------------------------------------
 mm/madvise.c       | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 58 insertions(+), 58 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0e0549f3d681f6c7a78e8dfa341a810e5a8f96c1..ef40f68c1183d4c95016575a4ee0171e12df9ba4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4059,14 +4059,14 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,
 #endif
 
 #ifdef CONFIG_ANON_VMA_NAME
-int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
-			  unsigned long len_in,
-			  struct anon_vma_name *anon_name);
+int set_anon_vma_name(unsigned long addr, unsigned long size,
+		      const char __user *uname);
 #else
-static inline int
-madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
-		      unsigned long len_in, struct anon_vma_name *anon_name) {
-	return 0;
+static inline
+int set_anon_vma_name(unsigned long addr, unsigned long size,
+		      const char __user *uname)
+{
+	return -EINVAL;
 }
 #endif
 
diff --git a/kernel/sys.c b/kernel/sys.c
index adc0de0aa364aebb23999f621717a5d32599921c..b153fb345ada28ea1a33386a32bcce9cb1b23475 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2343,54 +2343,14 @@ int __weak arch_lock_shadow_stack_status(struct task_struct *t, unsigned long st
 
 #define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE)
 
-#ifdef CONFIG_ANON_VMA_NAME
-
-#define ANON_VMA_NAME_MAX_LEN		80
-#define ANON_VMA_NAME_INVALID_CHARS	"\\`$[]"
-
-static inline bool is_valid_name_char(char ch)
-{
-	/* printable ascii characters, excluding ANON_VMA_NAME_INVALID_CHARS */
-	return ch > 0x1f && ch < 0x7f &&
-		!strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
-}
-
 static int prctl_set_vma(unsigned long opt, unsigned long addr,
 			 unsigned long size, unsigned long arg)
 {
-	struct mm_struct *mm = current->mm;
-	const char __user *uname;
-	struct anon_vma_name *anon_name = NULL;
 	int error;
 
 	switch (opt) {
 	case PR_SET_VMA_ANON_NAME:
-		uname = (const char __user *)arg;
-		if (uname) {
-			char *name, *pch;
-
-			name = strndup_user(uname, ANON_VMA_NAME_MAX_LEN);
-			if (IS_ERR(name))
-				return PTR_ERR(name);
-
-			for (pch = name; *pch != '\0'; pch++) {
-				if (!is_valid_name_char(*pch)) {
-					kfree(name);
-					return -EINVAL;
-				}
-			}
-			/* anon_vma has its own copy */
-			anon_name = anon_vma_name_alloc(name);
-			kfree(name);
-			if (!anon_name)
-				return -ENOMEM;
-
-		}
-
-		mmap_write_lock(mm);
-		error = madvise_set_anon_name(mm, addr, size, anon_name);
-		mmap_write_unlock(mm);
-		anon_vma_name_put(anon_name);
+		error = set_anon_vma_name(addr, size, (const char __user *)arg);
 		break;
 	default:
 		error = -EINVAL;
@@ -2399,14 +2359,6 @@ static int prctl_set_vma(unsigned long opt, unsigned long addr,
 	return error;
 }
 
-#else /* CONFIG_ANON_VMA_NAME */
-static int prctl_set_vma(unsigned long opt, unsigned long start,
-			 unsigned long size, unsigned long arg)
-{
-	return -EINVAL;
-}
-#endif /* CONFIG_ANON_VMA_NAME */
-
 static inline unsigned long get_current_mdwe(void)
 {
 	unsigned long ret = 0;
diff --git a/mm/madvise.c b/mm/madvise.c
index fca0e9b3e844ad766e83ac04cc0d7f4099c74005..7e8819b5e9a0f183213ffe19d7e52bd5fda5f49d 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -134,8 +134,8 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
 	return 0;
 }
 
-int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
-			  unsigned long len_in, struct anon_vma_name *anon_name)
+static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
+		unsigned long len_in, struct anon_vma_name *anon_name)
 {
 	unsigned long end;
 	unsigned long len;
@@ -2096,3 +2096,51 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
 out:
 	return ret;
 }
+
+#ifdef CONFIG_ANON_VMA_NAME
+
+#define ANON_VMA_NAME_MAX_LEN		80
+#define ANON_VMA_NAME_INVALID_CHARS	"\\`$[]"
+
+static inline bool is_valid_name_char(char ch)
+{
+	/* printable ascii characters, excluding ANON_VMA_NAME_INVALID_CHARS */
+	return ch > 0x1f && ch < 0x7f &&
+		!strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
+}
+
+int set_anon_vma_name(unsigned long addr, unsigned long size,
+		      const char __user *uname)
+{
+	struct anon_vma_name *anon_name = NULL;
+	struct mm_struct *mm = current->mm;
+	int error;
+
+	if (uname) {
+		char *name, *pch;
+
+		name = strndup_user(uname, ANON_VMA_NAME_MAX_LEN);
+		if (IS_ERR(name))
+			return PTR_ERR(name);
+
+		for (pch = name; *pch != '\0'; pch++) {
+			if (!is_valid_name_char(*pch)) {
+				kfree(name);
+				return -EINVAL;
+			}
+		}
+		/* anon_vma has its own copy */
+		anon_name = anon_vma_name_alloc(name);
+		kfree(name);
+		if (!anon_name)
+			return -ENOMEM;
+	}
+
+	mmap_write_lock(mm);
+	error = madvise_set_anon_name(mm, addr, size, anon_name);
+	mmap_write_unlock(mm);
+	anon_vma_name_put(anon_name);
+
+	return error;
+}
+#endif

-- 
2.50.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file
  2025-06-24 13:03 [PATCH v2 0/4] madvise anon_name cleanups Vlastimil Babka
  2025-06-24 13:03 ` [PATCH v2 1/4] mm, madvise: simplify anon_name handling Vlastimil Babka
  2025-06-24 13:03 ` [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c Vlastimil Babka
@ 2025-06-24 13:03 ` Vlastimil Babka
  2025-06-24 14:05   ` David Hildenbrand
                     ` (2 more replies)
  2025-06-24 13:03 ` [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name() Vlastimil Babka
  3 siblings, 3 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 13:03 UTC (permalink / raw)
  To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
	Michal Hocko, Colin Cross
  Cc: linux-mm, linux-kernel, Vlastimil Babka

Preparatory change so that we can use madvise_lock()/unlock() in the
function without forward declarations or more thorough shuffling.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/madvise.c | 64 ++++++++++++++++++++++++++++++------------------------------
 1 file changed, 32 insertions(+), 32 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 7e8819b5e9a0f183213ffe19d7e52bd5fda5f49d..cae064479cdf908707c45b941bd03d43d095eab6 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -133,38 +133,6 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
 
 	return 0;
 }
-
-static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
-		unsigned long len_in, struct anon_vma_name *anon_name)
-{
-	unsigned long end;
-	unsigned long len;
-	struct madvise_behavior madv_behavior = {
-		.mm = mm,
-		.behavior = __MADV_SET_ANON_VMA_NAME,
-		.lock_mode = MADVISE_MMAP_WRITE_LOCK,
-		.anon_name = anon_name,
-	};
-
-	if (start & ~PAGE_MASK)
-		return -EINVAL;
-	len = (len_in + ~PAGE_MASK) & PAGE_MASK;
-
-	/* Check to see whether len was rounded up from small -ve to zero */
-	if (len_in && !len)
-		return -EINVAL;
-
-	end = start + len;
-	if (end < start)
-		return -EINVAL;
-
-	if (end == start)
-		return 0;
-
-	madv_behavior.range.start = start;
-	madv_behavior.range.end = end;
-	return madvise_walk_vmas(&madv_behavior);
-}
 #else /* CONFIG_ANON_VMA_NAME */
 static int replace_anon_vma_name(struct vm_area_struct *vma,
 				 struct anon_vma_name *anon_name)
@@ -2109,6 +2077,38 @@ static inline bool is_valid_name_char(char ch)
 		!strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
 }
 
+static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
+		unsigned long len_in, struct anon_vma_name *anon_name)
+{
+	unsigned long end;
+	unsigned long len;
+	struct madvise_behavior madv_behavior = {
+		.mm = mm,
+		.behavior = __MADV_SET_ANON_VMA_NAME,
+		.lock_mode = MADVISE_MMAP_WRITE_LOCK,
+		.anon_name = anon_name,
+	};
+
+	if (start & ~PAGE_MASK)
+		return -EINVAL;
+	len = (len_in + ~PAGE_MASK) & PAGE_MASK;
+
+	/* Check to see whether len was rounded up from small -ve to zero */
+	if (len_in && !len)
+		return -EINVAL;
+
+	end = start + len;
+	if (end < start)
+		return -EINVAL;
+
+	if (end == start)
+		return 0;
+
+	madv_behavior.range.start = start;
+	madv_behavior.range.end = end;
+	return madvise_walk_vmas(&madv_behavior);
+}
+
 int set_anon_vma_name(unsigned long addr, unsigned long size,
 		      const char __user *uname)
 {

-- 
2.50.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name()
  2025-06-24 13:03 [PATCH v2 0/4] madvise anon_name cleanups Vlastimil Babka
                   ` (2 preceding siblings ...)
  2025-06-24 13:03 ` [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file Vlastimil Babka
@ 2025-06-24 13:03 ` Vlastimil Babka
  2025-06-24 14:06   ` David Hildenbrand
                     ` (2 more replies)
  3 siblings, 3 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 13:03 UTC (permalink / raw)
  To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
	Michal Hocko, Colin Cross
  Cc: linux-mm, linux-kernel, Vlastimil Babka

Use madvise_lock()/madvise_unlock() in madvise_set_anon_name() in the
same way as in do_madvise(). This narrows the lock scope a bit and
reuses existing functionality. get_lock_mode() already picks the correct
MADVISE_MMAP_WRITE_LOCK mode for __MADV_SET_ANON_VMA_NAME so we can just
remove the explicit assignment.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/madvise.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index cae064479cdf908707c45b941bd03d43d095eab6..ee02ccd0315a146cdb3001cd189e03be9e48a2ea 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -2082,10 +2082,10 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
 {
 	unsigned long end;
 	unsigned long len;
+	int error;
 	struct madvise_behavior madv_behavior = {
 		.mm = mm,
 		.behavior = __MADV_SET_ANON_VMA_NAME,
-		.lock_mode = MADVISE_MMAP_WRITE_LOCK,
 		.anon_name = anon_name,
 	};
 
@@ -2106,7 +2106,14 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
 
 	madv_behavior.range.start = start;
 	madv_behavior.range.end = end;
-	return madvise_walk_vmas(&madv_behavior);
+
+	error = madvise_lock(&madv_behavior);
+	if (error)
+		return error;
+	error = madvise_walk_vmas(&madv_behavior);
+	madvise_unlock(&madv_behavior);
+
+	return error;
 }
 
 int set_anon_vma_name(unsigned long addr, unsigned long size,
@@ -2136,9 +2143,7 @@ int set_anon_vma_name(unsigned long addr, unsigned long size,
 			return -ENOMEM;
 	}
 
-	mmap_write_lock(mm);
 	error = madvise_set_anon_name(mm, addr, size, anon_name);
-	mmap_write_unlock(mm);
 	anon_vma_name_put(anon_name);
 
 	return error;

-- 
2.50.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
  2025-06-24 13:03 ` [PATCH v2 1/4] mm, madvise: simplify anon_name handling Vlastimil Babka
@ 2025-06-24 13:58   ` David Hildenbrand
  2025-06-24 14:28     ` Suren Baghdasaryan
  2025-06-24 16:38     ` Vlastimil Babka
  2025-06-24 15:26   ` Lorenzo Stoakes
  2025-06-24 17:01   ` Vlastimil Babka
  2 siblings, 2 replies; 25+ messages in thread
From: David Hildenbrand @ 2025-06-24 13:58 UTC (permalink / raw)
  To: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	Jann Horn, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Colin Cross
  Cc: linux-mm, linux-kernel

On 24.06.25 15:03, Vlastimil Babka wrote:
> Since the introduction in 9a10064f5625 ("mm: add a field to store names
> for private anonymous memory") the code to set anon_name on a vma has
> been using madvise_update_vma() to call replace_anon_vma_name(). Since
> the former is called also by a number of other madvise behaviours that
> do not set a new anon_name, they have been passing the existing
> anon_name of the vma to make replace_vma_anon_name() a no-op.
> 
> This is rather wasteful as it needs anon_vma_name_eq() to determine the
> no-op situations, and checks for when replace_vma_anon_name() is allowed
> (the vma is anon/shmem) duplicate the checks already done earlier in
> madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
> fix use-after-free when anon vma name is used after vma is freed")
> adding anon_name refcount get/put operations exactly to the cases that
> actually do not change anon_name - just so the replace_vma_anon_name()
> can keep safely determining it has nothing to do.
> 
> The recent madvise cleanups made this suboptimal handling very obvious,
> but happily also allow for an easy fix. madvise_update_vma() now has the
> complete information whether it's been called to set a new anon_name, so
> stop passing it the existing vma's name and doing the refcount get/put
> in its only caller madvise_vma_behavior().
> 
> In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
> only to cases where we are setting a new name, otherwise we know it's a
> no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
> can remove the duplicate checks for vma being anon/shmem that were done
> already in madvise_vma_behavior().
> 
> Additionally, by using vma_modify_flags() when not modifying the
> anon_name, avoid explicitly passing the existing vma's anon_name and
> storing a pointer to it in struct madv_behavior or a local variable.
> This prevents the danger of accessing a freed anon_name after vma
> merging, previously fixed by commit 942341dcc574.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>   mm/madvise.c | 37 +++++++++++++------------------------
>   1 file changed, 13 insertions(+), 24 deletions(-)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
>   }
>   #endif /* CONFIG_ANON_VMA_NAME */
>   /*
> - * Update the vm_flags on region of a vma, splitting it or merging it as
> - * necessary.  Must be called with mmap_lock held for writing;
> - * Caller should ensure anon_name stability by raising its refcount even when
> - * anon_name belongs to a valid vma because this function might free that vma.
> + * Update the vm_flags and/or anon_name on region of a vma, splitting it or
> + * merging it as necessary. Must be called with mmap_lock held for writing.
>    */
>   static int madvise_update_vma(vm_flags_t new_flags,
>   		struct madvise_behavior *madv_behavior)
>   {
> -	int error;
>   	struct vm_area_struct *vma = madv_behavior->vma;
>   	struct madvise_behavior_range *range = &madv_behavior->range;
>   	struct anon_vma_name *anon_name = madv_behavior->anon_name;
> +	bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME;
>   	VMA_ITERATOR(vmi, madv_behavior->mm, range->start);
>   
> -	if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name))
> +	if (new_flags == vma->vm_flags && (!set_new_anon_name ||
> +			anon_vma_name_eq(anon_vma_name(vma), anon_name)))
>   		return 0;
>   
> -	vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
> +	if (set_new_anon_name)
> +		vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
>   			range->start, range->end, new_flags, anon_name);
> +	else
> +		vma = vma_modify_flags(&vmi, madv_behavior->prev, vma,
> +			range->start, range->end, new_flags);
> +
>   	if (IS_ERR(vma))
>   		return PTR_ERR(vma);
>   
> @@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags,
>   	/* vm_flags is protected by the mmap_lock held in write mode. */
>   	vma_start_write(vma);
>   	vm_flags_reset(vma, new_flags);
> -	if (!vma->vm_file || vma_is_anon_shmem(vma)) {
> -		error = replace_anon_vma_name(vma, anon_name);
> -		if (error)
> -			return error;
> -	}
> +	if (set_new_anon_name)
> +		return replace_anon_vma_name(vma, anon_name);

Took me a second to find where this is already checked (-> 
madvise_vma_behavior()).

:)

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c
  2025-06-24 13:03 ` [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c Vlastimil Babka
@ 2025-06-24 14:04   ` David Hildenbrand
  2025-06-24 14:31     ` Suren Baghdasaryan
  2025-06-24 15:28   ` Lorenzo Stoakes
  1 sibling, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2025-06-24 14:04 UTC (permalink / raw)
  To: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	Jann Horn, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Colin Cross
  Cc: linux-mm, linux-kernel

On 24.06.25 15:03, Vlastimil Babka wrote:
> Setting anon_name is done via madvise_set_anon_name() and behaves a lot
> of like other madvise operations. However, apparently because madvise()
> has lacked the 4th argument and prctl() not, the userspace entry point
> has been implemented via prctl(PR_SET_VMA, ...) and handled first by
> prctl_set_vma().
> 
> Currently prctl_set_vma() lives in kernel/sys.c but setting the
> vma->anon_name is mm-specific code so extract it to a new
> set_anon_vma_name() function under mm. mm/madvise.c seems to be the most
> straightforward place as that's where madvise_set_anon_name() lives.
> Stop declaring the latter in mm.h and instead declare
> set_anon_vma_name().
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>   include/linux/mm.h | 14 +++++++-------
>   kernel/sys.c       | 50 +-------------------------------------------------
>   mm/madvise.c       | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++--
>   3 files changed, 58 insertions(+), 58 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0e0549f3d681f6c7a78e8dfa341a810e5a8f96c1..ef40f68c1183d4c95016575a4ee0171e12df9ba4 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -4059,14 +4059,14 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,
>   #endif
>   
>   #ifdef CONFIG_ANON_VMA_NAME
> -int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> -			  unsigned long len_in,
> -			  struct anon_vma_name *anon_name);
> +int set_anon_vma_name(unsigned long addr, unsigned long size,
> +		      const char __user *uname);
>   #else
> -static inline int
> -madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> -		      unsigned long len_in, struct anon_vma_name *anon_name) {
> -	return 0;
> +static inline
> +int set_anon_vma_name(unsigned long addr, unsigned long size,
> +		      const char __user *uname)
> +{
> +	return -EINVAL;
>   }
>   #endif
>   
> diff --git a/kernel/sys.c b/kernel/sys.c
> index adc0de0aa364aebb23999f621717a5d32599921c..b153fb345ada28ea1a33386a32bcce9cb1b23475 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2343,54 +2343,14 @@ int __weak arch_lock_shadow_stack_status(struct task_struct *t, unsigned long st
>   
>   #define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE)
>   
> -#ifdef CONFIG_ANON_VMA_NAME
> -
> -#define ANON_VMA_NAME_MAX_LEN		80
> -#define ANON_VMA_NAME_INVALID_CHARS	"\\`$[]"
> -
> -static inline bool is_valid_name_char(char ch)
> -{
> -	/* printable ascii characters, excluding ANON_VMA_NAME_INVALID_CHARS */
> -	return ch > 0x1f && ch < 0x7f &&
> -		!strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
> -}
> -
>   static int prctl_set_vma(unsigned long opt, unsigned long addr,
>   			 unsigned long size, unsigned long arg)
>   {
> -	struct mm_struct *mm = current->mm;
> -	const char __user *uname;
> -	struct anon_vma_name *anon_name = NULL;
>   	int error;
>   
>   	switch (opt) {
>   	case PR_SET_VMA_ANON_NAME:
> -		uname = (const char __user *)arg;
> -		if (uname) {
> -			char *name, *pch;
> -
> -			name = strndup_user(uname, ANON_VMA_NAME_MAX_LEN);
> -			if (IS_ERR(name))
> -				return PTR_ERR(name);
> -
> -			for (pch = name; *pch != '\0'; pch++) {
> -				if (!is_valid_name_char(*pch)) {
> -					kfree(name);
> -					return -EINVAL;
> -				}
> -			}
> -			/* anon_vma has its own copy */
> -			anon_name = anon_vma_name_alloc(name);
> -			kfree(name);
> -			if (!anon_name)
> -				return -ENOMEM;
> -
> -		}
> -
> -		mmap_write_lock(mm);
> -		error = madvise_set_anon_name(mm, addr, size, anon_name);
> -		mmap_write_unlock(mm);
> -		anon_vma_name_put(anon_name);
> +		error = set_anon_vma_name(addr, size, (const char __user *)arg);

At first I thought whether passing current->mm as an argument might make 
it clearer on what we actually operate. But then, "anon_vma" might give 
a good hint.

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file
  2025-06-24 13:03 ` [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file Vlastimil Babka
@ 2025-06-24 14:05   ` David Hildenbrand
  2025-06-24 14:33     ` Suren Baghdasaryan
  2025-06-24 16:46     ` Vlastimil Babka
  2025-06-24 15:36   ` Lorenzo Stoakes
  2025-06-24 17:02   ` Vlastimil Babka
  2 siblings, 2 replies; 25+ messages in thread
From: David Hildenbrand @ 2025-06-24 14:05 UTC (permalink / raw)
  To: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	Jann Horn, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Colin Cross
  Cc: linux-mm, linux-kernel

On 24.06.25 15:03, Vlastimil Babka wrote:
> Preparatory change so that we can use madvise_lock()/unlock() in the
> function without forward declarations or more thorough shuffling.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>   mm/madvise.c | 64 ++++++++++++++++++++++++++++++------------------------------
>   1 file changed, 32 insertions(+), 32 deletions(-)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 7e8819b5e9a0f183213ffe19d7e52bd5fda5f49d..cae064479cdf908707c45b941bd03d43d095eab6 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -133,38 +133,6 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
>   
>   	return 0;
>   }
> -
> -static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> -		unsigned long len_in, struct anon_vma_name *anon_name)
> -{
> -	unsigned long end;
> -	unsigned long len;
> -	struct madvise_behavior madv_behavior = {
> -		.mm = mm,
> -		.behavior = __MADV_SET_ANON_VMA_NAME,
> -		.lock_mode = MADVISE_MMAP_WRITE_LOCK,
> -		.anon_name = anon_name,
> -	};
> -
> -	if (start & ~PAGE_MASK)
> -		return -EINVAL;
> -	len = (len_in + ~PAGE_MASK) & PAGE_MASK;
> -
> -	/* Check to see whether len was rounded up from small -ve to zero */
> -	if (len_in && !len)
> -		return -EINVAL;
> -
> -	end = start + len;
> -	if (end < start)
> -		return -EINVAL;
> -
> -	if (end == start)
> -		return 0;
> -
> -	madv_behavior.range.start = start;
> -	madv_behavior.range.end = end;
> -	return madvise_walk_vmas(&madv_behavior);
> -}
>   #else /* CONFIG_ANON_VMA_NAME */
>   static int replace_anon_vma_name(struct vm_area_struct *vma,
>   				 struct anon_vma_name *anon_name)
> @@ -2109,6 +2077,38 @@ static inline bool is_valid_name_char(char ch)
>   		!strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
>   }
>   
> +static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> +		unsigned long len_in, struct anon_vma_name *anon_name)
> +{
> +	unsigned long end;
> +	unsigned long len;
> +	struct madvise_behavior madv_behavior = {
> +		.mm = mm,
> +		.behavior = __MADV_SET_ANON_VMA_NAME,
> +		.lock_mode = MADVISE_MMAP_WRITE_LOCK,
> +		.anon_name = anon_name,
> +	};
> +
> +	if (start & ~PAGE_MASK)
> +		return -EINVAL;
> +	len = (len_in + ~PAGE_MASK) & PAGE_MASK;
> +
> +	/* Check to see whether len was rounded up from small -ve to zero */
> +	if (len_in && !len)
> +		return -EINVAL;
> +
> +	end = start + len;
> +	if (end < start)
> +		return -EINVAL;
> +
> +	if (end == start)
> +		return 0;
> +
> +	madv_behavior.range.start = start;
> +	madv_behavior.range.end = end;
> +	return madvise_walk_vmas(&madv_behavior);
> +}
> +
>   int set_anon_vma_name(unsigned long addr, unsigned long size,
>   		      const char __user *uname)
>   {
> 

Personally, I would squash that into #4, given that #4 is pretty small ;)

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name()
  2025-06-24 13:03 ` [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name() Vlastimil Babka
@ 2025-06-24 14:06   ` David Hildenbrand
  2025-06-24 14:35     ` Suren Baghdasaryan
  2025-06-24 15:45   ` Lorenzo Stoakes
  2025-06-24 17:03   ` Vlastimil Babka
  2 siblings, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2025-06-24 14:06 UTC (permalink / raw)
  To: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	Jann Horn, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Colin Cross
  Cc: linux-mm, linux-kernel

On 24.06.25 15:03, Vlastimil Babka wrote:
> Use madvise_lock()/madvise_unlock() in madvise_set_anon_name() in the
> same way as in do_madvise(). This narrows the lock scope a bit and
> reuses existing functionality. get_lock_mode() already picks the correct
> MADVISE_MMAP_WRITE_LOCK mode for __MADV_SET_ANON_VMA_NAME so we can just
> remove the explicit assignment.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
  2025-06-24 13:58   ` David Hildenbrand
@ 2025-06-24 14:28     ` Suren Baghdasaryan
  2025-06-24 16:41       ` Vlastimil Babka
  2025-06-24 16:38     ` Vlastimil Babka
  1 sibling, 1 reply; 25+ messages in thread
From: Suren Baghdasaryan @ 2025-06-24 14:28 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	Jann Horn, Mike Rapoport, Michal Hocko, Colin Cross, linux-mm,
	linux-kernel

On Tue, Jun 24, 2025 at 6:58 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 24.06.25 15:03, Vlastimil Babka wrote:
> > Since the introduction in 9a10064f5625 ("mm: add a field to store names
> > for private anonymous memory") the code to set anon_name on a vma has
> > been using madvise_update_vma() to call replace_anon_vma_name(). Since
> > the former is called also by a number of other madvise behaviours that
> > do not set a new anon_name, they have been passing the existing
> > anon_name of the vma to make replace_vma_anon_name() a no-op.

s/replace_vma_anon_name/replace_anon_vma_name

> >
> > This is rather wasteful as it needs anon_vma_name_eq() to determine the
> > no-op situations, and checks for when replace_vma_anon_name() is allowed

s/replace_vma_anon_name/replace_anon_vma_name

> > (the vma is anon/shmem) duplicate the checks already done earlier in
> > madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
> > fix use-after-free when anon vma name is used after vma is freed")
> > adding anon_name refcount get/put operations exactly to the cases that
> > actually do not change anon_name - just so the replace_vma_anon_name()

s/replace_vma_anon_name/replace_anon_vma_name

> > can keep safely determining it has nothing to do.
> >
> > The recent madvise cleanups made this suboptimal handling very obvious,
> > but happily also allow for an easy fix. madvise_update_vma() now has the
> > complete information whether it's been called to set a new anon_name, so
> > stop passing it the existing vma's name and doing the refcount get/put
> > in its only caller madvise_vma_behavior().
> >
> > In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
> > only to cases where we are setting a new name, otherwise we know it's a
> > no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
> > can remove the duplicate checks for vma being anon/shmem that were done
> > already in madvise_vma_behavior().
> >
> > Additionally, by using vma_modify_flags() when not modifying the
> > anon_name, avoid explicitly passing the existing vma's anon_name and
> > storing a pointer to it in struct madv_behavior or a local variable.
> > This prevents the danger of accessing a freed anon_name after vma
> > merging, previously fixed by commit 942341dcc574.
> >
> > Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> > ---
> >   mm/madvise.c | 37 +++++++++++++------------------------
> >   1 file changed, 13 insertions(+), 24 deletions(-)
> >
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
> >   }
> >   #endif /* CONFIG_ANON_VMA_NAME */
> >   /*
> > - * Update the vm_flags on region of a vma, splitting it or merging it as
> > - * necessary.  Must be called with mmap_lock held for writing;
> > - * Caller should ensure anon_name stability by raising its refcount even when
> > - * anon_name belongs to a valid vma because this function might free that vma.
> > + * Update the vm_flags and/or anon_name on region of a vma, splitting it or
> > + * merging it as necessary. Must be called with mmap_lock held for writing.
> >    */
> >   static int madvise_update_vma(vm_flags_t new_flags,
> >               struct madvise_behavior *madv_behavior)
> >   {
> > -     int error;
> >       struct vm_area_struct *vma = madv_behavior->vma;
> >       struct madvise_behavior_range *range = &madv_behavior->range;
> >       struct anon_vma_name *anon_name = madv_behavior->anon_name;
> > +     bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME;
> >       VMA_ITERATOR(vmi, madv_behavior->mm, range->start);
> >
> > -     if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name))
> > +     if (new_flags == vma->vm_flags && (!set_new_anon_name ||
> > +                     anon_vma_name_eq(anon_vma_name(vma), anon_name)))
> >               return 0;
> >
> > -     vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
> > +     if (set_new_anon_name)
> > +             vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
> >                       range->start, range->end, new_flags, anon_name);
> > +     else
> > +             vma = vma_modify_flags(&vmi, madv_behavior->prev, vma,
> > +                     range->start, range->end, new_flags);
> > +
> >       if (IS_ERR(vma))
> >               return PTR_ERR(vma);
> >
> > @@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags,
> >       /* vm_flags is protected by the mmap_lock held in write mode. */
> >       vma_start_write(vma);
> >       vm_flags_reset(vma, new_flags);
> > -     if (!vma->vm_file || vma_is_anon_shmem(vma)) {
> > -             error = replace_anon_vma_name(vma, anon_name);
> > -             if (error)
> > -                     return error;
> > -     }
> > +     if (set_new_anon_name)
> > +             return replace_anon_vma_name(vma, anon_name);
>
> Took me a second to find where this is already checked (->
> madvise_vma_behavior()).
>
> :)
>
> Acked-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Suren Baghdasaryan <surenb@google.com>

>
> --
> Cheers,
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c
  2025-06-24 14:04   ` David Hildenbrand
@ 2025-06-24 14:31     ` Suren Baghdasaryan
  0 siblings, 0 replies; 25+ messages in thread
From: Suren Baghdasaryan @ 2025-06-24 14:31 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	Jann Horn, Mike Rapoport, Michal Hocko, Colin Cross, linux-mm,
	linux-kernel

On Tue, Jun 24, 2025 at 7:04 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 24.06.25 15:03, Vlastimil Babka wrote:
> > Setting anon_name is done via madvise_set_anon_name() and behaves a lot
> > of like other madvise operations. However, apparently because madvise()
> > has lacked the 4th argument and prctl() not, the userspace entry point
> > has been implemented via prctl(PR_SET_VMA, ...) and handled first by
> > prctl_set_vma().
> >
> > Currently prctl_set_vma() lives in kernel/sys.c but setting the
> > vma->anon_name is mm-specific code so extract it to a new
> > set_anon_vma_name() function under mm. mm/madvise.c seems to be the most
> > straightforward place as that's where madvise_set_anon_name() lives.
> > Stop declaring the latter in mm.h and instead declare
> > set_anon_vma_name().
> >
> > Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> > ---
> >   include/linux/mm.h | 14 +++++++-------
> >   kernel/sys.c       | 50 +-------------------------------------------------
> >   mm/madvise.c       | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> >   3 files changed, 58 insertions(+), 58 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 0e0549f3d681f6c7a78e8dfa341a810e5a8f96c1..ef40f68c1183d4c95016575a4ee0171e12df9ba4 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -4059,14 +4059,14 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,
> >   #endif
> >
> >   #ifdef CONFIG_ANON_VMA_NAME
> > -int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> > -                       unsigned long len_in,
> > -                       struct anon_vma_name *anon_name);
> > +int set_anon_vma_name(unsigned long addr, unsigned long size,
> > +                   const char __user *uname);
> >   #else
> > -static inline int
> > -madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> > -                   unsigned long len_in, struct anon_vma_name *anon_name) {
> > -     return 0;
> > +static inline
> > +int set_anon_vma_name(unsigned long addr, unsigned long size,
> > +                   const char __user *uname)
> > +{
> > +     return -EINVAL;
> >   }
> >   #endif
> >
> > diff --git a/kernel/sys.c b/kernel/sys.c
> > index adc0de0aa364aebb23999f621717a5d32599921c..b153fb345ada28ea1a33386a32bcce9cb1b23475 100644
> > --- a/kernel/sys.c
> > +++ b/kernel/sys.c
> > @@ -2343,54 +2343,14 @@ int __weak arch_lock_shadow_stack_status(struct task_struct *t, unsigned long st
> >
> >   #define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE)
> >
> > -#ifdef CONFIG_ANON_VMA_NAME
> > -
> > -#define ANON_VMA_NAME_MAX_LEN                80
> > -#define ANON_VMA_NAME_INVALID_CHARS  "\\`$[]"
> > -
> > -static inline bool is_valid_name_char(char ch)
> > -{
> > -     /* printable ascii characters, excluding ANON_VMA_NAME_INVALID_CHARS */
> > -     return ch > 0x1f && ch < 0x7f &&
> > -             !strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
> > -}
> > -
> >   static int prctl_set_vma(unsigned long opt, unsigned long addr,
> >                        unsigned long size, unsigned long arg)
> >   {
> > -     struct mm_struct *mm = current->mm;
> > -     const char __user *uname;
> > -     struct anon_vma_name *anon_name = NULL;
> >       int error;
> >
> >       switch (opt) {
> >       case PR_SET_VMA_ANON_NAME:
> > -             uname = (const char __user *)arg;
> > -             if (uname) {
> > -                     char *name, *pch;
> > -
> > -                     name = strndup_user(uname, ANON_VMA_NAME_MAX_LEN);
> > -                     if (IS_ERR(name))
> > -                             return PTR_ERR(name);
> > -
> > -                     for (pch = name; *pch != '\0'; pch++) {
> > -                             if (!is_valid_name_char(*pch)) {
> > -                                     kfree(name);
> > -                                     return -EINVAL;
> > -                             }
> > -                     }
> > -                     /* anon_vma has its own copy */
> > -                     anon_name = anon_vma_name_alloc(name);
> > -                     kfree(name);
> > -                     if (!anon_name)
> > -                             return -ENOMEM;
> > -
> > -             }
> > -
> > -             mmap_write_lock(mm);
> > -             error = madvise_set_anon_name(mm, addr, size, anon_name);
> > -             mmap_write_unlock(mm);
> > -             anon_vma_name_put(anon_name);
> > +             error = set_anon_vma_name(addr, size, (const char __user *)arg);
>
> At first I thought whether passing current->mm as an argument might make
> it clearer on what we actually operate. But then, "anon_vma" might give
> a good hint.
>
> Acked-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Suren Baghdasaryan <surenb@google.com>

>
> --
> Cheers,
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file
  2025-06-24 14:05   ` David Hildenbrand
@ 2025-06-24 14:33     ` Suren Baghdasaryan
  2025-06-24 16:46     ` Vlastimil Babka
  1 sibling, 0 replies; 25+ messages in thread
From: Suren Baghdasaryan @ 2025-06-24 14:33 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	Jann Horn, Mike Rapoport, Michal Hocko, Colin Cross, linux-mm,
	linux-kernel

On Tue, Jun 24, 2025 at 7:05 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 24.06.25 15:03, Vlastimil Babka wrote:
> > Preparatory change so that we can use madvise_lock()/unlock() in the
> > function without forward declarations or more thorough shuffling.

If you respin the series please add a note that there is no functional
change here.

> >
> > Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> > ---
> >   mm/madvise.c | 64 ++++++++++++++++++++++++++++++------------------------------
> >   1 file changed, 32 insertions(+), 32 deletions(-)
> >
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index 7e8819b5e9a0f183213ffe19d7e52bd5fda5f49d..cae064479cdf908707c45b941bd03d43d095eab6 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -133,38 +133,6 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
> >
> >       return 0;
> >   }
> > -
> > -static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> > -             unsigned long len_in, struct anon_vma_name *anon_name)
> > -{
> > -     unsigned long end;
> > -     unsigned long len;
> > -     struct madvise_behavior madv_behavior = {
> > -             .mm = mm,
> > -             .behavior = __MADV_SET_ANON_VMA_NAME,
> > -             .lock_mode = MADVISE_MMAP_WRITE_LOCK,
> > -             .anon_name = anon_name,
> > -     };
> > -
> > -     if (start & ~PAGE_MASK)
> > -             return -EINVAL;
> > -     len = (len_in + ~PAGE_MASK) & PAGE_MASK;
> > -
> > -     /* Check to see whether len was rounded up from small -ve to zero */
> > -     if (len_in && !len)
> > -             return -EINVAL;
> > -
> > -     end = start + len;
> > -     if (end < start)
> > -             return -EINVAL;
> > -
> > -     if (end == start)
> > -             return 0;
> > -
> > -     madv_behavior.range.start = start;
> > -     madv_behavior.range.end = end;
> > -     return madvise_walk_vmas(&madv_behavior);
> > -}
> >   #else /* CONFIG_ANON_VMA_NAME */
> >   static int replace_anon_vma_name(struct vm_area_struct *vma,
> >                                struct anon_vma_name *anon_name)
> > @@ -2109,6 +2077,38 @@ static inline bool is_valid_name_char(char ch)
> >               !strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
> >   }
> >
> > +static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> > +             unsigned long len_in, struct anon_vma_name *anon_name)
> > +{
> > +     unsigned long end;
> > +     unsigned long len;
> > +     struct madvise_behavior madv_behavior = {
> > +             .mm = mm,
> > +             .behavior = __MADV_SET_ANON_VMA_NAME,
> > +             .lock_mode = MADVISE_MMAP_WRITE_LOCK,
> > +             .anon_name = anon_name,
> > +     };
> > +
> > +     if (start & ~PAGE_MASK)
> > +             return -EINVAL;
> > +     len = (len_in + ~PAGE_MASK) & PAGE_MASK;
> > +
> > +     /* Check to see whether len was rounded up from small -ve to zero */
> > +     if (len_in && !len)
> > +             return -EINVAL;
> > +
> > +     end = start + len;
> > +     if (end < start)
> > +             return -EINVAL;
> > +
> > +     if (end == start)
> > +             return 0;
> > +
> > +     madv_behavior.range.start = start;
> > +     madv_behavior.range.end = end;
> > +     return madvise_walk_vmas(&madv_behavior);
> > +}
> > +
> >   int set_anon_vma_name(unsigned long addr, unsigned long size,
> >                     const char __user *uname)
> >   {
> >
>
> Personally, I would squash that into #4, given that #4 is pretty small ;)
>
> Acked-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Suren Baghdasaryan <surenb@google.com>

>
> --
> Cheers,
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name()
  2025-06-24 14:06   ` David Hildenbrand
@ 2025-06-24 14:35     ` Suren Baghdasaryan
  0 siblings, 0 replies; 25+ messages in thread
From: Suren Baghdasaryan @ 2025-06-24 14:35 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	Jann Horn, Mike Rapoport, Michal Hocko, Colin Cross, linux-mm,
	linux-kernel

On Tue, Jun 24, 2025 at 7:06 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 24.06.25 15:03, Vlastimil Babka wrote:
> > Use madvise_lock()/madvise_unlock() in madvise_set_anon_name() in the
> > same way as in do_madvise(). This narrows the lock scope a bit and
> > reuses existing functionality. get_lock_mode() already picks the correct
> > MADVISE_MMAP_WRITE_LOCK mode for __MADV_SET_ANON_VMA_NAME so we can just
> > remove the explicit assignment.
> >
> > Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> > ---
>
> Acked-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Suren Baghdasaryan <surenb@google.com>

>
> --
> Cheers,
>
> David / dhildenb
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
  2025-06-24 13:03 ` [PATCH v2 1/4] mm, madvise: simplify anon_name handling Vlastimil Babka
  2025-06-24 13:58   ` David Hildenbrand
@ 2025-06-24 15:26   ` Lorenzo Stoakes
  2025-06-24 16:42     ` Vlastimil Babka
  2025-06-24 17:01   ` Vlastimil Babka
  2 siblings, 1 reply; 25+ messages in thread
From: Lorenzo Stoakes @ 2025-06-24 15:26 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand, Jann Horn,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Colin Cross,
	linux-mm, linux-kernel

On Tue, Jun 24, 2025 at 03:03:45PM +0200, Vlastimil Babka wrote:
> Since the introduction in 9a10064f5625 ("mm: add a field to store names
> for private anonymous memory") the code to set anon_name on a vma has
> been using madvise_update_vma() to call replace_anon_vma_name(). Since
> the former is called also by a number of other madvise behaviours that
> do not set a new anon_name, they have been passing the existing
> anon_name of the vma to make replace_vma_anon_name() a no-op.
>
> This is rather wasteful as it needs anon_vma_name_eq() to determine the
> no-op situations, and checks for when replace_vma_anon_name() is allowed
> (the vma is anon/shmem) duplicate the checks already done earlier in
> madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
> fix use-after-free when anon vma name is used after vma is freed")
> adding anon_name refcount get/put operations exactly to the cases that
> actually do not change anon_name - just so the replace_vma_anon_name()
> can keep safely determining it has nothing to do.
>
> The recent madvise cleanups made this suboptimal handling very obvious,
> but happily also allow for an easy fix. madvise_update_vma() now has the
> complete information whether it's been called to set a new anon_name, so
> stop passing it the existing vma's name and doing the refcount get/put
> in its only caller madvise_vma_behavior().
>
> In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
> only to cases where we are setting a new name, otherwise we know it's a
> no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
> can remove the duplicate checks for vma being anon/shmem that were done
> already in madvise_vma_behavior().
>
> Additionally, by using vma_modify_flags() when not modifying the
> anon_name, avoid explicitly passing the existing vma's anon_name and
> storing a pointer to it in struct madv_behavior or a local variable.
> This prevents the danger of accessing a freed anon_name after vma
> merging, previously fixed by commit 942341dcc574.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Cheers, LGTM so:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

I made sure to do some stress-ng --madvise testing with this series (+ my recent
fix) applied :P all good.

> ---
>  mm/madvise.c | 37 +++++++++++++------------------------
>  1 file changed, 13 insertions(+), 24 deletions(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
>  }
>  #endif /* CONFIG_ANON_VMA_NAME */
>  /*
> - * Update the vm_flags on region of a vma, splitting it or merging it as
> - * necessary.  Must be called with mmap_lock held for writing;
> - * Caller should ensure anon_name stability by raising its refcount even when
> - * anon_name belongs to a valid vma because this function might free that vma.
> + * Update the vm_flags and/or anon_name on region of a vma, splitting it or
> + * merging it as necessary. Must be called with mmap_lock held for writing.
>   */
>  static int madvise_update_vma(vm_flags_t new_flags,
>  		struct madvise_behavior *madv_behavior)
>  {
> -	int error;
>  	struct vm_area_struct *vma = madv_behavior->vma;
>  	struct madvise_behavior_range *range = &madv_behavior->range;
>  	struct anon_vma_name *anon_name = madv_behavior->anon_name;
> +	bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME;
>  	VMA_ITERATOR(vmi, madv_behavior->mm, range->start);
>
> -	if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name))
> +	if (new_flags == vma->vm_flags && (!set_new_anon_name ||
> +			anon_vma_name_eq(anon_vma_name(vma), anon_name)))
>  		return 0;
>
> -	vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
> +	if (set_new_anon_name)
> +		vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
>  			range->start, range->end, new_flags, anon_name);

I will do a follow up (doesn't really belong here I'd say as involves change to
modify code) that makes this vma_modify_name() or vma_modify_anon_name() as this
is the only caller and we don't care about flags here :)

> +	else
> +		vma = vma_modify_flags(&vmi, madv_behavior->prev, vma,
> +			range->start, range->end, new_flags);
> +
>  	if (IS_ERR(vma))
>  		return PTR_ERR(vma);
>
> @@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags,
>  	/* vm_flags is protected by the mmap_lock held in write mode. */
>  	vma_start_write(vma);
>  	vm_flags_reset(vma, new_flags);
> -	if (!vma->vm_file || vma_is_anon_shmem(vma)) {
> -		error = replace_anon_vma_name(vma, anon_name);
> -		if (error)
> -			return error;
> -	}
> +	if (set_new_anon_name)
> +		return replace_anon_vma_name(vma, anon_name);
>
>  	return 0;
>  }
> @@ -1313,7 +1314,6 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior)
>  	int behavior = madv_behavior->behavior;
>  	struct vm_area_struct *vma = madv_behavior->vma;
>  	vm_flags_t new_flags = vma->vm_flags;
> -	bool set_new_anon_name = behavior == __MADV_SET_ANON_VMA_NAME;
>  	struct madvise_behavior_range *range = &madv_behavior->range;
>  	int error;
>
> @@ -1403,18 +1403,7 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior)
>  	/* This is a write operation.*/
>  	VM_WARN_ON_ONCE(madv_behavior->lock_mode != MADVISE_MMAP_WRITE_LOCK);
>
> -	/*
> -	 * madvise_update_vma() might cause a VMA merge which could put an
> -	 * anon_vma_name, so we must hold an additional reference on the
> -	 * anon_vma_name so it doesn't disappear from under us.
> -	 */
> -	if (!set_new_anon_name) {
> -		madv_behavior->anon_name = anon_vma_name(vma);
> -		anon_vma_name_get(madv_behavior->anon_name);
> -	}
>  	error = madvise_update_vma(new_flags, madv_behavior);
> -	if (!set_new_anon_name)
> -		anon_vma_name_put(madv_behavior->anon_name);
>  out:
>  	/*
>  	 * madvise() returns EAGAIN if kernel resources, such as
>
> --
> 2.50.0
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c
  2025-06-24 13:03 ` [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c Vlastimil Babka
  2025-06-24 14:04   ` David Hildenbrand
@ 2025-06-24 15:28   ` Lorenzo Stoakes
  1 sibling, 0 replies; 25+ messages in thread
From: Lorenzo Stoakes @ 2025-06-24 15:28 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand, Jann Horn,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Colin Cross,
	linux-mm, linux-kernel

On Tue, Jun 24, 2025 at 03:03:46PM +0200, Vlastimil Babka wrote:
> Setting anon_name is done via madvise_set_anon_name() and behaves a lot
> of like other madvise operations. However, apparently because madvise()
> has lacked the 4th argument and prctl() not, the userspace entry point
> has been implemented via prctl(PR_SET_VMA, ...) and handled first by
> prctl_set_vma().
>
> Currently prctl_set_vma() lives in kernel/sys.c but setting the
> vma->anon_name is mm-specific code so extract it to a new
> set_anon_vma_name() function under mm. mm/madvise.c seems to be the most
> straightforward place as that's where madvise_set_anon_name() lives.
> Stop declaring the latter in mm.h and instead declare
> set_anon_vma_name().
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

LGTM, so:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

Thanks for doing this! :)

> ---
>  include/linux/mm.h | 14 +++++++-------
>  kernel/sys.c       | 50 +-------------------------------------------------
>  mm/madvise.c       | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++--
>  3 files changed, 58 insertions(+), 58 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0e0549f3d681f6c7a78e8dfa341a810e5a8f96c1..ef40f68c1183d4c95016575a4ee0171e12df9ba4 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -4059,14 +4059,14 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,
>  #endif
>
>  #ifdef CONFIG_ANON_VMA_NAME
> -int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> -			  unsigned long len_in,
> -			  struct anon_vma_name *anon_name);
> +int set_anon_vma_name(unsigned long addr, unsigned long size,
> +		      const char __user *uname);
>  #else
> -static inline int
> -madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> -		      unsigned long len_in, struct anon_vma_name *anon_name) {
> -	return 0;
> +static inline
> +int set_anon_vma_name(unsigned long addr, unsigned long size,
> +		      const char __user *uname)
> +{
> +	return -EINVAL;
>  }
>  #endif
>
> diff --git a/kernel/sys.c b/kernel/sys.c
> index adc0de0aa364aebb23999f621717a5d32599921c..b153fb345ada28ea1a33386a32bcce9cb1b23475 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2343,54 +2343,14 @@ int __weak arch_lock_shadow_stack_status(struct task_struct *t, unsigned long st
>
>  #define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE)
>
> -#ifdef CONFIG_ANON_VMA_NAME
> -
> -#define ANON_VMA_NAME_MAX_LEN		80
> -#define ANON_VMA_NAME_INVALID_CHARS	"\\`$[]"
> -
> -static inline bool is_valid_name_char(char ch)
> -{
> -	/* printable ascii characters, excluding ANON_VMA_NAME_INVALID_CHARS */
> -	return ch > 0x1f && ch < 0x7f &&
> -		!strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
> -}
> -
>  static int prctl_set_vma(unsigned long opt, unsigned long addr,
>  			 unsigned long size, unsigned long arg)
>  {
> -	struct mm_struct *mm = current->mm;
> -	const char __user *uname;
> -	struct anon_vma_name *anon_name = NULL;
>  	int error;
>
>  	switch (opt) {
>  	case PR_SET_VMA_ANON_NAME:
> -		uname = (const char __user *)arg;
> -		if (uname) {
> -			char *name, *pch;
> -
> -			name = strndup_user(uname, ANON_VMA_NAME_MAX_LEN);
> -			if (IS_ERR(name))
> -				return PTR_ERR(name);
> -
> -			for (pch = name; *pch != '\0'; pch++) {
> -				if (!is_valid_name_char(*pch)) {
> -					kfree(name);
> -					return -EINVAL;
> -				}
> -			}
> -			/* anon_vma has its own copy */
> -			anon_name = anon_vma_name_alloc(name);
> -			kfree(name);
> -			if (!anon_name)
> -				return -ENOMEM;
> -
> -		}
> -
> -		mmap_write_lock(mm);
> -		error = madvise_set_anon_name(mm, addr, size, anon_name);
> -		mmap_write_unlock(mm);
> -		anon_vma_name_put(anon_name);
> +		error = set_anon_vma_name(addr, size, (const char __user *)arg);
>  		break;
>  	default:
>  		error = -EINVAL;
> @@ -2399,14 +2359,6 @@ static int prctl_set_vma(unsigned long opt, unsigned long addr,
>  	return error;
>  }
>
> -#else /* CONFIG_ANON_VMA_NAME */
> -static int prctl_set_vma(unsigned long opt, unsigned long start,
> -			 unsigned long size, unsigned long arg)
> -{
> -	return -EINVAL;
> -}
> -#endif /* CONFIG_ANON_VMA_NAME */
> -
>  static inline unsigned long get_current_mdwe(void)
>  {
>  	unsigned long ret = 0;
> diff --git a/mm/madvise.c b/mm/madvise.c
> index fca0e9b3e844ad766e83ac04cc0d7f4099c74005..7e8819b5e9a0f183213ffe19d7e52bd5fda5f49d 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -134,8 +134,8 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
>  	return 0;
>  }
>
> -int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> -			  unsigned long len_in, struct anon_vma_name *anon_name)
> +static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> +		unsigned long len_in, struct anon_vma_name *anon_name)
>  {
>  	unsigned long end;
>  	unsigned long len;
> @@ -2096,3 +2096,51 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
>  out:
>  	return ret;
>  }
> +
> +#ifdef CONFIG_ANON_VMA_NAME
> +
> +#define ANON_VMA_NAME_MAX_LEN		80
> +#define ANON_VMA_NAME_INVALID_CHARS	"\\`$[]"
> +
> +static inline bool is_valid_name_char(char ch)
> +{
> +	/* printable ascii characters, excluding ANON_VMA_NAME_INVALID_CHARS */
> +	return ch > 0x1f && ch < 0x7f &&
> +		!strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
> +}
> +
> +int set_anon_vma_name(unsigned long addr, unsigned long size,
> +		      const char __user *uname)
> +{
> +	struct anon_vma_name *anon_name = NULL;
> +	struct mm_struct *mm = current->mm;
> +	int error;
> +
> +	if (uname) {
> +		char *name, *pch;
> +
> +		name = strndup_user(uname, ANON_VMA_NAME_MAX_LEN);
> +		if (IS_ERR(name))
> +			return PTR_ERR(name);
> +
> +		for (pch = name; *pch != '\0'; pch++) {
> +			if (!is_valid_name_char(*pch)) {
> +				kfree(name);
> +				return -EINVAL;
> +			}
> +		}
> +		/* anon_vma has its own copy */
> +		anon_name = anon_vma_name_alloc(name);
> +		kfree(name);
> +		if (!anon_name)
> +			return -ENOMEM;
> +	}
> +
> +	mmap_write_lock(mm);
> +	error = madvise_set_anon_name(mm, addr, size, anon_name);
> +	mmap_write_unlock(mm);
> +	anon_vma_name_put(anon_name);
> +
> +	return error;
> +}
> +#endif
>
> --
> 2.50.0
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file
  2025-06-24 13:03 ` [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file Vlastimil Babka
  2025-06-24 14:05   ` David Hildenbrand
@ 2025-06-24 15:36   ` Lorenzo Stoakes
  2025-06-24 17:02   ` Vlastimil Babka
  2 siblings, 0 replies; 25+ messages in thread
From: Lorenzo Stoakes @ 2025-06-24 15:36 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand, Jann Horn,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Colin Cross,
	linux-mm, linux-kernel

On Tue, Jun 24, 2025 at 03:03:47PM +0200, Vlastimil Babka wrote:
> Preparatory change so that we can use madvise_lock()/unlock() in the
> function without forward declarations or more thorough shuffling.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Fine, but I think this is small enough of a move to be safely combined with 4/4
as David says.

Either way:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

> ---
>  mm/madvise.c | 64 ++++++++++++++++++++++++++++++------------------------------
>  1 file changed, 32 insertions(+), 32 deletions(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 7e8819b5e9a0f183213ffe19d7e52bd5fda5f49d..cae064479cdf908707c45b941bd03d43d095eab6 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -133,38 +133,6 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
>
>  	return 0;
>  }
> -
> -static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> -		unsigned long len_in, struct anon_vma_name *anon_name)
> -{
> -	unsigned long end;
> -	unsigned long len;
> -	struct madvise_behavior madv_behavior = {
> -		.mm = mm,
> -		.behavior = __MADV_SET_ANON_VMA_NAME,
> -		.lock_mode = MADVISE_MMAP_WRITE_LOCK,
> -		.anon_name = anon_name,
> -	};
> -
> -	if (start & ~PAGE_MASK)
> -		return -EINVAL;
> -	len = (len_in + ~PAGE_MASK) & PAGE_MASK;
> -
> -	/* Check to see whether len was rounded up from small -ve to zero */
> -	if (len_in && !len)
> -		return -EINVAL;
> -
> -	end = start + len;
> -	if (end < start)
> -		return -EINVAL;
> -
> -	if (end == start)
> -		return 0;
> -
> -	madv_behavior.range.start = start;
> -	madv_behavior.range.end = end;
> -	return madvise_walk_vmas(&madv_behavior);
> -}
>  #else /* CONFIG_ANON_VMA_NAME */
>  static int replace_anon_vma_name(struct vm_area_struct *vma,
>  				 struct anon_vma_name *anon_name)
> @@ -2109,6 +2077,38 @@ static inline bool is_valid_name_char(char ch)
>  		!strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
>  }
>
> +static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> +		unsigned long len_in, struct anon_vma_name *anon_name)
> +{
> +	unsigned long end;
> +	unsigned long len;
> +	struct madvise_behavior madv_behavior = {
> +		.mm = mm,
> +		.behavior = __MADV_SET_ANON_VMA_NAME,
> +		.lock_mode = MADVISE_MMAP_WRITE_LOCK,
> +		.anon_name = anon_name,
> +	};
> +
> +	if (start & ~PAGE_MASK)
> +		return -EINVAL;
> +	len = (len_in + ~PAGE_MASK) & PAGE_MASK;
> +
> +	/* Check to see whether len was rounded up from small -ve to zero */
> +	if (len_in && !len)
> +		return -EINVAL;
> +
> +	end = start + len;
> +	if (end < start)
> +		return -EINVAL;
> +
> +	if (end == start)
> +		return 0;
> +
> +	madv_behavior.range.start = start;
> +	madv_behavior.range.end = end;
> +	return madvise_walk_vmas(&madv_behavior);
> +}
> +
>  int set_anon_vma_name(unsigned long addr, unsigned long size,
>  		      const char __user *uname)
>  {
>
> --
> 2.50.0
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name()
  2025-06-24 13:03 ` [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name() Vlastimil Babka
  2025-06-24 14:06   ` David Hildenbrand
@ 2025-06-24 15:45   ` Lorenzo Stoakes
  2025-06-24 16:48     ` Vlastimil Babka
  2025-06-24 17:03   ` Vlastimil Babka
  2 siblings, 1 reply; 25+ messages in thread
From: Lorenzo Stoakes @ 2025-06-24 15:45 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand, Jann Horn,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Colin Cross,
	linux-mm, linux-kernel

On Tue, Jun 24, 2025 at 03:03:48PM +0200, Vlastimil Babka wrote:
> Use madvise_lock()/madvise_unlock() in madvise_set_anon_name() in the
> same way as in do_madvise(). This narrows the lock scope a bit and
> reuses existing functionality. get_lock_mode() already picks the correct
> MADVISE_MMAP_WRITE_LOCK mode for __MADV_SET_ANON_VMA_NAME so we can just
> remove the explicit assignment.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

This is a nice idea, hadn't thought of it :) aren't we making anon vma name
behave now :P

I mean this is a _minor_ difference in functionality in that we will now
use mmap_write_lock_killable() and could fail to lock on fatal signal
whereas before we unconditionally lock but I think that's fine, and
possibly even... desirable actually?

At any rate this is all nice, thanks again!

> ---
>  mm/madvise.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index cae064479cdf908707c45b941bd03d43d095eab6..ee02ccd0315a146cdb3001cd189e03be9e48a2ea 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -2082,10 +2082,10 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
>  {
>  	unsigned long end;
>  	unsigned long len;
> +	int error;
>  	struct madvise_behavior madv_behavior = {
>  		.mm = mm,
>  		.behavior = __MADV_SET_ANON_VMA_NAME,
> -		.lock_mode = MADVISE_MMAP_WRITE_LOCK,
>  		.anon_name = anon_name,
>  	};
>
> @@ -2106,7 +2106,14 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
>
>  	madv_behavior.range.start = start;
>  	madv_behavior.range.end = end;
> -	return madvise_walk_vmas(&madv_behavior);
> +
> +	error = madvise_lock(&madv_behavior);
> +	if (error)
> +		return error;
> +	error = madvise_walk_vmas(&madv_behavior);
> +	madvise_unlock(&madv_behavior);
> +
> +	return error;
>  }
>
>  int set_anon_vma_name(unsigned long addr, unsigned long size,
> @@ -2136,9 +2143,7 @@ int set_anon_vma_name(unsigned long addr, unsigned long size,
>  			return -ENOMEM;
>  	}
>
> -	mmap_write_lock(mm);
>  	error = madvise_set_anon_name(mm, addr, size, anon_name);
> -	mmap_write_unlock(mm);
>  	anon_vma_name_put(anon_name);
>
>  	return error;
>
> --
> 2.50.0
>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
  2025-06-24 13:58   ` David Hildenbrand
  2025-06-24 14:28     ` Suren Baghdasaryan
@ 2025-06-24 16:38     ` Vlastimil Babka
  1 sibling, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 16:38 UTC (permalink / raw)
  To: David Hildenbrand, Andrew Morton, Liam R. Howlett,
	Lorenzo Stoakes, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
	Michal Hocko, Colin Cross
  Cc: linux-mm, linux-kernel

On 6/24/25 15:58, David Hildenbrand wrote:
> On 24.06.25 15:03, Vlastimil Babka wrote:
>> Since the introduction in 9a10064f5625 ("mm: add a field to store names
>> for private anonymous memory") the code to set anon_name on a vma has
>> been using madvise_update_vma() to call replace_anon_vma_name(). Since
>> the former is called also by a number of other madvise behaviours that
>> do not set a new anon_name, they have been passing the existing
>> anon_name of the vma to make replace_vma_anon_name() a no-op.
>> 
>> This is rather wasteful as it needs anon_vma_name_eq() to determine the
>> no-op situations, and checks for when replace_vma_anon_name() is allowed
>> (the vma is anon/shmem) duplicate the checks already done earlier in
>> madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:

   ^

>> fix use-after-free when anon vma name is used after vma is freed")
>> adding anon_name refcount get/put operations exactly to the cases that
>> actually do not change anon_name - just so the replace_vma_anon_name()
>> can keep safely determining it has nothing to do.
>> 
>> The recent madvise cleanups made this suboptimal handling very obvious,
>> but happily also allow for an easy fix. madvise_update_vma() now has the
>> complete information whether it's been called to set a new anon_name, so
>> stop passing it the existing vma's name and doing the refcount get/put
>> in its only caller madvise_vma_behavior().
>> 
>> In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
>> only to cases where we are setting a new name, otherwise we know it's a
>> no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
>> can remove the duplicate checks for vma being anon/shmem that were done
>> already in madvise_vma_behavior().

              ^

>> 
>> Additionally, by using vma_modify_flags() when not modifying the
>> anon_name, avoid explicitly passing the existing vma's anon_name and
>> storing a pointer to it in struct madv_behavior or a local variable.
>> This prevents the danger of accessing a freed anon_name after vma
>> merging, previously fixed by commit 942341dcc574.
>> 
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

<snip>

> Took me a second to find where this is already checked (-> 
> madvise_vma_behavior()).

And I thought I was repeating myself too much in the changelog :)

> 
> :)

But maybe you're joking on purpose, referring to that :)

> Acked-by: David Hildenbrand <david@redhat.com>

Thanks!




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
  2025-06-24 14:28     ` Suren Baghdasaryan
@ 2025-06-24 16:41       ` Vlastimil Babka
  0 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 16:41 UTC (permalink / raw)
  To: Suren Baghdasaryan, David Hildenbrand
  Cc: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes, Jann Horn,
	Mike Rapoport, Michal Hocko, Colin Cross, linux-mm, linux-kernel

On 6/24/25 16:28, Suren Baghdasaryan wrote:
> On Tue, Jun 24, 2025 at 6:58 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 24.06.25 15:03, Vlastimil Babka wrote:
>> > Since the introduction in 9a10064f5625 ("mm: add a field to store names
>> > for private anonymous memory") the code to set anon_name on a vma has
>> > been using madvise_update_vma() to call replace_anon_vma_name(). Since
>> > the former is called also by a number of other madvise behaviours that
>> > do not set a new anon_name, they have been passing the existing
>> > anon_name of the vma to make replace_vma_anon_name() a no-op.
> 
> s/replace_vma_anon_name/replace_anon_vma_name

Hm right, thanks. The problem is that it was indeed originally called
replace_vma_anon_name() (later renamed) and I was copy/pasting while looking
at commit 9a10064f5625... will send a fixed up changelog in a reply to patch
1/4.

>>
>> Acked-by: David Hildenbrand <david@redhat.com>
> 
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>

Thanks!

>>
>> --
>> Cheers,
>>
>> David / dhildenb
>>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
  2025-06-24 15:26   ` Lorenzo Stoakes
@ 2025-06-24 16:42     ` Vlastimil Babka
  0 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 16:42 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand, Jann Horn,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Colin Cross,
	linux-mm, linux-kernel

On 6/24/25 17:26, Lorenzo Stoakes wrote:
> On Tue, Jun 24, 2025 at 03:03:45PM +0200, Vlastimil Babka wrote:
>> Since the introduction in 9a10064f5625 ("mm: add a field to store names
>> for private anonymous memory") the code to set anon_name on a vma has
>> been using madvise_update_vma() to call replace_anon_vma_name(). Since
>> the former is called also by a number of other madvise behaviours that
>> do not set a new anon_name, they have been passing the existing
>> anon_name of the vma to make replace_vma_anon_name() a no-op.
>>
>> This is rather wasteful as it needs anon_vma_name_eq() to determine the
>> no-op situations, and checks for when replace_vma_anon_name() is allowed
>> (the vma is anon/shmem) duplicate the checks already done earlier in
>> madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
>> fix use-after-free when anon vma name is used after vma is freed")
>> adding anon_name refcount get/put operations exactly to the cases that
>> actually do not change anon_name - just so the replace_vma_anon_name()
>> can keep safely determining it has nothing to do.
>>
>> The recent madvise cleanups made this suboptimal handling very obvious,
>> but happily also allow for an easy fix. madvise_update_vma() now has the
>> complete information whether it's been called to set a new anon_name, so
>> stop passing it the existing vma's name and doing the refcount get/put
>> in its only caller madvise_vma_behavior().
>>
>> In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
>> only to cases where we are setting a new name, otherwise we know it's a
>> no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
>> can remove the duplicate checks for vma being anon/shmem that were done
>> already in madvise_vma_behavior().
>>
>> Additionally, by using vma_modify_flags() when not modifying the
>> anon_name, avoid explicitly passing the existing vma's anon_name and
>> storing a pointer to it in struct madv_behavior or a local variable.
>> This prevents the danger of accessing a freed anon_name after vma
>> merging, previously fixed by commit 942341dcc574.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> 
> Cheers, LGTM so:
> 
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> 
> I made sure to do some stress-ng --madvise testing with this series (+ my recent
> fix) applied :P all good.

Thanks for that!

>> ---
>>  mm/madvise.c | 37 +++++++++++++------------------------
>>  1 file changed, 13 insertions(+), 24 deletions(-)
>>
>> diff --git a/mm/madvise.c b/mm/madvise.c
>> index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644
>> --- a/mm/madvise.c
>> +++ b/mm/madvise.c
>> @@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
>>  }
>>  #endif /* CONFIG_ANON_VMA_NAME */
>>  /*
>> - * Update the vm_flags on region of a vma, splitting it or merging it as
>> - * necessary.  Must be called with mmap_lock held for writing;
>> - * Caller should ensure anon_name stability by raising its refcount even when
>> - * anon_name belongs to a valid vma because this function might free that vma.
>> + * Update the vm_flags and/or anon_name on region of a vma, splitting it or
>> + * merging it as necessary. Must be called with mmap_lock held for writing.
>>   */
>>  static int madvise_update_vma(vm_flags_t new_flags,
>>  		struct madvise_behavior *madv_behavior)
>>  {
>> -	int error;
>>  	struct vm_area_struct *vma = madv_behavior->vma;
>>  	struct madvise_behavior_range *range = &madv_behavior->range;
>>  	struct anon_vma_name *anon_name = madv_behavior->anon_name;
>> +	bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME;
>>  	VMA_ITERATOR(vmi, madv_behavior->mm, range->start);
>>
>> -	if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name))
>> +	if (new_flags == vma->vm_flags && (!set_new_anon_name ||
>> +			anon_vma_name_eq(anon_vma_name(vma), anon_name)))
>>  		return 0;
>>
>> -	vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
>> +	if (set_new_anon_name)
>> +		vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
>>  			range->start, range->end, new_flags, anon_name);
> 
> I will do a follow up (doesn't really belong here I'd say as involves change to
> modify code) that makes this vma_modify_name() or vma_modify_anon_name() as this
> is the only caller and we don't care about flags here :)

Oh right, good idea!



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file
  2025-06-24 14:05   ` David Hildenbrand
  2025-06-24 14:33     ` Suren Baghdasaryan
@ 2025-06-24 16:46     ` Vlastimil Babka
  1 sibling, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 16:46 UTC (permalink / raw)
  To: David Hildenbrand, Andrew Morton, Liam R. Howlett,
	Lorenzo Stoakes, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
	Michal Hocko, Colin Cross
  Cc: linux-mm, linux-kernel

On 6/24/25 16:05, David Hildenbrand wrote:
> On 24.06.25 15:03, Vlastimil Babka wrote:
> 
> Personally, I would squash that into #4, given that #4 is pretty small ;)
> 
> Acked-by: David Hildenbrand <david@redhat.com>

Right, the advantages of pure move commits is that git diff can colorize
them so it's immediately obvious, and git blame -C or -M can also recognize
pure moves and ignore them. Mixing moves with changes tends to break these
heuristics. But if Andrew wants to squash, feel free.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name()
  2025-06-24 15:45   ` Lorenzo Stoakes
@ 2025-06-24 16:48     ` Vlastimil Babka
  0 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 16:48 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand, Jann Horn,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Colin Cross,
	linux-mm, linux-kernel

On 6/24/25 17:45, Lorenzo Stoakes wrote:
> On Tue, Jun 24, 2025 at 03:03:48PM +0200, Vlastimil Babka wrote:
>> Use madvise_lock()/madvise_unlock() in madvise_set_anon_name() in the
>> same way as in do_madvise(). This narrows the lock scope a bit and
>> reuses existing functionality. get_lock_mode() already picks the correct
>> MADVISE_MMAP_WRITE_LOCK mode for __MADV_SET_ANON_VMA_NAME so we can just
>> remove the explicit assignment.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> 
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> 
> This is a nice idea, hadn't thought of it :) aren't we making anon vma name
> behave now :P
> 
> I mean this is a _minor_ difference in functionality in that we will now
> use mmap_write_lock_killable() and could fail to lock on fatal signal
> whereas before we unconditionally lock but I think that's fine, and
> possibly even... desirable actually?

Right! I'll add a note to the changelog. It's changing existing uapi but
hopefully not in a way that userspace would become broken. I mean even the
other madvise modes only got killable locks later tat some point after
introduction, no?

> 
> At any rate this is all nice, thanks again!
> 
>> ---
>>  mm/madvise.c | 13 +++++++++----
>>  1 file changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/madvise.c b/mm/madvise.c
>> index cae064479cdf908707c45b941bd03d43d095eab6..ee02ccd0315a146cdb3001cd189e03be9e48a2ea 100644
>> --- a/mm/madvise.c
>> +++ b/mm/madvise.c
>> @@ -2082,10 +2082,10 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
>>  {
>>  	unsigned long end;
>>  	unsigned long len;
>> +	int error;
>>  	struct madvise_behavior madv_behavior = {
>>  		.mm = mm,
>>  		.behavior = __MADV_SET_ANON_VMA_NAME,
>> -		.lock_mode = MADVISE_MMAP_WRITE_LOCK,
>>  		.anon_name = anon_name,
>>  	};
>>
>> @@ -2106,7 +2106,14 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
>>
>>  	madv_behavior.range.start = start;
>>  	madv_behavior.range.end = end;
>> -	return madvise_walk_vmas(&madv_behavior);
>> +
>> +	error = madvise_lock(&madv_behavior);
>> +	if (error)
>> +		return error;
>> +	error = madvise_walk_vmas(&madv_behavior);
>> +	madvise_unlock(&madv_behavior);
>> +
>> +	return error;
>>  }
>>
>>  int set_anon_vma_name(unsigned long addr, unsigned long size,
>> @@ -2136,9 +2143,7 @@ int set_anon_vma_name(unsigned long addr, unsigned long size,
>>  			return -ENOMEM;
>>  	}
>>
>> -	mmap_write_lock(mm);
>>  	error = madvise_set_anon_name(mm, addr, size, anon_name);
>> -	mmap_write_unlock(mm);
>>  	anon_vma_name_put(anon_name);
>>
>>  	return error;
>>
>> --
>> 2.50.0
>>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
  2025-06-24 13:03 ` [PATCH v2 1/4] mm, madvise: simplify anon_name handling Vlastimil Babka
  2025-06-24 13:58   ` David Hildenbrand
  2025-06-24 15:26   ` Lorenzo Stoakes
@ 2025-06-24 17:01   ` Vlastimil Babka
  2 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 17:01 UTC (permalink / raw)
  To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
	Michal Hocko, Colin Cross
  Cc: linux-mm, linux-kernel

On 6/24/25 15:03, Vlastimil Babka wrote:
> Since the introduction in 9a10064f5625 ("mm: add a field to store names
> for private anonymous memory") the code to set anon_name on a vma has
> been using madvise_update_vma() to call replace_anon_vma_name(). Since
> the former is called also by a number of other madvise behaviours that
> do not set a new anon_name, they have been passing the existing
> anon_name of the vma to make replace_vma_anon_name() a no-op.
> 
> This is rather wasteful as it needs anon_vma_name_eq() to determine the
> no-op situations, and checks for when replace_vma_anon_name() is allowed
> (the vma is anon/shmem) duplicate the checks already done earlier in
> madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
> fix use-after-free when anon vma name is used after vma is freed")
> adding anon_name refcount get/put operations exactly to the cases that
> actually do not change anon_name - just so the replace_vma_anon_name()
> can keep safely determining it has nothing to do.
> 
> The recent madvise cleanups made this suboptimal handling very obvious,
> but happily also allow for an easy fix. madvise_update_vma() now has the
> complete information whether it's been called to set a new anon_name, so
> stop passing it the existing vma's name and doing the refcount get/put
> in its only caller madvise_vma_behavior().
> 
> In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
> only to cases where we are setting a new name, otherwise we know it's a
> no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
> can remove the duplicate checks for vma being anon/shmem that were done
> already in madvise_vma_behavior().
> 
> Additionally, by using vma_modify_flags() when not modifying the
> anon_name, avoid explicitly passing the existing vma's anon_name and
> storing a pointer to it in struct madv_behavior or a local variable.
> This prevents the danger of accessing a freed anon_name after vma
> merging, previously fixed by commit 942341dcc574.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

With fixed up function names as pointed out by Suren:

Since the introduction in 9a10064f5625 ("mm: add a field to store names
for private anonymous memory") the code to set anon_name on a vma has
been using madvise_update_vma() to call replace_anon_vma_name(). Since
the former is called also by a number of other madvise behaviours that
do not set a new anon_name, they have been passing the existing
anon_name of the vma to make replace_anon_vma_name() a no-op.

This is rather wasteful as it needs anon_vma_name_eq() to determine the
no-op situations, and checks for when replace_anon_vma_name() is allowed
(the vma is anon/shmem) duplicate the checks already done earlier in
madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
fix use-after-free when anon vma name is used after vma is freed")
adding anon_name refcount get/put operations exactly to the cases that
actually do not change anon_name - just so the replace_anon_vma_name()
can keep safely determining it has nothing to do.

The recent madvise cleanups made this suboptimal handling very obvious,
but happily also allow for an easy fix. madvise_update_vma() now has the
complete information whether it's been called to set a new anon_name, so
stop passing it the existing vma's name and doing the refcount get/put
in its only caller madvise_vma_behavior().

In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
only to cases where we are setting a new name, otherwise we know it's a
no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
can remove the duplicate checks for vma being anon/shmem that were done
already in madvise_vma_behavior().

Additionally, by using vma_modify_flags() when not modifying the
anon_name, avoid explicitly passing the existing vma's anon_name and
storing a pointer to it in struct madv_behavior or a local variable.
This prevents the danger of accessing a freed anon_name after vma
merging, previously fixed by commit 942341dcc574.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file
  2025-06-24 13:03 ` [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file Vlastimil Babka
  2025-06-24 14:05   ` David Hildenbrand
  2025-06-24 15:36   ` Lorenzo Stoakes
@ 2025-06-24 17:02   ` Vlastimil Babka
  2 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 17:02 UTC (permalink / raw)
  To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
	Michal Hocko, Colin Cross
  Cc: linux-mm, linux-kernel

On 6/24/25 15:03, Vlastimil Babka wrote:
> Preparatory change so that we can use madvise_lock()/unlock() in the
> function without forward declarations or more thorough shuffling.

additional paragraph here:

No functional change. Move as a separate commit helps git heuristics to
detect it properly.


> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name()
  2025-06-24 13:03 ` [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name() Vlastimil Babka
  2025-06-24 14:06   ` David Hildenbrand
  2025-06-24 15:45   ` Lorenzo Stoakes
@ 2025-06-24 17:03   ` Vlastimil Babka
  2 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 17:03 UTC (permalink / raw)
  To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
	David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
	Michal Hocko, Colin Cross
  Cc: linux-mm, linux-kernel

On 6/24/25 15:03, Vlastimil Babka wrote:
> Use madvise_lock()/madvise_unlock() in madvise_set_anon_name() in the
> same way as in do_madvise(). This narrows the lock scope a bit and
> reuses existing functionality. get_lock_mode() already picks the correct
> MADVISE_MMAP_WRITE_LOCK mode for __MADV_SET_ANON_VMA_NAME so we can just
> remove the explicit assignment.

Additional paragraph here:

There is a user visible change in that the prctl(PR_SET_VMA,
PR_SET_VMA_ANON_NAME...) might now return -EINTR.

> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  mm/madvise.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index cae064479cdf908707c45b941bd03d43d095eab6..ee02ccd0315a146cdb3001cd189e03be9e48a2ea 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -2082,10 +2082,10 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
>  {
>  	unsigned long end;
>  	unsigned long len;
> +	int error;
>  	struct madvise_behavior madv_behavior = {
>  		.mm = mm,
>  		.behavior = __MADV_SET_ANON_VMA_NAME,
> -		.lock_mode = MADVISE_MMAP_WRITE_LOCK,
>  		.anon_name = anon_name,
>  	};
>  
> @@ -2106,7 +2106,14 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
>  
>  	madv_behavior.range.start = start;
>  	madv_behavior.range.end = end;
> -	return madvise_walk_vmas(&madv_behavior);
> +
> +	error = madvise_lock(&madv_behavior);
> +	if (error)
> +		return error;
> +	error = madvise_walk_vmas(&madv_behavior);
> +	madvise_unlock(&madv_behavior);
> +
> +	return error;
>  }
>  
>  int set_anon_vma_name(unsigned long addr, unsigned long size,
> @@ -2136,9 +2143,7 @@ int set_anon_vma_name(unsigned long addr, unsigned long size,
>  			return -ENOMEM;
>  	}
>  
> -	mmap_write_lock(mm);
>  	error = madvise_set_anon_name(mm, addr, size, anon_name);
> -	mmap_write_unlock(mm);
>  	anon_vma_name_put(anon_name);
>  
>  	return error;
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2025-06-24 17:03 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-24 13:03 [PATCH v2 0/4] madvise anon_name cleanups Vlastimil Babka
2025-06-24 13:03 ` [PATCH v2 1/4] mm, madvise: simplify anon_name handling Vlastimil Babka
2025-06-24 13:58   ` David Hildenbrand
2025-06-24 14:28     ` Suren Baghdasaryan
2025-06-24 16:41       ` Vlastimil Babka
2025-06-24 16:38     ` Vlastimil Babka
2025-06-24 15:26   ` Lorenzo Stoakes
2025-06-24 16:42     ` Vlastimil Babka
2025-06-24 17:01   ` Vlastimil Babka
2025-06-24 13:03 ` [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c Vlastimil Babka
2025-06-24 14:04   ` David Hildenbrand
2025-06-24 14:31     ` Suren Baghdasaryan
2025-06-24 15:28   ` Lorenzo Stoakes
2025-06-24 13:03 ` [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file Vlastimil Babka
2025-06-24 14:05   ` David Hildenbrand
2025-06-24 14:33     ` Suren Baghdasaryan
2025-06-24 16:46     ` Vlastimil Babka
2025-06-24 15:36   ` Lorenzo Stoakes
2025-06-24 17:02   ` Vlastimil Babka
2025-06-24 13:03 ` [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name() Vlastimil Babka
2025-06-24 14:06   ` David Hildenbrand
2025-06-24 14:35     ` Suren Baghdasaryan
2025-06-24 15:45   ` Lorenzo Stoakes
2025-06-24 16:48     ` Vlastimil Babka
2025-06-24 17:03   ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).