* [PATCH v2 1/4] mm, madvise: simplify anon_name handling
2025-06-24 13:03 [PATCH v2 0/4] madvise anon_name cleanups Vlastimil Babka
@ 2025-06-24 13:03 ` Vlastimil Babka
2025-06-24 13:58 ` David Hildenbrand
` (2 more replies)
2025-06-24 13:03 ` [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c Vlastimil Babka
` (2 subsequent siblings)
3 siblings, 3 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 13:03 UTC (permalink / raw)
To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
Michal Hocko, Colin Cross
Cc: linux-mm, linux-kernel, Vlastimil Babka
Since the introduction in 9a10064f5625 ("mm: add a field to store names
for private anonymous memory") the code to set anon_name on a vma has
been using madvise_update_vma() to call replace_anon_vma_name(). Since
the former is called also by a number of other madvise behaviours that
do not set a new anon_name, they have been passing the existing
anon_name of the vma to make replace_vma_anon_name() a no-op.
This is rather wasteful as it needs anon_vma_name_eq() to determine the
no-op situations, and checks for when replace_vma_anon_name() is allowed
(the vma is anon/shmem) duplicate the checks already done earlier in
madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
fix use-after-free when anon vma name is used after vma is freed")
adding anon_name refcount get/put operations exactly to the cases that
actually do not change anon_name - just so the replace_vma_anon_name()
can keep safely determining it has nothing to do.
The recent madvise cleanups made this suboptimal handling very obvious,
but happily also allow for an easy fix. madvise_update_vma() now has the
complete information whether it's been called to set a new anon_name, so
stop passing it the existing vma's name and doing the refcount get/put
in its only caller madvise_vma_behavior().
In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
only to cases where we are setting a new name, otherwise we know it's a
no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
can remove the duplicate checks for vma being anon/shmem that were done
already in madvise_vma_behavior().
Additionally, by using vma_modify_flags() when not modifying the
anon_name, avoid explicitly passing the existing vma's anon_name and
storing a pointer to it in struct madv_behavior or a local variable.
This prevents the danger of accessing a freed anon_name after vma
merging, previously fixed by commit 942341dcc574.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
mm/madvise.c | 37 +++++++++++++------------------------
1 file changed, 13 insertions(+), 24 deletions(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
}
#endif /* CONFIG_ANON_VMA_NAME */
/*
- * Update the vm_flags on region of a vma, splitting it or merging it as
- * necessary. Must be called with mmap_lock held for writing;
- * Caller should ensure anon_name stability by raising its refcount even when
- * anon_name belongs to a valid vma because this function might free that vma.
+ * Update the vm_flags and/or anon_name on region of a vma, splitting it or
+ * merging it as necessary. Must be called with mmap_lock held for writing.
*/
static int madvise_update_vma(vm_flags_t new_flags,
struct madvise_behavior *madv_behavior)
{
- int error;
struct vm_area_struct *vma = madv_behavior->vma;
struct madvise_behavior_range *range = &madv_behavior->range;
struct anon_vma_name *anon_name = madv_behavior->anon_name;
+ bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME;
VMA_ITERATOR(vmi, madv_behavior->mm, range->start);
- if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name))
+ if (new_flags == vma->vm_flags && (!set_new_anon_name ||
+ anon_vma_name_eq(anon_vma_name(vma), anon_name)))
return 0;
- vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
+ if (set_new_anon_name)
+ vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
range->start, range->end, new_flags, anon_name);
+ else
+ vma = vma_modify_flags(&vmi, madv_behavior->prev, vma,
+ range->start, range->end, new_flags);
+
if (IS_ERR(vma))
return PTR_ERR(vma);
@@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags,
/* vm_flags is protected by the mmap_lock held in write mode. */
vma_start_write(vma);
vm_flags_reset(vma, new_flags);
- if (!vma->vm_file || vma_is_anon_shmem(vma)) {
- error = replace_anon_vma_name(vma, anon_name);
- if (error)
- return error;
- }
+ if (set_new_anon_name)
+ return replace_anon_vma_name(vma, anon_name);
return 0;
}
@@ -1313,7 +1314,6 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior)
int behavior = madv_behavior->behavior;
struct vm_area_struct *vma = madv_behavior->vma;
vm_flags_t new_flags = vma->vm_flags;
- bool set_new_anon_name = behavior == __MADV_SET_ANON_VMA_NAME;
struct madvise_behavior_range *range = &madv_behavior->range;
int error;
@@ -1403,18 +1403,7 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior)
/* This is a write operation.*/
VM_WARN_ON_ONCE(madv_behavior->lock_mode != MADVISE_MMAP_WRITE_LOCK);
- /*
- * madvise_update_vma() might cause a VMA merge which could put an
- * anon_vma_name, so we must hold an additional reference on the
- * anon_vma_name so it doesn't disappear from under us.
- */
- if (!set_new_anon_name) {
- madv_behavior->anon_name = anon_vma_name(vma);
- anon_vma_name_get(madv_behavior->anon_name);
- }
error = madvise_update_vma(new_flags, madv_behavior);
- if (!set_new_anon_name)
- anon_vma_name_put(madv_behavior->anon_name);
out:
/*
* madvise() returns EAGAIN if kernel resources, such as
--
2.50.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
2025-06-24 13:03 ` [PATCH v2 1/4] mm, madvise: simplify anon_name handling Vlastimil Babka
@ 2025-06-24 13:58 ` David Hildenbrand
2025-06-24 14:28 ` Suren Baghdasaryan
2025-06-24 16:38 ` Vlastimil Babka
2025-06-24 15:26 ` Lorenzo Stoakes
2025-06-24 17:01 ` Vlastimil Babka
2 siblings, 2 replies; 25+ messages in thread
From: David Hildenbrand @ 2025-06-24 13:58 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
Jann Horn, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Colin Cross
Cc: linux-mm, linux-kernel
On 24.06.25 15:03, Vlastimil Babka wrote:
> Since the introduction in 9a10064f5625 ("mm: add a field to store names
> for private anonymous memory") the code to set anon_name on a vma has
> been using madvise_update_vma() to call replace_anon_vma_name(). Since
> the former is called also by a number of other madvise behaviours that
> do not set a new anon_name, they have been passing the existing
> anon_name of the vma to make replace_vma_anon_name() a no-op.
>
> This is rather wasteful as it needs anon_vma_name_eq() to determine the
> no-op situations, and checks for when replace_vma_anon_name() is allowed
> (the vma is anon/shmem) duplicate the checks already done earlier in
> madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
> fix use-after-free when anon vma name is used after vma is freed")
> adding anon_name refcount get/put operations exactly to the cases that
> actually do not change anon_name - just so the replace_vma_anon_name()
> can keep safely determining it has nothing to do.
>
> The recent madvise cleanups made this suboptimal handling very obvious,
> but happily also allow for an easy fix. madvise_update_vma() now has the
> complete information whether it's been called to set a new anon_name, so
> stop passing it the existing vma's name and doing the refcount get/put
> in its only caller madvise_vma_behavior().
>
> In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
> only to cases where we are setting a new name, otherwise we know it's a
> no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
> can remove the duplicate checks for vma being anon/shmem that were done
> already in madvise_vma_behavior().
>
> Additionally, by using vma_modify_flags() when not modifying the
> anon_name, avoid explicitly passing the existing vma's anon_name and
> storing a pointer to it in struct madv_behavior or a local variable.
> This prevents the danger of accessing a freed anon_name after vma
> merging, previously fixed by commit 942341dcc574.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> mm/madvise.c | 37 +++++++++++++------------------------
> 1 file changed, 13 insertions(+), 24 deletions(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
> }
> #endif /* CONFIG_ANON_VMA_NAME */
> /*
> - * Update the vm_flags on region of a vma, splitting it or merging it as
> - * necessary. Must be called with mmap_lock held for writing;
> - * Caller should ensure anon_name stability by raising its refcount even when
> - * anon_name belongs to a valid vma because this function might free that vma.
> + * Update the vm_flags and/or anon_name on region of a vma, splitting it or
> + * merging it as necessary. Must be called with mmap_lock held for writing.
> */
> static int madvise_update_vma(vm_flags_t new_flags,
> struct madvise_behavior *madv_behavior)
> {
> - int error;
> struct vm_area_struct *vma = madv_behavior->vma;
> struct madvise_behavior_range *range = &madv_behavior->range;
> struct anon_vma_name *anon_name = madv_behavior->anon_name;
> + bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME;
> VMA_ITERATOR(vmi, madv_behavior->mm, range->start);
>
> - if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name))
> + if (new_flags == vma->vm_flags && (!set_new_anon_name ||
> + anon_vma_name_eq(anon_vma_name(vma), anon_name)))
> return 0;
>
> - vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
> + if (set_new_anon_name)
> + vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
> range->start, range->end, new_flags, anon_name);
> + else
> + vma = vma_modify_flags(&vmi, madv_behavior->prev, vma,
> + range->start, range->end, new_flags);
> +
> if (IS_ERR(vma))
> return PTR_ERR(vma);
>
> @@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags,
> /* vm_flags is protected by the mmap_lock held in write mode. */
> vma_start_write(vma);
> vm_flags_reset(vma, new_flags);
> - if (!vma->vm_file || vma_is_anon_shmem(vma)) {
> - error = replace_anon_vma_name(vma, anon_name);
> - if (error)
> - return error;
> - }
> + if (set_new_anon_name)
> + return replace_anon_vma_name(vma, anon_name);
Took me a second to find where this is already checked (->
madvise_vma_behavior()).
:)
Acked-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
2025-06-24 13:58 ` David Hildenbrand
@ 2025-06-24 14:28 ` Suren Baghdasaryan
2025-06-24 16:41 ` Vlastimil Babka
2025-06-24 16:38 ` Vlastimil Babka
1 sibling, 1 reply; 25+ messages in thread
From: Suren Baghdasaryan @ 2025-06-24 14:28 UTC (permalink / raw)
To: David Hildenbrand
Cc: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
Jann Horn, Mike Rapoport, Michal Hocko, Colin Cross, linux-mm,
linux-kernel
On Tue, Jun 24, 2025 at 6:58 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 24.06.25 15:03, Vlastimil Babka wrote:
> > Since the introduction in 9a10064f5625 ("mm: add a field to store names
> > for private anonymous memory") the code to set anon_name on a vma has
> > been using madvise_update_vma() to call replace_anon_vma_name(). Since
> > the former is called also by a number of other madvise behaviours that
> > do not set a new anon_name, they have been passing the existing
> > anon_name of the vma to make replace_vma_anon_name() a no-op.
s/replace_vma_anon_name/replace_anon_vma_name
> >
> > This is rather wasteful as it needs anon_vma_name_eq() to determine the
> > no-op situations, and checks for when replace_vma_anon_name() is allowed
s/replace_vma_anon_name/replace_anon_vma_name
> > (the vma is anon/shmem) duplicate the checks already done earlier in
> > madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
> > fix use-after-free when anon vma name is used after vma is freed")
> > adding anon_name refcount get/put operations exactly to the cases that
> > actually do not change anon_name - just so the replace_vma_anon_name()
s/replace_vma_anon_name/replace_anon_vma_name
> > can keep safely determining it has nothing to do.
> >
> > The recent madvise cleanups made this suboptimal handling very obvious,
> > but happily also allow for an easy fix. madvise_update_vma() now has the
> > complete information whether it's been called to set a new anon_name, so
> > stop passing it the existing vma's name and doing the refcount get/put
> > in its only caller madvise_vma_behavior().
> >
> > In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
> > only to cases where we are setting a new name, otherwise we know it's a
> > no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
> > can remove the duplicate checks for vma being anon/shmem that were done
> > already in madvise_vma_behavior().
> >
> > Additionally, by using vma_modify_flags() when not modifying the
> > anon_name, avoid explicitly passing the existing vma's anon_name and
> > storing a pointer to it in struct madv_behavior or a local variable.
> > This prevents the danger of accessing a freed anon_name after vma
> > merging, previously fixed by commit 942341dcc574.
> >
> > Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> > ---
> > mm/madvise.c | 37 +++++++++++++------------------------
> > 1 file changed, 13 insertions(+), 24 deletions(-)
> >
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
> > }
> > #endif /* CONFIG_ANON_VMA_NAME */
> > /*
> > - * Update the vm_flags on region of a vma, splitting it or merging it as
> > - * necessary. Must be called with mmap_lock held for writing;
> > - * Caller should ensure anon_name stability by raising its refcount even when
> > - * anon_name belongs to a valid vma because this function might free that vma.
> > + * Update the vm_flags and/or anon_name on region of a vma, splitting it or
> > + * merging it as necessary. Must be called with mmap_lock held for writing.
> > */
> > static int madvise_update_vma(vm_flags_t new_flags,
> > struct madvise_behavior *madv_behavior)
> > {
> > - int error;
> > struct vm_area_struct *vma = madv_behavior->vma;
> > struct madvise_behavior_range *range = &madv_behavior->range;
> > struct anon_vma_name *anon_name = madv_behavior->anon_name;
> > + bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME;
> > VMA_ITERATOR(vmi, madv_behavior->mm, range->start);
> >
> > - if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name))
> > + if (new_flags == vma->vm_flags && (!set_new_anon_name ||
> > + anon_vma_name_eq(anon_vma_name(vma), anon_name)))
> > return 0;
> >
> > - vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
> > + if (set_new_anon_name)
> > + vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
> > range->start, range->end, new_flags, anon_name);
> > + else
> > + vma = vma_modify_flags(&vmi, madv_behavior->prev, vma,
> > + range->start, range->end, new_flags);
> > +
> > if (IS_ERR(vma))
> > return PTR_ERR(vma);
> >
> > @@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags,
> > /* vm_flags is protected by the mmap_lock held in write mode. */
> > vma_start_write(vma);
> > vm_flags_reset(vma, new_flags);
> > - if (!vma->vm_file || vma_is_anon_shmem(vma)) {
> > - error = replace_anon_vma_name(vma, anon_name);
> > - if (error)
> > - return error;
> > - }
> > + if (set_new_anon_name)
> > + return replace_anon_vma_name(vma, anon_name);
>
> Took me a second to find where this is already checked (->
> madvise_vma_behavior()).
>
> :)
>
> Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>
> --
> Cheers,
>
> David / dhildenb
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
2025-06-24 14:28 ` Suren Baghdasaryan
@ 2025-06-24 16:41 ` Vlastimil Babka
0 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 16:41 UTC (permalink / raw)
To: Suren Baghdasaryan, David Hildenbrand
Cc: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes, Jann Horn,
Mike Rapoport, Michal Hocko, Colin Cross, linux-mm, linux-kernel
On 6/24/25 16:28, Suren Baghdasaryan wrote:
> On Tue, Jun 24, 2025 at 6:58 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 24.06.25 15:03, Vlastimil Babka wrote:
>> > Since the introduction in 9a10064f5625 ("mm: add a field to store names
>> > for private anonymous memory") the code to set anon_name on a vma has
>> > been using madvise_update_vma() to call replace_anon_vma_name(). Since
>> > the former is called also by a number of other madvise behaviours that
>> > do not set a new anon_name, they have been passing the existing
>> > anon_name of the vma to make replace_vma_anon_name() a no-op.
>
> s/replace_vma_anon_name/replace_anon_vma_name
Hm right, thanks. The problem is that it was indeed originally called
replace_vma_anon_name() (later renamed) and I was copy/pasting while looking
at commit 9a10064f5625... will send a fixed up changelog in a reply to patch
1/4.
>>
>> Acked-by: David Hildenbrand <david@redhat.com>
>
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Thanks!
>>
>> --
>> Cheers,
>>
>> David / dhildenb
>>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
2025-06-24 13:58 ` David Hildenbrand
2025-06-24 14:28 ` Suren Baghdasaryan
@ 2025-06-24 16:38 ` Vlastimil Babka
1 sibling, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 16:38 UTC (permalink / raw)
To: David Hildenbrand, Andrew Morton, Liam R. Howlett,
Lorenzo Stoakes, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
Michal Hocko, Colin Cross
Cc: linux-mm, linux-kernel
On 6/24/25 15:58, David Hildenbrand wrote:
> On 24.06.25 15:03, Vlastimil Babka wrote:
>> Since the introduction in 9a10064f5625 ("mm: add a field to store names
>> for private anonymous memory") the code to set anon_name on a vma has
>> been using madvise_update_vma() to call replace_anon_vma_name(). Since
>> the former is called also by a number of other madvise behaviours that
>> do not set a new anon_name, they have been passing the existing
>> anon_name of the vma to make replace_vma_anon_name() a no-op.
>>
>> This is rather wasteful as it needs anon_vma_name_eq() to determine the
>> no-op situations, and checks for when replace_vma_anon_name() is allowed
>> (the vma is anon/shmem) duplicate the checks already done earlier in
>> madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
^
>> fix use-after-free when anon vma name is used after vma is freed")
>> adding anon_name refcount get/put operations exactly to the cases that
>> actually do not change anon_name - just so the replace_vma_anon_name()
>> can keep safely determining it has nothing to do.
>>
>> The recent madvise cleanups made this suboptimal handling very obvious,
>> but happily also allow for an easy fix. madvise_update_vma() now has the
>> complete information whether it's been called to set a new anon_name, so
>> stop passing it the existing vma's name and doing the refcount get/put
>> in its only caller madvise_vma_behavior().
>>
>> In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
>> only to cases where we are setting a new name, otherwise we know it's a
>> no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
>> can remove the duplicate checks for vma being anon/shmem that were done
>> already in madvise_vma_behavior().
^
>>
>> Additionally, by using vma_modify_flags() when not modifying the
>> anon_name, avoid explicitly passing the existing vma's anon_name and
>> storing a pointer to it in struct madv_behavior or a local variable.
>> This prevents the danger of accessing a freed anon_name after vma
>> merging, previously fixed by commit 942341dcc574.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
<snip>
> Took me a second to find where this is already checked (->
> madvise_vma_behavior()).
And I thought I was repeating myself too much in the changelog :)
>
> :)
But maybe you're joking on purpose, referring to that :)
> Acked-by: David Hildenbrand <david@redhat.com>
Thanks!
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
2025-06-24 13:03 ` [PATCH v2 1/4] mm, madvise: simplify anon_name handling Vlastimil Babka
2025-06-24 13:58 ` David Hildenbrand
@ 2025-06-24 15:26 ` Lorenzo Stoakes
2025-06-24 16:42 ` Vlastimil Babka
2025-06-24 17:01 ` Vlastimil Babka
2 siblings, 1 reply; 25+ messages in thread
From: Lorenzo Stoakes @ 2025-06-24 15:26 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand, Jann Horn,
Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Colin Cross,
linux-mm, linux-kernel
On Tue, Jun 24, 2025 at 03:03:45PM +0200, Vlastimil Babka wrote:
> Since the introduction in 9a10064f5625 ("mm: add a field to store names
> for private anonymous memory") the code to set anon_name on a vma has
> been using madvise_update_vma() to call replace_anon_vma_name(). Since
> the former is called also by a number of other madvise behaviours that
> do not set a new anon_name, they have been passing the existing
> anon_name of the vma to make replace_vma_anon_name() a no-op.
>
> This is rather wasteful as it needs anon_vma_name_eq() to determine the
> no-op situations, and checks for when replace_vma_anon_name() is allowed
> (the vma is anon/shmem) duplicate the checks already done earlier in
> madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
> fix use-after-free when anon vma name is used after vma is freed")
> adding anon_name refcount get/put operations exactly to the cases that
> actually do not change anon_name - just so the replace_vma_anon_name()
> can keep safely determining it has nothing to do.
>
> The recent madvise cleanups made this suboptimal handling very obvious,
> but happily also allow for an easy fix. madvise_update_vma() now has the
> complete information whether it's been called to set a new anon_name, so
> stop passing it the existing vma's name and doing the refcount get/put
> in its only caller madvise_vma_behavior().
>
> In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
> only to cases where we are setting a new name, otherwise we know it's a
> no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
> can remove the duplicate checks for vma being anon/shmem that were done
> already in madvise_vma_behavior().
>
> Additionally, by using vma_modify_flags() when not modifying the
> anon_name, avoid explicitly passing the existing vma's anon_name and
> storing a pointer to it in struct madv_behavior or a local variable.
> This prevents the danger of accessing a freed anon_name after vma
> merging, previously fixed by commit 942341dcc574.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cheers, LGTM so:
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
I made sure to do some stress-ng --madvise testing with this series (+ my recent
fix) applied :P all good.
> ---
> mm/madvise.c | 37 +++++++++++++------------------------
> 1 file changed, 13 insertions(+), 24 deletions(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
> }
> #endif /* CONFIG_ANON_VMA_NAME */
> /*
> - * Update the vm_flags on region of a vma, splitting it or merging it as
> - * necessary. Must be called with mmap_lock held for writing;
> - * Caller should ensure anon_name stability by raising its refcount even when
> - * anon_name belongs to a valid vma because this function might free that vma.
> + * Update the vm_flags and/or anon_name on region of a vma, splitting it or
> + * merging it as necessary. Must be called with mmap_lock held for writing.
> */
> static int madvise_update_vma(vm_flags_t new_flags,
> struct madvise_behavior *madv_behavior)
> {
> - int error;
> struct vm_area_struct *vma = madv_behavior->vma;
> struct madvise_behavior_range *range = &madv_behavior->range;
> struct anon_vma_name *anon_name = madv_behavior->anon_name;
> + bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME;
> VMA_ITERATOR(vmi, madv_behavior->mm, range->start);
>
> - if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name))
> + if (new_flags == vma->vm_flags && (!set_new_anon_name ||
> + anon_vma_name_eq(anon_vma_name(vma), anon_name)))
> return 0;
>
> - vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
> + if (set_new_anon_name)
> + vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
> range->start, range->end, new_flags, anon_name);
I will do a follow up (doesn't really belong here I'd say as involves change to
modify code) that makes this vma_modify_name() or vma_modify_anon_name() as this
is the only caller and we don't care about flags here :)
> + else
> + vma = vma_modify_flags(&vmi, madv_behavior->prev, vma,
> + range->start, range->end, new_flags);
> +
> if (IS_ERR(vma))
> return PTR_ERR(vma);
>
> @@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags,
> /* vm_flags is protected by the mmap_lock held in write mode. */
> vma_start_write(vma);
> vm_flags_reset(vma, new_flags);
> - if (!vma->vm_file || vma_is_anon_shmem(vma)) {
> - error = replace_anon_vma_name(vma, anon_name);
> - if (error)
> - return error;
> - }
> + if (set_new_anon_name)
> + return replace_anon_vma_name(vma, anon_name);
>
> return 0;
> }
> @@ -1313,7 +1314,6 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior)
> int behavior = madv_behavior->behavior;
> struct vm_area_struct *vma = madv_behavior->vma;
> vm_flags_t new_flags = vma->vm_flags;
> - bool set_new_anon_name = behavior == __MADV_SET_ANON_VMA_NAME;
> struct madvise_behavior_range *range = &madv_behavior->range;
> int error;
>
> @@ -1403,18 +1403,7 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior)
> /* This is a write operation.*/
> VM_WARN_ON_ONCE(madv_behavior->lock_mode != MADVISE_MMAP_WRITE_LOCK);
>
> - /*
> - * madvise_update_vma() might cause a VMA merge which could put an
> - * anon_vma_name, so we must hold an additional reference on the
> - * anon_vma_name so it doesn't disappear from under us.
> - */
> - if (!set_new_anon_name) {
> - madv_behavior->anon_name = anon_vma_name(vma);
> - anon_vma_name_get(madv_behavior->anon_name);
> - }
> error = madvise_update_vma(new_flags, madv_behavior);
> - if (!set_new_anon_name)
> - anon_vma_name_put(madv_behavior->anon_name);
> out:
> /*
> * madvise() returns EAGAIN if kernel resources, such as
>
> --
> 2.50.0
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
2025-06-24 15:26 ` Lorenzo Stoakes
@ 2025-06-24 16:42 ` Vlastimil Babka
0 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 16:42 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand, Jann Horn,
Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Colin Cross,
linux-mm, linux-kernel
On 6/24/25 17:26, Lorenzo Stoakes wrote:
> On Tue, Jun 24, 2025 at 03:03:45PM +0200, Vlastimil Babka wrote:
>> Since the introduction in 9a10064f5625 ("mm: add a field to store names
>> for private anonymous memory") the code to set anon_name on a vma has
>> been using madvise_update_vma() to call replace_anon_vma_name(). Since
>> the former is called also by a number of other madvise behaviours that
>> do not set a new anon_name, they have been passing the existing
>> anon_name of the vma to make replace_vma_anon_name() a no-op.
>>
>> This is rather wasteful as it needs anon_vma_name_eq() to determine the
>> no-op situations, and checks for when replace_vma_anon_name() is allowed
>> (the vma is anon/shmem) duplicate the checks already done earlier in
>> madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
>> fix use-after-free when anon vma name is used after vma is freed")
>> adding anon_name refcount get/put operations exactly to the cases that
>> actually do not change anon_name - just so the replace_vma_anon_name()
>> can keep safely determining it has nothing to do.
>>
>> The recent madvise cleanups made this suboptimal handling very obvious,
>> but happily also allow for an easy fix. madvise_update_vma() now has the
>> complete information whether it's been called to set a new anon_name, so
>> stop passing it the existing vma's name and doing the refcount get/put
>> in its only caller madvise_vma_behavior().
>>
>> In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
>> only to cases where we are setting a new name, otherwise we know it's a
>> no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
>> can remove the duplicate checks for vma being anon/shmem that were done
>> already in madvise_vma_behavior().
>>
>> Additionally, by using vma_modify_flags() when not modifying the
>> anon_name, avoid explicitly passing the existing vma's anon_name and
>> storing a pointer to it in struct madv_behavior or a local variable.
>> This prevents the danger of accessing a freed anon_name after vma
>> merging, previously fixed by commit 942341dcc574.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>
> Cheers, LGTM so:
>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>
> I made sure to do some stress-ng --madvise testing with this series (+ my recent
> fix) applied :P all good.
Thanks for that!
>> ---
>> mm/madvise.c | 37 +++++++++++++------------------------
>> 1 file changed, 13 insertions(+), 24 deletions(-)
>>
>> diff --git a/mm/madvise.c b/mm/madvise.c
>> index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644
>> --- a/mm/madvise.c
>> +++ b/mm/madvise.c
>> @@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
>> }
>> #endif /* CONFIG_ANON_VMA_NAME */
>> /*
>> - * Update the vm_flags on region of a vma, splitting it or merging it as
>> - * necessary. Must be called with mmap_lock held for writing;
>> - * Caller should ensure anon_name stability by raising its refcount even when
>> - * anon_name belongs to a valid vma because this function might free that vma.
>> + * Update the vm_flags and/or anon_name on region of a vma, splitting it or
>> + * merging it as necessary. Must be called with mmap_lock held for writing.
>> */
>> static int madvise_update_vma(vm_flags_t new_flags,
>> struct madvise_behavior *madv_behavior)
>> {
>> - int error;
>> struct vm_area_struct *vma = madv_behavior->vma;
>> struct madvise_behavior_range *range = &madv_behavior->range;
>> struct anon_vma_name *anon_name = madv_behavior->anon_name;
>> + bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME;
>> VMA_ITERATOR(vmi, madv_behavior->mm, range->start);
>>
>> - if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name))
>> + if (new_flags == vma->vm_flags && (!set_new_anon_name ||
>> + anon_vma_name_eq(anon_vma_name(vma), anon_name)))
>> return 0;
>>
>> - vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
>> + if (set_new_anon_name)
>> + vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
>> range->start, range->end, new_flags, anon_name);
>
> I will do a follow up (doesn't really belong here I'd say as involves change to
> modify code) that makes this vma_modify_name() or vma_modify_anon_name() as this
> is the only caller and we don't care about flags here :)
Oh right, good idea!
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 1/4] mm, madvise: simplify anon_name handling
2025-06-24 13:03 ` [PATCH v2 1/4] mm, madvise: simplify anon_name handling Vlastimil Babka
2025-06-24 13:58 ` David Hildenbrand
2025-06-24 15:26 ` Lorenzo Stoakes
@ 2025-06-24 17:01 ` Vlastimil Babka
2 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 17:01 UTC (permalink / raw)
To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
Michal Hocko, Colin Cross
Cc: linux-mm, linux-kernel
On 6/24/25 15:03, Vlastimil Babka wrote:
> Since the introduction in 9a10064f5625 ("mm: add a field to store names
> for private anonymous memory") the code to set anon_name on a vma has
> been using madvise_update_vma() to call replace_anon_vma_name(). Since
> the former is called also by a number of other madvise behaviours that
> do not set a new anon_name, they have been passing the existing
> anon_name of the vma to make replace_vma_anon_name() a no-op.
>
> This is rather wasteful as it needs anon_vma_name_eq() to determine the
> no-op situations, and checks for when replace_vma_anon_name() is allowed
> (the vma is anon/shmem) duplicate the checks already done earlier in
> madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
> fix use-after-free when anon vma name is used after vma is freed")
> adding anon_name refcount get/put operations exactly to the cases that
> actually do not change anon_name - just so the replace_vma_anon_name()
> can keep safely determining it has nothing to do.
>
> The recent madvise cleanups made this suboptimal handling very obvious,
> but happily also allow for an easy fix. madvise_update_vma() now has the
> complete information whether it's been called to set a new anon_name, so
> stop passing it the existing vma's name and doing the refcount get/put
> in its only caller madvise_vma_behavior().
>
> In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
> only to cases where we are setting a new name, otherwise we know it's a
> no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
> can remove the duplicate checks for vma being anon/shmem that were done
> already in madvise_vma_behavior().
>
> Additionally, by using vma_modify_flags() when not modifying the
> anon_name, avoid explicitly passing the existing vma's anon_name and
> storing a pointer to it in struct madv_behavior or a local variable.
> This prevents the danger of accessing a freed anon_name after vma
> merging, previously fixed by commit 942341dcc574.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
With fixed up function names as pointed out by Suren:
Since the introduction in 9a10064f5625 ("mm: add a field to store names
for private anonymous memory") the code to set anon_name on a vma has
been using madvise_update_vma() to call replace_anon_vma_name(). Since
the former is called also by a number of other madvise behaviours that
do not set a new anon_name, they have been passing the existing
anon_name of the vma to make replace_anon_vma_name() a no-op.
This is rather wasteful as it needs anon_vma_name_eq() to determine the
no-op situations, and checks for when replace_anon_vma_name() is allowed
(the vma is anon/shmem) duplicate the checks already done earlier in
madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
fix use-after-free when anon vma name is used after vma is freed")
adding anon_name refcount get/put operations exactly to the cases that
actually do not change anon_name - just so the replace_anon_vma_name()
can keep safely determining it has nothing to do.
The recent madvise cleanups made this suboptimal handling very obvious,
but happily also allow for an easy fix. madvise_update_vma() now has the
complete information whether it's been called to set a new anon_name, so
stop passing it the existing vma's name and doing the refcount get/put
in its only caller madvise_vma_behavior().
In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
only to cases where we are setting a new name, otherwise we know it's a
no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
can remove the duplicate checks for vma being anon/shmem that were done
already in madvise_vma_behavior().
Additionally, by using vma_modify_flags() when not modifying the
anon_name, avoid explicitly passing the existing vma's anon_name and
storing a pointer to it in struct madv_behavior or a local variable.
This prevents the danger of accessing a freed anon_name after vma
merging, previously fixed by commit 942341dcc574.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c
2025-06-24 13:03 [PATCH v2 0/4] madvise anon_name cleanups Vlastimil Babka
2025-06-24 13:03 ` [PATCH v2 1/4] mm, madvise: simplify anon_name handling Vlastimil Babka
@ 2025-06-24 13:03 ` Vlastimil Babka
2025-06-24 14:04 ` David Hildenbrand
2025-06-24 15:28 ` Lorenzo Stoakes
2025-06-24 13:03 ` [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file Vlastimil Babka
2025-06-24 13:03 ` [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name() Vlastimil Babka
3 siblings, 2 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 13:03 UTC (permalink / raw)
To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
Michal Hocko, Colin Cross
Cc: linux-mm, linux-kernel, Vlastimil Babka
Setting anon_name is done via madvise_set_anon_name() and behaves a lot
of like other madvise operations. However, apparently because madvise()
has lacked the 4th argument and prctl() not, the userspace entry point
has been implemented via prctl(PR_SET_VMA, ...) and handled first by
prctl_set_vma().
Currently prctl_set_vma() lives in kernel/sys.c but setting the
vma->anon_name is mm-specific code so extract it to a new
set_anon_vma_name() function under mm. mm/madvise.c seems to be the most
straightforward place as that's where madvise_set_anon_name() lives.
Stop declaring the latter in mm.h and instead declare
set_anon_vma_name().
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
include/linux/mm.h | 14 +++++++-------
kernel/sys.c | 50 +-------------------------------------------------
mm/madvise.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++--
3 files changed, 58 insertions(+), 58 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0e0549f3d681f6c7a78e8dfa341a810e5a8f96c1..ef40f68c1183d4c95016575a4ee0171e12df9ba4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4059,14 +4059,14 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,
#endif
#ifdef CONFIG_ANON_VMA_NAME
-int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
- unsigned long len_in,
- struct anon_vma_name *anon_name);
+int set_anon_vma_name(unsigned long addr, unsigned long size,
+ const char __user *uname);
#else
-static inline int
-madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
- unsigned long len_in, struct anon_vma_name *anon_name) {
- return 0;
+static inline
+int set_anon_vma_name(unsigned long addr, unsigned long size,
+ const char __user *uname)
+{
+ return -EINVAL;
}
#endif
diff --git a/kernel/sys.c b/kernel/sys.c
index adc0de0aa364aebb23999f621717a5d32599921c..b153fb345ada28ea1a33386a32bcce9cb1b23475 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2343,54 +2343,14 @@ int __weak arch_lock_shadow_stack_status(struct task_struct *t, unsigned long st
#define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE)
-#ifdef CONFIG_ANON_VMA_NAME
-
-#define ANON_VMA_NAME_MAX_LEN 80
-#define ANON_VMA_NAME_INVALID_CHARS "\\`$[]"
-
-static inline bool is_valid_name_char(char ch)
-{
- /* printable ascii characters, excluding ANON_VMA_NAME_INVALID_CHARS */
- return ch > 0x1f && ch < 0x7f &&
- !strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
-}
-
static int prctl_set_vma(unsigned long opt, unsigned long addr,
unsigned long size, unsigned long arg)
{
- struct mm_struct *mm = current->mm;
- const char __user *uname;
- struct anon_vma_name *anon_name = NULL;
int error;
switch (opt) {
case PR_SET_VMA_ANON_NAME:
- uname = (const char __user *)arg;
- if (uname) {
- char *name, *pch;
-
- name = strndup_user(uname, ANON_VMA_NAME_MAX_LEN);
- if (IS_ERR(name))
- return PTR_ERR(name);
-
- for (pch = name; *pch != '\0'; pch++) {
- if (!is_valid_name_char(*pch)) {
- kfree(name);
- return -EINVAL;
- }
- }
- /* anon_vma has its own copy */
- anon_name = anon_vma_name_alloc(name);
- kfree(name);
- if (!anon_name)
- return -ENOMEM;
-
- }
-
- mmap_write_lock(mm);
- error = madvise_set_anon_name(mm, addr, size, anon_name);
- mmap_write_unlock(mm);
- anon_vma_name_put(anon_name);
+ error = set_anon_vma_name(addr, size, (const char __user *)arg);
break;
default:
error = -EINVAL;
@@ -2399,14 +2359,6 @@ static int prctl_set_vma(unsigned long opt, unsigned long addr,
return error;
}
-#else /* CONFIG_ANON_VMA_NAME */
-static int prctl_set_vma(unsigned long opt, unsigned long start,
- unsigned long size, unsigned long arg)
-{
- return -EINVAL;
-}
-#endif /* CONFIG_ANON_VMA_NAME */
-
static inline unsigned long get_current_mdwe(void)
{
unsigned long ret = 0;
diff --git a/mm/madvise.c b/mm/madvise.c
index fca0e9b3e844ad766e83ac04cc0d7f4099c74005..7e8819b5e9a0f183213ffe19d7e52bd5fda5f49d 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -134,8 +134,8 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
return 0;
}
-int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
- unsigned long len_in, struct anon_vma_name *anon_name)
+static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
+ unsigned long len_in, struct anon_vma_name *anon_name)
{
unsigned long end;
unsigned long len;
@@ -2096,3 +2096,51 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
out:
return ret;
}
+
+#ifdef CONFIG_ANON_VMA_NAME
+
+#define ANON_VMA_NAME_MAX_LEN 80
+#define ANON_VMA_NAME_INVALID_CHARS "\\`$[]"
+
+static inline bool is_valid_name_char(char ch)
+{
+ /* printable ascii characters, excluding ANON_VMA_NAME_INVALID_CHARS */
+ return ch > 0x1f && ch < 0x7f &&
+ !strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
+}
+
+int set_anon_vma_name(unsigned long addr, unsigned long size,
+ const char __user *uname)
+{
+ struct anon_vma_name *anon_name = NULL;
+ struct mm_struct *mm = current->mm;
+ int error;
+
+ if (uname) {
+ char *name, *pch;
+
+ name = strndup_user(uname, ANON_VMA_NAME_MAX_LEN);
+ if (IS_ERR(name))
+ return PTR_ERR(name);
+
+ for (pch = name; *pch != '\0'; pch++) {
+ if (!is_valid_name_char(*pch)) {
+ kfree(name);
+ return -EINVAL;
+ }
+ }
+ /* anon_vma has its own copy */
+ anon_name = anon_vma_name_alloc(name);
+ kfree(name);
+ if (!anon_name)
+ return -ENOMEM;
+ }
+
+ mmap_write_lock(mm);
+ error = madvise_set_anon_name(mm, addr, size, anon_name);
+ mmap_write_unlock(mm);
+ anon_vma_name_put(anon_name);
+
+ return error;
+}
+#endif
--
2.50.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c
2025-06-24 13:03 ` [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c Vlastimil Babka
@ 2025-06-24 14:04 ` David Hildenbrand
2025-06-24 14:31 ` Suren Baghdasaryan
2025-06-24 15:28 ` Lorenzo Stoakes
1 sibling, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2025-06-24 14:04 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
Jann Horn, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Colin Cross
Cc: linux-mm, linux-kernel
On 24.06.25 15:03, Vlastimil Babka wrote:
> Setting anon_name is done via madvise_set_anon_name() and behaves a lot
> of like other madvise operations. However, apparently because madvise()
> has lacked the 4th argument and prctl() not, the userspace entry point
> has been implemented via prctl(PR_SET_VMA, ...) and handled first by
> prctl_set_vma().
>
> Currently prctl_set_vma() lives in kernel/sys.c but setting the
> vma->anon_name is mm-specific code so extract it to a new
> set_anon_vma_name() function under mm. mm/madvise.c seems to be the most
> straightforward place as that's where madvise_set_anon_name() lives.
> Stop declaring the latter in mm.h and instead declare
> set_anon_vma_name().
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> include/linux/mm.h | 14 +++++++-------
> kernel/sys.c | 50 +-------------------------------------------------
> mm/madvise.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> 3 files changed, 58 insertions(+), 58 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0e0549f3d681f6c7a78e8dfa341a810e5a8f96c1..ef40f68c1183d4c95016575a4ee0171e12df9ba4 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -4059,14 +4059,14 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,
> #endif
>
> #ifdef CONFIG_ANON_VMA_NAME
> -int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> - unsigned long len_in,
> - struct anon_vma_name *anon_name);
> +int set_anon_vma_name(unsigned long addr, unsigned long size,
> + const char __user *uname);
> #else
> -static inline int
> -madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> - unsigned long len_in, struct anon_vma_name *anon_name) {
> - return 0;
> +static inline
> +int set_anon_vma_name(unsigned long addr, unsigned long size,
> + const char __user *uname)
> +{
> + return -EINVAL;
> }
> #endif
>
> diff --git a/kernel/sys.c b/kernel/sys.c
> index adc0de0aa364aebb23999f621717a5d32599921c..b153fb345ada28ea1a33386a32bcce9cb1b23475 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2343,54 +2343,14 @@ int __weak arch_lock_shadow_stack_status(struct task_struct *t, unsigned long st
>
> #define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE)
>
> -#ifdef CONFIG_ANON_VMA_NAME
> -
> -#define ANON_VMA_NAME_MAX_LEN 80
> -#define ANON_VMA_NAME_INVALID_CHARS "\\`$[]"
> -
> -static inline bool is_valid_name_char(char ch)
> -{
> - /* printable ascii characters, excluding ANON_VMA_NAME_INVALID_CHARS */
> - return ch > 0x1f && ch < 0x7f &&
> - !strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
> -}
> -
> static int prctl_set_vma(unsigned long opt, unsigned long addr,
> unsigned long size, unsigned long arg)
> {
> - struct mm_struct *mm = current->mm;
> - const char __user *uname;
> - struct anon_vma_name *anon_name = NULL;
> int error;
>
> switch (opt) {
> case PR_SET_VMA_ANON_NAME:
> - uname = (const char __user *)arg;
> - if (uname) {
> - char *name, *pch;
> -
> - name = strndup_user(uname, ANON_VMA_NAME_MAX_LEN);
> - if (IS_ERR(name))
> - return PTR_ERR(name);
> -
> - for (pch = name; *pch != '\0'; pch++) {
> - if (!is_valid_name_char(*pch)) {
> - kfree(name);
> - return -EINVAL;
> - }
> - }
> - /* anon_vma has its own copy */
> - anon_name = anon_vma_name_alloc(name);
> - kfree(name);
> - if (!anon_name)
> - return -ENOMEM;
> -
> - }
> -
> - mmap_write_lock(mm);
> - error = madvise_set_anon_name(mm, addr, size, anon_name);
> - mmap_write_unlock(mm);
> - anon_vma_name_put(anon_name);
> + error = set_anon_vma_name(addr, size, (const char __user *)arg);
At first I thought whether passing current->mm as an argument might make
it clearer on what we actually operate. But then, "anon_vma" might give
a good hint.
Acked-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c
2025-06-24 14:04 ` David Hildenbrand
@ 2025-06-24 14:31 ` Suren Baghdasaryan
0 siblings, 0 replies; 25+ messages in thread
From: Suren Baghdasaryan @ 2025-06-24 14:31 UTC (permalink / raw)
To: David Hildenbrand
Cc: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
Jann Horn, Mike Rapoport, Michal Hocko, Colin Cross, linux-mm,
linux-kernel
On Tue, Jun 24, 2025 at 7:04 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 24.06.25 15:03, Vlastimil Babka wrote:
> > Setting anon_name is done via madvise_set_anon_name() and behaves a lot
> > of like other madvise operations. However, apparently because madvise()
> > has lacked the 4th argument and prctl() not, the userspace entry point
> > has been implemented via prctl(PR_SET_VMA, ...) and handled first by
> > prctl_set_vma().
> >
> > Currently prctl_set_vma() lives in kernel/sys.c but setting the
> > vma->anon_name is mm-specific code so extract it to a new
> > set_anon_vma_name() function under mm. mm/madvise.c seems to be the most
> > straightforward place as that's where madvise_set_anon_name() lives.
> > Stop declaring the latter in mm.h and instead declare
> > set_anon_vma_name().
> >
> > Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> > ---
> > include/linux/mm.h | 14 +++++++-------
> > kernel/sys.c | 50 +-------------------------------------------------
> > mm/madvise.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> > 3 files changed, 58 insertions(+), 58 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 0e0549f3d681f6c7a78e8dfa341a810e5a8f96c1..ef40f68c1183d4c95016575a4ee0171e12df9ba4 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -4059,14 +4059,14 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,
> > #endif
> >
> > #ifdef CONFIG_ANON_VMA_NAME
> > -int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> > - unsigned long len_in,
> > - struct anon_vma_name *anon_name);
> > +int set_anon_vma_name(unsigned long addr, unsigned long size,
> > + const char __user *uname);
> > #else
> > -static inline int
> > -madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> > - unsigned long len_in, struct anon_vma_name *anon_name) {
> > - return 0;
> > +static inline
> > +int set_anon_vma_name(unsigned long addr, unsigned long size,
> > + const char __user *uname)
> > +{
> > + return -EINVAL;
> > }
> > #endif
> >
> > diff --git a/kernel/sys.c b/kernel/sys.c
> > index adc0de0aa364aebb23999f621717a5d32599921c..b153fb345ada28ea1a33386a32bcce9cb1b23475 100644
> > --- a/kernel/sys.c
> > +++ b/kernel/sys.c
> > @@ -2343,54 +2343,14 @@ int __weak arch_lock_shadow_stack_status(struct task_struct *t, unsigned long st
> >
> > #define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE)
> >
> > -#ifdef CONFIG_ANON_VMA_NAME
> > -
> > -#define ANON_VMA_NAME_MAX_LEN 80
> > -#define ANON_VMA_NAME_INVALID_CHARS "\\`$[]"
> > -
> > -static inline bool is_valid_name_char(char ch)
> > -{
> > - /* printable ascii characters, excluding ANON_VMA_NAME_INVALID_CHARS */
> > - return ch > 0x1f && ch < 0x7f &&
> > - !strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
> > -}
> > -
> > static int prctl_set_vma(unsigned long opt, unsigned long addr,
> > unsigned long size, unsigned long arg)
> > {
> > - struct mm_struct *mm = current->mm;
> > - const char __user *uname;
> > - struct anon_vma_name *anon_name = NULL;
> > int error;
> >
> > switch (opt) {
> > case PR_SET_VMA_ANON_NAME:
> > - uname = (const char __user *)arg;
> > - if (uname) {
> > - char *name, *pch;
> > -
> > - name = strndup_user(uname, ANON_VMA_NAME_MAX_LEN);
> > - if (IS_ERR(name))
> > - return PTR_ERR(name);
> > -
> > - for (pch = name; *pch != '\0'; pch++) {
> > - if (!is_valid_name_char(*pch)) {
> > - kfree(name);
> > - return -EINVAL;
> > - }
> > - }
> > - /* anon_vma has its own copy */
> > - anon_name = anon_vma_name_alloc(name);
> > - kfree(name);
> > - if (!anon_name)
> > - return -ENOMEM;
> > -
> > - }
> > -
> > - mmap_write_lock(mm);
> > - error = madvise_set_anon_name(mm, addr, size, anon_name);
> > - mmap_write_unlock(mm);
> > - anon_vma_name_put(anon_name);
> > + error = set_anon_vma_name(addr, size, (const char __user *)arg);
>
> At first I thought whether passing current->mm as an argument might make
> it clearer on what we actually operate. But then, "anon_vma" might give
> a good hint.
>
> Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>
> --
> Cheers,
>
> David / dhildenb
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c
2025-06-24 13:03 ` [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c Vlastimil Babka
2025-06-24 14:04 ` David Hildenbrand
@ 2025-06-24 15:28 ` Lorenzo Stoakes
1 sibling, 0 replies; 25+ messages in thread
From: Lorenzo Stoakes @ 2025-06-24 15:28 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand, Jann Horn,
Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Colin Cross,
linux-mm, linux-kernel
On Tue, Jun 24, 2025 at 03:03:46PM +0200, Vlastimil Babka wrote:
> Setting anon_name is done via madvise_set_anon_name() and behaves a lot
> of like other madvise operations. However, apparently because madvise()
> has lacked the 4th argument and prctl() not, the userspace entry point
> has been implemented via prctl(PR_SET_VMA, ...) and handled first by
> prctl_set_vma().
>
> Currently prctl_set_vma() lives in kernel/sys.c but setting the
> vma->anon_name is mm-specific code so extract it to a new
> set_anon_vma_name() function under mm. mm/madvise.c seems to be the most
> straightforward place as that's where madvise_set_anon_name() lives.
> Stop declaring the latter in mm.h and instead declare
> set_anon_vma_name().
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
LGTM, so:
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Thanks for doing this! :)
> ---
> include/linux/mm.h | 14 +++++++-------
> kernel/sys.c | 50 +-------------------------------------------------
> mm/madvise.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> 3 files changed, 58 insertions(+), 58 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0e0549f3d681f6c7a78e8dfa341a810e5a8f96c1..ef40f68c1183d4c95016575a4ee0171e12df9ba4 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -4059,14 +4059,14 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,
> #endif
>
> #ifdef CONFIG_ANON_VMA_NAME
> -int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> - unsigned long len_in,
> - struct anon_vma_name *anon_name);
> +int set_anon_vma_name(unsigned long addr, unsigned long size,
> + const char __user *uname);
> #else
> -static inline int
> -madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> - unsigned long len_in, struct anon_vma_name *anon_name) {
> - return 0;
> +static inline
> +int set_anon_vma_name(unsigned long addr, unsigned long size,
> + const char __user *uname)
> +{
> + return -EINVAL;
> }
> #endif
>
> diff --git a/kernel/sys.c b/kernel/sys.c
> index adc0de0aa364aebb23999f621717a5d32599921c..b153fb345ada28ea1a33386a32bcce9cb1b23475 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2343,54 +2343,14 @@ int __weak arch_lock_shadow_stack_status(struct task_struct *t, unsigned long st
>
> #define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE)
>
> -#ifdef CONFIG_ANON_VMA_NAME
> -
> -#define ANON_VMA_NAME_MAX_LEN 80
> -#define ANON_VMA_NAME_INVALID_CHARS "\\`$[]"
> -
> -static inline bool is_valid_name_char(char ch)
> -{
> - /* printable ascii characters, excluding ANON_VMA_NAME_INVALID_CHARS */
> - return ch > 0x1f && ch < 0x7f &&
> - !strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
> -}
> -
> static int prctl_set_vma(unsigned long opt, unsigned long addr,
> unsigned long size, unsigned long arg)
> {
> - struct mm_struct *mm = current->mm;
> - const char __user *uname;
> - struct anon_vma_name *anon_name = NULL;
> int error;
>
> switch (opt) {
> case PR_SET_VMA_ANON_NAME:
> - uname = (const char __user *)arg;
> - if (uname) {
> - char *name, *pch;
> -
> - name = strndup_user(uname, ANON_VMA_NAME_MAX_LEN);
> - if (IS_ERR(name))
> - return PTR_ERR(name);
> -
> - for (pch = name; *pch != '\0'; pch++) {
> - if (!is_valid_name_char(*pch)) {
> - kfree(name);
> - return -EINVAL;
> - }
> - }
> - /* anon_vma has its own copy */
> - anon_name = anon_vma_name_alloc(name);
> - kfree(name);
> - if (!anon_name)
> - return -ENOMEM;
> -
> - }
> -
> - mmap_write_lock(mm);
> - error = madvise_set_anon_name(mm, addr, size, anon_name);
> - mmap_write_unlock(mm);
> - anon_vma_name_put(anon_name);
> + error = set_anon_vma_name(addr, size, (const char __user *)arg);
> break;
> default:
> error = -EINVAL;
> @@ -2399,14 +2359,6 @@ static int prctl_set_vma(unsigned long opt, unsigned long addr,
> return error;
> }
>
> -#else /* CONFIG_ANON_VMA_NAME */
> -static int prctl_set_vma(unsigned long opt, unsigned long start,
> - unsigned long size, unsigned long arg)
> -{
> - return -EINVAL;
> -}
> -#endif /* CONFIG_ANON_VMA_NAME */
> -
> static inline unsigned long get_current_mdwe(void)
> {
> unsigned long ret = 0;
> diff --git a/mm/madvise.c b/mm/madvise.c
> index fca0e9b3e844ad766e83ac04cc0d7f4099c74005..7e8819b5e9a0f183213ffe19d7e52bd5fda5f49d 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -134,8 +134,8 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
> return 0;
> }
>
> -int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> - unsigned long len_in, struct anon_vma_name *anon_name)
> +static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> + unsigned long len_in, struct anon_vma_name *anon_name)
> {
> unsigned long end;
> unsigned long len;
> @@ -2096,3 +2096,51 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
> out:
> return ret;
> }
> +
> +#ifdef CONFIG_ANON_VMA_NAME
> +
> +#define ANON_VMA_NAME_MAX_LEN 80
> +#define ANON_VMA_NAME_INVALID_CHARS "\\`$[]"
> +
> +static inline bool is_valid_name_char(char ch)
> +{
> + /* printable ascii characters, excluding ANON_VMA_NAME_INVALID_CHARS */
> + return ch > 0x1f && ch < 0x7f &&
> + !strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
> +}
> +
> +int set_anon_vma_name(unsigned long addr, unsigned long size,
> + const char __user *uname)
> +{
> + struct anon_vma_name *anon_name = NULL;
> + struct mm_struct *mm = current->mm;
> + int error;
> +
> + if (uname) {
> + char *name, *pch;
> +
> + name = strndup_user(uname, ANON_VMA_NAME_MAX_LEN);
> + if (IS_ERR(name))
> + return PTR_ERR(name);
> +
> + for (pch = name; *pch != '\0'; pch++) {
> + if (!is_valid_name_char(*pch)) {
> + kfree(name);
> + return -EINVAL;
> + }
> + }
> + /* anon_vma has its own copy */
> + anon_name = anon_vma_name_alloc(name);
> + kfree(name);
> + if (!anon_name)
> + return -ENOMEM;
> + }
> +
> + mmap_write_lock(mm);
> + error = madvise_set_anon_name(mm, addr, size, anon_name);
> + mmap_write_unlock(mm);
> + anon_vma_name_put(anon_name);
> +
> + return error;
> +}
> +#endif
>
> --
> 2.50.0
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file
2025-06-24 13:03 [PATCH v2 0/4] madvise anon_name cleanups Vlastimil Babka
2025-06-24 13:03 ` [PATCH v2 1/4] mm, madvise: simplify anon_name handling Vlastimil Babka
2025-06-24 13:03 ` [PATCH v2 2/4] mm, madvise: extract mm code from prctl_set_vma() to mm/madvise.c Vlastimil Babka
@ 2025-06-24 13:03 ` Vlastimil Babka
2025-06-24 14:05 ` David Hildenbrand
` (2 more replies)
2025-06-24 13:03 ` [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name() Vlastimil Babka
3 siblings, 3 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 13:03 UTC (permalink / raw)
To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
Michal Hocko, Colin Cross
Cc: linux-mm, linux-kernel, Vlastimil Babka
Preparatory change so that we can use madvise_lock()/unlock() in the
function without forward declarations or more thorough shuffling.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
mm/madvise.c | 64 ++++++++++++++++++++++++++++++------------------------------
1 file changed, 32 insertions(+), 32 deletions(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index 7e8819b5e9a0f183213ffe19d7e52bd5fda5f49d..cae064479cdf908707c45b941bd03d43d095eab6 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -133,38 +133,6 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
return 0;
}
-
-static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
- unsigned long len_in, struct anon_vma_name *anon_name)
-{
- unsigned long end;
- unsigned long len;
- struct madvise_behavior madv_behavior = {
- .mm = mm,
- .behavior = __MADV_SET_ANON_VMA_NAME,
- .lock_mode = MADVISE_MMAP_WRITE_LOCK,
- .anon_name = anon_name,
- };
-
- if (start & ~PAGE_MASK)
- return -EINVAL;
- len = (len_in + ~PAGE_MASK) & PAGE_MASK;
-
- /* Check to see whether len was rounded up from small -ve to zero */
- if (len_in && !len)
- return -EINVAL;
-
- end = start + len;
- if (end < start)
- return -EINVAL;
-
- if (end == start)
- return 0;
-
- madv_behavior.range.start = start;
- madv_behavior.range.end = end;
- return madvise_walk_vmas(&madv_behavior);
-}
#else /* CONFIG_ANON_VMA_NAME */
static int replace_anon_vma_name(struct vm_area_struct *vma,
struct anon_vma_name *anon_name)
@@ -2109,6 +2077,38 @@ static inline bool is_valid_name_char(char ch)
!strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
}
+static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
+ unsigned long len_in, struct anon_vma_name *anon_name)
+{
+ unsigned long end;
+ unsigned long len;
+ struct madvise_behavior madv_behavior = {
+ .mm = mm,
+ .behavior = __MADV_SET_ANON_VMA_NAME,
+ .lock_mode = MADVISE_MMAP_WRITE_LOCK,
+ .anon_name = anon_name,
+ };
+
+ if (start & ~PAGE_MASK)
+ return -EINVAL;
+ len = (len_in + ~PAGE_MASK) & PAGE_MASK;
+
+ /* Check to see whether len was rounded up from small -ve to zero */
+ if (len_in && !len)
+ return -EINVAL;
+
+ end = start + len;
+ if (end < start)
+ return -EINVAL;
+
+ if (end == start)
+ return 0;
+
+ madv_behavior.range.start = start;
+ madv_behavior.range.end = end;
+ return madvise_walk_vmas(&madv_behavior);
+}
+
int set_anon_vma_name(unsigned long addr, unsigned long size,
const char __user *uname)
{
--
2.50.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file
2025-06-24 13:03 ` [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file Vlastimil Babka
@ 2025-06-24 14:05 ` David Hildenbrand
2025-06-24 14:33 ` Suren Baghdasaryan
2025-06-24 16:46 ` Vlastimil Babka
2025-06-24 15:36 ` Lorenzo Stoakes
2025-06-24 17:02 ` Vlastimil Babka
2 siblings, 2 replies; 25+ messages in thread
From: David Hildenbrand @ 2025-06-24 14:05 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
Jann Horn, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Colin Cross
Cc: linux-mm, linux-kernel
On 24.06.25 15:03, Vlastimil Babka wrote:
> Preparatory change so that we can use madvise_lock()/unlock() in the
> function without forward declarations or more thorough shuffling.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> mm/madvise.c | 64 ++++++++++++++++++++++++++++++------------------------------
> 1 file changed, 32 insertions(+), 32 deletions(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 7e8819b5e9a0f183213ffe19d7e52bd5fda5f49d..cae064479cdf908707c45b941bd03d43d095eab6 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -133,38 +133,6 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
>
> return 0;
> }
> -
> -static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> - unsigned long len_in, struct anon_vma_name *anon_name)
> -{
> - unsigned long end;
> - unsigned long len;
> - struct madvise_behavior madv_behavior = {
> - .mm = mm,
> - .behavior = __MADV_SET_ANON_VMA_NAME,
> - .lock_mode = MADVISE_MMAP_WRITE_LOCK,
> - .anon_name = anon_name,
> - };
> -
> - if (start & ~PAGE_MASK)
> - return -EINVAL;
> - len = (len_in + ~PAGE_MASK) & PAGE_MASK;
> -
> - /* Check to see whether len was rounded up from small -ve to zero */
> - if (len_in && !len)
> - return -EINVAL;
> -
> - end = start + len;
> - if (end < start)
> - return -EINVAL;
> -
> - if (end == start)
> - return 0;
> -
> - madv_behavior.range.start = start;
> - madv_behavior.range.end = end;
> - return madvise_walk_vmas(&madv_behavior);
> -}
> #else /* CONFIG_ANON_VMA_NAME */
> static int replace_anon_vma_name(struct vm_area_struct *vma,
> struct anon_vma_name *anon_name)
> @@ -2109,6 +2077,38 @@ static inline bool is_valid_name_char(char ch)
> !strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
> }
>
> +static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> + unsigned long len_in, struct anon_vma_name *anon_name)
> +{
> + unsigned long end;
> + unsigned long len;
> + struct madvise_behavior madv_behavior = {
> + .mm = mm,
> + .behavior = __MADV_SET_ANON_VMA_NAME,
> + .lock_mode = MADVISE_MMAP_WRITE_LOCK,
> + .anon_name = anon_name,
> + };
> +
> + if (start & ~PAGE_MASK)
> + return -EINVAL;
> + len = (len_in + ~PAGE_MASK) & PAGE_MASK;
> +
> + /* Check to see whether len was rounded up from small -ve to zero */
> + if (len_in && !len)
> + return -EINVAL;
> +
> + end = start + len;
> + if (end < start)
> + return -EINVAL;
> +
> + if (end == start)
> + return 0;
> +
> + madv_behavior.range.start = start;
> + madv_behavior.range.end = end;
> + return madvise_walk_vmas(&madv_behavior);
> +}
> +
> int set_anon_vma_name(unsigned long addr, unsigned long size,
> const char __user *uname)
> {
>
Personally, I would squash that into #4, given that #4 is pretty small ;)
Acked-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file
2025-06-24 14:05 ` David Hildenbrand
@ 2025-06-24 14:33 ` Suren Baghdasaryan
2025-06-24 16:46 ` Vlastimil Babka
1 sibling, 0 replies; 25+ messages in thread
From: Suren Baghdasaryan @ 2025-06-24 14:33 UTC (permalink / raw)
To: David Hildenbrand
Cc: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
Jann Horn, Mike Rapoport, Michal Hocko, Colin Cross, linux-mm,
linux-kernel
On Tue, Jun 24, 2025 at 7:05 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 24.06.25 15:03, Vlastimil Babka wrote:
> > Preparatory change so that we can use madvise_lock()/unlock() in the
> > function without forward declarations or more thorough shuffling.
If you respin the series please add a note that there is no functional
change here.
> >
> > Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> > ---
> > mm/madvise.c | 64 ++++++++++++++++++++++++++++++------------------------------
> > 1 file changed, 32 insertions(+), 32 deletions(-)
> >
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index 7e8819b5e9a0f183213ffe19d7e52bd5fda5f49d..cae064479cdf908707c45b941bd03d43d095eab6 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -133,38 +133,6 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
> >
> > return 0;
> > }
> > -
> > -static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> > - unsigned long len_in, struct anon_vma_name *anon_name)
> > -{
> > - unsigned long end;
> > - unsigned long len;
> > - struct madvise_behavior madv_behavior = {
> > - .mm = mm,
> > - .behavior = __MADV_SET_ANON_VMA_NAME,
> > - .lock_mode = MADVISE_MMAP_WRITE_LOCK,
> > - .anon_name = anon_name,
> > - };
> > -
> > - if (start & ~PAGE_MASK)
> > - return -EINVAL;
> > - len = (len_in + ~PAGE_MASK) & PAGE_MASK;
> > -
> > - /* Check to see whether len was rounded up from small -ve to zero */
> > - if (len_in && !len)
> > - return -EINVAL;
> > -
> > - end = start + len;
> > - if (end < start)
> > - return -EINVAL;
> > -
> > - if (end == start)
> > - return 0;
> > -
> > - madv_behavior.range.start = start;
> > - madv_behavior.range.end = end;
> > - return madvise_walk_vmas(&madv_behavior);
> > -}
> > #else /* CONFIG_ANON_VMA_NAME */
> > static int replace_anon_vma_name(struct vm_area_struct *vma,
> > struct anon_vma_name *anon_name)
> > @@ -2109,6 +2077,38 @@ static inline bool is_valid_name_char(char ch)
> > !strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
> > }
> >
> > +static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> > + unsigned long len_in, struct anon_vma_name *anon_name)
> > +{
> > + unsigned long end;
> > + unsigned long len;
> > + struct madvise_behavior madv_behavior = {
> > + .mm = mm,
> > + .behavior = __MADV_SET_ANON_VMA_NAME,
> > + .lock_mode = MADVISE_MMAP_WRITE_LOCK,
> > + .anon_name = anon_name,
> > + };
> > +
> > + if (start & ~PAGE_MASK)
> > + return -EINVAL;
> > + len = (len_in + ~PAGE_MASK) & PAGE_MASK;
> > +
> > + /* Check to see whether len was rounded up from small -ve to zero */
> > + if (len_in && !len)
> > + return -EINVAL;
> > +
> > + end = start + len;
> > + if (end < start)
> > + return -EINVAL;
> > +
> > + if (end == start)
> > + return 0;
> > +
> > + madv_behavior.range.start = start;
> > + madv_behavior.range.end = end;
> > + return madvise_walk_vmas(&madv_behavior);
> > +}
> > +
> > int set_anon_vma_name(unsigned long addr, unsigned long size,
> > const char __user *uname)
> > {
> >
>
> Personally, I would squash that into #4, given that #4 is pretty small ;)
>
> Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>
> --
> Cheers,
>
> David / dhildenb
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file
2025-06-24 14:05 ` David Hildenbrand
2025-06-24 14:33 ` Suren Baghdasaryan
@ 2025-06-24 16:46 ` Vlastimil Babka
1 sibling, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 16:46 UTC (permalink / raw)
To: David Hildenbrand, Andrew Morton, Liam R. Howlett,
Lorenzo Stoakes, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
Michal Hocko, Colin Cross
Cc: linux-mm, linux-kernel
On 6/24/25 16:05, David Hildenbrand wrote:
> On 24.06.25 15:03, Vlastimil Babka wrote:
>
> Personally, I would squash that into #4, given that #4 is pretty small ;)
>
> Acked-by: David Hildenbrand <david@redhat.com>
Right, the advantages of pure move commits is that git diff can colorize
them so it's immediately obvious, and git blame -C or -M can also recognize
pure moves and ignore them. Mixing moves with changes tends to break these
heuristics. But if Andrew wants to squash, feel free.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file
2025-06-24 13:03 ` [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file Vlastimil Babka
2025-06-24 14:05 ` David Hildenbrand
@ 2025-06-24 15:36 ` Lorenzo Stoakes
2025-06-24 17:02 ` Vlastimil Babka
2 siblings, 0 replies; 25+ messages in thread
From: Lorenzo Stoakes @ 2025-06-24 15:36 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand, Jann Horn,
Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Colin Cross,
linux-mm, linux-kernel
On Tue, Jun 24, 2025 at 03:03:47PM +0200, Vlastimil Babka wrote:
> Preparatory change so that we can use madvise_lock()/unlock() in the
> function without forward declarations or more thorough shuffling.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Fine, but I think this is small enough of a move to be safely combined with 4/4
as David says.
Either way:
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
> mm/madvise.c | 64 ++++++++++++++++++++++++++++++------------------------------
> 1 file changed, 32 insertions(+), 32 deletions(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 7e8819b5e9a0f183213ffe19d7e52bd5fda5f49d..cae064479cdf908707c45b941bd03d43d095eab6 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -133,38 +133,6 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
>
> return 0;
> }
> -
> -static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> - unsigned long len_in, struct anon_vma_name *anon_name)
> -{
> - unsigned long end;
> - unsigned long len;
> - struct madvise_behavior madv_behavior = {
> - .mm = mm,
> - .behavior = __MADV_SET_ANON_VMA_NAME,
> - .lock_mode = MADVISE_MMAP_WRITE_LOCK,
> - .anon_name = anon_name,
> - };
> -
> - if (start & ~PAGE_MASK)
> - return -EINVAL;
> - len = (len_in + ~PAGE_MASK) & PAGE_MASK;
> -
> - /* Check to see whether len was rounded up from small -ve to zero */
> - if (len_in && !len)
> - return -EINVAL;
> -
> - end = start + len;
> - if (end < start)
> - return -EINVAL;
> -
> - if (end == start)
> - return 0;
> -
> - madv_behavior.range.start = start;
> - madv_behavior.range.end = end;
> - return madvise_walk_vmas(&madv_behavior);
> -}
> #else /* CONFIG_ANON_VMA_NAME */
> static int replace_anon_vma_name(struct vm_area_struct *vma,
> struct anon_vma_name *anon_name)
> @@ -2109,6 +2077,38 @@ static inline bool is_valid_name_char(char ch)
> !strchr(ANON_VMA_NAME_INVALID_CHARS, ch);
> }
>
> +static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> + unsigned long len_in, struct anon_vma_name *anon_name)
> +{
> + unsigned long end;
> + unsigned long len;
> + struct madvise_behavior madv_behavior = {
> + .mm = mm,
> + .behavior = __MADV_SET_ANON_VMA_NAME,
> + .lock_mode = MADVISE_MMAP_WRITE_LOCK,
> + .anon_name = anon_name,
> + };
> +
> + if (start & ~PAGE_MASK)
> + return -EINVAL;
> + len = (len_in + ~PAGE_MASK) & PAGE_MASK;
> +
> + /* Check to see whether len was rounded up from small -ve to zero */
> + if (len_in && !len)
> + return -EINVAL;
> +
> + end = start + len;
> + if (end < start)
> + return -EINVAL;
> +
> + if (end == start)
> + return 0;
> +
> + madv_behavior.range.start = start;
> + madv_behavior.range.end = end;
> + return madvise_walk_vmas(&madv_behavior);
> +}
> +
> int set_anon_vma_name(unsigned long addr, unsigned long size,
> const char __user *uname)
> {
>
> --
> 2.50.0
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file
2025-06-24 13:03 ` [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file Vlastimil Babka
2025-06-24 14:05 ` David Hildenbrand
2025-06-24 15:36 ` Lorenzo Stoakes
@ 2025-06-24 17:02 ` Vlastimil Babka
2 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 17:02 UTC (permalink / raw)
To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
Michal Hocko, Colin Cross
Cc: linux-mm, linux-kernel
On 6/24/25 15:03, Vlastimil Babka wrote:
> Preparatory change so that we can use madvise_lock()/unlock() in the
> function without forward declarations or more thorough shuffling.
additional paragraph here:
No functional change. Move as a separate commit helps git heuristics to
detect it properly.
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name()
2025-06-24 13:03 [PATCH v2 0/4] madvise anon_name cleanups Vlastimil Babka
` (2 preceding siblings ...)
2025-06-24 13:03 ` [PATCH v2 3/4] mm, madvise: move madvise_set_anon_name() down the file Vlastimil Babka
@ 2025-06-24 13:03 ` Vlastimil Babka
2025-06-24 14:06 ` David Hildenbrand
` (2 more replies)
3 siblings, 3 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 13:03 UTC (permalink / raw)
To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
Michal Hocko, Colin Cross
Cc: linux-mm, linux-kernel, Vlastimil Babka
Use madvise_lock()/madvise_unlock() in madvise_set_anon_name() in the
same way as in do_madvise(). This narrows the lock scope a bit and
reuses existing functionality. get_lock_mode() already picks the correct
MADVISE_MMAP_WRITE_LOCK mode for __MADV_SET_ANON_VMA_NAME so we can just
remove the explicit assignment.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
mm/madvise.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index cae064479cdf908707c45b941bd03d43d095eab6..ee02ccd0315a146cdb3001cd189e03be9e48a2ea 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -2082,10 +2082,10 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
{
unsigned long end;
unsigned long len;
+ int error;
struct madvise_behavior madv_behavior = {
.mm = mm,
.behavior = __MADV_SET_ANON_VMA_NAME,
- .lock_mode = MADVISE_MMAP_WRITE_LOCK,
.anon_name = anon_name,
};
@@ -2106,7 +2106,14 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
madv_behavior.range.start = start;
madv_behavior.range.end = end;
- return madvise_walk_vmas(&madv_behavior);
+
+ error = madvise_lock(&madv_behavior);
+ if (error)
+ return error;
+ error = madvise_walk_vmas(&madv_behavior);
+ madvise_unlock(&madv_behavior);
+
+ return error;
}
int set_anon_vma_name(unsigned long addr, unsigned long size,
@@ -2136,9 +2143,7 @@ int set_anon_vma_name(unsigned long addr, unsigned long size,
return -ENOMEM;
}
- mmap_write_lock(mm);
error = madvise_set_anon_name(mm, addr, size, anon_name);
- mmap_write_unlock(mm);
anon_vma_name_put(anon_name);
return error;
--
2.50.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name()
2025-06-24 13:03 ` [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name() Vlastimil Babka
@ 2025-06-24 14:06 ` David Hildenbrand
2025-06-24 14:35 ` Suren Baghdasaryan
2025-06-24 15:45 ` Lorenzo Stoakes
2025-06-24 17:03 ` Vlastimil Babka
2 siblings, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2025-06-24 14:06 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
Jann Horn, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Colin Cross
Cc: linux-mm, linux-kernel
On 24.06.25 15:03, Vlastimil Babka wrote:
> Use madvise_lock()/madvise_unlock() in madvise_set_anon_name() in the
> same way as in do_madvise(). This narrows the lock scope a bit and
> reuses existing functionality. get_lock_mode() already picks the correct
> MADVISE_MMAP_WRITE_LOCK mode for __MADV_SET_ANON_VMA_NAME so we can just
> remove the explicit assignment.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
Acked-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name()
2025-06-24 14:06 ` David Hildenbrand
@ 2025-06-24 14:35 ` Suren Baghdasaryan
0 siblings, 0 replies; 25+ messages in thread
From: Suren Baghdasaryan @ 2025-06-24 14:35 UTC (permalink / raw)
To: David Hildenbrand
Cc: Vlastimil Babka, Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
Jann Horn, Mike Rapoport, Michal Hocko, Colin Cross, linux-mm,
linux-kernel
On Tue, Jun 24, 2025 at 7:06 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 24.06.25 15:03, Vlastimil Babka wrote:
> > Use madvise_lock()/madvise_unlock() in madvise_set_anon_name() in the
> > same way as in do_madvise(). This narrows the lock scope a bit and
> > reuses existing functionality. get_lock_mode() already picks the correct
> > MADVISE_MMAP_WRITE_LOCK mode for __MADV_SET_ANON_VMA_NAME so we can just
> > remove the explicit assignment.
> >
> > Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> > ---
>
> Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>
> --
> Cheers,
>
> David / dhildenb
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name()
2025-06-24 13:03 ` [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name() Vlastimil Babka
2025-06-24 14:06 ` David Hildenbrand
@ 2025-06-24 15:45 ` Lorenzo Stoakes
2025-06-24 16:48 ` Vlastimil Babka
2025-06-24 17:03 ` Vlastimil Babka
2 siblings, 1 reply; 25+ messages in thread
From: Lorenzo Stoakes @ 2025-06-24 15:45 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand, Jann Horn,
Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Colin Cross,
linux-mm, linux-kernel
On Tue, Jun 24, 2025 at 03:03:48PM +0200, Vlastimil Babka wrote:
> Use madvise_lock()/madvise_unlock() in madvise_set_anon_name() in the
> same way as in do_madvise(). This narrows the lock scope a bit and
> reuses existing functionality. get_lock_mode() already picks the correct
> MADVISE_MMAP_WRITE_LOCK mode for __MADV_SET_ANON_VMA_NAME so we can just
> remove the explicit assignment.
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
This is a nice idea, hadn't thought of it :) aren't we making anon vma name
behave now :P
I mean this is a _minor_ difference in functionality in that we will now
use mmap_write_lock_killable() and could fail to lock on fatal signal
whereas before we unconditionally lock but I think that's fine, and
possibly even... desirable actually?
At any rate this is all nice, thanks again!
> ---
> mm/madvise.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index cae064479cdf908707c45b941bd03d43d095eab6..ee02ccd0315a146cdb3001cd189e03be9e48a2ea 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -2082,10 +2082,10 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> {
> unsigned long end;
> unsigned long len;
> + int error;
> struct madvise_behavior madv_behavior = {
> .mm = mm,
> .behavior = __MADV_SET_ANON_VMA_NAME,
> - .lock_mode = MADVISE_MMAP_WRITE_LOCK,
> .anon_name = anon_name,
> };
>
> @@ -2106,7 +2106,14 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
>
> madv_behavior.range.start = start;
> madv_behavior.range.end = end;
> - return madvise_walk_vmas(&madv_behavior);
> +
> + error = madvise_lock(&madv_behavior);
> + if (error)
> + return error;
> + error = madvise_walk_vmas(&madv_behavior);
> + madvise_unlock(&madv_behavior);
> +
> + return error;
> }
>
> int set_anon_vma_name(unsigned long addr, unsigned long size,
> @@ -2136,9 +2143,7 @@ int set_anon_vma_name(unsigned long addr, unsigned long size,
> return -ENOMEM;
> }
>
> - mmap_write_lock(mm);
> error = madvise_set_anon_name(mm, addr, size, anon_name);
> - mmap_write_unlock(mm);
> anon_vma_name_put(anon_name);
>
> return error;
>
> --
> 2.50.0
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name()
2025-06-24 15:45 ` Lorenzo Stoakes
@ 2025-06-24 16:48 ` Vlastimil Babka
0 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 16:48 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Liam R. Howlett, David Hildenbrand, Jann Horn,
Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Colin Cross,
linux-mm, linux-kernel
On 6/24/25 17:45, Lorenzo Stoakes wrote:
> On Tue, Jun 24, 2025 at 03:03:48PM +0200, Vlastimil Babka wrote:
>> Use madvise_lock()/madvise_unlock() in madvise_set_anon_name() in the
>> same way as in do_madvise(). This narrows the lock scope a bit and
>> reuses existing functionality. get_lock_mode() already picks the correct
>> MADVISE_MMAP_WRITE_LOCK mode for __MADV_SET_ANON_VMA_NAME so we can just
>> remove the explicit assignment.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>
> This is a nice idea, hadn't thought of it :) aren't we making anon vma name
> behave now :P
>
> I mean this is a _minor_ difference in functionality in that we will now
> use mmap_write_lock_killable() and could fail to lock on fatal signal
> whereas before we unconditionally lock but I think that's fine, and
> possibly even... desirable actually?
Right! I'll add a note to the changelog. It's changing existing uapi but
hopefully not in a way that userspace would become broken. I mean even the
other madvise modes only got killable locks later tat some point after
introduction, no?
>
> At any rate this is all nice, thanks again!
>
>> ---
>> mm/madvise.c | 13 +++++++++----
>> 1 file changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/madvise.c b/mm/madvise.c
>> index cae064479cdf908707c45b941bd03d43d095eab6..ee02ccd0315a146cdb3001cd189e03be9e48a2ea 100644
>> --- a/mm/madvise.c
>> +++ b/mm/madvise.c
>> @@ -2082,10 +2082,10 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
>> {
>> unsigned long end;
>> unsigned long len;
>> + int error;
>> struct madvise_behavior madv_behavior = {
>> .mm = mm,
>> .behavior = __MADV_SET_ANON_VMA_NAME,
>> - .lock_mode = MADVISE_MMAP_WRITE_LOCK,
>> .anon_name = anon_name,
>> };
>>
>> @@ -2106,7 +2106,14 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
>>
>> madv_behavior.range.start = start;
>> madv_behavior.range.end = end;
>> - return madvise_walk_vmas(&madv_behavior);
>> +
>> + error = madvise_lock(&madv_behavior);
>> + if (error)
>> + return error;
>> + error = madvise_walk_vmas(&madv_behavior);
>> + madvise_unlock(&madv_behavior);
>> +
>> + return error;
>> }
>>
>> int set_anon_vma_name(unsigned long addr, unsigned long size,
>> @@ -2136,9 +2143,7 @@ int set_anon_vma_name(unsigned long addr, unsigned long size,
>> return -ENOMEM;
>> }
>>
>> - mmap_write_lock(mm);
>> error = madvise_set_anon_name(mm, addr, size, anon_name);
>> - mmap_write_unlock(mm);
>> anon_vma_name_put(anon_name);
>>
>> return error;
>>
>> --
>> 2.50.0
>>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name()
2025-06-24 13:03 ` [PATCH v2 4/4] mm, madvise: use standard madvise locking in madvise_set_anon_name() Vlastimil Babka
2025-06-24 14:06 ` David Hildenbrand
2025-06-24 15:45 ` Lorenzo Stoakes
@ 2025-06-24 17:03 ` Vlastimil Babka
2 siblings, 0 replies; 25+ messages in thread
From: Vlastimil Babka @ 2025-06-24 17:03 UTC (permalink / raw)
To: Andrew Morton, Liam R. Howlett, Lorenzo Stoakes,
David Hildenbrand, Jann Horn, Mike Rapoport, Suren Baghdasaryan,
Michal Hocko, Colin Cross
Cc: linux-mm, linux-kernel
On 6/24/25 15:03, Vlastimil Babka wrote:
> Use madvise_lock()/madvise_unlock() in madvise_set_anon_name() in the
> same way as in do_madvise(). This narrows the lock scope a bit and
> reuses existing functionality. get_lock_mode() already picks the correct
> MADVISE_MMAP_WRITE_LOCK mode for __MADV_SET_ANON_VMA_NAME so we can just
> remove the explicit assignment.
Additional paragraph here:
There is a user visible change in that the prctl(PR_SET_VMA,
PR_SET_VMA_ANON_NAME...) might now return -EINTR.
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> mm/madvise.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index cae064479cdf908707c45b941bd03d43d095eab6..ee02ccd0315a146cdb3001cd189e03be9e48a2ea 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -2082,10 +2082,10 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
> {
> unsigned long end;
> unsigned long len;
> + int error;
> struct madvise_behavior madv_behavior = {
> .mm = mm,
> .behavior = __MADV_SET_ANON_VMA_NAME,
> - .lock_mode = MADVISE_MMAP_WRITE_LOCK,
> .anon_name = anon_name,
> };
>
> @@ -2106,7 +2106,14 @@ static int madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
>
> madv_behavior.range.start = start;
> madv_behavior.range.end = end;
> - return madvise_walk_vmas(&madv_behavior);
> +
> + error = madvise_lock(&madv_behavior);
> + if (error)
> + return error;
> + error = madvise_walk_vmas(&madv_behavior);
> + madvise_unlock(&madv_behavior);
> +
> + return error;
> }
>
> int set_anon_vma_name(unsigned long addr, unsigned long size,
> @@ -2136,9 +2143,7 @@ int set_anon_vma_name(unsigned long addr, unsigned long size,
> return -ENOMEM;
> }
>
> - mmap_write_lock(mm);
> error = madvise_set_anon_name(mm, addr, size, anon_name);
> - mmap_write_unlock(mm);
> anon_vma_name_put(anon_name);
>
> return error;
>
^ permalink raw reply [flat|nested] 25+ messages in thread