From: "Yajun Deng" <yajun.deng@linux.dev>
To: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: akpm@linux-foundation.org, david@redhat.com,
lorenzo.stoakes@oracle.com, riel@surriel.com, vbabka@suse.cz,
harry.yoo@oracle.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/rmap: make num_children and num_active_vmas update in internally
Date: Sat, 06 Sep 2025 14:59:29 +0000 [thread overview]
Message-ID: <405f6c44b4214ff466743ed94d16cb2fbea1b7f3@linux.dev> (raw)
In-Reply-To: <4ifsfk44so7ychuu57mkbhujjl4lh5bxt2ufdseskunxsle366@3p6oo7qulwef>
September 5, 2025 at 11:16 PM, "Liam R. Howlett" <Liam.Howlett@oracle.com mailto:Liam.Howlett@oracle.com?to=%22Liam%20R.%20Howlett%22%20%3CLiam.Howlett%40oracle.com%3E > wrote:
>
> * Yajun Deng <yajun.deng@linux.dev> [250905 09:21]:
>
> >
> > If the anon_vma_alloc() is called, the num_children of the parent of
> > the anon_vma will be updated. But this operation occurs outside of
> > anon_vma_alloc().
> >
> > The num_active_vmas are also updated outside of anon_vma.
> >
> > Pass the parent of anon_vma to the anon_vma_alloc() and update the
> > num_children inside it.
> >
> > Introduce anon_vma_attach() and anon_vma_detach() to update
> > num_active_vmas with the anon_vma.
> >
> > Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
> > ---
> > mm/rmap.c | 63 ++++++++++++++++++++++++++++---------------------------
> > 1 file changed, 32 insertions(+), 31 deletions(-)
> >
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 34333ae3bd80..2a28edfa5734 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -86,15 +86,21 @@
> > static struct kmem_cache *anon_vma_cachep;
> > static struct kmem_cache *anon_vma_chain_cachep;
> >
> > -static inline struct anon_vma *anon_vma_alloc(void)
> > +static inline struct anon_vma *anon_vma_alloc(struct anon_vma *parent)
> > {
> > struct anon_vma *anon_vma;
> >
> > anon_vma = kmem_cache_alloc(anon_vma_cachep, GFP_KERNEL);
> > - if (anon_vma) {
> > - atomic_set(&anon_vma->refcount, 1);
> > - anon_vma->num_children = 0;
> > - anon_vma->num_active_vmas = 0;
> > + if (!anon_vma)
> > + return NULL;
> > +
> > + atomic_set(&anon_vma->refcount, 1);
> > + anon_vma->num_children = 0;
> > + anon_vma->num_active_vmas = 0;
> > + if (parent) {
> > + anon_vma->parent = parent;
> > + anon_vma->root = parent->root;
> > + } else {
> > anon_vma->parent = anon_vma;
> > /*
> > * Initialise the anon_vma root to point to itself. If called
> > @@ -102,6 +108,7 @@ static inline struct anon_vma *anon_vma_alloc(void)
> > */
> > anon_vma->root = anon_vma;
> > }
> > + anon_vma->parent->num_children++;
> >
> > return anon_vma;
> > }
> > @@ -146,6 +153,19 @@ static void anon_vma_chain_free(struct anon_vma_chain *anon_vma_chain)
> > kmem_cache_free(anon_vma_chain_cachep, anon_vma_chain);
> > }
> >
> > +static inline void anon_vma_attach(struct vm_area_struct *vma,
> > + struct anon_vma *anon_vma)
> > +{
> > + vma->anon_vma = anon_vma;
> > + vma->anon_vma->num_active_vmas++;
> > +}
> > +
> > +static inline void anon_vma_detach(struct vm_area_struct *vma)
> > +{
> > + vma->anon_vma->num_active_vmas--;
> > + vma->anon_vma = NULL;
> > +}
> > +
> >
> It is a bit odd that you are setting a vma value with the prefix of
> anon_vma. Surely there is a better name: vma_attach_anon() ? And since
> this is editing the vma, should it be in rmap.c or vma.h?
>
I will move them to vma.h.
> >
> > static void anon_vma_chain_link(struct vm_area_struct *vma,
> > struct anon_vma_chain *avc,
> > struct anon_vma *anon_vma)
> > @@ -198,10 +218,9 @@ int __anon_vma_prepare(struct vm_area_struct *vma)
> > anon_vma = find_mergeable_anon_vma(vma);
> > allocated = NULL;
> > if (!anon_vma) {
> > - anon_vma = anon_vma_alloc();
> > + anon_vma = anon_vma_alloc(NULL);
> >
> I don't love passing NULL for parent, it's two if statements to do the
> same work as before - we already know that parent is NULL by this point,
> but we call a function to check it again.
>
I will add a wapper function.
> >
> > if (unlikely(!anon_vma))
> > goto out_enomem_free_avc;
> > - anon_vma->num_children++; /* self-parent link for new root */
> > allocated = anon_vma;
> > }
> >
> > @@ -209,9 +228,8 @@ int __anon_vma_prepare(struct vm_area_struct *vma)
> > /* page_table_lock to protect against threads */
> > spin_lock(&mm->page_table_lock);
> > if (likely(!vma->anon_vma)) {
> > - vma->anon_vma = anon_vma;
> > + anon_vma_attach(vma, anon_vma);
> > anon_vma_chain_link(vma, avc, anon_vma);
> > - anon_vma->num_active_vmas++;
> > allocated = NULL;
> > avc = NULL;
> > }
> > @@ -306,10 +324,8 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> > if (!dst->anon_vma && src->anon_vma &&
> > anon_vma->num_children < 2 &&
> > anon_vma->num_active_vmas == 0)
> > - dst->anon_vma = anon_vma;
> > + anon_vma_attach(dst, anon_vma);
> > }
> > - if (dst->anon_vma)
> > - dst->anon_vma->num_active_vmas++;
> > unlock_anon_vma_root(root);
> > return 0;
> >
> anon_vma_clone() has a goto label of enomem_failure that needs to be
> handled correctly. Looks like you have to avoid zeroing dst before
> unlink_anon_vmas(vma) there.
>
Yes, it's an error.
> >
> > @@ -356,31 +372,22 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
> > return 0;
> >
> > /* Then add our own anon_vma. */
> > - anon_vma = anon_vma_alloc();
> > + anon_vma = anon_vma_alloc(pvma->anon_vma);
> > if (!anon_vma)
> > goto out_error;
> > - anon_vma->num_active_vmas++;
> > avc = anon_vma_chain_alloc(GFP_KERNEL);
> > if (!avc)
> > goto out_error_free_anon_vma;
> >
> At this point anon_vma has a parent set and the parent->num_children++,
> but vma->anon_vma != anon_vma yet. If avc fails here, we will put the
> anon_vma but leave the parent with num_children incremented, since
> unlink_anon_vmas() will not find anything.
>
Yes, it's an error.
> >
> > - /*
> > - * The root anon_vma's rwsem is the lock actually used when we
> > - * lock any of the anon_vmas in this anon_vma tree.
> > - */
> >
> This information is lost when adding the parent passthrough.
>
I'll add it back.
> >
> > - anon_vma->root = pvma->anon_vma->root;
> > - anon_vma->parent = pvma->anon_vma;
> > /*
> > * With refcounts, an anon_vma can stay around longer than the
> > * process it belongs to. The root anon_vma needs to be pinned until
> > * this anon_vma is freed, because the lock lives in the root.
> > */
> > get_anon_vma(anon_vma->root);
> > - /* Mark this anon_vma as the one where our new (COWed) pages go. */
> > - vma->anon_vma = anon_vma;
> > + anon_vma_attach(vma, anon_vma);
> >
> So now we are in the same situation, we know what we need to do with the
> parent, but we have to run through another if statement to get it to
> happen instead of assigning it.
>
Some code like it.
init_tg_rt_entry() has two callers. One has a parent, the other does not.
> >
> > anon_vma_lock_write(anon_vma);
> > anon_vma_chain_link(vma, avc, anon_vma);
> > - anon_vma->parent->num_children++;
> > anon_vma_unlock_write(anon_vma);
> >
> > return 0;
> > @@ -419,15 +426,9 @@ void unlink_anon_vmas(struct vm_area_struct *vma)
> > list_del(&avc->same_vma);
> > anon_vma_chain_free(avc);
> > }
> > - if (vma->anon_vma) {
> > - vma->anon_vma->num_active_vmas--;
> > + if (vma->anon_vma)
> > + anon_vma_detach(vma);
> >
> > - /*
> > - * vma would still be needed after unlink, and anon_vma will be prepared
> > - * when handle fault.
> > - */
> >
> It is still worth keeping the comment here too.
>
Okay.
> >
> > - vma->anon_vma = NULL;
> > - }
> > unlock_anon_vma_root(root);
> >
> > /*
> > --
> > 2.25.1
> >
>
next prev parent reply other threads:[~2025-09-06 14:59 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-05 13:20 [PATCH] mm/rmap: make num_children and num_active_vmas update in internally Yajun Deng
2025-09-05 14:58 ` Lorenzo Stoakes
2025-09-06 14:50 ` Yajun Deng
2025-09-08 4:29 ` Lorenzo Stoakes
2025-09-05 15:16 ` Liam R. Howlett
2025-09-06 14:59 ` Yajun Deng [this message]
2025-09-08 6:54 ` [syzbot ci] " syzbot ci
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=405f6c44b4214ff466743ed94d16cb2fbea1b7f3@linux.dev \
--to=yajun.deng@linux.dev \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=harry.yoo@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=riel@surriel.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.