public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: "Yajun Deng" <yajun.deng@linux.dev>
To: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: akpm@linux-foundation.org, david@redhat.com,
	lorenzo.stoakes@oracle.com, riel@surriel.com, vbabka@suse.cz,
	harry.yoo@oracle.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/rmap: make num_children and num_active_vmas update in internally
Date: Sat, 06 Sep 2025 14:59:29 +0000	[thread overview]
Message-ID: <405f6c44b4214ff466743ed94d16cb2fbea1b7f3@linux.dev> (raw)
In-Reply-To: <4ifsfk44so7ychuu57mkbhujjl4lh5bxt2ufdseskunxsle366@3p6oo7qulwef>

September 5, 2025 at 11:16 PM, "Liam R. Howlett" <Liam.Howlett@oracle.com mailto:Liam.Howlett@oracle.com?to=%22Liam%20R.%20Howlett%22%20%3CLiam.Howlett%40oracle.com%3E > wrote:


> 
> * Yajun Deng <yajun.deng@linux.dev> [250905 09:21]:
> 
> > 
> > If the anon_vma_alloc() is called, the num_children of the parent of
> >  the anon_vma will be updated. But this operation occurs outside of
> >  anon_vma_alloc().
> >  
> >  The num_active_vmas are also updated outside of anon_vma.
> >  
> >  Pass the parent of anon_vma to the anon_vma_alloc() and update the
> >  num_children inside it.
> >  
> >  Introduce anon_vma_attach() and anon_vma_detach() to update
> >  num_active_vmas with the anon_vma.
> >  
> >  Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
> >  ---
> >  mm/rmap.c | 63 ++++++++++++++++++++++++++++---------------------------
> >  1 file changed, 32 insertions(+), 31 deletions(-)
> >  
> >  diff --git a/mm/rmap.c b/mm/rmap.c
> >  index 34333ae3bd80..2a28edfa5734 100644
> >  --- a/mm/rmap.c
> >  +++ b/mm/rmap.c
> >  @@ -86,15 +86,21 @@
> >  static struct kmem_cache *anon_vma_cachep;
> >  static struct kmem_cache *anon_vma_chain_cachep;
> >  
> >  -static inline struct anon_vma *anon_vma_alloc(void)
> >  +static inline struct anon_vma *anon_vma_alloc(struct anon_vma *parent)
> >  {
> >  struct anon_vma *anon_vma;
> >  
> >  anon_vma = kmem_cache_alloc(anon_vma_cachep, GFP_KERNEL);
> >  - if (anon_vma) {
> >  - atomic_set(&anon_vma->refcount, 1);
> >  - anon_vma->num_children = 0;
> >  - anon_vma->num_active_vmas = 0;
> >  + if (!anon_vma)
> >  + return NULL;
> >  +
> >  + atomic_set(&anon_vma->refcount, 1);
> >  + anon_vma->num_children = 0;
> >  + anon_vma->num_active_vmas = 0;
> >  + if (parent) {
> >  + anon_vma->parent = parent;
> >  + anon_vma->root = parent->root;
> >  + } else {
> >  anon_vma->parent = anon_vma;
> >  /*
> >  * Initialise the anon_vma root to point to itself. If called
> >  @@ -102,6 +108,7 @@ static inline struct anon_vma *anon_vma_alloc(void)
> >  */
> >  anon_vma->root = anon_vma;
> >  }
> >  + anon_vma->parent->num_children++;
> >  
> >  return anon_vma;
> >  }
> >  @@ -146,6 +153,19 @@ static void anon_vma_chain_free(struct anon_vma_chain *anon_vma_chain)
> >  kmem_cache_free(anon_vma_chain_cachep, anon_vma_chain);
> >  }
> >  
> >  +static inline void anon_vma_attach(struct vm_area_struct *vma,
> >  + struct anon_vma *anon_vma)
> >  +{
> >  + vma->anon_vma = anon_vma;
> >  + vma->anon_vma->num_active_vmas++;
> >  +}
> >  +
> >  +static inline void anon_vma_detach(struct vm_area_struct *vma)
> >  +{
> >  + vma->anon_vma->num_active_vmas--;
> >  + vma->anon_vma = NULL;
> >  +}
> >  +
> > 
> It is a bit odd that you are setting a vma value with the prefix of
> anon_vma. Surely there is a better name: vma_attach_anon() ? And since
> this is editing the vma, should it be in rmap.c or vma.h?
> 

I will move them to vma.h.
> > 
> > static void anon_vma_chain_link(struct vm_area_struct *vma,
> >  struct anon_vma_chain *avc,
> >  struct anon_vma *anon_vma)
> >  @@ -198,10 +218,9 @@ int __anon_vma_prepare(struct vm_area_struct *vma)
> >  anon_vma = find_mergeable_anon_vma(vma);
> >  allocated = NULL;
> >  if (!anon_vma) {
> >  - anon_vma = anon_vma_alloc();
> >  + anon_vma = anon_vma_alloc(NULL);
> > 
> I don't love passing NULL for parent, it's two if statements to do the
> same work as before - we already know that parent is NULL by this point,
> but we call a function to check it again.
> 

I will add a wapper function.
> > 
> > if (unlikely(!anon_vma))
> >  goto out_enomem_free_avc;
> >  - anon_vma->num_children++; /* self-parent link for new root */
> >  allocated = anon_vma;
> >  }
> >  
> >  @@ -209,9 +228,8 @@ int __anon_vma_prepare(struct vm_area_struct *vma)
> >  /* page_table_lock to protect against threads */
> >  spin_lock(&mm->page_table_lock);
> >  if (likely(!vma->anon_vma)) {
> >  - vma->anon_vma = anon_vma;
> >  + anon_vma_attach(vma, anon_vma);
> >  anon_vma_chain_link(vma, avc, anon_vma);
> >  - anon_vma->num_active_vmas++;
> >  allocated = NULL;
> >  avc = NULL;
> >  }
> >  @@ -306,10 +324,8 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> >  if (!dst->anon_vma && src->anon_vma &&
> >  anon_vma->num_children < 2 &&
> >  anon_vma->num_active_vmas == 0)
> >  - dst->anon_vma = anon_vma;
> >  + anon_vma_attach(dst, anon_vma);
> >  }
> >  - if (dst->anon_vma)
> >  - dst->anon_vma->num_active_vmas++;
> >  unlock_anon_vma_root(root);
> >  return 0;
> > 
> anon_vma_clone() has a goto label of enomem_failure that needs to be
> handled correctly. Looks like you have to avoid zeroing dst before
> unlink_anon_vmas(vma) there.
> 

Yes, it's an error.
> > 
> > @@ -356,31 +372,22 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
> >  return 0;
> >  
> >  /* Then add our own anon_vma. */
> >  - anon_vma = anon_vma_alloc();
> >  + anon_vma = anon_vma_alloc(pvma->anon_vma);
> >  if (!anon_vma)
> >  goto out_error;
> >  - anon_vma->num_active_vmas++;
> >  avc = anon_vma_chain_alloc(GFP_KERNEL);
> >  if (!avc)
> >  goto out_error_free_anon_vma;
> > 
> At this point anon_vma has a parent set and the parent->num_children++,
> but vma->anon_vma != anon_vma yet. If avc fails here, we will put the
> anon_vma but leave the parent with num_children incremented, since
> unlink_anon_vmas() will not find anything.
> 

Yes, it's an error.
> > 
> > - /*
> >  - * The root anon_vma's rwsem is the lock actually used when we
> >  - * lock any of the anon_vmas in this anon_vma tree.
> >  - */
> > 
> This information is lost when adding the parent passthrough.
> 

I'll add it back.

> > 
> > - anon_vma->root = pvma->anon_vma->root;
> >  - anon_vma->parent = pvma->anon_vma;
> >  /*
> >  * With refcounts, an anon_vma can stay around longer than the
> >  * process it belongs to. The root anon_vma needs to be pinned until
> >  * this anon_vma is freed, because the lock lives in the root.
> >  */
> >  get_anon_vma(anon_vma->root);
> >  - /* Mark this anon_vma as the one where our new (COWed) pages go. */
> >  - vma->anon_vma = anon_vma;
> >  + anon_vma_attach(vma, anon_vma);
> > 
> So now we are in the same situation, we know what we need to do with the
> parent, but we have to run through another if statement to get it to
> happen instead of assigning it.
> 

Some code like it.
init_tg_rt_entry() has two callers. One has a parent, the other does not.

> > 
> > anon_vma_lock_write(anon_vma);
> >  anon_vma_chain_link(vma, avc, anon_vma);
> >  - anon_vma->parent->num_children++;
> >  anon_vma_unlock_write(anon_vma);
> >  
> >  return 0;
> >  @@ -419,15 +426,9 @@ void unlink_anon_vmas(struct vm_area_struct *vma)
> >  list_del(&avc->same_vma);
> >  anon_vma_chain_free(avc);
> >  }
> >  - if (vma->anon_vma) {
> >  - vma->anon_vma->num_active_vmas--;
> >  + if (vma->anon_vma)
> >  + anon_vma_detach(vma);
> >  
> >  - /*
> >  - * vma would still be needed after unlink, and anon_vma will be prepared
> >  - * when handle fault.
> >  - */
> > 
> It is still worth keeping the comment here too.
> 

Okay.

> > 
> > - vma->anon_vma = NULL;
> >  - }
> >  unlock_anon_vma_root(root);
> >  
> >  /*
> >  -- 
> >  2.25.1
> >
>


  reply	other threads:[~2025-09-06 14:59 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-05 13:20 [PATCH] mm/rmap: make num_children and num_active_vmas update in internally Yajun Deng
2025-09-05 14:58 ` Lorenzo Stoakes
2025-09-06 14:50   ` Yajun Deng
2025-09-08  4:29     ` Lorenzo Stoakes
2025-09-05 15:16 ` Liam R. Howlett
2025-09-06 14:59   ` Yajun Deng [this message]
2025-09-08  6:54 ` [syzbot ci] " syzbot ci

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=405f6c44b4214ff466743ed94d16cb2fbea1b7f3@linux.dev \
    --to=yajun.deng@linux.dev \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=harry.yoo@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=riel@surriel.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox