All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Jeongjun Park <aha310510@gmail.com>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	akpm@linux-foundation.org, edumazet@google.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4] mm/vmalloc: fix data race in show_numa_info()
Date: Thu, 8 May 2025 18:09:23 +0200	[thread overview]
Message-ID: <aBzXMzBNie447bYs@pc636> (raw)
In-Reply-To: <CAO9qdTHWvTafDrdkyXh1k7ix0=3_HnVYuAzOhg3x-iP8dv0NmA@mail.gmail.com>

On Thu, May 08, 2025 at 10:35:37PM +0900, Jeongjun Park wrote:
> Uladzislau Rezki <urezki@gmail.com> wrote:
> >
> > On Thu, May 08, 2025 at 03:55:58PM +0900, Jeongjun Park wrote:
> > > The following data-race was found in show_numa_info():
> > >
> > > ==================================================================
> > > BUG: KCSAN: data-race in vmalloc_info_show / vmalloc_info_show
> > >
> > > read to 0xffff88800971fe30 of 4 bytes by task 8289 on cpu 0:
> > >  show_numa_info mm/vmalloc.c:4936 [inline]
> > >  vmalloc_info_show+0x5a8/0x7e0 mm/vmalloc.c:5016
> > >  seq_read_iter+0x373/0xb40 fs/seq_file.c:230
> > >  proc_reg_read_iter+0x11e/0x170 fs/proc/inode.c:299
> > > ....
> > >
> > > write to 0xffff88800971fe30 of 4 bytes by task 8287 on cpu 1:
> > >  show_numa_info mm/vmalloc.c:4934 [inline]
> > >  vmalloc_info_show+0x38f/0x7e0 mm/vmalloc.c:5016
> > >  seq_read_iter+0x373/0xb40 fs/seq_file.c:230
> > >  proc_reg_read_iter+0x11e/0x170 fs/proc/inode.c:299
> > > ....
> > >
> > > value changed: 0x0000008f -> 0x00000000
> > > ==================================================================
> > >
> > > According to this report,there is a read/write data-race because m->private
> > > is accessible to multiple CPUs. To fix this, instead of allocating the heap
> > > in proc_vmalloc_init() and passing the heap address to m->private,
> > > vmalloc_info_show() should allocate the heap.
> > >
> > > Fixes: a47a126ad5ea ("vmallocinfo: add NUMA information")
> > > Suggested-by: Eric Dumazet <edumazet@google.com>
> > > Suggested-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> > > Signed-off-by: Jeongjun Park <aha310510@gmail.com>
> > > ---
> > > v4: Change the way counters array heap is allocated, per Andrew Morton's suggestion.
> > >     And fix it to call smp_rmb() in the correct location.
> > > - Link to v3: https://lore.kernel.org/all/20250507142552.9446-1-aha310510@gmail.com/
> > > v3: Following Uladzislau Rezki's suggestion, we check v->flags beforehand
> > >     to avoid printing uninitialized members of vm_struct.
> > > - Link to v2: https://lore.kernel.org/all/20250506082520.84153-1-aha310510@gmail.com/
> > > v2: Refactoring some functions and fix patch as per Eric Dumazet suggestion
> > > - Link to v1: https://lore.kernel.org/all/20250505171948.24410-1-aha310510@gmail.com/
> > > ---
> > >  mm/vmalloc.c | 61 ++++++++++++++++++++++++++++------------------------
> > >  1 file changed, 33 insertions(+), 28 deletions(-)
> > >
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index 3ed720a787ec..112df5a86106 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -3100,7 +3100,7 @@ static void clear_vm_uninitialized_flag(struct vm_struct *vm)
> > >       /*
> > >        * Before removing VM_UNINITIALIZED,
> > >        * we should make sure that vm has proper values.
> > > -      * Pair with smp_rmb() in show_numa_info().
> > > +      * Pair with smp_rmb() in vread_iter() and vmalloc_info_show().
> > >        */
> > >       smp_wmb();
> > >       vm->flags &= ~VM_UNINITIALIZED;
> > > @@ -4914,28 +4914,29 @@ bool vmalloc_dump_obj(void *object)
> > >  #endif
> > >
> > >  #ifdef CONFIG_PROC_FS
> > > -static void show_numa_info(struct seq_file *m, struct vm_struct *v)
> > > -{
> > > -     if (IS_ENABLED(CONFIG_NUMA)) {
> > > -             unsigned int nr, *counters = m->private;
> > > -             unsigned int step = 1U << vm_area_page_order(v);
> > >
> > > -             if (!counters)
> > > -                     return;
> > > +/*
> > > + * Print number of pages allocated on each memory node.
> > > + *
> > > + * This function can only be called if CONFIG_NUMA is enabled
> > > + * and VM_UNINITIALIZED bit in v->flags is disabled.
> > > + */
> > > +static void show_numa_info(struct seq_file *m, struct vm_struct *v,
> > > +                              unsigned int *counters)
> > > +{
> > > +     unsigned int nr;
> > > +     unsigned int step = 1U << vm_area_page_order(v);
> > >
> > > -             if (v->flags & VM_UNINITIALIZED)
> > > -                     return;
> > > -             /* Pair with smp_wmb() in clear_vm_uninitialized_flag() */
> > > -             smp_rmb();
> > > +     if (!counters)
> > > +             return;
> > >
> > > -             memset(counters, 0, nr_node_ids * sizeof(unsigned int));
> > > +     memset(counters, 0, nr_node_ids * sizeof(unsigned int));
> > >
> > > -             for (nr = 0; nr < v->nr_pages; nr += step)
> > > -                     counters[page_to_nid(v->pages[nr])] += step;
> > > -             for_each_node_state(nr, N_HIGH_MEMORY)
> > > -                     if (counters[nr])
> > > -                             seq_printf(m, " N%u=%u", nr, counters[nr]);
> > > -     }
> > > +     for (nr = 0; nr < v->nr_pages; nr += step)
> > > +             counters[page_to_nid(v->pages[nr])] += step;
> > > +     for_each_node_state(nr, N_HIGH_MEMORY)
> > > +             if (counters[nr])
> > > +                     seq_printf(m, " N%u=%u", nr, counters[nr]);
> > >  }
> > >
> > >  static void show_purge_info(struct seq_file *m)
> > > @@ -4962,8 +4963,11 @@ static int vmalloc_info_show(struct seq_file *m, void *p)
> > >       struct vmap_node *vn;
> > >       struct vmap_area *va;
> > >       struct vm_struct *v;
> > > +     unsigned int *counters = NULL;
> > >
> > Setting NULL can be dropped.
> >
> > >       int i;
> > >
> > > +     counters = kmalloc(nr_node_ids * sizeof(unsigned int), GFP_KERNEL);
> > "if (IS_ENABLED(CONFIG_NUMA))"? If no NUMA, allocating and freeing sounds
> > like without any reason.
> >
> > > +
> > >       for (i = 0; i < nr_vmap_nodes; i++) {
> > >               vn = &vmap_nodes[i];
> > >
> > > @@ -4979,6 +4983,11 @@ static int vmalloc_info_show(struct seq_file *m, void *p)
> > >                       }
> > >
> > >                       v = va->vm;
> > > +                     if (v->flags & VM_UNINITIALIZED)
> > > +                             continue;
> > > +
> > > +                     /* Pair with smp_wmb() in clear_vm_uninitialized_flag() */
> > > +                     smp_rmb();
> > >
> > >                       seq_printf(m, "0x%pK-0x%pK %7ld",
> > >                               v->addr, v->addr + v->size, v->size);
> > > @@ -5013,7 +5022,9 @@ static int vmalloc_info_show(struct seq_file *m, void *p)
> > >                       if (is_vmalloc_addr(v->pages))
> > >                               seq_puts(m, " vpages");
> > >
> > > -                     show_numa_info(m, v);
> > > +                     if (IS_ENABLED(CONFIG_NUMA))
> > > +                             show_numa_info(m, v, counters);
> > > +
> > >                       seq_putc(m, '\n');
> > >               }
> > >               spin_unlock(&vn->busy.lock);
> > > @@ -5023,19 +5034,13 @@ static int vmalloc_info_show(struct seq_file *m, void *p)
> > >        * As a final step, dump "unpurged" areas.
> > >        */
> > >       show_purge_info(m);
> > > +     kfree(counters);
> > >
> > Maybe check for NULL before and IS_ENABLED(CONFIG_NUMA)?
> >
> > --
> > Uladzislau Rezki
> 
> Hmm....then I think patching vmalloc_info_show() like this should
> work.
> 
> ---
>  mm/vmalloc.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 3ed720a787ec..b60355256211 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -4962,8 +4962,12 @@ static int vmalloc_info_show(struct seq_file *m, void *p)
>     struct vmap_node *vn;
>     struct vmap_area *va;
>     struct vm_struct *v;
> +   unsigned int *counters = NULL;
>     int i;
> 
> +   if (IS_ENABLED(CONFIG_NUMA))
> +       counters = kmalloc(nr_node_ids * sizeof(unsigned int), GFP_KERNEL);
> +
>     for (i = 0; i < nr_vmap_nodes; i++) {
>         vn = &vmap_nodes[i];
> 
> @@ -5013,7 +5017,9 @@ static int vmalloc_info_show(struct seq_file *m, void *p)
>             if (is_vmalloc_addr(v->pages))
>                 seq_puts(m, " vpages");
> 
> -           show_numa_info(m, v);
> +           if (counters)
> +               show_numa_info(m, v, counters);
> +
>             seq_putc(m, '\n');
>         }
>         spin_unlock(&vn->busy.lock);
> @@ -5023,6 +5029,7 @@ static int vmalloc_info_show(struct seq_file *m, void *p)
>      * As a final step, dump "unpurged" areas.
>      */
>     show_purge_info(m);
> +   kfree(counters);
>     return 0;
>  }
> 
> --
> 
> This way, we won't allocate heap unnecessarily when
> CONFIG_NUMA is disabled, and we only need to check if
> counters are NULL or not before calling show_numa_info().
> 
Yep, makes sense to me!

>
> However, if you change it like this, counters must always be
> initialized to NULL.
>
or:

<snip>
+   unsigned int *counters;
...
...
if (IS_ENABLED(CONFIG_NUMA))
  kfree(counters);
<snip>

if NUMA, you always do kmalloc() on entry, so it points to NULL
or to a real allocated memory. Apart of that, it is probably better
to avoid of checking "counters" before invoking the show_numa_info()
and replace it also by the IS_ENABLED().

It implies that you bail out if "counters" is NULL in the
show_numa_info().

--
Uladzislau Rezki


      reply	other threads:[~2025-05-08 16:09 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-08  6:55 [PATCH v4] mm/vmalloc: fix data race in show_numa_info() Jeongjun Park
2025-05-08 12:51 ` Uladzislau Rezki
2025-05-08 13:35   ` Jeongjun Park
2025-05-08 16:09     ` Uladzislau Rezki [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aBzXMzBNie447bYs@pc636 \
    --to=urezki@gmail.com \
    --cc=aha310510@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.