All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Uladzislau Rezki <urezki@gmail.com>
Cc: Nick Bowler <nbowler@draconx.ca>,
	Hailong Liu <hailong.liu@oppo.com>,
	linux-kernel@vger.kernel.org,
	Linux regressions mailing list <regressions@lists.linux.dev>,
	linux-mm@kvack.org, sparclinux@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: PROBLEM: kernel crashes when running xfsdump since ~6.4
Date: Wed, 26 Jun 2024 08:38:56 +0800	[thread overview]
Message-ID: <ZntjIE6msJbF8zTa@MiWiFi-R3L-srv> (raw)
In-Reply-To: <ZnsjIB2byIxSgbjc@pc636>

On 06/25/24 at 10:05pm, Uladzislau Rezki wrote:
> > > > > > /**
> > > > > >  * cpumask_next - get the next cpu in a cpumask
> > > > > >  * @n: the cpu prior to the place to search (i.e. return will be > @n)
> > > > > >  * @srcp: the cpumask pointer
> > > > > >  *
> > > > > >  * Return: >= nr_cpu_ids if no further cpus set.
> > > > > 
> > > > > Ah, I got what you mean. In the vbq case, it may not have chance to get
> > > > > a return number as nr_cpu_ids. Becuase the hashed index limits the
> > > > > range to [0, nr_cpu_ids-1], and cpu_possible(index) will guarantee it
> > > > > won't be the highest cpu number [nr_cpu_ids-1] since CPU[nr_cpu_ids-1] must
> > > > > be possible CPU.
> > > > > 
> > > > > Do I miss some corner cases?
> > > > > 
> > > > Right. We guarantee that a highest CPU is available by doing: % nr_cpu_ids.
> > > > So we do not need to use *next_wrap() variant. You do not miss anything :)
> > > > 
> > > > Hailong Liu has proposed more simpler version:
> > > > 
> > > > <snip>
> > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > > index 11fe5ea208aa..e1e63ffb9c57 100644
> > > > --- a/mm/vmalloc.c
> > > > +++ b/mm/vmalloc.c
> > > > @@ -1994,8 +1994,9 @@ static struct xarray *
> > > >  addr_to_vb_xa(unsigned long addr)
> > > >  {
> > > >         int index = (addr / VMAP_BLOCK_SIZE) % num_possible_cpus();
> > > > +       int cpu = cpumask_nth(index, cpu_possible_mask);
> > > > 
> > > > -       return &per_cpu(vmap_block_queue, index).vmap_blocks;
> > > > +       return &per_cpu(vmap_block_queue, cpu).vmap_blocks;
> > > > <snip>
> > > > 
> > > > which just takes a next CPU if an index is not set in the cpu_possible_mask.
> > > > 
> > > > The only thing that can be updated in the patch is to replace num_possible_cpu()
> > > > by the nr_cpu_ids.
> > > > 
> > > > Any thoughts? I think we need to fix it by a minor change so it is
> > > > easier to back-port on stable kernels.
> > > 
> > > Yeah, sounds good since the regresson commit is merged in v6.3.
> > > Please feel free to post this and the hash array patch separately for
> > > formal reviewing.
> > > 
> > Agreed! The patch about hash array i will post later.
> > 
> > > By the way, when I am replying this mail, I check the cpumask_nth()
> > > again. I doubt it may take more checking then cpu_possible(), given most
> > > of systems don't have gaps in cpu_possible_mask. I could be dizzy at
> > > this moment.
> > > 
> > > static inline unsigned int cpumask_nth(unsigned int cpu, const struct cpumask *srcp)
> > > {
> > >         return find_nth_bit(cpumask_bits(srcp), small_cpumask_bits, cpumask_check(cpu));
> > > }
> > > 
> > Yep, i do not think it is a big problem based on your noted fact.
> > 
> Checked. There is a difference:
> 
> 1. Default
> 
> <snip>
> ...
> +   15.95%     6.05%  [kernel]        [k] __vmap_pages_range_noflush
> +   15.91%     1.74%  [kernel]        [k] addr_to_vb_xa <---------------
> +   15.13%    12.05%  [kernel]        [k] vunmap_p4d_range
> +   14.17%    13.38%  [kernel]        [k] __find_nth_bit <--------------
> +   10.62%     0.00%  [kernel]        [k] ret_from_fork_asm
> +   10.62%     0.00%  [kernel]        [k] ret_from_fork
> +   10.62%     0.00%  [kernel]        [k] kthread
> ...
> <snip>
> 
> 2. Check if cpu_possible() and then fallback to cpumask_nth() if not
> 
> <snip>
> ...
> +    6.84%     0.29%  [kernel]          [k] alloc_vmap_area
> +    6.80%     6.70%  [kernel]          [k] native_queued_spin_lock_slowpath
> +    4.24%     0.09%  [kernel]          [k] free_vmap_block
> +    2.41%     2.38%  [kernel]          [k] addr_to_vb_xa <-----------
> +    1.94%     1.91%  [kernel]          [k] xas_start
> ...
> <snip>
> 
> It is _worth_ to check if an index is in possible mask:
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 45e1506d58c3..af20f78c2cbf 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2542,7 +2542,10 @@ static DEFINE_PER_CPU(struct vmap_block_queue, vmap_block_queue);
>  static struct xarray *
>  addr_to_vb_xa(unsigned long addr)
>  {
> -       int index = (addr / VMAP_BLOCK_SIZE) % num_possible_cpus();
> +       int index = (addr / VMAP_BLOCK_SIZE) % nr_cpu_ids;
> +
> +       if (!cpu_possible(index))
> +               index = cpumask_nth(index, cpu_possible_mask);
> 
>         return &per_cpu(vmap_block_queue, index).vmap_blocks;
>  }
> 
> cpumask_nth() is not cheap. My measurements are based on a synthetic
> tight test and it detects a difference. In a real workloads it should
> not be visible. Having gaps is not a common case plus a "slow path"
> will be mitigated by the hit against possible mask.

Ah, this is consistent with my understanding from the code, thanks
for confirming by testing.


  reply	other threads:[~2024-06-26  0:39 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-20  6:19 PROBLEM: kernel crashes when running xfsdump since ~6.4 Nick Bowler
2024-06-20  6:37 ` Hailong Liu
2024-06-20 14:36   ` Nick Bowler
2024-06-20 18:02 ` Nick Bowler
2024-06-21  3:30   ` Hailong Liu
2024-06-21  7:07     ` Baoquan He
2024-06-21  9:44       ` Uladzislau Rezki
2024-06-21 10:45         ` Hailong Liu
2024-06-21 11:15         ` Hailong Liu
2024-06-24 12:18           ` Uladzislau Rezki
2024-06-25  9:26             ` Hailong Liu
2024-06-25  9:55               ` Uladzislau Rezki
2024-06-21 13:42         ` Michael Kelley
2024-06-24 12:17           ` Uladzislau Rezki
2024-06-21 14:02         ` Baoquan He
2024-06-24 12:16           ` Uladzislau Rezki
2024-06-25  3:30             ` Baoquan He
2024-06-25 10:32               ` Uladzislau Rezki
2024-06-25 11:40                 ` Baoquan He
2024-06-25 12:40                   ` Uladzislau Rezki
2024-06-25 13:02                     ` Baoquan He
2024-06-25 15:33                       ` Uladzislau Rezki
2024-06-25 15:49                         ` Baoquan He
2024-06-25 16:49                           ` Uladzislau Rezki
2024-06-25 20:05                             ` Uladzislau Rezki
2024-06-26  0:38                               ` Baoquan He [this message]
2024-06-26  5:12                               ` Hailong Liu
2024-06-26  9:15                                 ` Uladzislau Rezki
2024-06-26 10:03                                   ` Hailong Liu
2024-06-26 10:51                                     ` Baoquan He
2024-06-26 10:53                                       ` Uladzislau Rezki
2024-06-26 11:30                                       ` Hailong Liu
2024-06-26 11:45                                         ` Uladzislau Rezki
2024-06-26 10:51                                     ` Uladzislau Rezki
2024-06-26 13:34                                       ` Nick Bowler
2024-06-26 13:38                                         ` Uladzislau Rezki
2024-06-25 11:19               ` Hailong Liu
2024-06-25 12:41               ` Uladzislau Rezki
2024-06-24 12:20   ` Uladzislau Rezki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZntjIE6msJbF8zTa@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hailong.liu@oppo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nbowler@draconx.ca \
    --cc=regressions@lists.linux.dev \
    --cc=sparclinux@vger.kernel.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.