All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vasily Gorbik <gor@linux.ibm.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Boqun Feng <boqun@kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Uladzislau Rezki <urezki@gmail.com>,
	rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-s390@vger.kernel.org, Samir M <samir@linux.ibm.com>,
	Srikar Dronamraju <srikar@linux.ibm.com>
Subject: Re: BUG: workqueue lockup - SRCU schedules work on not-online CPUs during size transition
Date: Wed, 29 Apr 2026 19:50:31 +0200	[thread overview]
Message-ID: <tte9o87@ub.hpns> (raw)
In-Reply-To: <ed1fa6cd-7343-4ca3-8b9d-d699ca496f83@paulmck-laptop>

On Tue, Apr 14, 2026 at 12:24:12PM -0700, Paul E. McKenney wrote:
> On Thu, Apr 09, 2026 at 09:03:26PM -0700, Paul E. McKenney wrote:
> Please see below for the full patch, including refraining from queueing
> workqueue handlers on not-yet-online CPUs and diverting SRCU callbacks
> from not-yet-fully-online CPUs to the boot CPU's callback queue.
...
> commit ce533a60b2ef29a9b516cc717e77c6b679bc09c0
> Author: Paul E. McKenney <paulmck@kernel.org>
> Date:   Thu Apr 9 11:16:02 2026 -0700
> 
>     srcu: Don't queue workqueue handlers to never-online CPUs
>     
>     While an srcu_struct structure is in the midst of switching from CPU-0
>     to all-CPUs state, it can attempt to invoke callbacks for CPUs that
>     have never been online.  Worse yet, it can attempt in invoke callbacks
>     for CPUs that never will be online due to not being present in the
>     cpu_possible_mask.  This can cause hangs on s390, which is not set up to
>     deal with workqueue handlers being scheduled on such CPUs.  This commit
>     therefore causes Tree SRCU to refrain from queueing workqueue handlers
>     on CPUs that have not yet (and might never) come online.
>     
>     Because callbacks are not invoked on CPUs that have not been
>     online, it is an error to invoke call_srcu(), synchronize_srcu(), or
>     synchronize_srcu_expedited() on a CPU that is not yet fully online.
>     However, it turns out to be less code to redirect the callbacks
>     from too-early invocations of call_srcu() than to warn about such
>     invocations.  This commit therefore also redirects callbacks queued on
>     not-yet-fully-online CPUs to the boot CPU.
>     
>     Reported-by: Vasily Gorbik <gor@linux.ibm.com>
>     Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>     Tested-by: Vasily Gorbik <gor@linux.ibm.com>
>     Cc: Tejun Heo <tj@kernel.org>

I retested it on s390 and on x86 KVM with --smp 16,maxcpus=255, all
looks good to me.

FWIW, again:

Tested-by: Vasily Gorbik <gor@linux.ibm.com>

Would you mind adding Cc: stable so it gets picked up for v7.0?
61bbcfb50514 ("srcu: Push srcu_node allocation to GP when
non-preemptible") is what made it reproducible for us.

Thank you!

  reply	other threads:[~2026-04-29 17:50 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-09 13:08 BUG: workqueue lockup - SRCU schedules work on not-online CPUs during size transition Vasily Gorbik
2026-04-09 17:22 ` Paul E. McKenney
2026-04-09 19:15   ` Vasily Gorbik
2026-04-09 20:10     ` Paul E. McKenney
2026-04-10  4:03       ` Paul E. McKenney
2026-04-14 19:24         ` Paul E. McKenney
2026-04-29 17:50           ` Vasily Gorbik [this message]
2026-04-29 18:05             ` Paul E. McKenney
2026-04-29 18:23               ` Vasily Gorbik
2026-04-09 17:26 ` Boqun Feng
2026-04-09 17:40   ` Boqun Feng
2026-04-09 17:47     ` Tejun Heo
2026-04-09 17:48       ` Tejun Heo
2026-04-09 18:04         ` Paul E. McKenney
2026-04-09 18:09           ` Tejun Heo
2026-04-09 18:15             ` Paul E. McKenney
2026-04-09 18:10       ` Boqun Feng
2026-04-09 18:27         ` Paul E. McKenney
2026-04-10 18:53         ` Tejun Heo
2026-04-10 19:17           ` Paul E. McKenney
2026-04-10 19:29             ` Tejun Heo
2026-04-29 15:00           ` Srikar Dronamraju
2026-04-29 17:08             ` Vasily Gorbik
2026-04-29 17:18               ` Paul E. McKenney
2026-04-29 17:44                 ` Shrikanth Hegde
2026-04-29 18:01                   ` Paul E. McKenney
2026-04-30  7:08                     ` Shrikanth Hegde
2026-04-30 16:05                       ` Paul E. McKenney
2026-04-30 16:10                       ` Paul E. McKenney
2026-05-01 13:17                         ` Shrikanth Hegde
2026-05-01 14:00                           ` Paul E. McKenney
2026-04-29 18:17           ` Samir M

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tte9o87@ub.hpns \
    --to=gor@linux.ibm.com \
    --cc=boqun@kernel.org \
    --cc=frederic@kernel.org \
    --cc=joelagnelf@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=neeraj.upadhyay@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=samir@linux.ibm.com \
    --cc=srikar@linux.ibm.com \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.