From: Boqun Feng <boqun@kernel.org>
To: Samir M <samir@linux.ibm.com>
Cc: "Paul E . McKenney" <paulmck@kernel.org>,
Boqun Feng <boqun.feng@gmail.com>,
LKML <linux-kernel@vger.kernel.org>, Tejun Heo <tj@kernel.org>,
RCU <rcu@vger.kernel.org>,
linuxppc-dev@lists.ozlabs.org,
Shrikanth Hegde <sshegde@linux.ibm.com>
Subject: Re: [mainline][BUG] Observed Workqueue lockups on offline CPUs.
Date: Mon, 27 Apr 2026 08:43:59 -0700 [thread overview]
Message-ID: <ae-EP1BpXgnEWCt4@tardis.local> (raw)
In-Reply-To: <688280dc-78a2-4796-9eaf-e1c058836012@linux.ibm.com>
On Mon, Apr 27, 2026 at 05:00:10PM +0530, Samir M wrote:
>
Hi Samir,
> On 27/04/26 3:32 pm, Samir M wrote:
> > Hi Paul,
> >
> > I've been testing the latest upstream kernel on a PowerPC system and
> > encountered workqueue lockup issues that I've bisected to commit
> > 61bbcfb50514 ("srcu: Push srcu_node allocation to GP when
> > non-preemptible").
> > After booting, I'm seeing workqueue lockup warnings for CPUs 81-96,
> > which are offline on my system. The workqueues remain stuck for over 237
> > seconds:
> >
> > [ 243.309302][ C0] BUG: workqueue lockup - pool cpus=81 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309311][ C0] BUG: workqueue lockup - pool cpus=82 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309318][ C0] BUG: workqueue lockup - pool cpus=83 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309326][ C0] BUG: workqueue lockup - pool cpus=84 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309333][ C0] BUG: workqueue lockup - pool cpus=85 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309341][ C0] BUG: workqueue lockup - pool cpus=86 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309348][ C0] BUG: workqueue lockup - pool cpus=87 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309355][ C0] BUG: workqueue lockup - pool cpus=88 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309363][ C0] BUG: workqueue lockup - pool cpus=89 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309370][ C0] BUG: workqueue lockup - pool cpus=90 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309377][ C0] BUG: workqueue lockup - pool cpus=91 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309384][ C0] BUG: workqueue lockup - pool cpus=92 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309392][ C0] BUG: workqueue lockup - pool cpus=93 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309399][ C0] BUG: workqueue lockup - pool cpus=94 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309406][ C0] BUG: workqueue lockup - pool cpus=95 node=0
> > flags=0x4 nice=0 stuck for 237s!
> > [ 243.309413][ C0] BUG: workqueue lockup - pool cpus=96 node=0
> > flags=0x4 nice=0 stuck for 237s!
> >
> > Git bisect identified this as the first bad commit:
> >
> > commit 61bbcfb50514a8a94e035a7349697a3790ab4783
> > Author: Paul E. McKenney <paulmck@kernel.org>
> > Date: Fri Mar 20 20:29:20 2026 -0700
> >
> > srcu: Push srcu_node allocation to GP when non-preemptible
> >
> > When the srcutree.convert_to_big and srcutree.big_cpu_lim kernel boot
> > parameters specify initialization-time allocation of the srcu_node
> > tree for statically allocated srcu_struct structures (for example, in
> > DEFINE_SRCU() at build time instead of init_srcu_struct() at
> > runtime),
> > init_srcu_struct_nodes() will attempt to dynamically allocate this
> > tree
> > at the first run-time update-side use of this srcu_struct structure,
> > but while holding a raw spinlock. Because the memory allocator can
> > acquire non-raw spinlocks, this can result in lockdep splats.
> >
> > This commit therefore uses the same SRCU_SIZE_ALLOC trick that is
> > used
> > when the first run-time update-side use of this srcu_struct structure
> > happens before srcu_init() is called. The actual allocation then
> > takes
> > place from workqueue context at the ends of upcoming SRCU grace
> > periods.
> >
> > [boqun: Adjust the sha1 of the Fixes tag]
> >
> > Fixes: 175b45ed343a ("srcu: Use raw spinlocks so call_srcu() can be
> > used under preempt_disable()")
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > Signed-off-by: Boqun Feng <boqun@kernel.org>
> >
> > kernel/rcu/srcutree.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > Reverting this commit resolves the issue.
> >
> > The problem appears to be that the workqueue is attempting to execute on
> > offline CPUs. The commit moves SRCU node allocation to workqueue context
> > to avoid lockdep issues with memory allocation under raw spinlocks,
> > which makes sense. However, it seems the workqueue scheduling doesn't
> > properly account for CPU online/offline state in this code path.
> >
> > My test environment:
> > - Architecture: PowerPC
> > - Kernel version: Latest upstream (7.1-rc1)
> > - CPUs 81-96 are offline at boot time
> >
> > I suspect the issue might be related to:
> > 1. Workqueue not checking CPU online status before scheduling SRCU
> > allocation work
> > 2. Missing CPU hotplug awareness in the new workqueue-based allocation
> > path
> > 3. Possible race condition with CPU hotplug events
> >
> > Would it make sense to use queue_work_on() with explicit online CPU
> > selection, or add CPU hotplug handlers for this workqueue? I'm not
> > deeply familiar with the workqueue internals, so I might be missing
> > something.
> > Please let me know if you need any additional details or if you'd like
> > me to test any patches.
> >
> > If you happen to fix the above issue, then please add below tag.
> > Reported-by: Samir M <samir@linux.ibm.com>
> >
> >
> > Thanks,
> > Samir
>
> Hi Paul,
>
>
> I worked on fixing the issue and introduced the changes below. With these
> updates, I no longer observe any workqueue lockup messages for offline CPUs.
> Could you please review the changes and share your feedback?
>
> The commit 61bbcfb50514 ("srcu: Push srcu_node allocation to GP when
> non-preemptible") introduced workqueue lockups on systems with offline
> CPUs. The issue occurs because srcu_queue_delayed_work_on() calls
> queue_work_on() with sdp->cpu, which may be offline, causing the
> workqueue to spin indefinitely on that CPU.
>
> This patch fixes the issue by checking if the target CPU is online
> before queuing work on it. If the CPU is offline, we fall back to
> using queue_work() which will schedule the work on any available
> online CPU.
>
> Fixes: 61bbcfb50514 ("srcu: Push srcu_node allocation to GP when
> non-preemptible")
>
> Signed-off-by: Samir <samir@linux.ibm.com>
Thanks for the patch, but I wonder: have you checked this email thread:
https://lore.kernel.org/rcu/ttd89ul@ub.hpns/
Paul had a fix [1], and TJ had a "fix" [2] on workqueue side.
In general I think we discovered that as long as a CPU has been onlined
once, it's OK to queue the work on that CPU (which may be offlined) even
with our TJ's patch (whether we should do that is a different problem
;-)). Please do check whether Paul's fix works for your case, thanks!
[1]: https://lore.kernel.org/rcu/ed1fa6cd-7343-4ca3-8b9d-d699ca496f83@paulmck-laptop/
[2]: https://lore.kernel.org/rcu/adlHKowvhn8AGXCc@slm.duckdns.org/
Regards,
Boqun
> ---
> kernel/rcu/srcutree.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 0d01cd8c4b4a..55a90dd4a030 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -869,10 +869,15 @@ static void srcu_delay_timer(struct timer_list *t)
> static void srcu_queue_delayed_work_on(struct srcu_data *sdp,
> unsigned long delay)
> {
> - if (!delay) {
> + if (!delay && cpu_online(sdp->cpu)) {
> queue_work_on(sdp->cpu, rcu_gp_wq, &sdp->work);
> return;
> + } else if (!delay) {
> + /* CPU is offline, queue on any available CPU */
> + queue_work(rcu_gp_wq, &sdp->work);
> + return;
> + }
>
> timer_reduce(&sdp->delay_work, jiffies + delay);
> }
> --
>
>
> Thanks,
> Samir
prev parent reply other threads:[~2026-04-27 23:31 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-27 10:02 [mainline][BUG] Observed Workqueue lockups on offline CPUs Samir M
2026-04-27 11:30 ` Samir M
2026-04-27 15:43 ` Boqun Feng [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ae-EP1BpXgnEWCt4@tardis.local \
--to=boqun@kernel.org \
--cc=boqun.feng@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=samir@linux.ibm.com \
--cc=sshegde@linux.ibm.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox