From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: paulmck@kernel.org
Cc: Tejun Heo <tj@kernel.org>, Vasily Gorbik <gor@linux.ibm.com>,
Srikar Dronamraju <srikar@linux.ibm.com>,
Boqun Feng <boqun@kernel.org>,
Frederic Weisbecker <frederic@kernel.org>,
Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
Joel Fernandes <joelagnelf@nvidia.com>,
Uladzislau Rezki <urezki@gmail.com>,
rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-s390@vger.kernel.org,
Lai Jiangshan <jiangshanlai@gmail.com>,
samir@linux.ibm.com
Subject: Re: BUG: workqueue lockup - SRCU schedules work on not-online CPUs during size transition
Date: Fri, 1 May 2026 18:47:55 +0530 [thread overview]
Message-ID: <981f0de4-fd65-475e-a626-ed7cd3594d3f@linux.ibm.com> (raw)
In-Reply-To: <c62502b6-642d-487d-a8a2-4ed7f9c7d858@paulmck-laptop>
Hi Paul.
On 4/30/26 9:40 PM, Paul E. McKenney wrote:
> On Thu, Apr 30, 2026 at 12:38:16PM +0530, Shrikanth Hegde wrote:
>> Hi Paul.
>>
>> On 4/29/26 11:31 PM, Paul E. McKenney wrote:
>
> [ . . . ]
>
> Sorry, missed one...
>
>>> ------------------------------------------------------------------------
>>>
>>> commit f8d5aaaf90f8294890802ce8dccbafd9850ac5f9
>>> Author: Paul E. McKenney <paulmck@kernel.org>
>>> Date: Thu Apr 9 11:16:02 2026 -0700
>>>
>>> srcu: Don't queue workqueue handlers to never-online CPUs
>>> While an srcu_struct structure is in the midst of switching from CPU-0
>>> to all-CPUs state, it can attempt to invoke callbacks for CPUs that
>>> have never been online. Worse yet, it can attempt in invoke callbacks
>>> for CPUs that never will be online due to not being present in the
>>
>> for CPUs that never will be online due to being present in the cpu_possible_mask?
>
> Exactly.
>
> Just because a CPU is in cpu_possible_mask doesn't mean that it will
> ever actually come online. For example, for single-threaded performance
> reasons, a given system might choose to bring online only one CPU from
> each hypertheaded core. In that case, the other CPU in each hyperthreaded
> core could be in the cpu_possible_mask, but would never come online.
>
> Thanx, Paul
>
Nit: I was suggesting *not* is probably not needed in that changelog.
I agree with explanation.
>>> cpu_possible_mask. This can cause hangs on s390, which is not set up to
>>> deal with workqueue handlers being scheduled on such CPUs. This commit
>>> therefore causes Tree SRCU to refrain from queueing workqueue handlers
>>> on CPUs that have not yet (and might never) come online.
>>> Because callbacks are not invoked on CPUs that have not been
>>> online, it is an error to invoke call_srcu(), synchronize_srcu(), or
>>> synchronize_srcu_expedited() on a CPU that is not yet fully online.
>>> However, it turns out to be less code to redirect the callbacks
>>> from too-early invocations of call_srcu() than to warn about such
>>> invocations. This commit therefore also redirects callbacks queued on
>>> not-yet-fully-online CPUs to the boot CPU.
>>> Reported-by: Vasily Gorbik <gor@linux.ibm.com>
>>> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>>> Tested-by: Vasily Gorbik <gor@linux.ibm.com>
>>> Cc: Tejun Heo <tj@kernel.org>
Alright. With those two explanations, this LGTM.
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
next prev parent reply other threads:[~2026-05-01 13:18 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-09 13:08 BUG: workqueue lockup - SRCU schedules work on not-online CPUs during size transition Vasily Gorbik
2026-04-09 17:22 ` Paul E. McKenney
2026-04-09 19:15 ` Vasily Gorbik
2026-04-09 20:10 ` Paul E. McKenney
2026-04-10 4:03 ` Paul E. McKenney
2026-04-14 19:24 ` Paul E. McKenney
2026-04-29 17:50 ` Vasily Gorbik
2026-04-29 18:05 ` Paul E. McKenney
2026-04-29 18:23 ` Vasily Gorbik
2026-04-09 17:26 ` Boqun Feng
2026-04-09 17:40 ` Boqun Feng
2026-04-09 17:47 ` Tejun Heo
2026-04-09 17:48 ` Tejun Heo
2026-04-09 18:04 ` Paul E. McKenney
2026-04-09 18:09 ` Tejun Heo
2026-04-09 18:15 ` Paul E. McKenney
2026-04-09 18:10 ` Boqun Feng
2026-04-09 18:27 ` Paul E. McKenney
2026-04-10 18:53 ` Tejun Heo
2026-04-10 19:17 ` Paul E. McKenney
2026-04-10 19:29 ` Tejun Heo
2026-04-29 15:00 ` Srikar Dronamraju
2026-04-29 17:08 ` Vasily Gorbik
2026-04-29 17:18 ` Paul E. McKenney
2026-04-29 17:44 ` Shrikanth Hegde
2026-04-29 18:01 ` Paul E. McKenney
2026-04-30 7:08 ` Shrikanth Hegde
2026-04-30 16:05 ` Paul E. McKenney
2026-04-30 16:10 ` Paul E. McKenney
2026-05-01 13:17 ` Shrikanth Hegde [this message]
2026-05-01 14:00 ` Paul E. McKenney
2026-04-29 18:17 ` Samir M
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=981f0de4-fd65-475e-a626-ed7cd3594d3f@linux.ibm.com \
--to=sshegde@linux.ibm.com \
--cc=boqun@kernel.org \
--cc=frederic@kernel.org \
--cc=gor@linux.ibm.com \
--cc=jiangshanlai@gmail.com \
--cc=joelagnelf@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=neeraj.upadhyay@kernel.org \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=samir@linux.ibm.com \
--cc=srikar@linux.ibm.com \
--cc=tj@kernel.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox