From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: paulmck@kernel.org
Cc: Tejun Heo <tj@kernel.org>, Vasily Gorbik <gor@linux.ibm.com>,
Srikar Dronamraju <srikar@linux.ibm.com>,
Boqun Feng <boqun@kernel.org>,
Frederic Weisbecker <frederic@kernel.org>,
Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
Joel Fernandes <joelagnelf@nvidia.com>,
Uladzislau Rezki <urezki@gmail.com>,
rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-s390@vger.kernel.org,
Lai Jiangshan <jiangshanlai@gmail.com>,
samir@linux.ibm.com
Subject: Re: BUG: workqueue lockup - SRCU schedules work on not-online CPUs during size transition
Date: Fri, 1 May 2026 18:47:55 +0530 [thread overview]
Message-ID: <981f0de4-fd65-475e-a626-ed7cd3594d3f@linux.ibm.com> (raw)
In-Reply-To: <c62502b6-642d-487d-a8a2-4ed7f9c7d858@paulmck-laptop>
Hi Paul.
On 4/30/26 9:40 PM, Paul E. McKenney wrote:
> On Thu, Apr 30, 2026 at 12:38:16PM +0530, Shrikanth Hegde wrote:
>> Hi Paul.
>>
>> On 4/29/26 11:31 PM, Paul E. McKenney wrote:
>
> [ . . . ]
>
> Sorry, missed one...
>
>>> ------------------------------------------------------------------------
>>>
>>> commit f8d5aaaf90f8294890802ce8dccbafd9850ac5f9
>>> Author: Paul E. McKenney <paulmck@kernel.org>
>>> Date: Thu Apr 9 11:16:02 2026 -0700
>>>
>>> srcu: Don't queue workqueue handlers to never-online CPUs
>>> While an srcu_struct structure is in the midst of switching from CPU-0
>>> to all-CPUs state, it can attempt to invoke callbacks for CPUs that
>>> have never been online. Worse yet, it can attempt in invoke callbacks
>>> for CPUs that never will be online due to not being present in the
>>
>> for CPUs that never will be online due to being present in the cpu_possible_mask?
>
> Exactly.
>
> Just because a CPU is in cpu_possible_mask doesn't mean that it will
> ever actually come online. For example, for single-threaded performance
> reasons, a given system might choose to bring online only one CPU from
> each hypertheaded core. In that case, the other CPU in each hyperthreaded
> core could be in the cpu_possible_mask, but would never come online.
>
> Thanx, Paul
>
Nit: I was suggesting *not* is probably not needed in that changelog.
I agree with explanation.
>>> cpu_possible_mask. This can cause hangs on s390, which is not set up to
>>> deal with workqueue handlers being scheduled on such CPUs. This commit
>>> therefore causes Tree SRCU to refrain from queueing workqueue handlers
>>> on CPUs that have not yet (and might never) come online.
>>> Because callbacks are not invoked on CPUs that have not been
>>> online, it is an error to invoke call_srcu(), synchronize_srcu(), or
>>> synchronize_srcu_expedited() on a CPU that is not yet fully online.
>>> However, it turns out to be less code to redirect the callbacks
>>> from too-early invocations of call_srcu() than to warn about such
>>> invocations. This commit therefore also redirects callbacks queued on
>>> not-yet-fully-online CPUs to the boot CPU.
>>> Reported-by: Vasily Gorbik <gor@linux.ibm.com>
>>> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>>> Tested-by: Vasily Gorbik <gor@linux.ibm.com>
>>> Cc: Tejun Heo <tj@kernel.org>
Alright. With those two explanations, this LGTM.
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
next prev parent reply other threads:[~2026-05-01 13:18 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-09 13:08 BUG: workqueue lockup - SRCU schedules work on not-online CPUs during size transition Vasily Gorbik
2026-04-09 17:22 ` Paul E. McKenney
2026-04-09 19:15 ` Vasily Gorbik
2026-04-09 20:10 ` Paul E. McKenney
2026-04-10 4:03 ` Paul E. McKenney
2026-04-14 19:24 ` Paul E. McKenney
2026-04-29 17:50 ` Vasily Gorbik
2026-04-29 18:05 ` Paul E. McKenney
2026-04-29 18:23 ` Vasily Gorbik
2026-04-09 17:26 ` Boqun Feng
2026-04-09 17:40 ` Boqun Feng
2026-04-09 17:47 ` Tejun Heo
2026-04-09 17:48 ` Tejun Heo
2026-04-09 18:04 ` Paul E. McKenney
2026-04-09 18:09 ` Tejun Heo
2026-04-09 18:15 ` Paul E. McKenney
2026-04-09 18:10 ` Boqun Feng
2026-04-09 18:27 ` Paul E. McKenney
2026-04-10 18:53 ` Tejun Heo
2026-04-10 19:17 ` Paul E. McKenney
2026-04-10 19:29 ` Tejun Heo
2026-04-29 15:00 ` Srikar Dronamraju
2026-04-29 17:08 ` Vasily Gorbik
2026-04-29 17:18 ` Paul E. McKenney
2026-04-29 17:44 ` Shrikanth Hegde
2026-04-29 18:01 ` Paul E. McKenney
2026-04-30 7:08 ` Shrikanth Hegde
2026-04-30 16:05 ` Paul E. McKenney
2026-04-30 16:10 ` Paul E. McKenney
2026-05-01 13:17 ` Shrikanth Hegde [this message]
2026-05-01 14:00 ` Paul E. McKenney
2026-04-29 18:17 ` Samir M
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=981f0de4-fd65-475e-a626-ed7cd3594d3f@linux.ibm.com \
--to=sshegde@linux.ibm.com \
--cc=boqun@kernel.org \
--cc=frederic@kernel.org \
--cc=gor@linux.ibm.com \
--cc=jiangshanlai@gmail.com \
--cc=joelagnelf@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=neeraj.upadhyay@kernel.org \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=samir@linux.ibm.com \
--cc=srikar@linux.ibm.com \
--cc=tj@kernel.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.