All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boqun Feng <boqun@kernel.org>
To: Tejun Heo <tj@kernel.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <neeraj.upadhyay@kernel.org>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Uladzislau Rezki <urezki@gmail.com>,
	rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-s390@vger.kernel.org,
	Lai Jiangshan <jiangshanlai@gmail.com>
Subject: Re: BUG: workqueue lockup - SRCU schedules work on not-online CPUs during size transition
Date: Thu, 9 Apr 2026 11:10:04 -0700	[thread overview]
Message-ID: <adfrfJGrglg0bGw_@tardis.local> (raw)
In-Reply-To: <adfmHZfABu64Kv4D@slm.duckdns.org>

On Thu, Apr 09, 2026 at 07:47:09AM -1000, Tejun Heo wrote:
> On Thu, Apr 09, 2026 at 10:40:05AM -0700, Boqun Feng wrote:
> > On Thu, Apr 09, 2026 at 10:26:49AM -0700, Boqun Feng wrote:
> > > On Thu, Apr 09, 2026 at 03:08:45PM +0200, Vasily Gorbik wrote:
> > > > Commit 61bbcfb50514 ("srcu: Push srcu_node allocation to GP when
> > > > non-preemptible") defers srcu_node tree allocation when called under
> > > > raw spinlock, putting SRCU through ~6 transitional grace periods
> > > > (SRCU_SIZE_ALLOC to SRCU_SIZE_BIG). During this transition srcu_gp_end()
> > > > uses mask = ~0, which makes srcu_schedule_cbs_snp() call queue_work_on()
> > > > for every possible CPU. Since rcu_gp_wq is WQ_PERCPU, work targets
> > > > per-CPU pools directly - pools for not-online CPUs have no workers,
> > > 
> > > [Cc workqueue]
> > > 
> > > Hmm.. I thought for offline CPUs the corresponding worker pools become a
> > > unbound one hence there are still workers?
> > > 
> > 
> > Ah, as Paul replied in another email, the problem was because these CPUs
> > had never been onlined, so they don't even have unbound workers?
> 
> Hahaha, we do initialize worker pool for every possible CPU but the
> transition to unbound operation happens in the hot unplug callback. We

;-) ;-) ;-)

> probably need to do some of the hot unplug operation during init if the CPU

Seems that we (mostly Paul) have our own trick to track whether a CPU
has ever been onlined in RCU, see rcu_cpu_beenfullyonline(). Paul also
used it in his fix [1]. And I think it won't be that hard to copy it
into workqueue and let queue_work_on() use it so that if the user queues
a work on a never-onlined CPU, it can detect it (with a warning?) and do
something?

[1]: https://lore.kernel.org/rcu/073abb55-197a-4519-b177-f9f776624fed@paulmck-laptop/

Regards,
Boqun

> is possible but not online. That said, what kind of machine is it? Is the
> firmware just reporting bogus possible mask? How come the CPUs weren't
> online during boot?
> 
> Thanks.
> 
> -- 
> tejun

  parent reply	other threads:[~2026-04-09 18:10 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-09 13:08 BUG: workqueue lockup - SRCU schedules work on not-online CPUs during size transition Vasily Gorbik
2026-04-09 17:22 ` Paul E. McKenney
2026-04-09 19:15   ` Vasily Gorbik
2026-04-09 20:10     ` Paul E. McKenney
2026-04-10  4:03       ` Paul E. McKenney
2026-04-14 19:24         ` Paul E. McKenney
2026-04-29 17:50           ` Vasily Gorbik
2026-04-29 18:05             ` Paul E. McKenney
2026-04-29 18:23               ` Vasily Gorbik
2026-04-09 17:26 ` Boqun Feng
2026-04-09 17:40   ` Boqun Feng
2026-04-09 17:47     ` Tejun Heo
2026-04-09 17:48       ` Tejun Heo
2026-04-09 18:04         ` Paul E. McKenney
2026-04-09 18:09           ` Tejun Heo
2026-04-09 18:15             ` Paul E. McKenney
2026-04-09 18:10       ` Boqun Feng [this message]
2026-04-09 18:27         ` Paul E. McKenney
2026-04-10 18:53         ` Tejun Heo
2026-04-10 19:17           ` Paul E. McKenney
2026-04-10 19:29             ` Tejun Heo
2026-04-29 15:00           ` Srikar Dronamraju
2026-04-29 17:08             ` Vasily Gorbik
2026-04-29 17:18               ` Paul E. McKenney
2026-04-29 17:44                 ` Shrikanth Hegde
2026-04-29 18:01                   ` Paul E. McKenney
2026-04-30  7:08                     ` Shrikanth Hegde
2026-04-30 16:05                       ` Paul E. McKenney
2026-04-30 16:10                       ` Paul E. McKenney
2026-05-01 13:17                         ` Shrikanth Hegde
2026-05-01 14:00                           ` Paul E. McKenney
2026-04-29 18:17           ` Samir M

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adfrfJGrglg0bGw_@tardis.local \
    --to=boqun@kernel.org \
    --cc=frederic@kernel.org \
    --cc=gor@linux.ibm.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joelagnelf@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=neeraj.upadhyay@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.