The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH v4 0/1] cpuhp: Expedite RCU when toggling system-wide SMT mode
@ 2026-05-07  5:39 Vishal Chourasia
  2026-05-07  5:39 ` [PATCH v4 1/1] " Vishal Chourasia
  0 siblings, 1 reply; 3+ messages in thread
From: Vishal Chourasia @ 2026-05-07  5:39 UTC (permalink / raw)
  To: peterz, aboorvad
  Cc: boqun.feng, frederic, joelagnelf, josh, linux-kernel,
	neeraj.upadhyay, paulmck, rcu, rostedt, srikar, sshegde, tglx,
	urezki, samir, vishalc

Hello All,

SMT mode switch operation on a large CPU count system takes close to an
hour to complete. Initial debugging root caused the delay to the CPU
hotplug subsystem being blocked on numerous synchronize_rcu() calls.
Simply enabling system-wide RCU expediting reduced the switch time to
5-6 minutes. Since then, different approaches have been explored, of
which some had their own side effects and others didn't work as
expected.

Approaches explored:

1. Expedited individual CPU hotplug operations by wrapping
_cpu_up()/_cpu_down() with rcu_expedite_gp()/rcu_unexpedite_gp() [0].
Peter suggested expediting only when SMT switch is triggered via the
sysfs control interface, not for individual hotplug operations [1].

2. Replacing synchronize_rcu() calls in the CPU hotplug codepath with
their expedited variants. This is not viable because one
synchronize_rcu() is invoked inside cpus_write_lock(), which is shared
with other kernel subsystems [5].

3. Hoisting cpus_write_lock() to be taken once for the entire SMT switch
operation instead of per-CPU [3][4]. On large systems where the SMT
switch can still take 5-6 minutes, holding the lock for that duration
causes hung task splats and starves other subsystems depending on the
read lock.

4. Peter has also suggested using rcu_sync_{enter|exit}() which as is
doesn't help as is, but can be paired the approach 2 from above.

Current approach: expedite RCU grace periods around the SMT switch
operation in the sysfs control interface path, per Peter's suggestion
[1], with Aboorva's analysis confirming synchronize_rcu() as the
bottleneck [2].

[0] https://lore.kernel.org/all/20260218083915.660252-2-vishalc@linux.ibm.com
[1] https://lore.kernel.org/all/20260113090153.GS830755@noisy.programming.kicks-ass.net/
[2] https://lore.kernel.org/all/5f2ab8a44d685701fe36cdaa8042a1aef215d10d.camel@linux.vnet.ibm.com
[3] https://lore.kernel.org/all/20260119114333.GI1890602@noisy.programming.kicks-ass.net/
[4] https://lore.kernel.org/all/ba470918-0ad9-4548-9161-826948462f73@linux.ibm.com/
[5] https://lore.kernel.org/all/804E7B47-F515-4592-B12E-84AD251EB07D@nvidia.com/
[6] https://lore.kernel.org/all/e2cca734-9191-4073-ba9d-936014498645@linux.ibm.com/

Vishal Chourasia (1):
  cpuhp: Expedite RCU when toggling system-wide SMT mode

 include/linux/rcupdate.h | 8 ++++++++
 kernel/cpu.c             | 4 ++++
 kernel/rcu/rcu.h         | 4 ----
 3 files changed, 12 insertions(+), 4 deletions(-)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-07 19:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-07  5:39 [PATCH v4 0/1] cpuhp: Expedite RCU when toggling system-wide SMT mode Vishal Chourasia
2026-05-07  5:39 ` [PATCH v4 1/1] " Vishal Chourasia
2026-05-07 19:07   ` Samir M

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox