public inbox for rcu@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH-next 00/23] cgroup/cpuset: Enable runtime update of nohz_full and managed_irq CPUs
@ 2026-04-21  3:03 Waiman Long
  2026-04-21  3:03 ` [PATCH 01/23] sched/isolation: Add HK_TYPE_KERNEL_NOISE_BOOT & HK_TYPE_MANAGED_IRQ_BOOT Waiman Long
                   ` (22 more replies)
  0 siblings, 23 replies; 36+ messages in thread
From: Waiman Long @ 2026-04-21  3:03 UTC (permalink / raw)
  To: Tejun Heo, Johannes Weiner, Michal Koutný, Jonathan Corbet,
	Shuah Khan, Catalin Marinas, Will Deacon, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, Guenter Roeck,
	Frederic Weisbecker, Paul E. McKenney, Neeraj Upadhyay,
	Joel Fernandes, Josh Triplett, Boqun Feng, Uladzislau Rezki,
	Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan, Zqiang,
	Anna-Maria Behnsen, Ingo Molnar, Thomas Gleixner, Chen Ridong,
	Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Ben Segall, Mel Gorman, Valentin Schneider, K Prateek Nayak,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: cgroups, linux-doc, linux-kernel, linux-arm-kernel, linux-hyperv,
	linux-hwmon, rcu, netdev, linux-kselftest, Costa Shulyupin,
	Qiliang Yuan, Waiman Long

The "isolcpus=domain" CPU list implied by the HK_TYPE_DOMAIN housekeeping
cpumask is being dynamically updated whenever the set of cpuset isolated
partitions change. This patch series extends the isolated partition code
to make dynamic changes to the "nohz_full" and "isolcpus=managed_irq"
CPU lists. These CPU lists correspoond to equivalent changes in the
HK_TYPE_KERNEL_NOISE and HK_TYPE_MANAGED_IRQ housekeeping cpumasks.

To facilitate the changing of these CPU lists, the CPU hotplug code
which is doing a lot of heavy lifting is now being used. For changing
the "nohz_full" and HK_TYPE_KERNEL_NOISE cpumasks, the affected CPUs
are torn down into offline mode first. Then the housekeeping cpumask
and the corresponding cpumasks in the RCU, tick and watchdog subsystems
are modified. After that, the affected CPUs are brought online again.

It is slightly different with the "managed_irq" HK_TYPE_MANAGED_IRQ
cpumask, the cpumask is updated first and the affected CPUs are torn
down and brought up to move the managed interrupts away from the newly
isolated CPUs.

Using CPU hotplug does have its drawback when multiple isolated
partitions are being managed in a system. The CPU offline process uses
the stop_machine mechanism to stop all the CPUs except the one being torn
down. This will cause latency spike for isolated CPUs in other isolated
partitions. It is not a problem if only one isolated partition is
needed. This is an issue we need to address in the near future.

Patches 1-7 enables runtime updates to the nohz_full/managed_irq
cpumasks in sched/isolation, tick, RCU and watchdog subsystems.

Patches 8-17 modifies various subsystems that need access to the
HK_TYPE_MANAGED_IRQ cpumask and HK_TYPE_KERNEL_NOISE cpumask and its
aliases. Like the runtime modifiable HK_TYPE_DOMAIN cpumask, RCU is used
to protect access to cpumasks to avoid potential UAF problem. Patch 17
updates the housekeeping_dereference_check() to print a WARNING when
lockdep is enabled if those housekeeping cpumasks are used without the
proper lock protection.

Patch 18 introduces a new cpuhp_offline_cb() API that enables the
shutting down of a given list of CPUs, run a callback function and then
brought those CPUs up again while disallowing any concurrent CPU hotplug
activity.

Patches 19-23 updates the cpuset code, selftest and documentation to
allow change in isolated partition configuration to be reflected in
the housekeeping and other cpumasks dynamically.

As there is a slight overhead in enabling dynamic update to the nohz_full
cpumask, this new nohz_full and managed_irq runtime update feature
has to be explicitly opted in by adding a nohz_full kernel command
line parameter with or without a CPU list to indicate a desire to use
this feature. It is also because a number of subsystems have explicit
check of nohz_full at boot time to adjust their behavior which may not
be easy to modify after boot.

Waiman Long (23):
  sched/isolation: Add HK_TYPE_KERNEL_NOISE_BOOT &
    HK_TYPE_MANAGED_IRQ_BOOT
  sched/isolation: Enhance housekeeping_update() to support updating
    more than one HK cpumask
  tick/nohz: Make nohz_full parameter optional
  tick/nohz: Allow runtime changes in full dynticks CPUs
  tick: Pass timer tick job to an online HK CPU in tick_cpu_dying()
  rcu/nocbs: Allow runtime changes in RCU NOCBS cpumask
  watchdog: Sync up with runtime change of isolated CPUs
  arm64: topology: Use RCU to protect access to HK_TYPE_TICK cpumask
  workqueue: Use RCU to protect access of HK_TYPE_TIMER cpumask
  cpu: Use RCU to protect access of HK_TYPE_TIMER cpumask
  hrtimer: Use RCU to protect access of HK_TYPE_TIMER cpumask
  net: Use boot time housekeeping cpumask settings for now
  sched/core: Use RCU to protect access of HK_TYPE_KERNEL_NOISE cpumask
  hwmon/coretemp: Use RCU to protect access of HK_TYPE_MISC cpumask
  Drivers: hv: Use RCU to protect access of HK_TYPE_MANAGED_IRQ cpumask
  genirq/cpuhotplug: Use RCU to protect access of HK_TYPE_MANAGED_IRQ
    cpumask
  sched/isolation: Extend housekeeping_dereference_check() to cover
    changes in nohz_full or manged_irqs           cpumasks
  cpu/hotplug: Add a new cpuhp_offline_cb() API
  cgroup/cpuset: Improve check for calling housekeeping_update()
  cgroup/cpuset: Enable runtime update of
    HK_TYPE_{KERNEL_NOISE,MANAGED_IRQ} cpumasks
  cgroup/cpuset: Limit the side effect of using CPU hotplug on isolated
    partition
  cgroup/cpuset: Prevent offline_disabled CPUs from being used in
    isolated partition
  cgroup/cpuset: Documentation and kselftest updates

 Documentation/admin-guide/cgroup-v2.rst       |  35 ++-
 .../admin-guide/kernel-parameters.txt         |  15 +-
 arch/arm64/kernel/topology.c                  |  17 +-
 drivers/hv/channel_mgmt.c                     |  15 +-
 drivers/hv/vmbus_drv.c                        |   7 +-
 drivers/hwmon/coretemp.c                      |   6 +-
 include/linux/context_tracking.h              |   8 +-
 include/linux/cpuhplock.h                     |   9 +
 include/linux/nmi.h                           |   2 +
 include/linux/rcupdate.h                      |   2 +
 include/linux/sched/isolation.h               |  18 +-
 include/linux/tick.h                          |   2 +
 kernel/cgroup/cpuset-internal.h               |   1 +
 kernel/cgroup/cpuset.c                        | 292 +++++++++++++++++-
 kernel/context_tracking.c                     |  19 +-
 kernel/cpu.c                                  |  72 +++++
 kernel/irq/cpuhotplug.c                       |   1 +
 kernel/irq/manage.c                           |   1 +
 kernel/rcu/tree_nocb.h                        |  24 +-
 kernel/sched/core.c                           |   4 +-
 kernel/sched/isolation.c                      | 135 +++++---
 kernel/time/hrtimer.c                         |   4 +-
 kernel/time/tick-common.c                     |  16 +-
 kernel/time/tick-sched.c                      |  48 ++-
 kernel/watchdog.c                             |  24 ++
 kernel/workqueue.c                            |   6 +-
 net/core/net-sysfs.c                          |   2 +-
 .../selftests/cgroup/test_cpuset_prs.sh       |  70 ++++-
 28 files changed, 747 insertions(+), 108 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2026-04-21 16:17 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-21  3:03 [PATCH-next 00/23] cgroup/cpuset: Enable runtime update of nohz_full and managed_irq CPUs Waiman Long
2026-04-21  3:03 ` [PATCH 01/23] sched/isolation: Add HK_TYPE_KERNEL_NOISE_BOOT & HK_TYPE_MANAGED_IRQ_BOOT Waiman Long
2026-04-21  3:03 ` [PATCH 02/23] sched/isolation: Enhance housekeeping_update() to support updating more than one HK cpumask Waiman Long
2026-04-21  3:03 ` [PATCH 03/23] tick/nohz: Make nohz_full parameter optional Waiman Long
2026-04-21  8:32   ` Thomas Gleixner
2026-04-21 14:14     ` Waiman Long
2026-04-21  3:03 ` [PATCH 04/23] tick/nohz: Allow runtime changes in full dynticks CPUs Waiman Long
2026-04-21  8:50   ` Thomas Gleixner
2026-04-21 14:24     ` Waiman Long
2026-04-21  3:03 ` [PATCH 05/23] tick: Pass timer tick job to an online HK CPU in tick_cpu_dying() Waiman Long
2026-04-21  8:55   ` Thomas Gleixner
2026-04-21 14:22     ` Waiman Long
2026-04-21  3:03 ` [PATCH 06/23] rcu/nocbs: Allow runtime changes in RCU NOCBS cpumask Waiman Long
2026-04-21  3:03 ` [PATCH 07/23] watchdog: Sync up with runtime change of isolated CPUs Waiman Long
2026-04-21  3:03 ` [PATCH 08/23] arm64: topology: Use RCU to protect access to HK_TYPE_TICK cpumask Waiman Long
2026-04-21  3:03 ` [PATCH 09/23] workqueue: Use RCU to protect access of HK_TYPE_TIMER cpumask Waiman Long
2026-04-21  3:03 ` [PATCH 10/23] cpu: " Waiman Long
2026-04-21  8:57   ` Thomas Gleixner
2026-04-21 14:25     ` Waiman Long
2026-04-21  3:03 ` [PATCH 11/23] hrtimer: " Waiman Long
2026-04-21  8:59   ` Thomas Gleixner
2026-04-21  3:03 ` [PATCH 12/23] net: Use boot time housekeeping cpumask settings for now Waiman Long
2026-04-21  3:03 ` [PATCH 13/23] sched/core: Use RCU to protect access of HK_TYPE_KERNEL_NOISE cpumask Waiman Long
2026-04-21  3:03 ` [PATCH 14/23] hwmon/coretemp: Use RCU to protect access of HK_TYPE_MISC cpumask Waiman Long
2026-04-21  3:03 ` [PATCH 15/23] Drivers: hv: Use RCU to protect access of HK_TYPE_MANAGED_IRQ cpumask Waiman Long
2026-04-21  3:03 ` [PATCH 16/23] genirq/cpuhotplug: " Waiman Long
2026-04-21  9:02   ` Thomas Gleixner
2026-04-21 14:29     ` Waiman Long
2026-04-21  3:03 ` [PATCH 17/23] sched/isolation: Extend housekeeping_dereference_check() to cover changes in nohz_full or manged_irqs cpumasks Waiman Long
2026-04-21  3:03 ` [PATCH 18/23] cpu/hotplug: Add a new cpuhp_offline_cb() API Waiman Long
2026-04-21 16:17   ` Thomas Gleixner
2026-04-21  3:03 ` [PATCH 19/23] cgroup/cpuset: Improve check for calling housekeeping_update() Waiman Long
2026-04-21  3:03 ` [PATCH 20/23] cgroup/cpuset: Enable runtime update of HK_TYPE_{KERNEL_NOISE,MANAGED_IRQ} cpumasks Waiman Long
2026-04-21  3:03 ` [PATCH 21/23] cgroup/cpuset: Limit the side effect of using CPU hotplug on isolated partition Waiman Long
2026-04-21  3:03 ` [PATCH 22/23] cgroup/cpuset: Prevent offline_disabled CPUs from being used in " Waiman Long
2026-04-21  3:03 ` [PATCH 23/23] cgroup/cpuset: Documentation and kselftest updates Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox