public inbox for rcu@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/15] Implementation of Dynamic Housekeeping & Enhanced Isolation (DHEI)
@ 2026-03-25  9:09 Qiliang Yuan
  2026-03-25  9:09 ` [PATCH 01/15] sched/isolation: Support dynamic allocation for housekeeping masks Qiliang Yuan
                   ` (15 more replies)
  0 siblings, 16 replies; 23+ messages in thread
From: Qiliang Yuan @ 2026-03-25  9:09 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, Thomas Gleixner, Paul E. McKenney,
	Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes,
	Josh Triplett, Boqun Feng, Uladzislau Rezki, Mathieu Desnoyers,
	Lai Jiangshan, Zqiang, Tejun Heo, Andrew Morton, Vlastimil Babka,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Johannes Weiner, Zi Yan, Anna-Maria Behnsen, Ingo Molnar,
	Shuah Khan
  Cc: linux-kernel, rcu, linux-mm, linux-kselftest, Qiliang Yuan

The Linux kernel provides mechanisms like 'isolcpus' and 'nohz_full' to
reduce interference for latency-sensitive workloads. However, these are
locked behind the "Reboot Wall" - they can only be configured via boot
parameters and require a system restart for changes to take effect.

In modern cloud-native environments, CPU resources often need to be
dynamically re-partitioned to accommodate container scaling without
the performance penalty and downtime of a full system reboot. Similarly,
high-frequency trading (HFT) platforms require the ability to fine-tune
CPU isolation at runtime to minimize jitter for critical execution threads
based on shifting market demands.

This patch series introduces Dynamic Housekeeping & Enhanced Isolation
(DHEI). DHEI allows administrators to reconfigure the kernel's
housekeeping boundaries at runtime via a new sysfs interface at
/sys/kernel/housekeeping/.

Key Features:
- Fine-grained control: Separate sysfs nodes for timer, rcu, tick,
  workqueue, kthread, managed_irq, domain, and misc.
- Dynamic NOHZ_FULL: Supports enabling/disabling full dynticks mode
  on-the-fly.
- SMT Awareness: Optional 'smt_aware_mode' for core-granular isolation.
- Safety Guards: Prevents isolating all CPUs, requires at least one
  online housekeeping CPU, and enforces CAP_SYS_ADMIN capability.

Core Architecture:
1. Notifier-Driven Synchronization: HK_UPDATE_MASK blocking notifier chain.
2. Decoupled Memory Management: Runtime-safe cpumask allocation.
3. Subsystem Handlers: Dynamic migration for IRQ, RCU, Sched, etc.

The series is organized as follows:
- Patches 01-03: Core infrastructure (dynamic allocation, notifier,
  enum separation)
- Patches 04-09: Subsystem notifier handlers (genirq, RCU, scheduler,
  watchdog, workqueue, mm/compaction)
- Patch 10: tick/nohz dynamic full dynticks
- Patches 11-13: SMT-aware isolation, boot-time bridging, sysfs interface
- Patch 14: ABI documentation
- Patch 15: kselftest suite

Tested on x86_64 (8 vCPUs, SMT enabled) with all selftests passing.

As suggested by Joel Fernandes and Thomas Gleixner, this V1 version
provides a stronger rationale for dynamic isolation and addresses
all RFC feedback regarding naming and notifier robustness.

To: Ingo Molnar <mingo@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
To: Juri Lelli <juri.lelli@redhat.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
To: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Steven Rostedt <rostedt@goodmis.org>
To: Ben Segall <bsegall@google.com>
To: Mel Gorman <mgorman@suse.de>
To: Valentin Schneider <vschneid@redhat.com>
To: Thomas Gleixner <tglx@kernel.org>
To: Paul E. McKenney <paulmck@kernel.org>
To: Frederic Weisbecker <frederic@kernel.org>
To: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
To: Joel Fernandes <joelagnelf@nvidia.com>
To: Josh Triplett <josh@joshtriplett.org>
To: Boqun Feng <boqun.feng@gmail.com>
To: Uladzislau Rezki <urezki@gmail.com>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Lai Jiangshan <jiangshanlai@gmail.com>
To: Zqiang <qiang.zhang@linux.dev>
To: Tejun Heo <tj@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
To: Vlastimil Babka <vbabka@suse.cz>
To: Suren Baghdasaryan <surenb@google.com>
To: Michal Hocko <mhocko@suse.com>
To: Brendan Jackman <jackmanb@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
To: Zi Yan <ziy@nvidia.com>
To: Anna-Maria Behnsen <anna-maria@linutronix.de>
To: Ingo Molnar <mingo@kernel.org>
To: Shuah Khan <shuah@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: rcu@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-kselftest@vger.kernel.org
Signed-off-by: Qiliang Yuan <realwujing@gmail.com>

Changes since RFC:
- Dynamic RCU NOCB rewrite: Perform full runtime offload/deoffload via remove_cpu()/add_cpu() for online CPUs, with lazy initialization.
- Robust Timer Migration: Added logic to dynamically migrate tick_do_timer_cpu when a housekeeper is isolated.
- Enhanced Isolation Safety: Hardened sysfs interface with CAP_SYS_ADMIN checks, 0600 permissions, and strict cpumask validations including SMT subset checks.
- Lifecycle Cleanups: Replaced system_state boot checks with slab_is_available() and added hotplug shutdown guards for clean power-off.
- Testing & Docs: Added comprehensive kselftest suite for isolation scenarios and detailed ABI documentation.
- Link to RFC: https://lore.kernel.org/all/20260206-feature-dynamic_isolcpus_dhei-v1-0-00a711eb0c74@gmail.com/

---
Qiliang Yuan (15):
      sched/isolation: Support dynamic allocation for housekeeping masks
      sched/isolation: Introduce housekeeping notifier infrastructure
      sched/isolation: Separate housekeeping types in enum hk_type
      genirq: Support dynamic migration for managed interrupts
      rcu: Support runtime NOCB initialization and dynamic offloading
      sched/core: Dynamically update scheduler domain housekeeping mask
      watchdog: Allow runtime toggle of lockup detector affinity
      workqueue: Support dynamic housekeeping mask updates
      mm/compaction: Support dynamic housekeeping mask updates for kcompactd
      tick/nohz: Transition to dynamic full dynticks state management
      sched/isolation: Implement SMT-aware isolation and safety guards
      sched/isolation: Bridge boot-time parameters with dynamic isolation
      sched/isolation: Implement sysfs interface for dynamic housekeeping
      Documentation: isolation: Document DHEI sysfs interfaces
      selftests: dhei: Add functional tests for dynamic housekeeping

 .../ABI/testing/sysfs-kernel-housekeeping          |  22 ++
 include/linux/sched/isolation.h                    |  40 +++-
 kernel/irq/manage.c                                |  49 +++++
 kernel/rcu/rcu.h                                   |   4 +
 kernel/rcu/tree.c                                  |  76 +++++++
 kernel/rcu/tree.h                                  |   2 +-
 kernel/rcu/tree_nocb.h                             |  27 ++-
 kernel/sched/core.c                                |  28 +++
 kernel/sched/isolation.c                           | 236 ++++++++++++++++++++-
 kernel/time/tick-sched.c                           | 130 +++++++++---
 kernel/watchdog.c                                  |  25 +++
 kernel/workqueue.c                                 |  42 ++++
 mm/compaction.c                                    |  27 +++
 tools/testing/selftests/Makefile                   |   1 +
 tools/testing/selftests/dhei/Makefile              |   4 +
 tools/testing/selftests/dhei/dhei_test.sh          | 160 ++++++++++++++
 16 files changed, 818 insertions(+), 55 deletions(-)
---
base-commit: 63804fed149a6750ffd28610c5c1c98cce6bd377
change-id: 20260324-dhei-v12-final-891d1ba62bd3

Best regards,
-- 
Qiliang Yuan <realwujing@gmail.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2026-03-25 16:02 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-25  9:09 [PATCH 00/15] Implementation of Dynamic Housekeeping & Enhanced Isolation (DHEI) Qiliang Yuan
2026-03-25  9:09 ` [PATCH 01/15] sched/isolation: Support dynamic allocation for housekeeping masks Qiliang Yuan
2026-03-25 13:57   ` Peter Zijlstra
2026-03-25  9:09 ` [PATCH 02/15] sched/isolation: Introduce housekeeping notifier infrastructure Qiliang Yuan
2026-03-25 13:58   ` Peter Zijlstra
2026-03-25  9:09 ` [PATCH 03/15] sched/isolation: Separate housekeeping types in enum hk_type Qiliang Yuan
2026-03-25 13:59   ` Peter Zijlstra
2026-03-25  9:09 ` [PATCH 04/15] genirq: Support dynamic migration for managed interrupts Qiliang Yuan
2026-03-25  9:09 ` [PATCH 05/15] rcu: Support runtime NOCB initialization and dynamic offloading Qiliang Yuan
2026-03-25  9:09 ` [PATCH 06/15] sched/core: Dynamically update scheduler domain housekeeping mask Qiliang Yuan
2026-03-25 14:00   ` Peter Zijlstra
2026-03-25  9:09 ` [PATCH 07/15] watchdog: Allow runtime toggle of lockup detector affinity Qiliang Yuan
2026-03-25 14:03   ` Peter Zijlstra
2026-03-25  9:09 ` [PATCH 08/15] workqueue: Support dynamic housekeeping mask updates Qiliang Yuan
2026-03-25  9:09 ` [PATCH 09/15] mm/compaction: Support dynamic housekeeping mask updates for kcompactd Qiliang Yuan
2026-03-25  9:09 ` [PATCH 10/15] tick/nohz: Transition to dynamic full dynticks state management Qiliang Yuan
2026-03-25  9:09 ` [PATCH 11/15] sched/isolation: Implement SMT-aware isolation and safety guards Qiliang Yuan
2026-03-25  9:09 ` [PATCH 12/15] sched/isolation: Bridge boot-time parameters with dynamic isolation Qiliang Yuan
2026-03-25  9:09 ` [PATCH 13/15] sched/isolation: Implement sysfs interface for dynamic housekeeping Qiliang Yuan
2026-03-25 14:04   ` Peter Zijlstra
2026-03-25  9:09 ` [PATCH 14/15] Documentation: isolation: Document DHEI sysfs interfaces Qiliang Yuan
2026-03-25  9:09 ` [PATCH 15/15] selftests: dhei: Add functional tests for dynamic housekeeping Qiliang Yuan
2026-03-25 16:02 ` [PATCH 00/15] Implementation of Dynamic Housekeeping & Enhanced Isolation (DHEI) Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox