linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: "Tejun Heo" <tj@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Frederic Weisbecker" <frederic@kernel.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	"Neeraj Upadhyay" <neeraj.upadhyay@kernel.org>,
	"Joel Fernandes" <joelagnelf@nvidia.com>,
	"Josh Triplett" <josh@joshtriplett.org>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Uladzislau Rezki" <urezki@gmail.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>,
	"Lai Jiangshan" <jiangshanlai@gmail.com>,
	Zqiang <qiang.zhang@linux.dev>,
	"Anna-Maria Behnsen" <anna-maria@linutronix.de>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Juri Lelli" <juri.lelli@redhat.com>,
	"Vincent Guittot" <vincent.guittot@linaro.org>,
	"Dietmar Eggemann" <dietmar.eggemann@arm.com>,
	"Ben Segall" <bsegall@google.com>, "Mel Gorman" <mgorman@suse.de>,
	"Valentin Schneider" <vschneid@redhat.com>,
	"Shuah Khan" <shuah@kernel.org>
Cc: cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, rcu@vger.kernel.org,
	linux-kselftest@vger.kernel.org, Phil Auld <pauld@redhat.com>,
	Costa Shulyupin <costa.shul@redhat.com>,
	Gabriele Monaco <gmonaco@redhat.com>,
	Cestmir Kalina <ckalina@redhat.com>,
	Waiman Long <longman@redhat.com>
Subject: [RFC PATCH 17/18] cgroup/cpuset: Documentation updates & don't use CPU 0 for isolated partition
Date: Fri,  8 Aug 2025 11:20:00 -0400	[thread overview]
Message-ID: <20250808152001.20245-8-longman@redhat.com> (raw)
In-Reply-To: <20250808151053.19777-1-longman@redhat.com>

As CPU hotplug is now used to improve CPU isolation of CPUs in isolated
partitions. The boot CPU (typically CPU 0) cannot be put offline
impacting the amount of CPU isolation available. Now we have to advise
users that the boot CPU should never be used for isolated partitions. A
warning will be printed when boot CPU is used and the cgroup-v2.rst is
updated accordingly. The test_cpuset_prs.sh selftest is also updated
to remove CPU 0 when forming isolated partitions.

Also update the cgroup-v2.rst file to show the need to specify the
"nohz_full" kernel boot parameter to enable better nohz_full behavior
for the CPUs in isolated partitions as well as the latency spike issue
with using CPU hotplug.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 Documentation/admin-guide/cgroup-v2.rst       | 33 +++++++++++++++----
 .../selftests/cgroup/test_cpuset_prs.sh       |  8 ++---
 2 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index d9d3cc7df348..26213383b34b 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2556,11 +2556,12 @@ Cpuset Interface Files
 
 	It accepts only the following input values when written to.
 
-	  ==========	=====================================
+	  ==========	===============================================
 	  "member"	Non-root member of a partition
 	  "root"	Partition root
-	  "isolated"	Partition root without load balancing
-	  ==========	=====================================
+	  "isolated"	Partition root without load balancing and other
+		        OS noises
+	  ==========	===============================================
 
 	A cpuset partition is a collection of cpuset-enabled cgroups with
 	a partition root at the top of the hierarchy and its descendants
@@ -2593,9 +2594,29 @@ Cpuset Interface Files
 
 	When set to "isolated", the CPUs in that partition will be in
 	an isolated state without any load balancing from the scheduler
-	and excluded from the unbound workqueues.  Tasks placed in such
-	a partition with multiple CPUs should be carefully distributed
-	and bound to each of the individual CPUs for optimal performance.
+	and excluded from the unbound workqueues as well as without
+	other OS noises.  Tasks placed in such a partition with multiple
+	CPUs should be carefully distributed and bound to each of the
+	individual CPUs for optimal performance.
+
+	As CPU hotplug, if supported, is used to improve the degree of
+	CPU isolation close to the "nohz_full" kernel boot parameter.
+	The boot CPU (typically CPU 0) cannot be brought offline, so the
+	boot CPU should not be used for forming isolated partitions.
+	The "nohz_full" kernel boot parameter needs to be present to
+	enable full dynticks support and RCU no-callback CPU mode for
+	CPUs in isolated partitions even if the optional cpu list
+	isn't provided.  Without that, adding the "rcu_nocbs" boot
+	kernel parameter without the cpu list can be used to enable
+	RCU no-callback CPU mode without full dynticks.
+
+	Using CPU hotplug for creating or destroying an isolated
+	partition can cause latency spike in applications running
+	in other isolated partitions.  A reserved list of CPUs can
+	optionally be put in the "nohz_full" kernel boot parameter to
+	alleviate this problem.  When these reserved CPUs are used for
+	isolated partitions, CPU hotplug won't need to be invoked and
+	so there won't be latency spike in other isolated partitions.
 
 	A partition root ("root" or "isolated") can be in one of the
 	two possible states - valid or invalid.  An invalid partition
diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
index a17256d9f88a..f61369be8bf6 100755
--- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh
+++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
@@ -318,8 +318,8 @@ TEST_MATRIX=(
 	# Invalid to valid local partition direct transition tests
 	" C1-3:S+:P2 X4:P2  .      .      .      .      .      .     0 A1:1-3|XA1:1-3|A2:1-3:XA2: A1:P2|A2:P-2 1-3"
 	" C1-3:S+:P2 X4:P2  .      .      .    X3:P2    .      .     0 A1:1-2|XA1:1-3|A2:3:XA2:3 A1:P2|A2:P2 1-3"
-	"  C0-3:P2   .      .    C4-6   C0-4     .      .      .     0 A1:0-4|B1:4-6 A1:P-2|B1:P0"
-	"  C0-3:P2   .      .    C4-6 C0-4:C0-3  .      .      .     0 A1:0-3|B1:4-6 A1:P2|B1:P0 0-3"
+	"  C1-3:P2   .      .    C4-6   C1-4     .      .      .     0 A1:1-4|B1:4-6 A1:P-2|B1:P0"
+	"  C1-3:P2   .      .    C4-6 C1-4:C1-3  .      .      .     0 A1:1-3|B1:4-6 A1:P2|B1:P0 1-3"
 
 	# Local partition invalidation tests
 	" C0-3:X1-3:S+:P2 C1-3:X2-3:S+:P2 C2-3:X3:P2 \
@@ -329,8 +329,8 @@ TEST_MATRIX=(
 	" C0-3:X1-3:S+:P2 C1-3:X2-3:S+:P2 C2-3:X3:P2 \
 				   .      .    C4:X     .      .     0 A1:1-3|A2:1-3|A3:2-3|XA2:|XA3: A1:P2|A2:P-2|A3:P-2 1-3"
 	# Local partition CPU change tests
-	" C0-5:S+:P2 C4-5:S+:P1 .  .      .    C3-5     .      .     0 A1:0-2|A2:3-5 A1:P2|A2:P1 0-2"
-	" C0-5:S+:P2 C4-5:S+:P1 .  .    C1-5     .      .      .     0 A1:1-3|A2:4-5 A1:P2|A2:P1 1-3"
+	" C1-5:S+:P2 C4-5:S+:P1 .  .      .    C3-5     .      .     0 A1:1-2|A2:3-5 A1:P2|A2:P1 1-2"
+	" C1-5:S+:P2 C4-5:S+:P1 .  .    C2-5     .      .      .     0 A1:2-3|A2:4-5 A1:P2|A2:P1 2-3"
 
 	# cpus_allowed/exclusive_cpus update tests
 	" C0-3:X2-3:S+ C1-3:X2-3:S+ C2-3:X2-3 \
-- 
2.50.0


  parent reply	other threads:[~2025-08-08 15:21 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-08 15:10 [RFC PATCH 00/18] cgroup/cpuset: Enable runtime modification of Waiman Long
2025-08-08 15:10 ` [RFC PATCH 01/18] sched/isolation: Enable runtime update of housekeeping cpumasks Waiman Long
2025-08-08 15:10 ` [RFC PATCH 02/18] sched/isolation: Call sched_tick_offload_init() when HK_FLAG_KERNEL_NOISE is first set Waiman Long
2025-08-08 15:10 ` [RFC PATCH 03/18] sched/isolation: Use RCU to delay successive housekeeping cpumask updates Waiman Long
2025-08-08 15:10 ` [RFC PATCH 04/18] sched/isolation: Add a debugfs file to dump housekeeping cpumasks Waiman Long
2025-08-08 15:10 ` [RFC PATCH 05/18] cpu/hotplug: Add a new cpuhp_offline_cb() API Waiman Long
2025-08-08 15:10 ` [RFC PATCH 06/18] cgroup/cpuset: Introduce a new top level isolcpus_update_mutex Waiman Long
2025-08-08 15:10 ` [RFC PATCH 07/18] cgroup/cpuset: Allow overwriting HK_TYPE_DOMAIN housekeeping cpumask Waiman Long
2025-08-08 15:10 ` [RFC PATCH 08/18] cgroup/cpuset: Use CPU hotplug to enable runtime nohz_full modification Waiman Long
2025-08-08 15:10 ` [RFC PATCH 09/18] cgroup/cpuset: Revert "Include isolated cpuset CPUs in cpu_is_isolated() check" Waiman Long
2025-08-08 15:19 ` [RFC PATCH 10/18] sched/core: Ignore DL BW deactivation error if in cpuhp_offline_cb_mode Waiman Long
2025-08-08 15:19 ` [RFC PATCH 11/18] tick/nohz: Make nohz_full parameter optional Waiman Long
2025-08-08 15:19 ` [RFC PATCH 12/18] tick/nohz: Introduce tick_nohz_full_update_cpus() to update tick_nohz_full_mask Waiman Long
2025-08-08 15:19 ` [RFC PATCH 13/18] tick/nohz: Allow runtime changes in full dynticks CPUs Waiman Long
2025-08-08 15:19 ` [RFC PATCH 14/18] tick: Pass timer tick job to an online HK CPU in tick_cpu_dying() Waiman Long
2025-08-08 15:19 ` [RFC PATCH 15/18] cgroup/cpuset: Enable RCU NO-CB CPU offloading of newly isolated CPUs Waiman Long
2025-08-08 15:19 ` [RFC PATCH 16/18] cgroup/cpuset: Don't set have_boot_nohz_full without any boot time nohz_full CPU Waiman Long
2025-08-08 15:20 ` Waiman Long [this message]
2025-08-08 15:20 ` [RFC PATCH 18/18] cgroup/cpuset: Add pr_debug() statements for cpuhp_offline_cb() call Waiman Long
2025-08-08 15:50 ` [RFC PATCH 00/18] cgroup/cpuset: Enable runtime modification of Frederic Weisbecker
2025-08-08 16:27   ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250808152001.20245-8-longman@redhat.com \
    --to=longman@redhat.com \
    --cc=anna-maria@linutronix.de \
    --cc=boqun.feng@gmail.com \
    --cc=bsegall@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=ckalina@redhat.com \
    --cc=corbet@lwn.net \
    --cc=costa.shul@redhat.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=frederic@kernel.org \
    --cc=gmonaco@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joelagnelf@nvidia.com \
    --cc=josh@joshtriplett.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=neeraj.upadhyay@kernel.org \
    --cc=pauld@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qiang.zhang@linux.dev \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=urezki@gmail.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).