cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] cgroup, docs: cpu controller interaction with various scheduling policies
@ 2025-05-22  2:08 Shashank Balaji
  2025-05-22  2:08 ` [PATCH v3 1/3] cgroup, docs: convert space indentation to tab indentation Shashank Balaji
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Shashank Balaji @ 2025-05-22  2:08 UTC (permalink / raw)
  To: Tejun Heo, Johannes Weiner, Michal Koutný, Jonathan Corbet
  Cc: cgroups, linux-doc, linux-kernel, Shinya Takumi, Shashank Balaji

The cgroup v2 cpu controller interface files interact with processes
differently based on their scheduling policy and the underlying
scheduler used (fair-class vs. BPF scheduler). This patchset
documents these differences.

Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
---
Changes in v3:
- Refer to sched-ext.rst for fair-class vs. BPF scheduler instead of repeating
the details in cgroup-v2.rst
- Link to v2: https://lore.kernel.org/r/20250520-rt-and-cpu-controller-doc-v2-0-70a2b6a1b703@sony.com

Changes in v2:
- Expanded scope from only RT processes to all scheduling policies
- Link to v1: https://lore.kernel.org/all/20250305-rt-and-cpu-controller-doc-v1-0-7b6a6f5ff43d@sony.com/

---
Shashank Balaji (3):
      cgroup, docs: convert space indentation to tab indentation
      sched_ext, docs: convert mentions of "CFS" to "fair-class scheduler"
      cgroup, docs: cpu controller's interaction with various scheduling policies

 Documentation/admin-guide/cgroup-v2.rst | 77 ++++++++++++++++++++++++---------
 Documentation/scheduler/sched-ext.rst   |  8 ++--
 2 files changed, 60 insertions(+), 25 deletions(-)
---
base-commit: 036ee8a17bd046d7a350de0aae152307a061cc46
change-id: 20250226-rt-and-cpu-controller-doc-8a8aac572f3e

Best regards,
-- 
Shashank Balaji <shashank.mahadasyam@sony.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/3] cgroup, docs: convert space indentation to tab indentation
  2025-05-22  2:08 [PATCH v3 0/3] cgroup, docs: cpu controller interaction with various scheduling policies Shashank Balaji
@ 2025-05-22  2:08 ` Shashank Balaji
  2025-05-22 19:07   ` Tejun Heo
  2025-05-22  2:08 ` [PATCH v3 2/3] sched_ext, docs: convert mentions of "CFS" to "fair-class scheduler" Shashank Balaji
  2025-05-22  2:08 ` [PATCH v3 3/3] cgroup, docs: cpu controller's interaction with various scheduling policies Shashank Balaji
  2 siblings, 1 reply; 8+ messages in thread
From: Shashank Balaji @ 2025-05-22  2:08 UTC (permalink / raw)
  To: Tejun Heo, Johannes Weiner, Michal Koutný, Jonathan Corbet
  Cc: cgroups, linux-doc, linux-kernel, Shinya Takumi, Shashank Balaji

The paragraphs on cpu.uclamp.{min,max} are space indented. Convert them to
tab indentation to make them uniform with the other paragraphs.

Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
---
 Documentation/admin-guide/cgroup-v2.rst | 36 +++++++++++++++++----------------
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 1a16ce68a4d7f6f8c9070be89c4975dbfa79077e..226fc7f9212eafcbf83c81f5b08391f215c1d894 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1162,30 +1162,32 @@ All time durations are in microseconds.
 	:ref:`Documentation/accounting/psi.rst <psi>` for details.
 
   cpu.uclamp.min
-        A read-write single value file which exists on non-root cgroups.
-        The default is "0", i.e. no utilization boosting.
+	A read-write single value file which exists on non-root cgroups.
+	The default is "0", i.e. no utilization boosting.
 
-        The requested minimum utilization (protection) as a percentage
-        rational number, e.g. 12.34 for 12.34%.
+	The requested minimum utilization (protection) as a percentage
+	rational number, e.g. 12.34 for 12.34%.
 
-        This interface allows reading and setting minimum utilization clamp
-        values similar to the sched_setattr(2). This minimum utilization
-        value is used to clamp the task specific minimum utilization clamp.
+	This interface allows reading and setting minimum utilization clamp
+	values similar to the sched_setattr(2). This minimum utilization
+	value is used to clamp the task specific minimum utilization clamp,
+	including those of realtime processes.
 
-        The requested minimum utilization (protection) is always capped by
-        the current value for the maximum utilization (limit), i.e.
-        `cpu.uclamp.max`.
+	The requested minimum utilization (protection) is always capped by
+	the current value for the maximum utilization (limit), i.e.
+	`cpu.uclamp.max`.
 
   cpu.uclamp.max
-        A read-write single value file which exists on non-root cgroups.
-        The default is "max". i.e. no utilization capping
+	A read-write single value file which exists on non-root cgroups.
+	The default is "max". i.e. no utilization capping
 
-        The requested maximum utilization (limit) as a percentage rational
-        number, e.g. 98.76 for 98.76%.
+	The requested maximum utilization (limit) as a percentage rational
+	number, e.g. 98.76 for 98.76%.
 
-        This interface allows reading and setting maximum utilization clamp
-        values similar to the sched_setattr(2). This maximum utilization
-        value is used to clamp the task specific maximum utilization clamp.
+	This interface allows reading and setting maximum utilization clamp
+	values similar to the sched_setattr(2). This maximum utilization
+	value is used to clamp the task specific maximum utilization clamp,
+	including those of realtime processes.
 
   cpu.idle
 	A read-write single value file which exists on non-root cgroups.

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/3] sched_ext, docs: convert mentions of "CFS" to "fair-class scheduler"
  2025-05-22  2:08 [PATCH v3 0/3] cgroup, docs: cpu controller interaction with various scheduling policies Shashank Balaji
  2025-05-22  2:08 ` [PATCH v3 1/3] cgroup, docs: convert space indentation to tab indentation Shashank Balaji
@ 2025-05-22  2:08 ` Shashank Balaji
  2025-05-22 19:09   ` Tejun Heo
  2025-05-22  2:08 ` [PATCH v3 3/3] cgroup, docs: cpu controller's interaction with various scheduling policies Shashank Balaji
  2 siblings, 1 reply; 8+ messages in thread
From: Shashank Balaji @ 2025-05-22  2:08 UTC (permalink / raw)
  To: Tejun Heo, Johannes Weiner, Michal Koutný, Jonathan Corbet
  Cc: cgroups, linux-doc, linux-kernel, Shinya Takumi, Shashank Balaji

Mentions of CFS are stale since the fair-class scheduler is implemented using
EEVDF. So, convert such mentions to "fair-class scheduler" to stay
algorithm-name agnostic.

Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
---
 Documentation/scheduler/sched-ext.rst | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/scheduler/sched-ext.rst b/Documentation/scheduler/sched-ext.rst
index 0b2654e2164b8e6139db19fc8b68e6c5c289503d..ceca6f8966eeeb5f029a9ae41c039d67c1db7be8 100644
--- a/Documentation/scheduler/sched-ext.rst
+++ b/Documentation/scheduler/sched-ext.rst
@@ -47,8 +47,8 @@ options should be enabled to use sched_ext:
 sched_ext is used only when the BPF scheduler is loaded and running.
 
 If a task explicitly sets its scheduling policy to ``SCHED_EXT``, it will be
-treated as ``SCHED_NORMAL`` and scheduled by CFS until the BPF scheduler is
-loaded.
+treated as ``SCHED_NORMAL`` and scheduled by the fair-class scheduler until the
+BPF scheduler is loaded.
 
 When the BPF scheduler is loaded and ``SCX_OPS_SWITCH_PARTIAL`` is not set
 in ``ops->flags``, all ``SCHED_NORMAL``, ``SCHED_BATCH``, ``SCHED_IDLE``, and
@@ -57,11 +57,11 @@ in ``ops->flags``, all ``SCHED_NORMAL``, ``SCHED_BATCH``, ``SCHED_IDLE``, and
 However, when the BPF scheduler is loaded and ``SCX_OPS_SWITCH_PARTIAL`` is
 set in ``ops->flags``, only tasks with the ``SCHED_EXT`` policy are scheduled
 by sched_ext, while tasks with ``SCHED_NORMAL``, ``SCHED_BATCH`` and
-``SCHED_IDLE`` policies are scheduled by CFS.
+``SCHED_IDLE`` policies are scheduled by the fair-class scheduler.
 
 Terminating the sched_ext scheduler program, triggering `SysRq-S`, or
 detection of any internal error including stalled runnable tasks aborts the
-BPF scheduler and reverts all tasks back to CFS.
+BPF scheduler and reverts all tasks back to the fair-class scheduler.
 
 .. code-block:: none
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 3/3] cgroup, docs: cpu controller's interaction with various scheduling policies
  2025-05-22  2:08 [PATCH v3 0/3] cgroup, docs: cpu controller interaction with various scheduling policies Shashank Balaji
  2025-05-22  2:08 ` [PATCH v3 1/3] cgroup, docs: convert space indentation to tab indentation Shashank Balaji
  2025-05-22  2:08 ` [PATCH v3 2/3] sched_ext, docs: convert mentions of "CFS" to "fair-class scheduler" Shashank Balaji
@ 2025-05-22  2:08 ` Shashank Balaji
  2025-05-22  2:16   ` Shashank Balaji
  2025-05-22 19:11   ` Tejun Heo
  2 siblings, 2 replies; 8+ messages in thread
From: Shashank Balaji @ 2025-05-22  2:08 UTC (permalink / raw)
  To: Tejun Heo, Johannes Weiner, Michal Koutný, Jonathan Corbet
  Cc: cgroups, linux-doc, linux-kernel, Shinya Takumi, Shashank Balaji

The cpu controller interface files account for or affect processes
differently based on their scheduling policy, and the underlying
scheduler used (fair-class vs. BPF scheduler). Document these
differences

Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
---
 Documentation/admin-guide/cgroup-v2.rst | 41 +++++++++++++++++++++++++++++----
 1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 226fc7f9212eafcbf83c81f5b08391f215c1d894..f6dc95608d239d586b482154c4367baaf5614fb6 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1095,19 +1095,34 @@ realtime processes irrespective of CONFIG_RT_GROUP_SCHED.
 CPU Interface Files
 ~~~~~~~~~~~~~~~~~~~
 
-All time durations are in microseconds.
+The interaction of a process with the cpu controller depends on its scheduling
+policy and the underlying scheduler. From the point of view of the cpu controller,
+processes can be categorized as follows:
+
+* Processes under the fair-class scheduler
+* Processes under a BPF scheduler with the ``cgroup_set_weight`` callback
+* Everything else: ``SCHED_{FIFO,RR,DEADLINE}`` and processes under a BPF scheduler
+  without the ``cgroup_set_weight`` callback
+
+For details on when a process is under the fair-class scheduler or a BPF scheduler,
+check out :ref:`Documentation/scheduler/sched-ext.rst <sched-ext>`.
+
+For each of the following interface files, the above categories
+will be referred to. All time durations are in microseconds.
 
   cpu.stat
 	A read-only flat-keyed file.
 	This file exists whether the controller is enabled or not.
 
-	It always reports the following three stats:
+	It always reports the following three stats, which account for all the
+	processes in the cgroup:
 
 	- usage_usec
 	- user_usec
 	- system_usec
 
-	and the following five when the controller is enabled:
+	and the following five when the controller is enabled, which account for
+	only the processes under the fair-class scheduler:
 
 	- nr_periods
 	- nr_throttled
@@ -1125,6 +1140,10 @@ All time durations are in microseconds.
 	If the cgroup has been configured to be SCHED_IDLE (cpu.idle = 1),
 	then the weight will show as a 0.
 
+	This file affects only processes under the fair-class scheduler and a BPF
+	scheduler with the ``cgroup_set_weight`` callback depending on what the
+	callback actually does.
+
   cpu.weight.nice
 	A read-write single value file which exists on non-root
 	cgroups.  The default is "0".
@@ -1137,6 +1156,10 @@ All time durations are in microseconds.
 	granularity is coarser for the nice values, the read value is
 	the closest approximation of the current weight.
 
+	This file affects only processes under the fair-class scheduler and a BPF
+	scheduler with the ``cgroup_set_weight`` callback depending on what the
+	callback actually does.
+
   cpu.max
 	A read-write two value file which exists on non-root cgroups.
 	The default is "max 100000".
@@ -1149,18 +1172,24 @@ All time durations are in microseconds.
 	$PERIOD duration.  "max" for $MAX indicates no limit.  If only
 	one number is written, $MAX is updated.
 
+	This file affects only processes under the fair-class scheduler.
+
   cpu.max.burst
 	A read-write single value file which exists on non-root
 	cgroups.  The default is "0".
 
 	The burst in the range [0, $MAX].
 
+	This file affects only processes under the fair-class scheduler.
+
   cpu.pressure
 	A read-write nested-keyed file.
 
 	Shows pressure stall information for CPU. See
 	:ref:`Documentation/accounting/psi.rst <psi>` for details.
 
+	This file accounts for all the processes in the cgroup.
+
   cpu.uclamp.min
 	A read-write single value file which exists on non-root cgroups.
 	The default is "0", i.e. no utilization boosting.
@@ -1177,6 +1206,8 @@ All time durations are in microseconds.
 	the current value for the maximum utilization (limit), i.e.
 	`cpu.uclamp.max`.
 
+	This file affects all the processes in the cgroup.
+
   cpu.uclamp.max
 	A read-write single value file which exists on non-root cgroups.
 	The default is "max". i.e. no utilization capping
@@ -1189,6 +1220,8 @@ All time durations are in microseconds.
 	value is used to clamp the task specific maximum utilization clamp,
 	including those of realtime processes.
 
+	This file affects all the processes in the cgroup.
+
   cpu.idle
 	A read-write single value file which exists on non-root cgroups.
 	The default is 0.
@@ -1199,7 +1232,7 @@ All time durations are in microseconds.
 	own relative priorities, but the cgroup itself will be treated as
 	very low priority relative to its peers.
 
-
+	This file affects only processes under the fair-class scheduler.
 
 Memory
 ------

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 3/3] cgroup, docs: cpu controller's interaction with various scheduling policies
  2025-05-22  2:08 ` [PATCH v3 3/3] cgroup, docs: cpu controller's interaction with various scheduling policies Shashank Balaji
@ 2025-05-22  2:16   ` Shashank Balaji
  2025-05-22 19:11   ` Tejun Heo
  1 sibling, 0 replies; 8+ messages in thread
From: Shashank Balaji @ 2025-05-22  2:16 UTC (permalink / raw)
  To: Tejun Heo, Johannes Weiner, Michal Koutný, Jonathan Corbet
  Cc: cgroups, linux-doc, linux-kernel, Shinya Takumi

Hi,

On Thu, May 22, 2025 at 11:08:14AM +0900, Shashank Balaji wrote:
> +* Processes under the fair-class scheduler
> +* Processes under a BPF scheduler with the ``cgroup_set_weight`` callback
> +* Everything else: ``SCHED_{FIFO,RR,DEADLINE}`` and processes under a BPF scheduler
> +  without the ``cgroup_set_weight`` callback

Though `cgroup_set_weight` is referred to here, CONFIG_EXT_GROUP_SCHED
is not yet documented in sched-ext.rst. But I don't understand it well
enough to add that documentation myself.

Thanks

Regards,
Shashank

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/3] cgroup, docs: convert space indentation to tab indentation
  2025-05-22  2:08 ` [PATCH v3 1/3] cgroup, docs: convert space indentation to tab indentation Shashank Balaji
@ 2025-05-22 19:07   ` Tejun Heo
  0 siblings, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2025-05-22 19:07 UTC (permalink / raw)
  To: Shashank Balaji
  Cc: Johannes Weiner, Michal Koutný, Jonathan Corbet, cgroups,
	linux-doc, linux-kernel, Shinya Takumi

On Thu, May 22, 2025 at 11:08:12AM +0900, Shashank Balaji wrote:
> The paragraphs on cpu.uclamp.{min,max} are space indented. Convert them to
> tab indentation to make them uniform with the other paragraphs.
> 
> Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>

Applied to cgroup/for-6.16.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 2/3] sched_ext, docs: convert mentions of "CFS" to "fair-class scheduler"
  2025-05-22  2:08 ` [PATCH v3 2/3] sched_ext, docs: convert mentions of "CFS" to "fair-class scheduler" Shashank Balaji
@ 2025-05-22 19:09   ` Tejun Heo
  0 siblings, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2025-05-22 19:09 UTC (permalink / raw)
  To: Shashank Balaji
  Cc: Johannes Weiner, Michal Koutný, Jonathan Corbet, cgroups,
	linux-doc, linux-kernel, Shinya Takumi

On Thu, May 22, 2025 at 11:08:13AM +0900, Shashank Balaji wrote:
> Mentions of CFS are stale since the fair-class scheduler is implemented using
> EEVDF. So, convert such mentions to "fair-class scheduler" to stay
> algorithm-name agnostic.
> 
> Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>

Applied to sched_ext/for-6.16.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 3/3] cgroup, docs: cpu controller's interaction with various scheduling policies
  2025-05-22  2:08 ` [PATCH v3 3/3] cgroup, docs: cpu controller's interaction with various scheduling policies Shashank Balaji
  2025-05-22  2:16   ` Shashank Balaji
@ 2025-05-22 19:11   ` Tejun Heo
  1 sibling, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2025-05-22 19:11 UTC (permalink / raw)
  To: Shashank Balaji
  Cc: Johannes Weiner, Michal Koutný, Jonathan Corbet, cgroups,
	linux-doc, linux-kernel, Shinya Takumi

On Thu, May 22, 2025 at 11:08:14AM +0900, Shashank Balaji wrote:
> The cpu controller interface files account for or affect processes
> differently based on their scheduling policy, and the underlying
> scheduler used (fair-class vs. BPF scheduler). Document these
> differences
> 
> Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>

Applied to cgroup/for-6.16.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-05-22 19:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-22  2:08 [PATCH v3 0/3] cgroup, docs: cpu controller interaction with various scheduling policies Shashank Balaji
2025-05-22  2:08 ` [PATCH v3 1/3] cgroup, docs: convert space indentation to tab indentation Shashank Balaji
2025-05-22 19:07   ` Tejun Heo
2025-05-22  2:08 ` [PATCH v3 2/3] sched_ext, docs: convert mentions of "CFS" to "fair-class scheduler" Shashank Balaji
2025-05-22 19:09   ` Tejun Heo
2025-05-22  2:08 ` [PATCH v3 3/3] cgroup, docs: cpu controller's interaction with various scheduling policies Shashank Balaji
2025-05-22  2:16   ` Shashank Balaji
2025-05-22 19:11   ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).