* [PATCH v2 0/3] cgroup, docs: cpu controller interaction with various scheduling policies
@ 2025-05-20 14:07 Shashank Balaji via B4 Relay
2025-05-20 14:07 ` [PATCH v2 1/3] cgroup, docs: be specific about bandwidth control of rt processes Shashank Balaji via B4 Relay
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Shashank Balaji via B4 Relay @ 2025-05-20 14:07 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný, Jonathan Corbet
Cc: cgroups, linux-doc, linux-kernel, Shinya Takumi, Shashank Balaji
The cgroup v2 cpu controller interface files interact with processes
differently based on their scheduling policy and the underlying
scheduler used (fair-class vs. BPF scheduler). This patchset
documents these differences.
This is related to the previous patchset titled "cgroup, docs: Clarify
interaction of RT processes with cgroup v2 cpu controller"
(https://lore.kernel.org/all/20250305-rt-and-cpu-controller-doc-v1-0-7b6a6f5ff43d@sony.com/),
which focused solely on RT processes. The current patchset incorporates
the previous feedback and expands on the scope of scheduling policies.
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
---
Shashank Balaji (3):
cgroup, docs: be specific about bandwidth control of rt processes
sched_ext, docs: add label
cgroup, docs: cpu controller interaction with various scheduling policies
Documentation/admin-guide/cgroup-v2.rst | 100 ++++++++++++++++++++++++--------
Documentation/scheduler/sched-ext.rst | 2 +
2 files changed, 78 insertions(+), 24 deletions(-)
---
base-commit: 036ee8a17bd046d7a350de0aae152307a061cc46
change-id: 20250226-rt-and-cpu-controller-doc-8a8aac572f3e
Best regards,
--
Shashank Balaji <shashank.mahadasyam@sony.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/3] cgroup, docs: be specific about bandwidth control of rt processes
2025-05-20 14:07 [PATCH v2 0/3] cgroup, docs: cpu controller interaction with various scheduling policies Shashank Balaji via B4 Relay
@ 2025-05-20 14:07 ` Shashank Balaji via B4 Relay
2025-05-20 20:11 ` Tejun Heo
2025-05-21 16:35 ` Tejun Heo
2025-05-20 14:07 ` [PATCH v2 2/3] sched_ext, docs: add label Shashank Balaji via B4 Relay
2025-05-20 14:07 ` [PATCH v2 3/3] cgroup, docs: cpu controller interaction with various scheduling policies Shashank Balaji via B4 Relay
2 siblings, 2 replies; 12+ messages in thread
From: Shashank Balaji via B4 Relay @ 2025-05-20 14:07 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný, Jonathan Corbet
Cc: cgroups, linux-doc, linux-kernel, Shinya Takumi, Shashank Balaji
From: Shashank Balaji <shashank.mahadasyam@sony.com>
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
---
Documentation/admin-guide/cgroup-v2.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 1a16ce68a4d7f6f8c9070be89c4975dbfa79077e..3b3685736fe9b12e96a273248dfb4a8c62a4b698 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1076,7 +1076,7 @@ cpufreq governor about the minimum desired frequency which should always be
provided by a CPU, as well as the maximum desired frequency, which should not
be exceeded by a CPU.
-WARNING: cgroup2 cpu controller doesn't yet fully support the control of
+WARNING: cgroup2 cpu controller doesn't yet support the (bandwidth) control of
realtime processes. For a kernel built with the CONFIG_RT_GROUP_SCHED option
enabled for group scheduling of realtime processes, the cpu controller can only
be enabled when all RT processes are in the root cgroup. Be aware that system
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 2/3] sched_ext, docs: add label
2025-05-20 14:07 [PATCH v2 0/3] cgroup, docs: cpu controller interaction with various scheduling policies Shashank Balaji via B4 Relay
2025-05-20 14:07 ` [PATCH v2 1/3] cgroup, docs: be specific about bandwidth control of rt processes Shashank Balaji via B4 Relay
@ 2025-05-20 14:07 ` Shashank Balaji via B4 Relay
2025-05-20 20:26 ` Tejun Heo
2025-05-20 14:07 ` [PATCH v2 3/3] cgroup, docs: cpu controller interaction with various scheduling policies Shashank Balaji via B4 Relay
2 siblings, 1 reply; 12+ messages in thread
From: Shashank Balaji via B4 Relay @ 2025-05-20 14:07 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný, Jonathan Corbet
Cc: cgroups, linux-doc, linux-kernel, Shinya Takumi, Shashank Balaji
From: Shashank Balaji <shashank.mahadasyam@sony.com>
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
---
Documentation/scheduler/sched-ext.rst | 2 ++
1 file changed, 2 insertions(+)
diff --git a/Documentation/scheduler/sched-ext.rst b/Documentation/scheduler/sched-ext.rst
index 0b2654e2164b8e6139db19fc8b68e6c5c289503d..03f9e63a7b2aa10567366ff3e1a0cfc05b43e05a 100644
--- a/Documentation/scheduler/sched-ext.rst
+++ b/Documentation/scheduler/sched-ext.rst
@@ -1,3 +1,5 @@
+.. _sched-ext:
+
==========================
Extensible Scheduler Class
==========================
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 3/3] cgroup, docs: cpu controller interaction with various scheduling policies
2025-05-20 14:07 [PATCH v2 0/3] cgroup, docs: cpu controller interaction with various scheduling policies Shashank Balaji via B4 Relay
2025-05-20 14:07 ` [PATCH v2 1/3] cgroup, docs: be specific about bandwidth control of rt processes Shashank Balaji via B4 Relay
2025-05-20 14:07 ` [PATCH v2 2/3] sched_ext, docs: add label Shashank Balaji via B4 Relay
@ 2025-05-20 14:07 ` Shashank Balaji via B4 Relay
2025-05-20 20:16 ` Tejun Heo
2 siblings, 1 reply; 12+ messages in thread
From: Shashank Balaji via B4 Relay @ 2025-05-20 14:07 UTC (permalink / raw)
To: Tejun Heo, Johannes Weiner, Michal Koutný, Jonathan Corbet
Cc: cgroups, linux-doc, linux-kernel, Shinya Takumi, Shashank Balaji
From: Shashank Balaji <shashank.mahadasyam@sony.com>
The cpu controller interface files account for or affect processes
differently based on their scheduling policy, and the underlying
scheduler used (fair-class vs. BPF scheduler). Document these
differences
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
---
Documentation/admin-guide/cgroup-v2.rst | 98 +++++++++++++++++++++++++--------
1 file changed, 75 insertions(+), 23 deletions(-)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 3b3685736fe9b12e96a273248dfb4a8c62a4b698..0f79bf42a3e3b2fcbe6409f9e182ba9de1fbb79c 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1095,19 +1095,50 @@ realtime processes irrespective of CONFIG_RT_GROUP_SCHED.
CPU Interface Files
~~~~~~~~~~~~~~~~~~~
-All time durations are in microseconds.
+The interaction of a process with the cpu controller depends on its scheduling
+policy. We have the following scheduling policies: ``SCHED_IDLE``, ``SCHED_BATCH``,
+``SCHED_OTHER``, ``SCHED_EXT`` (if ``CONFIG_SCHED_CLASS_EXT`` is enabled), ``SCHED_FIFO``,
+``SCHED_RR``, and ``SCHED_DEADLINE``. ``SCHED_{IDLE,BATCH,OTHER,EXT}`` can be scheduled
+either by the fair-class scheduler or by a BPF scheduler::
+
+ CONFIG_SCHED_CLASS_EXT
+ ├─ Disabled
+ | └─ SCHED_{IDLE,BATCH,OTHER} -> fair-class scheduler
+ └─ Enabled
+ ├─ BPF scheduler disabled
+ | └─ SCHED_{IDLE,BATCH,OTHER,EXT} -> fair-class scheduler
+ ├─ BPF scheduler without SCX_OPS_SWITCH_PARTIAL enabled
+ | └─ SCHED_{IDLE,BATCH,OTHER,EXT} -> BPF scheduler
+ └─ BPF scheduler with SCX_OPS_SWITCH_PARTIAL enabled
+ ├─ SCHED_{IDLE,BATCH,OTHER} -> fair-class scheduler
+ └─ SCHED_EXT -> BPF scheduler
+
+For more details on ``SCHED_EXT``, check out :ref:`Documentation/scheduler/sched-ext.rst. <sched-ext>`
+From the point of view of the cpu controller, processes can be categorized as
+follows:
+
+* Processes under the fair-class scheduler
+* Processes under a BPF scheduler with the ``cgroup_set_weight`` callback
+* Everything else: ``SCHED_{FIFO,RR,DEADLINE}`` and processes under a BPF scheduler
+ without the ``cgroup_set_weight`` callback
+
+Note that the ``cgroup_*`` family of callbacks require ``CONFIG_EXT_GROUP_SCHED``
+to be enabled. For each of the following interface files, the above categories
+will be referred to. All time durations are in microseconds.
cpu.stat
A read-only flat-keyed file.
This file exists whether the controller is enabled or not.
- It always reports the following three stats:
+ It always reports the following three stats, which account for all the
+ processes in the cgroup:
- usage_usec
- user_usec
- system_usec
- and the following five when the controller is enabled:
+ and the following five when the controller is enabled, which account for
+ only the processes under the fair-class scheduler:
- nr_periods
- nr_throttled
@@ -1125,6 +1156,10 @@ All time durations are in microseconds.
If the cgroup has been configured to be SCHED_IDLE (cpu.idle = 1),
then the weight will show as a 0.
+ This file affects only processes under the fair-class scheduler and a BPF
+ scheduler with the ``cgroup_set_weight`` callback depending on what the
+ callback actually does.
+
cpu.weight.nice
A read-write single value file which exists on non-root
cgroups. The default is "0".
@@ -1137,6 +1172,10 @@ All time durations are in microseconds.
granularity is coarser for the nice values, the read value is
the closest approximation of the current weight.
+ This file affects only processes under the fair-class scheduler and a BPF
+ scheduler with the ``cgroup_set_weight`` callback depending on what the
+ callback actually does.
+
cpu.max
A read-write two value file which exists on non-root cgroups.
The default is "max 100000".
@@ -1149,43 +1188,56 @@ All time durations are in microseconds.
$PERIOD duration. "max" for $MAX indicates no limit. If only
one number is written, $MAX is updated.
+ This file affects only processes under the fair-class scheduler.
+
cpu.max.burst
A read-write single value file which exists on non-root
cgroups. The default is "0".
The burst in the range [0, $MAX].
+ This file affects only processes under the fair-class scheduler.
+
cpu.pressure
A read-write nested-keyed file.
- Shows pressure stall information for CPU. See
- :ref:`Documentation/accounting/psi.rst <psi>` for details.
+ Shows pressure stall information for CPU, including the contribution of
+ realtime processes. See :ref:`Documentation/accounting/psi.rst <psi>`
+ for details.
+
+ This file accounts for all the processes in the cgroup.
cpu.uclamp.min
- A read-write single value file which exists on non-root cgroups.
- The default is "0", i.e. no utilization boosting.
+ A read-write single value file which exists on non-root cgroups.
+ The default is "0", i.e. no utilization boosting.
- The requested minimum utilization (protection) as a percentage
- rational number, e.g. 12.34 for 12.34%.
+ The requested minimum utilization (protection) as a percentage
+ rational number, e.g. 12.34 for 12.34%.
- This interface allows reading and setting minimum utilization clamp
- values similar to the sched_setattr(2). This minimum utilization
- value is used to clamp the task specific minimum utilization clamp.
+ This interface allows reading and setting minimum utilization clamp
+ values similar to the sched_setattr(2). This minimum utilization
+ value is used to clamp the task specific minimum utilization clamp,
+ including those of realtime processes.
- The requested minimum utilization (protection) is always capped by
- the current value for the maximum utilization (limit), i.e.
- `cpu.uclamp.max`.
+ The requested minimum utilization (protection) is always capped by
+ the current value for the maximum utilization (limit), i.e.
+ `cpu.uclamp.max`.
+
+ This file affects all the processes in the cgroup.
cpu.uclamp.max
- A read-write single value file which exists on non-root cgroups.
- The default is "max". i.e. no utilization capping
+ A read-write single value file which exists on non-root cgroups.
+ The default is "max". i.e. no utilization capping
- The requested maximum utilization (limit) as a percentage rational
- number, e.g. 98.76 for 98.76%.
+ The requested maximum utilization (limit) as a percentage rational
+ number, e.g. 98.76 for 98.76%.
- This interface allows reading and setting maximum utilization clamp
- values similar to the sched_setattr(2). This maximum utilization
- value is used to clamp the task specific maximum utilization clamp.
+ This interface allows reading and setting maximum utilization clamp
+ values similar to the sched_setattr(2). This maximum utilization
+ value is used to clamp the task specific maximum utilization clamp,
+ including those of realtime processes.
+
+ This file affects all the processes in the cgroup.
cpu.idle
A read-write single value file which exists on non-root cgroups.
@@ -1197,7 +1249,7 @@ All time durations are in microseconds.
own relative priorities, but the cgroup itself will be treated as
very low priority relative to its peers.
-
+ This file affects only processes under the fair-class scheduler.
Memory
------
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/3] cgroup, docs: be specific about bandwidth control of rt processes
2025-05-20 14:07 ` [PATCH v2 1/3] cgroup, docs: be specific about bandwidth control of rt processes Shashank Balaji via B4 Relay
@ 2025-05-20 20:11 ` Tejun Heo
2025-05-21 1:14 ` Shashank.Mahadasyam
2025-05-21 16:35 ` Tejun Heo
1 sibling, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2025-05-20 20:11 UTC (permalink / raw)
To: shashank.mahadasyam
Cc: Johannes Weiner, Michal Koutný, Jonathan Corbet, cgroups,
linux-doc, linux-kernel, Shinya Takumi
On Tue, May 20, 2025 at 11:07:45PM +0900, Shashank Balaji via B4 Relay wrote:
> From: Shashank Balaji <shashank.mahadasyam@sony.com>
>
> Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
> ---
> Documentation/admin-guide/cgroup-v2.rst | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 1a16ce68a4d7f6f8c9070be89c4975dbfa79077e..3b3685736fe9b12e96a273248dfb4a8c62a4b698 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1076,7 +1076,7 @@ cpufreq governor about the minimum desired frequency which should always be
> provided by a CPU, as well as the maximum desired frequency, which should not
> be exceeded by a CPU.
>
> -WARNING: cgroup2 cpu controller doesn't yet fully support the control of
> +WARNING: cgroup2 cpu controller doesn't yet support the (bandwidth) control of
This reads weird to me. Without the () part, it becomes "doesn't yet support
the control of". Maybe rephrase it a bit more?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 3/3] cgroup, docs: cpu controller interaction with various scheduling policies
2025-05-20 14:07 ` [PATCH v2 3/3] cgroup, docs: cpu controller interaction with various scheduling policies Shashank Balaji via B4 Relay
@ 2025-05-20 20:16 ` Tejun Heo
2025-05-21 7:49 ` Shashank Balaji
0 siblings, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2025-05-20 20:16 UTC (permalink / raw)
To: shashank.mahadasyam
Cc: Johannes Weiner, Michal Koutný, Jonathan Corbet, cgroups,
linux-doc, linux-kernel, Shinya Takumi
Hello,
On Tue, May 20, 2025 at 11:07:47PM +0900, Shashank Balaji via B4 Relay wrote:
...
> +The interaction of a process with the cpu controller depends on its scheduling
> +policy. We have the following scheduling policies: ``SCHED_IDLE``, ``SCHED_BATCH``,
> +``SCHED_OTHER``, ``SCHED_EXT`` (if ``CONFIG_SCHED_CLASS_EXT`` is enabled), ``SCHED_FIFO``,
> +``SCHED_RR``, and ``SCHED_DEADLINE``. ``SCHED_{IDLE,BATCH,OTHER,EXT}`` can be scheduled
> +either by the fair-class scheduler or by a BPF scheduler::
> +
> + CONFIG_SCHED_CLASS_EXT
> + ├─ Disabled
> + | └─ SCHED_{IDLE,BATCH,OTHER} -> fair-class scheduler
> + └─ Enabled
> + ├─ BPF scheduler disabled
> + | └─ SCHED_{IDLE,BATCH,OTHER,EXT} -> fair-class scheduler
> + ├─ BPF scheduler without SCX_OPS_SWITCH_PARTIAL enabled
> + | └─ SCHED_{IDLE,BATCH,OTHER,EXT} -> BPF scheduler
> + └─ BPF scheduler with SCX_OPS_SWITCH_PARTIAL enabled
> + ├─ SCHED_{IDLE,BATCH,OTHER} -> fair-class scheduler
> + └─ SCHED_EXT -> BPF scheduler
> +
> +For more details on ``SCHED_EXT``, check out :ref:`Documentation/scheduler/sched-ext.rst. <sched-ext>`
> +From the point of view of the cpu controller, processes can be categorized as
> +follows:
> +
> +* Processes under the fair-class scheduler
> +* Processes under a BPF scheduler with the ``cgroup_set_weight`` callback
> +* Everything else: ``SCHED_{FIFO,RR,DEADLINE}`` and processes under a BPF scheduler
> + without the ``cgroup_set_weight`` callback
> +
> +Note that the ``cgroup_*`` family of callbacks require ``CONFIG_EXT_GROUP_SCHED``
> +to be enabled. For each of the following interface files, the above categories
> +will be referred to. All time durations are in microseconds.
Can we document the above in sched_ext documentation and point to it from
here? Documenting sched_ext details here seems a bit out of place and prone
to becoming stale over time.
...
> cpu.uclamp.min
> - A read-write single value file which exists on non-root cgroups.
> - The default is "0", i.e. no utilization boosting.
> + A read-write single value file which exists on non-root cgroups.
> + The default is "0", i.e. no utilization boosting.
Can you please separate out indentation changes to a separate patch? These
usually make reviewing tricky.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 2/3] sched_ext, docs: add label
2025-05-20 14:07 ` [PATCH v2 2/3] sched_ext, docs: add label Shashank Balaji via B4 Relay
@ 2025-05-20 20:26 ` Tejun Heo
0 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2025-05-20 20:26 UTC (permalink / raw)
To: shashank.mahadasyam
Cc: Johannes Weiner, Michal Koutný, Jonathan Corbet, cgroups,
linux-doc, linux-kernel, Shinya Takumi
On Tue, May 20, 2025 at 11:07:46PM +0900, Shashank Balaji via B4 Relay wrote:
> From: Shashank Balaji <shashank.mahadasyam@sony.com>
>
> Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Applied to sched_ext/for-6.16.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/3] cgroup, docs: be specific about bandwidth control of rt processes
2025-05-20 20:11 ` Tejun Heo
@ 2025-05-21 1:14 ` Shashank.Mahadasyam
2025-05-21 16:30 ` Tejun Heo
0 siblings, 1 reply; 12+ messages in thread
From: Shashank.Mahadasyam @ 2025-05-21 1:14 UTC (permalink / raw)
To: Tejun Heo
Cc: Johannes Weiner, Michal Koutný, Jonathan Corbet,
cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, Shinya.Takumi@sony.com
Hi Tejun,
On 21 May 2025 5:11, Tejun Heo wrote:
> > -WARNING: cgroup2 cpu controller doesn't yet fully support the control of
> > +WARNING: cgroup2 cpu controller doesn't yet support the (bandwidth) control of
>
> This reads weird to me. Without the () part, it becomes "doesn't yet support
> the control of". Maybe rephrase it a bit more?
I'm not sure how to rephrase it. It sounds fine to me 😅 Moreover, "doesn't yet support the control of" was the wording when the warning paragraph on RT_GROUP_SCHED was added in commit c2f31b79 (cgroup: add warning about RT not being supported on cgroup2). Would removing the parentheses, making it "doesn't yet support the bandwidth control of", sound better?
Thank you
Regards,
Shashank
________________________________________
From: Tejun Heo <tj@kernel.org>
Sent: 21 May 2025 5:11
To: Mahadasyam, Shashank (SGC)
Cc: Johannes Weiner; Michal Koutný; Jonathan Corbet; cgroups@vger.kernel.org; linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; Takumi, Shinya (SGC)
Subject: Re: [PATCH v2 1/3] cgroup, docs: be specific about bandwidth control of rt processes
On Tue, May 20, 2025 at 11: 07: 45PM +0900, Shashank Balaji via B4 Relay wrote: > From: Shashank Balaji <shashank. mahadasyam@ sony. com> > > Signed-off-by: Shashank Balaji <shashank. mahadasyam@ sony. com> > --- > Documentation/admin-guide/cgroup-v2. rst
On Tue, May 20, 2025 at 11:07:45PM +0900, Shashank Balaji via B4 Relay wrote:
> From: Shashank Balaji <shashank.mahadasyam@sony.com>
>
> Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
> ---
> Documentation/admin-guide/cgroup-v2.rst | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 1a16ce68a4d7f6f8c9070be89c4975dbfa79077e..3b3685736fe9b12e96a273248dfb4a8c62a4b698 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -1076,7 +1076,7 @@ cpufreq governor about the minimum desired frequency which should always be
> provided by a CPU, as well as the maximum desired frequency, which should not
> be exceeded by a CPU.
>
> -WARNING: cgroup2 cpu controller doesn't yet fully support the control of
> +WARNING: cgroup2 cpu controller doesn't yet support the (bandwidth) control of
This reads weird to me. Without the () part, it becomes "doesn't yet support
the control of". Maybe rephrase it a bit more?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 3/3] cgroup, docs: cpu controller interaction with various scheduling policies
2025-05-20 20:16 ` Tejun Heo
@ 2025-05-21 7:49 ` Shashank Balaji
2025-05-21 17:31 ` Tejun Heo
0 siblings, 1 reply; 12+ messages in thread
From: Shashank Balaji @ 2025-05-21 7:49 UTC (permalink / raw)
To: Tejun Heo
Cc: Johannes Weiner, Michal Koutný, Jonathan Corbet, cgroups,
linux-doc, linux-kernel, Shinya Takumi, shashank.mahadasyam
Hi Tejun,
On Tue, May 20, 2025 at 10:16:38AM -1000, Tejun Heo wrote:
> Hello,
>
> On Tue, May 20, 2025 at 11:07:47PM +0900, Shashank Balaji via B4 Relay wrote:
> ...
> > +The interaction of a process with the cpu controller depends on its scheduling
> > +policy. We have the following scheduling policies: ``SCHED_IDLE``, ``SCHED_BATCH``,
> > +``SCHED_OTHER``, ``SCHED_EXT`` (if ``CONFIG_SCHED_CLASS_EXT`` is enabled), ``SCHED_FIFO``,
> > +``SCHED_RR``, and ``SCHED_DEADLINE``. ``SCHED_{IDLE,BATCH,OTHER,EXT}`` can be scheduled
> > +either by the fair-class scheduler or by a BPF scheduler::
> > +
> > + CONFIG_SCHED_CLASS_EXT
> > + ├─ Disabled
> > + | └─ SCHED_{IDLE,BATCH,OTHER} -> fair-class scheduler
> > + └─ Enabled
> > + ├─ BPF scheduler disabled
> > + | └─ SCHED_{IDLE,BATCH,OTHER,EXT} -> fair-class scheduler
> > + ├─ BPF scheduler without SCX_OPS_SWITCH_PARTIAL enabled
> > + | └─ SCHED_{IDLE,BATCH,OTHER,EXT} -> BPF scheduler
> > + └─ BPF scheduler with SCX_OPS_SWITCH_PARTIAL enabled
> > + ├─ SCHED_{IDLE,BATCH,OTHER} -> fair-class scheduler
> > + └─ SCHED_EXT -> BPF scheduler
> > +
> > +For more details on ``SCHED_EXT``, check out :ref:`Documentation/scheduler/sched-ext.rst. <sched-ext>`
> > +From the point of view of the cpu controller, processes can be categorized as
> > +follows:
> > +
> > +* Processes under the fair-class scheduler
> > +* Processes under a BPF scheduler with the ``cgroup_set_weight`` callback
> > +* Everything else: ``SCHED_{FIFO,RR,DEADLINE}`` and processes under a BPF scheduler
> > + without the ``cgroup_set_weight`` callback
> > +
> > +Note that the ``cgroup_*`` family of callbacks require ``CONFIG_EXT_GROUP_SCHED``
> > +to be enabled. For each of the following interface files, the above categories
> > +will be referred to. All time durations are in microseconds.
>
> Can we document the above in sched_ext documentation and point to it from
> here? Documenting sched_ext details here seems a bit out of place and prone
> to becoming stale over time.
Got it. Apart from that, is the content alright?
> Can you please separate out indentation changes to a separate patch? These
> usually make reviewing tricky.
Got it.
Thank you
Regards,
Shashank
PS: Apologies for any malformed emails. I finally managed to switch from Outlook
to mutt.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/3] cgroup, docs: be specific about bandwidth control of rt processes
2025-05-21 1:14 ` Shashank.Mahadasyam
@ 2025-05-21 16:30 ` Tejun Heo
0 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2025-05-21 16:30 UTC (permalink / raw)
To: Shashank.Mahadasyam@sony.com
Cc: Johannes Weiner, Michal Koutný, Jonathan Corbet,
cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, Shinya.Takumi@sony.com
Hello,
On Wed, May 21, 2025 at 01:14:53AM +0000, Shashank.Mahadasyam@sony.com wrote:
> Hi Tejun,
>
> On 21 May 2025 5:11, Tejun Heo wrote:
> > > -WARNING: cgroup2 cpu controller doesn't yet fully support the control of
> > > +WARNING: cgroup2 cpu controller doesn't yet support the (bandwidth) control of
> >
> > This reads weird to me. Without the () part, it becomes "doesn't yet support
> > the control of". Maybe rephrase it a bit more?
>
> I'm not sure how to rephrase it. It sounds fine to me 😅 Moreover, "doesn't
> yet support the control of" was the wording when the warning paragraph on
> RT_GROUP_SCHED was added in commit c2f31b79 (cgroup: add warning about RT
> not being supported on cgroup2). Would removing the parentheses, making it
> "doesn't yet support the bandwidth control of", sound better?
You're right. I was thinking about sched_ext not RT. Lemme apply the patch
as-is.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 1/3] cgroup, docs: be specific about bandwidth control of rt processes
2025-05-20 14:07 ` [PATCH v2 1/3] cgroup, docs: be specific about bandwidth control of rt processes Shashank Balaji via B4 Relay
2025-05-20 20:11 ` Tejun Heo
@ 2025-05-21 16:35 ` Tejun Heo
1 sibling, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2025-05-21 16:35 UTC (permalink / raw)
To: shashank.mahadasyam
Cc: Johannes Weiner, Michal Koutný, Jonathan Corbet, cgroups,
linux-doc, linux-kernel, Shinya Takumi
On Tue, May 20, 2025 at 11:07:45PM +0900, Shashank Balaji via B4 Relay wrote:
> From: Shashank Balaji <shashank.mahadasyam@sony.com>
>
> Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Applied to cgroup/for-6.16.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 3/3] cgroup, docs: cpu controller interaction with various scheduling policies
2025-05-21 7:49 ` Shashank Balaji
@ 2025-05-21 17:31 ` Tejun Heo
0 siblings, 0 replies; 12+ messages in thread
From: Tejun Heo @ 2025-05-21 17:31 UTC (permalink / raw)
To: Shashank Balaji
Cc: Johannes Weiner, Michal Koutný, Jonathan Corbet, cgroups,
linux-doc, linux-kernel, Shinya Takumi
On Wed, May 21, 2025 at 04:49:43PM +0900, Shashank Balaji wrote:
> > > +Note that the ``cgroup_*`` family of callbacks require ``CONFIG_EXT_GROUP_SCHED``
CONFIG_EXT_GROUP_SCHED is auto-enabled if CONFIG_SCHED_CLASS_EXT and
CONFIG_CGROUP_SCHED are enabled, so no need to mention it explicitly.
> > > +to be enabled. For each of the following interface files, the above categories
> > > +will be referred to. All time durations are in microseconds.
> >
> > Can we document the above in sched_ext documentation and point to it from
> > here? Documenting sched_ext details here seems a bit out of place and prone
> > to becoming stale over time.
>
> Got it. Apart from that, is the content alright?
Other than the above, yeah, looks correct to me.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-05-21 17:31 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-20 14:07 [PATCH v2 0/3] cgroup, docs: cpu controller interaction with various scheduling policies Shashank Balaji via B4 Relay
2025-05-20 14:07 ` [PATCH v2 1/3] cgroup, docs: be specific about bandwidth control of rt processes Shashank Balaji via B4 Relay
2025-05-20 20:11 ` Tejun Heo
2025-05-21 1:14 ` Shashank.Mahadasyam
2025-05-21 16:30 ` Tejun Heo
2025-05-21 16:35 ` Tejun Heo
2025-05-20 14:07 ` [PATCH v2 2/3] sched_ext, docs: add label Shashank Balaji via B4 Relay
2025-05-20 20:26 ` Tejun Heo
2025-05-20 14:07 ` [PATCH v2 3/3] cgroup, docs: cpu controller interaction with various scheduling policies Shashank Balaji via B4 Relay
2025-05-20 20:16 ` Tejun Heo
2025-05-21 7:49 ` Shashank Balaji
2025-05-21 17:31 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).