* [PATCH v2 sched_ext/for-7.1] sched_ext: Documentation: improve accuracy of task lifecycle pseudo-code
@ 2026-04-09 16:57 Kuba Piecuch
2026-04-09 17:46 ` Andrea Righi
0 siblings, 1 reply; 2+ messages in thread
From: Kuba Piecuch @ 2026-04-09 16:57 UTC (permalink / raw)
To: Tejun Heo, Andrea Righi, Changwoo Min, David Vernet,
Christian Loehle
Cc: linux-kernel, sched-ext, Kuba Piecuch
* Add ops.quiescent() and ops.runnable() to the sched_change path.
When a queued task has one of its scheduling properties changed
(e.g. nice, affinity), it goes through dequeue() -> quiescent() ->
(property change callback, e.g. ops.set_weight()) -> runnable() ->
enqueue().
* Change && to || in ops.enqueue() condition. We want to enqueue tasks
that have a non-zero slice and are not in any DSQ.
* Call ops.dispatch() and ops.dequeue() only for tasks that have had
ops.enqueue() called. This is to account for tasks direct-dispatched
from ops.select_cpu().
* Add a note explaining that the pseudo-code provides a simplified view
of the task lifecycle and list some examples of cases that the
pseudo-code does not account for.
Fixes: a4f61f0a1afd ("sched_ext: Documentation: Add ops.dequeue() to task lifecycle")
Signed-off-by: Kuba Piecuch <jpiecuch@google.com>
---
Changes in v2:
- Changed && to || in ops.enqueue() condition (Andrea)
- Moved ops.dispatch() and ops.dequeue() inside the if statement
(Andrea)
- Added a note after the pseudo-code with examples of cases that the
pseudo-code does not account for
- Link to v1: https://lore.kernel.org/all/20260408091821.91063-1-jpiecuch@google.com
Documentation/scheduler/sched-ext.rst | 43 ++++++++++++++++++++++-----
1 file changed, 36 insertions(+), 7 deletions(-)
diff --git a/Documentation/scheduler/sched-ext.rst b/Documentation/scheduler/sched-ext.rst
index ec594ae8086de..2d77ab4816a63 100644
--- a/Documentation/scheduler/sched-ext.rst
+++ b/Documentation/scheduler/sched-ext.rst
@@ -408,8 +408,8 @@ for more information.
Task Lifecycle
--------------
-The following pseudo-code summarizes the entire lifecycle of a task managed
-by a sched_ext scheduler:
+The following pseudo-code presents a rough overview the entire lifecycle
+of a task managed by a sched_ext scheduler:
.. code-block:: c
@@ -423,20 +423,25 @@ by a sched_ext scheduler:
ops.runnable(); /* Task becomes ready to run */
while (task_is_runnable(task)) {
- if (task is not in a DSQ && task->scx.slice == 0) {
+ if (task is not in a DSQ || task->scx.slice == 0) {
ops.enqueue(); /* Task can be added to a DSQ */
/* Task property change (i.e., affinity, nice, etc.)? */
if (sched_change(task)) {
ops.dequeue(); /* Exiting BPF scheduler custody */
+ ops.quiescent();
+
+ /* Property change callback, e.g. ops.set_weight() */
+
+ ops.runnable();
continue;
}
- }
- /* Any usable CPU becomes available */
+ /* Any usable CPU becomes available */
- ops.dispatch(); /* Task is moved to a local DSQ */
- ops.dequeue(); /* Exiting BPF scheduler custody */
+ ops.dispatch(); /* Task is moved to a local DSQ */
+ ops.dequeue(); /* Exiting BPF scheduler custody */
+ }
ops.running(); /* Task starts running on its assigned CPU */
@@ -456,6 +461,30 @@ by a sched_ext scheduler:
ops.disable(); /* Disable BPF scheduling for the task */
ops.exit_task(); /* Task is destroyed */
+Note that the above pseudo-code does not cover all possible state transitions
+and edge cases, to name a few examples:
+
+* ``ops.dispatch()`` may fail to move the task to a local DSQ due to a racing
+ property change on that task, in which case ``ops.dispatch()`` will be
+ retried.
+
+* The task may be direct-dispatched to a local DSQ from ``ops.enqueue()``,
+ in which case ``ops.dispatch()`` and ``ops.dequeue()`` are skipped and we go
+ straight to ``ops.running()``.
+
+* Property changes may occur at virtually any point during the task's lifecycle,
+ not just when the task is queued and waiting to be dispatched. For example,
+ changing a property of a running task will lead to the callback sequence
+ ``ops.stopping()`` -> ``ops.quiescent()`` -> (property change callback) ->
+ ``ops.runnable()`` -> ``ops.running()``.
+
+* A sched_ext task can be preempted by a task from a higher-priority scheduling
+ class, in which it will exit the tick-dispatch loop even though it is runnable
+ and has a non-zero slice.
+
+See the "Scheduling Cycle" section for a more detailed description of how
+a freshly woken up task gets on a CPU.
+
Where to Look
=============
--
2.53.0.1213.gd9a14994de-goog
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH v2 sched_ext/for-7.1] sched_ext: Documentation: improve accuracy of task lifecycle pseudo-code
2026-04-09 16:57 [PATCH v2 sched_ext/for-7.1] sched_ext: Documentation: improve accuracy of task lifecycle pseudo-code Kuba Piecuch
@ 2026-04-09 17:46 ` Andrea Righi
0 siblings, 0 replies; 2+ messages in thread
From: Andrea Righi @ 2026-04-09 17:46 UTC (permalink / raw)
To: Kuba Piecuch
Cc: Tejun Heo, Changwoo Min, David Vernet, Christian Loehle,
linux-kernel, sched-ext
On Thu, Apr 09, 2026 at 04:57:44PM +0000, Kuba Piecuch wrote:
> * Add ops.quiescent() and ops.runnable() to the sched_change path.
> When a queued task has one of its scheduling properties changed
> (e.g. nice, affinity), it goes through dequeue() -> quiescent() ->
> (property change callback, e.g. ops.set_weight()) -> runnable() ->
> enqueue().
>
> * Change && to || in ops.enqueue() condition. We want to enqueue tasks
> that have a non-zero slice and are not in any DSQ.
>
> * Call ops.dispatch() and ops.dequeue() only for tasks that have had
> ops.enqueue() called. This is to account for tasks direct-dispatched
> from ops.select_cpu().
>
> * Add a note explaining that the pseudo-code provides a simplified view
> of the task lifecycle and list some examples of cases that the
> pseudo-code does not account for.
>
> Fixes: a4f61f0a1afd ("sched_ext: Documentation: Add ops.dequeue() to task lifecycle")
> Signed-off-by: Kuba Piecuch <jpiecuch@google.com>
Looks good to me.
Reviewed-by: Andrea Righi <arighi@nvidia.com>$
Thanks,
-Andrea
> ---
>
> Changes in v2:
> - Changed && to || in ops.enqueue() condition (Andrea)
> - Moved ops.dispatch() and ops.dequeue() inside the if statement
> (Andrea)
> - Added a note after the pseudo-code with examples of cases that the
> pseudo-code does not account for
> - Link to v1: https://lore.kernel.org/all/20260408091821.91063-1-jpiecuch@google.com
>
> Documentation/scheduler/sched-ext.rst | 43 ++++++++++++++++++++++-----
> 1 file changed, 36 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/scheduler/sched-ext.rst b/Documentation/scheduler/sched-ext.rst
> index ec594ae8086de..2d77ab4816a63 100644
> --- a/Documentation/scheduler/sched-ext.rst
> +++ b/Documentation/scheduler/sched-ext.rst
> @@ -408,8 +408,8 @@ for more information.
> Task Lifecycle
> --------------
>
> -The following pseudo-code summarizes the entire lifecycle of a task managed
> -by a sched_ext scheduler:
> +The following pseudo-code presents a rough overview the entire lifecycle
> +of a task managed by a sched_ext scheduler:
>
> .. code-block:: c
>
> @@ -423,20 +423,25 @@ by a sched_ext scheduler:
> ops.runnable(); /* Task becomes ready to run */
>
> while (task_is_runnable(task)) {
> - if (task is not in a DSQ && task->scx.slice == 0) {
> + if (task is not in a DSQ || task->scx.slice == 0) {
> ops.enqueue(); /* Task can be added to a DSQ */
>
> /* Task property change (i.e., affinity, nice, etc.)? */
> if (sched_change(task)) {
> ops.dequeue(); /* Exiting BPF scheduler custody */
> + ops.quiescent();
> +
> + /* Property change callback, e.g. ops.set_weight() */
> +
> + ops.runnable();
> continue;
> }
> - }
>
> - /* Any usable CPU becomes available */
> + /* Any usable CPU becomes available */
>
> - ops.dispatch(); /* Task is moved to a local DSQ */
> - ops.dequeue(); /* Exiting BPF scheduler custody */
> + ops.dispatch(); /* Task is moved to a local DSQ */
> + ops.dequeue(); /* Exiting BPF scheduler custody */
> + }
>
> ops.running(); /* Task starts running on its assigned CPU */
>
> @@ -456,6 +461,30 @@ by a sched_ext scheduler:
> ops.disable(); /* Disable BPF scheduling for the task */
> ops.exit_task(); /* Task is destroyed */
>
> +Note that the above pseudo-code does not cover all possible state transitions
> +and edge cases, to name a few examples:
> +
> +* ``ops.dispatch()`` may fail to move the task to a local DSQ due to a racing
> + property change on that task, in which case ``ops.dispatch()`` will be
> + retried.
> +
> +* The task may be direct-dispatched to a local DSQ from ``ops.enqueue()``,
> + in which case ``ops.dispatch()`` and ``ops.dequeue()`` are skipped and we go
> + straight to ``ops.running()``.
> +
> +* Property changes may occur at virtually any point during the task's lifecycle,
> + not just when the task is queued and waiting to be dispatched. For example,
> + changing a property of a running task will lead to the callback sequence
> + ``ops.stopping()`` -> ``ops.quiescent()`` -> (property change callback) ->
> + ``ops.runnable()`` -> ``ops.running()``.
> +
> +* A sched_ext task can be preempted by a task from a higher-priority scheduling
> + class, in which it will exit the tick-dispatch loop even though it is runnable
> + and has a non-zero slice.
> +
> +See the "Scheduling Cycle" section for a more detailed description of how
> +a freshly woken up task gets on a CPU.
> +
> Where to Look
> =============
>
> --
> 2.53.0.1213.gd9a14994de-goog
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-04-09 17:46 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-09 16:57 [PATCH v2 sched_ext/for-7.1] sched_ext: Documentation: improve accuracy of task lifecycle pseudo-code Kuba Piecuch
2026-04-09 17:46 ` Andrea Righi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox