All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Kuba Piecuch <jpiecuch@google.com>
Cc: Christian Loehle <christian.loehle@arm.com>,
	Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	Emil Tsalapatis <emil@etsalapatis.com>,
	sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH sched_ext/for-7.1] sched_ext: Documentation: Add missing calls to quiescent(), runnable()
Date: Thu, 9 Apr 2026 16:12:29 +0200	[thread overview]
Message-ID: <adezzWWsgrcGGD3j@gpd4> (raw)
In-Reply-To: <DHONT6QVA0PH.IP79IYRM8PKK@google.com>

On Thu, Apr 09, 2026 at 01:30:55PM +0000, Kuba Piecuch wrote:
> On Thu Apr 9, 2026 at 9:46 AM UTC, Christian Loehle wrote:
> ...
> >>>
> >>>     ops.init_task();            /* A new task is created */
> >>>     ops.enable();               /* Enable BPF scheduling for the task */
> >>>
> >>>     while (task in SCHED_EXT) {
> >>>         if (task can migrate)
> >>>             ops.select_cpu();   /* Called on wakeup (optimization) */
> >>>
> >>>         ops.runnable();         /* Task becomes ready to run */
> >>>
> >>>         while (task_is_runnable(task)) {
> >>>             if (task is not in a DSQ || task->scx.slice == 0) {
> >>>                 ops.enqueue();  /* Task can be added to a DSQ */
> >>>
> >>>                 /* Task property change (i.e., affinity, nice, etc.)? */
> >>>                 if (sched_change(task)) {
> >>>                     ops.dequeue(); /* Exiting BPF scheduler custody */
> >>>                     ops.quiescent();
> >>>
> >>>                     /* Property change callback, e.g. ops.set_weight() */
> >>>
> >>>                     ops.runnable();
> >>>                     continue;
> >>>                 }
> >>>
> >>>                 /* Any usable CPU becomes available */
> >>>
> >>>                 ops.dispatch();     /* Task is moved to a local DSQ */
> >>>                 ops.dequeue();      /* Exiting BPF scheduler custody */
> > Is this true here? Any dispatch followed by a dequeue?
> 
> The comment next to ops.dispatch() says the task is moved to a local DSQ,
> so if we assume that, then I think it will always be followed by ops.dequeue().
> Same if we move the task to the global DSQ.

So, ops.dispatch() is not a "task callback", it's a "CPU callback", invoked when
a CPU becomes available. So having ops.dispatch() here can be a bit confusing.

The intent was to describe the workflow where, once the task is enqueued to a
non-terminal DSQ, then it can be consumed by an ops.dispatch() event and, in
that case, ops.dequeue() is also invoked when the task reaches a terminal DSQ.
Not sure if there's a better way to express this concept in the pseudocode.

> 
> Of course, you could do something weird like dispatch the task to a user DSQ,
> in which case there won't be a dequeue and the task won't start running, but
> that's weird enough that I don't think we need to consider it.

Right. For the records, scx_rustland_core does something similar: from
ops.dispatch() it consumes a task from a BPF user ringbuffer, inserts it into a
user DSQ and then consumes the first task from the user DSQ via
scx_bpf_dsq_move_to_local().

But that's a bit of a special use case, due to the unusual user-space scheduling
part. But in this case the pseudocode is still accurate, since the
scx_bpf_dsq_move_to_local() triggers an ops.dequeue().

> 
> You could also have a property change racing with the dispatch which would make
> the dispatch fail and not be followed by a dequeue, but again, we need to draw
> the line somewhere.

Yeah, the thing is that sched_change() introduces a lot of different edge cases
that are difficult to represent in the pseudocode. I guess the best we can do is
document in a descriptive manner the concept of "BPF scheduler's custody" and
the fact that a task can temporarily leave the custody when a sched_change()
event happens.

> 
> So, in other words, any _successful_ dispatch to a _terminal_ DSQ is always
> followed by a dequeue.
> 
> Another case that isn't handled here is direct dispatch to a terminal DSQ from
> ops.enqueue(), where we don't get ops.dispatch() or ops.dequeue() and go
> straight to ops.running(). If any of the above cases should be handled in the
> pseudocode, I'd say it's this one.

Right, in fact it should be: any _successful_ dispatch to a _terminal DSQ_, if
the task was in the BPF scheduler's custody. A direct dispatch to a terminal DSQ
either from ops.select_cpu() or ops.enqueue() doesn't trigger ops.dequeue(),
because the task doesn't enter the BPF scheduler's custody.

Thanks,
-Andrea

  reply	other threads:[~2026-04-09 14:12 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-06 11:47 [PATCH sched_ext/for-7.1] sched_ext: Documentation: Add ops.dequeue() to task lifecycle Andrea Righi
2026-04-06 14:49 ` Emil Tsalapatis
2026-04-06 19:08   ` Andrea Righi
2026-04-06 18:09 ` Tejun Heo
2026-04-07  9:54 ` Kuba Piecuch
2026-04-07 16:31   ` Andrea Righi
2026-04-08  9:18     ` [PATCH sched_ext/for-7.1] sched_ext: Documentation: Add missing calls to quiescent(), runnable() Kuba Piecuch
2026-04-08 11:28       ` Andrea Righi
2026-04-08 12:40         ` Kuba Piecuch
2026-04-08 13:49           ` Andrea Righi
2026-04-08 14:17             ` Kuba Piecuch
2026-04-08 14:54               ` Andrea Righi
2026-04-09  8:46                 ` Kuba Piecuch
2026-04-09  9:38                   ` Andrea Righi
2026-04-09  9:46                   ` Christian Loehle
2026-04-09 13:30                     ` Kuba Piecuch
2026-04-09 14:12                       ` Andrea Righi [this message]
2026-04-09 13:51                     ` Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adezzWWsgrcGGD3j@gpd4 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=christian.loehle@arm.com \
    --cc=emil@etsalapatis.com \
    --cc=jpiecuch@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.