From: Andrea Righi <arighi@nvidia.com>
To: Emil Tsalapatis <emil@etsalapatis.com>
Cc: Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
Changwoo Min <changwoo@igalia.com>,
Kuba Piecuch <jpiecuch@google.com>,
Christian Loehle <christian.loehle@arm.com>,
Daniel Hodges <hodgesd@meta.com>,
sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] selftests/sched_ext: Add test to validate ops.dequeue() semantics
Date: Sun, 8 Feb 2026 14:55:11 +0100 [thread overview]
Message-ID: <aYiVv998mfMGVMCq@gpd4> (raw)
In-Reply-To: <aYhkv2K99aqiuwr5@gpd4>
On Sun, Feb 08, 2026 at 11:26:13AM +0100, Andrea Righi wrote:
> On Sun, Feb 08, 2026 at 10:02:41AM +0100, Andrea Righi wrote:
> ...
> > > >> > - From ops.select_cpu():
> > > >> > - scenario 0 (local DSQ): tasks dispatched to the local DSQ bypass
> > > >> > the BPF scheduler entirely; they never enter BPF custody, so
> > > >> > ops.dequeue() is not called,
> > > >> > - scenario 1 (global DSQ): tasks dispatched to SCX_DSQ_GLOBAL also
> > > >> > bypass the BPF scheduler, like the local DSQ; ops.dequeue() is
> > > >> > not called,
> > > >> > - scenario 2 (user DSQ): tasks enter BPF scheduler custody with full
> > > >> > enqueue/dequeue lifecycle tracking and state machine validation
> > > >> > (expects 1:1 enqueue/dequeue pairing).
> > > >>
> > > >> Could you add a note here about why there's no equivalent to scenario 6?
> > > >> The differentiating factor between that and scenario 2 (nonterminal queue) is
> > > >> that scx_dsq_insert_commit() is called regardless of whether the queue is terminal.
> > > >> And this makes sense since for non-DSQ queues the BPF scheduler can do its
> > > >> own tracking of enqueue/dequeue (plus it does not make too much sense to
> > > >> do BPF-internal enqueueing in select_cpu).
> > > >>
> > > >> What do you think? If the above makes sense, maybe we should spell it out
> > > >> in the documentation too. Maybe also add it makes no sense to enqueue
> > > >> in an internal BPF structure from select_cpu - the task is not yet
> > > >> enqueued, and would have to go through enqueue anyway.
> > > >
> > > > Oh, I just didn't think about it, we can definitely add to ops.select_cpu()
> > > > a scenario equivalent to scenario 6 (push task to the BPF queue).
> > > >
> > > > From a practical standpoint the benefits are questionable, but in the scope
> > > > of the kselftest I think it makes sense to better validate the entire state
> > > > machine in all cases. I'll add this scenario as well.
> > > >
> > >
> > > That makes sense! Let's add it for completeness. Even if it doesn't make
> > > sense right now that may change in the future. For example, if we end
> > > up finding a good reason to add the task into an internal structure from
> > > .select_cpu(), we may allow the task to be explicitly marked as being in
> > > the BPF scheduler's custody from a kfunc. Right now we can't do that
> > > from select_cpu() unless we direct dispatch IIUC.
> >
> > Ok, I'll send a new patch later with the new scenario included. It should
> > work already (if done properly in the test case), I think we don't need to
> > change anything in the kernel.
>
> Actually I take that back. The internal BPF queue from ops.select_cpu()
> scenario is a bit tricky, because when we return from ops.select_cpu()
> without p->scx.ddsp_dsq_id being set, we don't know if the scheduler added
> the task to an internal BPF queue or simply did nothing.
>
> We need to add some special logic here, preferably without introducing
> overhead just to handle this particular (really uncommon) case. I'll take a
> look.
The more I think about this, the more it feels wrong to consider a task as
being "in BPF scheduler custody" if it is stored in a BPF internal data
structure from ops.select_cpu().
At the point where ops.select_cpu() runs, the task has not yet entered the
BPF scheduler's queues. While it is technically possible to stash the task
in some BPF-managed structure from there, doing so should not imply full
scheduler custody.
In particular, we should not trigger ops.dequeue(), because the task has
not reached the "enqueue" stage of its lifecycle. ops.select_cpu() is
effectively a pre-enqueue hook, primarily intended as a fast path to bypass
the scheduler altogether. As such, triggering ops.dequeue() in this case
would not make sense IMHO.
I think it would make more sense to document this behavior explicitly and
leave the kselftest as is.
Thoughts?
Thanks,
-Andrea
next prev parent reply other threads:[~2026-02-08 13:55 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-06 13:54 [PATCHSET v7] sched_ext: Fix ops.dequeue() semantics Andrea Righi
2026-02-06 13:54 ` [PATCH 1/2] " Andrea Righi
2026-02-06 20:35 ` Emil Tsalapatis
2026-02-07 9:26 ` Andrea Righi
2026-02-09 17:28 ` Tejun Heo
2026-02-09 19:06 ` Andrea Righi
2026-02-06 13:54 ` [PATCH 2/2] selftests/sched_ext: Add test to validate " Andrea Righi
2026-02-06 20:10 ` Emil Tsalapatis
2026-02-07 9:16 ` Andrea Righi
2026-02-08 5:11 ` Emil Tsalapatis
2026-02-08 9:02 ` Andrea Righi
2026-02-08 10:26 ` Andrea Righi
2026-02-08 13:55 ` Andrea Righi [this message]
2026-02-08 17:59 ` Emil Tsalapatis
2026-02-08 20:08 ` Andrea Righi
2026-02-09 10:20 ` Andrea Righi
2026-02-09 15:00 ` Emil Tsalapatis
2026-02-09 15:43 ` Andrea Righi
2026-02-09 17:23 ` Tejun Heo
2026-02-09 19:17 ` Andrea Righi
2026-02-09 20:10 ` Tejun Heo
2026-02-09 22:22 ` Andrea Righi
2026-02-10 0:42 ` Tejun Heo
2026-02-10 7:29 ` Andrea Righi
-- strict thread matches above, loose matches on Subject: below --
2026-02-10 21:26 [PATCHSET v8] sched_ext: Fix " Andrea Righi
2026-02-10 21:26 ` [PATCH 2/2] selftests/sched_ext: Add test to validate " Andrea Righi
2026-02-12 17:15 ` Christian Loehle
2026-02-12 18:25 ` Andrea Righi
2026-02-05 15:32 [PATCHSET v6] sched_ext: Fix " Andrea Righi
2026-02-05 15:32 ` [PATCH 2/2] selftests/sched_ext: Add test to validate " Andrea Righi
2026-02-04 16:05 [PATCHSET v5] sched_ext: Fix " Andrea Righi
2026-02-04 16:05 ` [PATCH 2/2] selftests/sched_ext: Add test to validate " Andrea Righi
2026-02-01 9:08 [PATCHSET v4 sched_ext/for-6.20] sched_ext: Fix " Andrea Righi
2026-02-01 9:08 ` [PATCH 2/2] selftests/sched_ext: Add test to validate " Andrea Righi
2026-01-26 8:41 [PATCHSET v3 sched_ext/for-6.20] sched_ext: Fix " Andrea Righi
2026-01-26 8:41 ` [PATCH 2/2] selftests/sched_ext: Add test to validate " Andrea Righi
2026-01-27 16:53 ` Emil Tsalapatis
2026-01-21 12:25 [PATCHSET v2 sched_ext/for-6.20] sched_ext: Fix " Andrea Righi
2026-01-21 12:25 ` [PATCH 2/2] selftests/sched_ext: Add test to validate " Andrea Righi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYiVv998mfMGVMCq@gpd4 \
--to=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=christian.loehle@arm.com \
--cc=emil@etsalapatis.com \
--cc=hodgesd@meta.com \
--cc=jpiecuch@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sched-ext@lists.linux.dev \
--cc=tj@kernel.org \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox