From: Andrea Righi <arighi@nvidia.com>
To: Christian Loehle <christian.loehle@arm.com>
Cc: Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
Changwoo Min <changwoo@igalia.com>,
Kuba Piecuch <jpiecuch@google.com>,
Emil Tsalapatis <emil@etsalapatis.com>,
Daniel Hodges <hodgesd@meta.com>,
sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] sched_ext: Fix ops.dequeue() semantics
Date: Mon, 2 Feb 2026 16:32:14 +0100 [thread overview]
Message-ID: <aYDDfjZv0pFgO2LK@gpd4> (raw)
In-Reply-To: <dbced6c4-388d-4f2e-90c3-cb2299fca34d@arm.com>
On Mon, Feb 02, 2026 at 10:02:30AM +0000, Christian Loehle wrote:
> On 2/2/26 09:26, Andrea Righi wrote:
> > On Mon, Feb 02, 2026 at 08:45:18AM +0100, Andrea Righi wrote:
> > ...
> >>> So I have finally gotten around updating scx_storm to the new semantics,
> >>> see:
> >>> https://github.com/cloehle/scx/tree/cloehle/scx-storm-qmap-insert-local-dequeue-semantics
> >>>
> >>> I don't think the new ops.dequeue() are enough to make inserts to local-on
> >>> from anywhere safe, because it's still racing with dequeue from another CPU?
> >>
> >> Yeah, with this patch set BPF schedulers get proper ops.dequeue()
> >> callbacks, but we're not fixing the usage of SCX_DSQ_LOCAL_ON from
> >> ops.dispatch().
> >>
> >> When task properties change between scx_bpf_dsq_insert() and the actual
> >> dispatch, task_can_run_on_remote_rq() can still trigger a fatal
> >> scx_error().
> >>
> >> The ops.dequeue(SCX_DEQ_SCHED_CHANGE) notifications happens after the
> >> property change, so it can't prevent already-queued dispatches from
> >> failing. The race window is between ops.dispatch() returning and
> >> dispatch_to_local_dsq() executing.
> >>
> >> We can address this in a separate patch set. One thing at a time. :)
> >
> > Thinking more on this, the problem is that we're passing enforce=true to
> > task_can_run_on_remote_rq(), triggering a critical failure - scx_error().
> > There's a logic in task_can_run_on_remote_rq() to fallback to the global
> > DSQ, that doesn't happen if we pass enforce=true, due to scx_error().
> >
> > However, instead of the global DSQ fallback, I was wondering if it'd be
> > better to simply re-enqueue the task - setting SCX_ENQ_REENQ - if the
> > target local DSQ isn't valid anymore when the dispatch is finalized.
> >
> > In this way using SCX_DSQ_LOCAL_ON | cpu from ops.dispatch() would simply
> > trigger a re-enqueue when "cpu" isn't valid anymore (due to concurrent
> > affinity / migration disabled changes) and the BPF scheduler can handle
> > that in another ops.enqueue().
> >
> > What do you think?
>
> I think that's a lot more versatile for the BPF scheduler than using the
> global DSQ as fallback in that case, so yeah I'm all for it!
>
Ack, I already have a working patch do to this, I'll post it as a separate
patch set.
Thanks,
-Andrea
next prev parent reply other threads:[~2026-02-02 15:32 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-01 9:08 [PATCHSET v4 sched_ext/for-6.20] sched_ext: Fix ops.dequeue() semantics Andrea Righi
2026-02-01 9:08 ` [PATCH 1/2] " Andrea Righi
2026-02-01 22:47 ` Christian Loehle
2026-02-02 7:45 ` Andrea Righi
2026-02-02 9:26 ` Andrea Righi
2026-02-02 10:02 ` Christian Loehle
2026-02-02 15:32 ` Andrea Righi [this message]
2026-02-02 10:09 ` Christian Loehle
2026-02-02 13:59 ` Kuba Piecuch
2026-02-04 9:36 ` Andrea Righi
2026-02-04 9:51 ` Kuba Piecuch
2026-02-02 11:56 ` Kuba Piecuch
2026-02-04 10:11 ` Andrea Righi
2026-02-04 10:33 ` Kuba Piecuch
2026-02-01 9:08 ` [PATCH 2/2] selftests/sched_ext: Add test to validate " Andrea Righi
-- strict thread matches above, loose matches on Subject: below --
2026-02-10 21:26 [PATCHSET v8] sched_ext: Fix " Andrea Righi
2026-02-10 21:26 ` [PATCH 1/2] " Andrea Righi
2026-02-10 23:20 ` Tejun Heo
2026-02-11 16:06 ` Andrea Righi
2026-02-11 19:47 ` Tejun Heo
2026-02-11 22:34 ` Andrea Righi
2026-02-11 22:37 ` Tejun Heo
2026-02-11 22:48 ` Andrea Righi
2026-02-12 10:16 ` Andrea Righi
2026-02-12 14:32 ` Christian Loehle
2026-02-12 15:45 ` Andrea Righi
2026-02-12 17:07 ` Tejun Heo
2026-02-12 18:14 ` Andrea Righi
2026-02-12 18:35 ` Tejun Heo
2026-02-12 22:30 ` Andrea Righi
2026-02-14 10:16 ` Andrea Righi
2026-02-14 17:56 ` Tejun Heo
2026-02-14 19:32 ` Andrea Righi
2026-02-10 23:54 ` Tejun Heo
2026-02-11 16:07 ` Andrea Righi
2026-02-06 13:54 [PATCHSET v7] " Andrea Righi
2026-02-06 13:54 ` [PATCH 1/2] " Andrea Righi
2026-02-06 20:35 ` Emil Tsalapatis
2026-02-07 9:26 ` Andrea Righi
2026-02-09 17:28 ` Tejun Heo
2026-02-09 19:06 ` Andrea Righi
2026-02-05 15:32 [PATCHSET v6] " Andrea Righi
2026-02-05 15:32 ` [PATCH 1/2] " Andrea Righi
2026-02-05 19:29 ` Kuba Piecuch
2026-02-05 21:32 ` Andrea Righi
2026-02-04 16:05 [PATCHSET v5] " Andrea Righi
2026-02-04 16:05 ` [PATCH 1/2] " Andrea Righi
2026-02-04 22:14 ` Tejun Heo
2026-02-05 9:26 ` Andrea Righi
2026-01-26 8:41 [PATCHSET v3 sched_ext/for-6.20] " Andrea Righi
2026-01-26 8:41 ` [PATCH 1/2] " Andrea Righi
2026-01-27 16:38 ` Emil Tsalapatis
2026-01-27 16:41 ` Kuba Piecuch
2026-01-30 7:34 ` Andrea Righi
2026-01-30 13:14 ` Kuba Piecuch
2026-01-31 6:54 ` Andrea Righi
2026-01-31 16:45 ` Kuba Piecuch
2026-01-31 17:24 ` Andrea Righi
2026-01-28 21:21 ` Tejun Heo
2026-01-30 11:54 ` Kuba Piecuch
2026-01-31 9:02 ` Andrea Righi
2026-01-31 17:53 ` Kuba Piecuch
2026-01-31 20:26 ` Andrea Righi
2026-02-02 15:19 ` Tejun Heo
2026-02-02 15:30 ` Andrea Righi
2026-02-01 17:43 ` Tejun Heo
2026-02-02 15:52 ` Andrea Righi
2026-02-02 16:23 ` Kuba Piecuch
2026-01-21 12:25 [PATCHSET v2 sched_ext/for-6.20] " Andrea Righi
2026-01-21 12:25 ` [PATCH 1/2] " Andrea Righi
2026-01-21 12:54 ` Christian Loehle
2026-01-21 12:57 ` Andrea Righi
2026-01-22 9:28 ` Kuba Piecuch
2026-01-23 13:32 ` Andrea Righi
2025-12-19 22:43 [PATCH 0/2] sched_ext: Implement proper " Andrea Righi
2025-12-19 22:43 ` [PATCH 1/2] sched_ext: Fix " Andrea Righi
2025-12-28 3:20 ` Emil Tsalapatis
2025-12-29 16:36 ` Andrea Righi
2025-12-29 18:35 ` Emil Tsalapatis
2025-12-28 17:19 ` Tejun Heo
2025-12-28 23:28 ` Tejun Heo
2025-12-28 23:38 ` Tejun Heo
2025-12-29 17:07 ` Andrea Righi
2025-12-29 18:55 ` Emil Tsalapatis
2025-12-28 23:42 ` Tejun Heo
2025-12-29 17:17 ` Andrea Righi
2025-12-29 0:06 ` Tejun Heo
2025-12-29 18:56 ` Andrea Righi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYDDfjZv0pFgO2LK@gpd4 \
--to=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=christian.loehle@arm.com \
--cc=emil@etsalapatis.com \
--cc=hodgesd@meta.com \
--cc=jpiecuch@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sched-ext@lists.linux.dev \
--cc=tj@kernel.org \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.