* [PATCH v2] sched_ext: Honor SCX_OPS_ALWAYS_ENQ_IMMED on framework-internal goto-local paths
@ 2026-04-20 3:56 zhidao su
2026-04-20 5:23 ` Andrea Righi
0 siblings, 1 reply; 3+ messages in thread
From: zhidao su @ 2026-04-20 3:56 UTC (permalink / raw)
To: Tejun Heo, David Vernet
Cc: Andrea Righi, Changwoo Min, sched-ext, linux-kernel, zhidao su
SCX_OPS_ALWAYS_ENQ_IMMED promises that SCX_ENQ_IMMED is set on all local
DSQ enqueues. scx_vet_enq_flags() enforces this for BPF kfunc callers
(scx_bpf_dsq_insert, scx_bpf_dsq_move_*), but the framework-internal
goto-local paths in do_enqueue_task() — PF_EXITING, migration-disabled,
and !scx_rq_online fallbacks — bypass scx_vet_enq_flags() entirely.
When a scheduler sets SCX_OPS_ALWAYS_ENQ_IMMED, tasks hitting these
goto-local paths arrive at dispatch_enqueue() without SCX_ENQ_IMMED in
enq_flags, violating the flag's documented semantics.
This can be observed with trace_printk instrumentation at the enqueue:
label while running a multi-threaded fork-exit workload under a scheduler
with SCX_OPS_ALWAYS_ENQ_IMMED:
Before (scx_simple + ALWAYS_ENQ_IMMED, 2 CPUs, mmap-contention exit):
95 PF_EXITING local enqueues, 95/95 IMMED=0 ALW=1 <-- bug
After:
1030 PF_EXITING local enqueues, 1030/1030 IMMED=1 ALW=1 <-- fixed
Fix by checking SCX_OPS_ALWAYS_ENQ_IMMED at the enqueue: label and
setting SCX_ENQ_IMMED when dispatching to a local DSQ. This mirrors
what scx_vet_enq_flags() does for BPF callers.
Fixes: 3229ac4a5ef5 ("sched_ext: Add SCX_OPS_ALWAYS_ENQ_IMMED ops flag")
Signed-off-by: zhidao su <suzhidao@xiaomi.com>
---
v2: Resend to correct a submission error in v1 where unrelated files
were accidentally included in the patch. The code change is
identical; only kernel/sched/ext.c is modified. Apologies for
the noise.
---
kernel/sched/ext.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 9628c64e5592..0758f5e5a8f0 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -1859,7 +1859,9 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags,
* Clear persistent TASK_IMMED for fresh enqueues, see dsq_inc_nr().
* Note that exiting and migration-disabled tasks that skip
* ops.enqueue() below will lose IMMED protection unless
- * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set.
+ * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set,
+ * or %SCX_OPS_ALWAYS_ENQ_IMMED is enabled (which re-applies IMMED
+ * at the enqueue: label below).
*/
p->scx.flags &= ~SCX_TASK_IMMED;
@@ -1949,6 +1951,17 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags,
*/
touch_core_sched(rq, p);
refill_task_slice_dfl(sch, p);
+
+ /*
+ * Honor %SCX_OPS_ALWAYS_ENQ_IMMED for framework-internal local DSQ
+ * enqueues (PF_EXITING, migration-disabled, !online fallbacks).
+ * scx_vet_enq_flags() already handles this for BPF kfunc callers,
+ * but the goto-local paths above bypass it.
+ */
+ if ((sch->ops.flags & SCX_OPS_ALWAYS_ENQ_IMMED) &&
+ dsq == &rq->scx.local_dsq)
+ enq_flags |= SCX_ENQ_IMMED;
+
dispatch_enqueue(sch, rq, dsq, p, enq_flags);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH v2] sched_ext: Honor SCX_OPS_ALWAYS_ENQ_IMMED on framework-internal goto-local paths 2026-04-20 3:56 [PATCH v2] sched_ext: Honor SCX_OPS_ALWAYS_ENQ_IMMED on framework-internal goto-local paths zhidao su @ 2026-04-20 5:23 ` Andrea Righi 2026-04-20 5:29 ` zhidao su 0 siblings, 1 reply; 3+ messages in thread From: Andrea Righi @ 2026-04-20 5:23 UTC (permalink / raw) To: zhidao su Cc: Tejun Heo, David Vernet, Changwoo Min, sched-ext, linux-kernel, zhidao su Hi zhidao, On Mon, Apr 20, 2026 at 11:56:46AM +0800, zhidao su wrote: > SCX_OPS_ALWAYS_ENQ_IMMED promises that SCX_ENQ_IMMED is set on all local > DSQ enqueues. scx_vet_enq_flags() enforces this for BPF kfunc callers > (scx_bpf_dsq_insert, scx_bpf_dsq_move_*), but the framework-internal > goto-local paths in do_enqueue_task() — PF_EXITING, migration-disabled, > and !scx_rq_online fallbacks — bypass scx_vet_enq_flags() entirely. > > When a scheduler sets SCX_OPS_ALWAYS_ENQ_IMMED, tasks hitting these > goto-local paths arrive at dispatch_enqueue() without SCX_ENQ_IMMED in > enq_flags, violating the flag's documented semantics. > > This can be observed with trace_printk instrumentation at the enqueue: > label while running a multi-threaded fork-exit workload under a scheduler > with SCX_OPS_ALWAYS_ENQ_IMMED: > > Before (scx_simple + ALWAYS_ENQ_IMMED, 2 CPUs, mmap-contention exit): > 95 PF_EXITING local enqueues, 95/95 IMMED=0 ALW=1 <-- bug > > After: > 1030 PF_EXITING local enqueues, 1030/1030 IMMED=1 ALW=1 <-- fixed > > Fix by checking SCX_OPS_ALWAYS_ENQ_IMMED at the enqueue: label and > setting SCX_ENQ_IMMED when dispatching to a local DSQ. This mirrors > what scx_vet_enq_flags() does for BPF callers. > > Fixes: 3229ac4a5ef5 ("sched_ext: Add SCX_OPS_ALWAYS_ENQ_IMMED ops flag") > Signed-off-by: zhidao su <suzhidao@xiaomi.com> > --- > v2: Resend to correct a submission error in v1 where unrelated files > were accidentally included in the patch. The code change is > identical; only kernel/sched/ext.c is modified. Apologies for > the noise. > --- > kernel/sched/ext.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c > index 9628c64e5592..0758f5e5a8f0 100644 > --- a/kernel/sched/ext.c > +++ b/kernel/sched/ext.c > @@ -1859,7 +1859,9 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags, > * Clear persistent TASK_IMMED for fresh enqueues, see dsq_inc_nr(). > * Note that exiting and migration-disabled tasks that skip > * ops.enqueue() below will lose IMMED protection unless > - * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set. > + * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set, > + * or %SCX_OPS_ALWAYS_ENQ_IMMED is enabled (which re-applies IMMED > + * at the enqueue: label below). > */ > p->scx.flags &= ~SCX_TASK_IMMED; > > @@ -1949,6 +1951,17 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags, > */ > touch_core_sched(rq, p); > refill_task_slice_dfl(sch, p); > + > + /* > + * Honor %SCX_OPS_ALWAYS_ENQ_IMMED for framework-internal local DSQ > + * enqueues (PF_EXITING, migration-disabled, !online fallbacks). > + * scx_vet_enq_flags() already handles this for BPF kfunc callers, > + * but the goto-local paths above bypass it. > + */ > + if ((sch->ops.flags & SCX_OPS_ALWAYS_ENQ_IMMED) && > + dsq == &rq->scx.local_dsq) > + enq_flags |= SCX_ENQ_IMMED; > + I'm not sure this should be applied across all fallback cases, it's probably safer to avoid triggering re-enqueues for the internal events, especially considering that the BPF scheduler doesn't have visibility into them. If we do this we should also update %SCX_ENQ_IMMED documentation, that says: * Exiting and migration-disabled tasks bypass ops.enqueue() and * are placed directly on a local DSQ without IMMED protection * unless %SCX_OPS_ENQ_EXITING and %SCX_OPS_ENQ_MIGRATION_DISABLED * are set respectively. But again, do we actually want to do this? If SCX_OPS_ENQ_EXITING and SCX_OPS_ENQ_MIGRATION_DISABLED aren't set, these cases are handled internally by the sched_ext core, so triggering a re-enqueue seems unnecessary, as the BPF scheduler wouldn't have visibility of such events anyway. > dispatch_enqueue(sch, rq, dsq, p, enq_flags); > } > > -- > 2.43.0 > Thanks, -Andrea ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] sched_ext: Honor SCX_OPS_ALWAYS_ENQ_IMMED on framework-internal goto-local paths 2026-04-20 5:23 ` Andrea Righi @ 2026-04-20 5:29 ` zhidao su 0 siblings, 0 replies; 3+ messages in thread From: zhidao su @ 2026-04-20 5:29 UTC (permalink / raw) To: Andrea Righi Cc: Tejun Heo, David Vernet, Changwoo Min, sched-ext, linux-kernel Hi Andrea, Thank you for the careful review — you're right. I am not notice the re-enqueue side-effect, will drop this patch and think again. Apologies for the noise, including the v1 submission error that sent unrelated files. Thanks, zhidao > 2026年4月20日 13:23,Andrea Righi <arighi@nvidia.com> 写道: > > Hi zhidao, > > On Mon, Apr 20, 2026 at 11:56:46AM +0800, zhidao su wrote: >> SCX_OPS_ALWAYS_ENQ_IMMED promises that SCX_ENQ_IMMED is set on all local >> DSQ enqueues. scx_vet_enq_flags() enforces this for BPF kfunc callers >> (scx_bpf_dsq_insert, scx_bpf_dsq_move_*), but the framework-internal >> goto-local paths in do_enqueue_task() — PF_EXITING, migration-disabled, >> and !scx_rq_online fallbacks — bypass scx_vet_enq_flags() entirely. >> >> When a scheduler sets SCX_OPS_ALWAYS_ENQ_IMMED, tasks hitting these >> goto-local paths arrive at dispatch_enqueue() without SCX_ENQ_IMMED in >> enq_flags, violating the flag's documented semantics. >> >> This can be observed with trace_printk instrumentation at the enqueue: >> label while running a multi-threaded fork-exit workload under a scheduler >> with SCX_OPS_ALWAYS_ENQ_IMMED: >> >> Before (scx_simple + ALWAYS_ENQ_IMMED, 2 CPUs, mmap-contention exit): >> 95 PF_EXITING local enqueues, 95/95 IMMED=0 ALW=1 <-- bug >> >> After: >> 1030 PF_EXITING local enqueues, 1030/1030 IMMED=1 ALW=1 <-- fixed >> >> Fix by checking SCX_OPS_ALWAYS_ENQ_IMMED at the enqueue: label and >> setting SCX_ENQ_IMMED when dispatching to a local DSQ. This mirrors >> what scx_vet_enq_flags() does for BPF callers. >> >> Fixes: 3229ac4a5ef5 ("sched_ext: Add SCX_OPS_ALWAYS_ENQ_IMMED ops flag") >> Signed-off-by: zhidao su <suzhidao@xiaomi.com> >> --- >> v2: Resend to correct a submission error in v1 where unrelated files >> were accidentally included in the patch. The code change is >> identical; only kernel/sched/ext.c is modified. Apologies for >> the noise. >> --- >> kernel/sched/ext.c | 15 ++++++++++++++- >> 1 file changed, 14 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c >> index 9628c64e5592..0758f5e5a8f0 100644 >> --- a/kernel/sched/ext.c >> +++ b/kernel/sched/ext.c >> @@ -1859,7 +1859,9 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags, >> * Clear persistent TASK_IMMED for fresh enqueues, see dsq_inc_nr(). >> * Note that exiting and migration-disabled tasks that skip >> * ops.enqueue() below will lose IMMED protection unless >> - * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set. >> + * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set, >> + * or %SCX_OPS_ALWAYS_ENQ_IMMED is enabled (which re-applies IMMED >> + * at the enqueue: label below). >> */ >> p->scx.flags &= ~SCX_TASK_IMMED; >> >> @@ -1949,6 +1951,17 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags, >> */ >> touch_core_sched(rq, p); >> refill_task_slice_dfl(sch, p); >> + >> + /* >> + * Honor %SCX_OPS_ALWAYS_ENQ_IMMED for framework-internal local DSQ >> + * enqueues (PF_EXITING, migration-disabled, !online fallbacks). >> + * scx_vet_enq_flags() already handles this for BPF kfunc callers, >> + * but the goto-local paths above bypass it. >> + */ >> + if ((sch->ops.flags & SCX_OPS_ALWAYS_ENQ_IMMED) && >> + dsq == &rq->scx.local_dsq) >> + enq_flags |= SCX_ENQ_IMMED; >> + > > I'm not sure this should be applied across all fallback cases, it's probably > safer to avoid triggering re-enqueues for the internal events, especially > considering that the BPF scheduler doesn't have visibility into them. > > If we do this we should also update %SCX_ENQ_IMMED documentation, that says: > > * Exiting and migration-disabled tasks bypass ops.enqueue() and > * are placed directly on a local DSQ without IMMED protection > * unless %SCX_OPS_ENQ_EXITING and %SCX_OPS_ENQ_MIGRATION_DISABLED > * are set respectively. > > But again, do we actually want to do this? If SCX_OPS_ENQ_EXITING and > SCX_OPS_ENQ_MIGRATION_DISABLED aren't set, these cases are handled internally by > the sched_ext core, so triggering a re-enqueue seems unnecessary, as the BPF > scheduler wouldn't have visibility of such events anyway. > >> dispatch_enqueue(sch, rq, dsq, p, enq_flags); >> } >> >> -- >> 2.43.0 >> > > Thanks, > -Andrea ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-20 5:30 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-20 3:56 [PATCH v2] sched_ext: Honor SCX_OPS_ALWAYS_ENQ_IMMED on framework-internal goto-local paths zhidao su 2026-04-20 5:23 ` Andrea Righi 2026-04-20 5:29 ` zhidao su
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox