public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] sched_ext: Honor SCX_OPS_ALWAYS_ENQ_IMMED on framework-internal goto-local paths
@ 2026-04-20  3:56 zhidao su
  2026-04-20  5:23 ` Andrea Righi
  0 siblings, 1 reply; 3+ messages in thread
From: zhidao su @ 2026-04-20  3:56 UTC (permalink / raw)
  To: Tejun Heo, David Vernet
  Cc: Andrea Righi, Changwoo Min, sched-ext, linux-kernel, zhidao su

SCX_OPS_ALWAYS_ENQ_IMMED promises that SCX_ENQ_IMMED is set on all local
DSQ enqueues. scx_vet_enq_flags() enforces this for BPF kfunc callers
(scx_bpf_dsq_insert, scx_bpf_dsq_move_*), but the framework-internal
goto-local paths in do_enqueue_task() — PF_EXITING, migration-disabled,
and !scx_rq_online fallbacks — bypass scx_vet_enq_flags() entirely.

When a scheduler sets SCX_OPS_ALWAYS_ENQ_IMMED, tasks hitting these
goto-local paths arrive at dispatch_enqueue() without SCX_ENQ_IMMED in
enq_flags, violating the flag's documented semantics.

This can be observed with trace_printk instrumentation at the enqueue:
label while running a multi-threaded fork-exit workload under a scheduler
with SCX_OPS_ALWAYS_ENQ_IMMED:

  Before (scx_simple + ALWAYS_ENQ_IMMED, 2 CPUs, mmap-contention exit):
    95 PF_EXITING local enqueues, 95/95 IMMED=0 ALW=1   <-- bug

  After:
    1030 PF_EXITING local enqueues, 1030/1030 IMMED=1 ALW=1  <-- fixed

Fix by checking SCX_OPS_ALWAYS_ENQ_IMMED at the enqueue: label and
setting SCX_ENQ_IMMED when dispatching to a local DSQ. This mirrors
what scx_vet_enq_flags() does for BPF callers.

Fixes: 3229ac4a5ef5 ("sched_ext: Add SCX_OPS_ALWAYS_ENQ_IMMED ops flag")
Signed-off-by: zhidao su <suzhidao@xiaomi.com>
---
v2: Resend to correct a submission error in v1 where unrelated files
    were accidentally included in the patch. The code change is
    identical; only kernel/sched/ext.c is modified. Apologies for
    the noise.
---
 kernel/sched/ext.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 9628c64e5592..0758f5e5a8f0 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -1859,7 +1859,9 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags,
 	 * Clear persistent TASK_IMMED for fresh enqueues, see dsq_inc_nr().
 	 * Note that exiting and migration-disabled tasks that skip
 	 * ops.enqueue() below will lose IMMED protection unless
-	 * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set.
+	 * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set,
+	 * or %SCX_OPS_ALWAYS_ENQ_IMMED is enabled (which re-applies IMMED
+	 * at the enqueue: label below).
 	 */
 	p->scx.flags &= ~SCX_TASK_IMMED;
 
@@ -1949,6 +1951,17 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags,
 	 */
 	touch_core_sched(rq, p);
 	refill_task_slice_dfl(sch, p);
+
+	/*
+	 * Honor %SCX_OPS_ALWAYS_ENQ_IMMED for framework-internal local DSQ
+	 * enqueues (PF_EXITING, migration-disabled, !online fallbacks).
+	 * scx_vet_enq_flags() already handles this for BPF kfunc callers,
+	 * but the goto-local paths above bypass it.
+	 */
+	if ((sch->ops.flags & SCX_OPS_ALWAYS_ENQ_IMMED) &&
+	    dsq == &rq->scx.local_dsq)
+		enq_flags |= SCX_ENQ_IMMED;
+
 	dispatch_enqueue(sch, rq, dsq, p, enq_flags);
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] sched_ext: Honor SCX_OPS_ALWAYS_ENQ_IMMED on framework-internal goto-local paths
  2026-04-20  3:56 [PATCH v2] sched_ext: Honor SCX_OPS_ALWAYS_ENQ_IMMED on framework-internal goto-local paths zhidao su
@ 2026-04-20  5:23 ` Andrea Righi
  2026-04-20  5:29   ` zhidao su
  0 siblings, 1 reply; 3+ messages in thread
From: Andrea Righi @ 2026-04-20  5:23 UTC (permalink / raw)
  To: zhidao su
  Cc: Tejun Heo, David Vernet, Changwoo Min, sched-ext, linux-kernel,
	zhidao su

Hi zhidao,

On Mon, Apr 20, 2026 at 11:56:46AM +0800, zhidao su wrote:
> SCX_OPS_ALWAYS_ENQ_IMMED promises that SCX_ENQ_IMMED is set on all local
> DSQ enqueues. scx_vet_enq_flags() enforces this for BPF kfunc callers
> (scx_bpf_dsq_insert, scx_bpf_dsq_move_*), but the framework-internal
> goto-local paths in do_enqueue_task() — PF_EXITING, migration-disabled,
> and !scx_rq_online fallbacks — bypass scx_vet_enq_flags() entirely.
> 
> When a scheduler sets SCX_OPS_ALWAYS_ENQ_IMMED, tasks hitting these
> goto-local paths arrive at dispatch_enqueue() without SCX_ENQ_IMMED in
> enq_flags, violating the flag's documented semantics.
> 
> This can be observed with trace_printk instrumentation at the enqueue:
> label while running a multi-threaded fork-exit workload under a scheduler
> with SCX_OPS_ALWAYS_ENQ_IMMED:
> 
>   Before (scx_simple + ALWAYS_ENQ_IMMED, 2 CPUs, mmap-contention exit):
>     95 PF_EXITING local enqueues, 95/95 IMMED=0 ALW=1   <-- bug
> 
>   After:
>     1030 PF_EXITING local enqueues, 1030/1030 IMMED=1 ALW=1  <-- fixed
> 
> Fix by checking SCX_OPS_ALWAYS_ENQ_IMMED at the enqueue: label and
> setting SCX_ENQ_IMMED when dispatching to a local DSQ. This mirrors
> what scx_vet_enq_flags() does for BPF callers.
> 
> Fixes: 3229ac4a5ef5 ("sched_ext: Add SCX_OPS_ALWAYS_ENQ_IMMED ops flag")
> Signed-off-by: zhidao su <suzhidao@xiaomi.com>
> ---
> v2: Resend to correct a submission error in v1 where unrelated files
>     were accidentally included in the patch. The code change is
>     identical; only kernel/sched/ext.c is modified. Apologies for
>     the noise.
> ---
>  kernel/sched/ext.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 9628c64e5592..0758f5e5a8f0 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -1859,7 +1859,9 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags,
>  	 * Clear persistent TASK_IMMED for fresh enqueues, see dsq_inc_nr().
>  	 * Note that exiting and migration-disabled tasks that skip
>  	 * ops.enqueue() below will lose IMMED protection unless
> -	 * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set.
> +	 * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set,
> +	 * or %SCX_OPS_ALWAYS_ENQ_IMMED is enabled (which re-applies IMMED
> +	 * at the enqueue: label below).
>  	 */
>  	p->scx.flags &= ~SCX_TASK_IMMED;
>  
> @@ -1949,6 +1951,17 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags,
>  	 */
>  	touch_core_sched(rq, p);
>  	refill_task_slice_dfl(sch, p);
> +
> +	/*
> +	 * Honor %SCX_OPS_ALWAYS_ENQ_IMMED for framework-internal local DSQ
> +	 * enqueues (PF_EXITING, migration-disabled, !online fallbacks).
> +	 * scx_vet_enq_flags() already handles this for BPF kfunc callers,
> +	 * but the goto-local paths above bypass it.
> +	 */
> +	if ((sch->ops.flags & SCX_OPS_ALWAYS_ENQ_IMMED) &&
> +	    dsq == &rq->scx.local_dsq)
> +		enq_flags |= SCX_ENQ_IMMED;
> +

I'm not sure this should be applied across all fallback cases, it's probably
safer to avoid triggering re-enqueues for the internal events, especially
considering that the BPF scheduler doesn't have visibility into them.

If we do this we should also update %SCX_ENQ_IMMED documentation, that says:

 * Exiting and migration-disabled tasks bypass ops.enqueue() and
 * are placed directly on a local DSQ without IMMED protection
 * unless %SCX_OPS_ENQ_EXITING and %SCX_OPS_ENQ_MIGRATION_DISABLED
 * are set respectively.

But again, do we actually want to do this? If SCX_OPS_ENQ_EXITING and
SCX_OPS_ENQ_MIGRATION_DISABLED aren't set, these cases are handled internally by
the sched_ext core, so triggering a re-enqueue seems unnecessary, as the BPF
scheduler wouldn't have visibility of such events anyway.

>  	dispatch_enqueue(sch, rq, dsq, p, enq_flags);
>  }
>  
> -- 
> 2.43.0
> 

Thanks,
-Andrea

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] sched_ext: Honor SCX_OPS_ALWAYS_ENQ_IMMED on framework-internal goto-local paths
  2026-04-20  5:23 ` Andrea Righi
@ 2026-04-20  5:29   ` zhidao su
  0 siblings, 0 replies; 3+ messages in thread
From: zhidao su @ 2026-04-20  5:29 UTC (permalink / raw)
  To: Andrea Righi
  Cc: Tejun Heo, David Vernet, Changwoo Min, sched-ext, linux-kernel

Hi Andrea,

Thank you for the careful review — you're right.

I am not notice the  re-enqueue side-effect, will drop this patch and think again.

Apologies for the noise, including the v1 submission error that sent unrelated files.

Thanks,
zhidao

> 2026年4月20日 13:23,Andrea Righi <arighi@nvidia.com> 写道:
> 
> Hi zhidao,
> 
> On Mon, Apr 20, 2026 at 11:56:46AM +0800, zhidao su wrote:
>> SCX_OPS_ALWAYS_ENQ_IMMED promises that SCX_ENQ_IMMED is set on all local
>> DSQ enqueues. scx_vet_enq_flags() enforces this for BPF kfunc callers
>> (scx_bpf_dsq_insert, scx_bpf_dsq_move_*), but the framework-internal
>> goto-local paths in do_enqueue_task() — PF_EXITING, migration-disabled,
>> and !scx_rq_online fallbacks — bypass scx_vet_enq_flags() entirely.
>> 
>> When a scheduler sets SCX_OPS_ALWAYS_ENQ_IMMED, tasks hitting these
>> goto-local paths arrive at dispatch_enqueue() without SCX_ENQ_IMMED in
>> enq_flags, violating the flag's documented semantics.
>> 
>> This can be observed with trace_printk instrumentation at the enqueue:
>> label while running a multi-threaded fork-exit workload under a scheduler
>> with SCX_OPS_ALWAYS_ENQ_IMMED:
>> 
>>  Before (scx_simple + ALWAYS_ENQ_IMMED, 2 CPUs, mmap-contention exit):
>>    95 PF_EXITING local enqueues, 95/95 IMMED=0 ALW=1   <-- bug
>> 
>>  After:
>>    1030 PF_EXITING local enqueues, 1030/1030 IMMED=1 ALW=1  <-- fixed
>> 
>> Fix by checking SCX_OPS_ALWAYS_ENQ_IMMED at the enqueue: label and
>> setting SCX_ENQ_IMMED when dispatching to a local DSQ. This mirrors
>> what scx_vet_enq_flags() does for BPF callers.
>> 
>> Fixes: 3229ac4a5ef5 ("sched_ext: Add SCX_OPS_ALWAYS_ENQ_IMMED ops flag")
>> Signed-off-by: zhidao su <suzhidao@xiaomi.com>
>> ---
>> v2: Resend to correct a submission error in v1 where unrelated files
>>    were accidentally included in the patch. The code change is
>>    identical; only kernel/sched/ext.c is modified. Apologies for
>>    the noise.
>> ---
>> kernel/sched/ext.c | 15 ++++++++++++++-
>> 1 file changed, 14 insertions(+), 1 deletion(-)
>> 
>> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
>> index 9628c64e5592..0758f5e5a8f0 100644
>> --- a/kernel/sched/ext.c
>> +++ b/kernel/sched/ext.c
>> @@ -1859,7 +1859,9 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags,
>>  * Clear persistent TASK_IMMED for fresh enqueues, see dsq_inc_nr().
>>  * Note that exiting and migration-disabled tasks that skip
>>  * ops.enqueue() below will lose IMMED protection unless
>> -  * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set.
>> +  * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set,
>> +  * or %SCX_OPS_ALWAYS_ENQ_IMMED is enabled (which re-applies IMMED
>> +  * at the enqueue: label below).
>>  */
>> p->scx.flags &= ~SCX_TASK_IMMED;
>> 
>> @@ -1949,6 +1951,17 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags,
>>  */
>> touch_core_sched(rq, p);
>> refill_task_slice_dfl(sch, p);
>> +
>> + /*
>> +  * Honor %SCX_OPS_ALWAYS_ENQ_IMMED for framework-internal local DSQ
>> +  * enqueues (PF_EXITING, migration-disabled, !online fallbacks).
>> +  * scx_vet_enq_flags() already handles this for BPF kfunc callers,
>> +  * but the goto-local paths above bypass it.
>> +  */
>> + if ((sch->ops.flags & SCX_OPS_ALWAYS_ENQ_IMMED) &&
>> +     dsq == &rq->scx.local_dsq)
>> + enq_flags |= SCX_ENQ_IMMED;
>> +
> 
> I'm not sure this should be applied across all fallback cases, it's probably
> safer to avoid triggering re-enqueues for the internal events, especially
> considering that the BPF scheduler doesn't have visibility into them.
> 
> If we do this we should also update %SCX_ENQ_IMMED documentation, that says:
> 
> * Exiting and migration-disabled tasks bypass ops.enqueue() and
> * are placed directly on a local DSQ without IMMED protection
> * unless %SCX_OPS_ENQ_EXITING and %SCX_OPS_ENQ_MIGRATION_DISABLED
> * are set respectively.
> 
> But again, do we actually want to do this? If SCX_OPS_ENQ_EXITING and
> SCX_OPS_ENQ_MIGRATION_DISABLED aren't set, these cases are handled internally by
> the sched_ext core, so triggering a re-enqueue seems unnecessary, as the BPF
> scheduler wouldn't have visibility of such events anyway.
> 
>> dispatch_enqueue(sch, rq, dsq, p, enq_flags);
>> }
>> 
>> -- 
>> 2.43.0
>> 
> 
> Thanks,
> -Andrea


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-04-20  5:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-20  3:56 [PATCH v2] sched_ext: Honor SCX_OPS_ALWAYS_ENQ_IMMED on framework-internal goto-local paths zhidao su
2026-04-20  5:23 ` Andrea Righi
2026-04-20  5:29   ` zhidao su

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox