All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: zhidao su <soolaugust@gmail.com>
Cc: Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org,
	zhidao su <suzhidao@xiaomi.com>
Subject: Re: [PATCH v2] sched_ext: Honor SCX_OPS_ALWAYS_ENQ_IMMED on framework-internal goto-local paths
Date: Mon, 20 Apr 2026 07:23:30 +0200	[thread overview]
Message-ID: <aeW4UlU5Zi-XSd42@gpd4> (raw)
In-Reply-To: <20260420035646.1715762-1-suzhidao@xiaomi.com>

Hi zhidao,

On Mon, Apr 20, 2026 at 11:56:46AM +0800, zhidao su wrote:
> SCX_OPS_ALWAYS_ENQ_IMMED promises that SCX_ENQ_IMMED is set on all local
> DSQ enqueues. scx_vet_enq_flags() enforces this for BPF kfunc callers
> (scx_bpf_dsq_insert, scx_bpf_dsq_move_*), but the framework-internal
> goto-local paths in do_enqueue_task() — PF_EXITING, migration-disabled,
> and !scx_rq_online fallbacks — bypass scx_vet_enq_flags() entirely.
> 
> When a scheduler sets SCX_OPS_ALWAYS_ENQ_IMMED, tasks hitting these
> goto-local paths arrive at dispatch_enqueue() without SCX_ENQ_IMMED in
> enq_flags, violating the flag's documented semantics.
> 
> This can be observed with trace_printk instrumentation at the enqueue:
> label while running a multi-threaded fork-exit workload under a scheduler
> with SCX_OPS_ALWAYS_ENQ_IMMED:
> 
>   Before (scx_simple + ALWAYS_ENQ_IMMED, 2 CPUs, mmap-contention exit):
>     95 PF_EXITING local enqueues, 95/95 IMMED=0 ALW=1   <-- bug
> 
>   After:
>     1030 PF_EXITING local enqueues, 1030/1030 IMMED=1 ALW=1  <-- fixed
> 
> Fix by checking SCX_OPS_ALWAYS_ENQ_IMMED at the enqueue: label and
> setting SCX_ENQ_IMMED when dispatching to a local DSQ. This mirrors
> what scx_vet_enq_flags() does for BPF callers.
> 
> Fixes: 3229ac4a5ef5 ("sched_ext: Add SCX_OPS_ALWAYS_ENQ_IMMED ops flag")
> Signed-off-by: zhidao su <suzhidao@xiaomi.com>
> ---
> v2: Resend to correct a submission error in v1 where unrelated files
>     were accidentally included in the patch. The code change is
>     identical; only kernel/sched/ext.c is modified. Apologies for
>     the noise.
> ---
>  kernel/sched/ext.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 9628c64e5592..0758f5e5a8f0 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -1859,7 +1859,9 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags,
>  	 * Clear persistent TASK_IMMED for fresh enqueues, see dsq_inc_nr().
>  	 * Note that exiting and migration-disabled tasks that skip
>  	 * ops.enqueue() below will lose IMMED protection unless
> -	 * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set.
> +	 * %SCX_OPS_ENQ_EXITING / %SCX_OPS_ENQ_MIGRATION_DISABLED are set,
> +	 * or %SCX_OPS_ALWAYS_ENQ_IMMED is enabled (which re-applies IMMED
> +	 * at the enqueue: label below).
>  	 */
>  	p->scx.flags &= ~SCX_TASK_IMMED;
>  
> @@ -1949,6 +1951,17 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags,
>  	 */
>  	touch_core_sched(rq, p);
>  	refill_task_slice_dfl(sch, p);
> +
> +	/*
> +	 * Honor %SCX_OPS_ALWAYS_ENQ_IMMED for framework-internal local DSQ
> +	 * enqueues (PF_EXITING, migration-disabled, !online fallbacks).
> +	 * scx_vet_enq_flags() already handles this for BPF kfunc callers,
> +	 * but the goto-local paths above bypass it.
> +	 */
> +	if ((sch->ops.flags & SCX_OPS_ALWAYS_ENQ_IMMED) &&
> +	    dsq == &rq->scx.local_dsq)
> +		enq_flags |= SCX_ENQ_IMMED;
> +

I'm not sure this should be applied across all fallback cases, it's probably
safer to avoid triggering re-enqueues for the internal events, especially
considering that the BPF scheduler doesn't have visibility into them.

If we do this we should also update %SCX_ENQ_IMMED documentation, that says:

 * Exiting and migration-disabled tasks bypass ops.enqueue() and
 * are placed directly on a local DSQ without IMMED protection
 * unless %SCX_OPS_ENQ_EXITING and %SCX_OPS_ENQ_MIGRATION_DISABLED
 * are set respectively.

But again, do we actually want to do this? If SCX_OPS_ENQ_EXITING and
SCX_OPS_ENQ_MIGRATION_DISABLED aren't set, these cases are handled internally by
the sched_ext core, so triggering a re-enqueue seems unnecessary, as the BPF
scheduler wouldn't have visibility of such events anyway.

>  	dispatch_enqueue(sch, rq, dsq, p, enq_flags);
>  }
>  
> -- 
> 2.43.0
> 

Thanks,
-Andrea

  reply	other threads:[~2026-04-20  5:23 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-20  3:56 [PATCH v2] sched_ext: Honor SCX_OPS_ALWAYS_ENQ_IMMED on framework-internal goto-local paths zhidao su
2026-04-20  5:23 ` Andrea Righi [this message]
2026-04-20  5:29   ` zhidao su

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aeW4UlU5Zi-XSd42@gpd4 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=soolaugust@gmail.com \
    --cc=suzhidao@xiaomi.com \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.