Re: [PATCH v2 sched_ext/for-7.1] sched_ext: Invalidate dispatch decisions on CPU affinity changes

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Cheng-Yang Chou <yphbchou0911@gmail.com>
To: Kuba Piecuch <jpiecuch@google.com>
Cc: Tejun Heo <tj@kernel.org>, Andrea Righi <arighi@nvidia.com>,
	 David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	 Emil Tsalapatis <emil@etsalapatis.com>,
	Christian Loehle <christian.loehle@arm.com>,
	 Daniel Hodges <hodgesd@meta.com>,
	sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org,
	 Ching-Chun Huang <jserv@ccns.ncku.edu.tw>,
	Chia-Ping Tsai <chia7712@gmail.com>
Subject: Re: [PATCH v2 sched_ext/for-7.1] sched_ext: Invalidate dispatch decisions on CPU affinity changes
Date: Sun, 26 Apr 2026 09:47:53 +0800	[thread overview]
Message-ID: <20260426093756.Gd781@cchengyang.duckdns.org> (raw)
In-Reply-To: <DI0KLDKWJBOI.2LVQ249QGVJI8@google.com>

Hi Kuba,

On Thu, Apr 23, 2026 at 01:32:20PM +0000, Kuba Piecuch wrote:
> > On Mon, Mar 23, 2026 at 01:13:20PM -1000, Tejun Heo wrote:
> >> > The simple way to do this is to do scx_bpf_dsq_insert() at the very beginning,
> >> > once we know which task we would like to dispatch, and cancel the pending
> >> > dispatch via scx_bpf_dispatch_cancel() if any of the pre-dispatch checks fail
> >> > on the BPF side. This way, the "critical section" includes BPF-side checks, and
> >> > SCX will ignore the dispatch if there was a dequeue/enqueue racing with the
> >> > critical section.
> >> > 
> >> > With this solution, we can throw an error if task_can_run_on_remote_rq() is
> >> > false, because we know that there was no racing cpumask change (if there was,
> >> > it would have been caught earlier, in finish_dispatch()).
> >> 
> >> Yeah, I think this makes more sense. qseq is already there to provide
> >> protection against these events. It's just that the capturing of qseq is too
> >> late. If insert/cancel is too ugly, we can introduce another kfunc to
> >> capture the qseq - scx_bpf_dsq_insert_begin() or something like that - and
> >> stash it in a per-cpu variable. That way, qseq would be cover the "current"
> >> queued instance and the existing qseq mechanism would be able to reliably
> >> ignore the ones that lost race to dequeue.
> >
> > Since this has been stale for a while, I prepared a patch to implement
> > scx_bpf_dsq_insert_begin() as suggested.
> 
> Thanks for creating the patch. A couple of thoughts:
> 
> 1. Do we have a use case that requires dsq_insert_begin() that isn't
>    satisfied using the "insert and then cancel if needed" approach?

IIUC, yes. scx_bpf_dispatch_cancel() is only registered in 
scx_kfunc_ids_dispatch, so it is only callable from ops.dispatch().
dsq_insert_begin(), on the other hand, is available from both
ops.enqueue() and ops.dispatch() (SCX_KF_ENQUEUE | SCX_KF_DISPATCH).
Since there is nothing to cancel in ops.enqueue(), the insert-and-cancel
approach simply doesn't work there.

> 
> 2. Do we want to restrict ourselves through the one qseq slot provided by
>    dsq_insert_begin()? The most flexible approach IMO would be to simply
>    allow BPF to read the qseq directly via a kfunc and then supply it to
>    dsq_insert() later. With this, we can have multiple qseqs saved at the
>    same time, and we can even pass them between CPUs, e.g. if one CPU
>    dequeues a task for a sibling CPU, but we want the checks to be made inside
>    the sibling's ops.dispatch() (I just made this use case it up, it may not
>    be practical.)
>    That said, exposing an internal thing like qseq to BPF may be a step too far.

In Tejun's reply back in [1], he suggested dsq_insert_begin() precisely
to avoid promoting qseq into the BPF ABI — which matches your own concern.
The single per-CPU slot is sufficient for the one-task-per-iteration
dispatch loops used by existing schedulers (e.g., scx_central).
If a concrete cross-CPU use case materializes later, we can always extend
dsq_insert() to accept an explicit qseq without breaking the current,
simpler path.

[1]: https://lore.kernel.org/all/acHJED4iAeytdC2l@slm.duckdns.org/

>    Let me know what you think.
> 

Please correct me if I'm missing something, thanks! ^0^

-- 
Cheers,
Cheng-Yang

next prev parent reply	other threads:[~2026-04-26  1:47 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-19  8:35 [PATCH v2 sched_ext/for-7.1] sched_ext: Invalidate dispatch decisions on CPU affinity changes Andrea Righi
2026-03-19 10:31 ` Kuba Piecuch
2026-03-19 13:54   ` Kuba Piecuch
2026-03-19 21:09   ` Andrea Righi
2026-03-20  9:18     ` Kuba Piecuch
2026-03-23 23:13       ` Tejun Heo
2026-04-22  6:33         ` Cheng-Yang Chou
2026-04-22 11:02           ` Andrea Righi
2026-04-23 13:32           ` Kuba Piecuch
2026-04-26  1:47             ` Cheng-Yang Chou [this message]
2026-04-27  9:06               ` Kuba Piecuch
2026-05-01 16:19                 ` Cheng-Yang Chou
2026-05-04  8:00                   ` Kuba Piecuch
2026-05-04 21:24                     ` Tejun Heo
2026-05-04 21:58                       ` Andrea Righi
2026-05-05  8:35                         ` Cheng-Yang Chou
2026-05-05  8:01                       ` Kuba Piecuch
2026-05-05  8:31                         ` Tejun Heo
2026-05-05  9:13                           ` Kuba Piecuch
2026-05-05 15:14                             ` Tejun Heo
2026-05-05 15:58                           ` Cheng-Yang Chou
2026-03-19 15:18 ` Kuba Piecuch
2026-03-19 19:01   ` Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260426093756.Gd781@cchengyang.duckdns.org \
    --to=yphbchou0911@gmail.com \
    --cc=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=chia7712@gmail.com \
    --cc=christian.loehle@arm.com \
    --cc=emil@etsalapatis.com \
    --cc=hodgesd@meta.com \
    --cc=jpiecuch@google.com \
    --cc=jserv@ccns.ncku.edu.tw \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.