Re: [PATCH v2 sched_ext/for-7.1] sched_ext: Invalidate dispatch decisions on CPU affinity changes

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Kuba Piecuch <jpiecuch@google.com>
To: Cheng-Yang Chou <yphbchou0911@gmail.com>,
	Kuba Piecuch <jpiecuch@google.com>
Cc: Tejun Heo <tj@kernel.org>, Andrea Righi <arighi@nvidia.com>,
	 David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	 Emil Tsalapatis <emil@etsalapatis.com>,
	Christian Loehle <christian.loehle@arm.com>,
	 Daniel Hodges <hodgesd@meta.com>, <sched-ext@lists.linux.dev>,
	 <linux-kernel@vger.kernel.org>,
	Ching-Chun Huang <jserv@ccns.ncku.edu.tw>,
	 Chia-Ping Tsai <chia7712@gmail.com>
Subject: Re: [PATCH v2 sched_ext/for-7.1] sched_ext: Invalidate dispatch decisions on CPU affinity changes
Date: Mon, 04 May 2026 08:00:50 +0000	[thread overview]
Message-ID: <DI9QFXIX3Z0R.23PU1FB6DEMPS@google.com> (raw)
In-Reply-To: <20260502000039.Ga94c@cchengyang.duckdns.org>

Hi Cheng-Yang,

On Fri May 1, 2026 at 4:19 PM UTC, Cheng-Yang Chou wrote:
>> >> 2. Do we want to restrict ourselves through the one qseq slot provided by
>> >>    dsq_insert_begin()? The most flexible approach IMO would be to simply
>> >>    allow BPF to read the qseq directly via a kfunc and then supply it to
>> >>    dsq_insert() later. With this, we can have multiple qseqs saved at the
>> >>    same time, and we can even pass them between CPUs, e.g. if one CPU
>> >>    dequeues a task for a sibling CPU, but we want the checks to be made inside
>> >>    the sibling's ops.dispatch() (I just made this use case it up, it may not
>> >>    be practical.)
>> >>    That said, exposing an internal thing like qseq to BPF may be a step too far.
>> >
>> > In Tejun's reply back in [1], he suggested dsq_insert_begin() precisely
>> > to avoid promoting qseq into the BPF ABI — which matches your own concern.
>> > The single per-CPU slot is sufficient for the one-task-per-iteration
>> > dispatch loops used by existing schedulers (e.g., scx_central).
>> > If a concrete cross-CPU use case materializes later, we can always extend
>> > dsq_insert() to accept an explicit qseq without breaking the current,
>> > simpler path.
>> >
>> > [1]: https://lore.kernel.org/all/acHJED4iAeytdC2l@slm.duckdns.org/
>> >
>> 
>> Well, Tejun doesn't explicitly say there that he's against exposing qseq, but
>> I won't be surprised if he is.
>> 
>> FWIW, ghOSt (our Google-internal BPF scheduling solution) uses exactly this
>> approach to guard the dispatch path against racing dequeues/enqueues.
>> Every task has a seqnum that gets incremented on each "event" pertaining to
>> the task. In the dispatch path, the BPF scheduler reads the task seqnum,
>> does whatever checks it needs to do, and passes the seqnum to ghOSt at the end.
>> 
>> Admittedly, what works downstream doesn't have to work upstream, but I still
>> wanted to provide this data point :-)
>
> The ghOSt data point is appreciated. If a concrete use case emerges where
> the single-slot approach falls short, extending dsq_insert() to accept an
> explicit qseq seems like a natural next step.
>
> Tejun, Andrea, sched-ext folks, any preferences?

Random thought: If exposing qseq values to BPF directly is undesirable, then
perhaps a less objectionable approach would be to expose them as opaque
cookie/token values? Same semantics, but fewer SCX internals leaking to BPF.

Thanks,
Kuba

next prev parent reply	other threads:[~2026-05-04  8:00 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-19  8:35 [PATCH v2 sched_ext/for-7.1] sched_ext: Invalidate dispatch decisions on CPU affinity changes Andrea Righi
2026-03-19 10:31 ` Kuba Piecuch
2026-03-19 13:54   ` Kuba Piecuch
2026-03-19 21:09   ` Andrea Righi
2026-03-20  9:18     ` Kuba Piecuch
2026-03-23 23:13       ` Tejun Heo
2026-04-22  6:33         ` Cheng-Yang Chou
2026-04-22 11:02           ` Andrea Righi
2026-04-23 13:32           ` Kuba Piecuch
2026-04-26  1:47             ` Cheng-Yang Chou
2026-04-27  9:06               ` Kuba Piecuch
2026-05-01 16:19                 ` Cheng-Yang Chou
2026-05-04  8:00                   ` Kuba Piecuch [this message]
2026-05-04 21:24                     ` Tejun Heo
2026-05-04 21:58                       ` Andrea Righi
2026-05-05  8:35                         ` Cheng-Yang Chou
2026-05-05  8:01                       ` Kuba Piecuch
2026-05-05  8:31                         ` Tejun Heo
2026-05-05  9:13                           ` Kuba Piecuch
2026-05-05 15:14                             ` Tejun Heo
2026-05-05 15:58                           ` Cheng-Yang Chou
2026-03-19 15:18 ` Kuba Piecuch
2026-03-19 19:01   ` Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DI9QFXIX3Z0R.23PU1FB6DEMPS@google.com \
    --to=jpiecuch@google.com \
    --cc=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=chia7712@gmail.com \
    --cc=christian.loehle@arm.com \
    --cc=emil@etsalapatis.com \
    --cc=hodgesd@meta.com \
    --cc=jserv@ccns.ncku.edu.tw \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    --cc=yphbchou0911@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.