All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kuba Piecuch <jpiecuch@google.com>
To: Andrea Righi <arighi@nvidia.com>, Kuba Piecuch <jpiecuch@google.com>
Cc: Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
	 Changwoo Min <changwoo@igalia.com>,
	Emil Tsalapatis <emil@etsalapatis.com>,
	 Christian Loehle <christian.loehle@arm.com>,
	Daniel Hodges <hodgesd@meta.com>, <sched-ext@lists.linux.dev>,
	 <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 sched_ext/for-7.1] sched_ext: Invalidate dispatch decisions on CPU affinity changes
Date: Fri, 20 Mar 2026 09:18:20 +0000	[thread overview]
Message-ID: <DH7HWWN0HZQM.1ZSKEH89LMOKQ@google.com> (raw)
In-Reply-To: <abxl-xw7nt1jp5qT@gpd4>

On Thu Mar 19, 2026 at 9:09 PM UTC, Andrea Righi wrote:
> On Thu, Mar 19, 2026 at 10:31:30AM +0000, Kuba Piecuch wrote:
>> On Thu Mar 19, 2026 at 8:35 AM UTC, Andrea Righi wrote:
>> > @@ -2537,9 +2546,26 @@ static void dispatch_to_local_dsq(struct scx_sched *sch, struct rq *rq,
>> >  	}
>> >  
>> >  	if (src_rq != dst_rq &&
>> > -	    unlikely(!task_can_run_on_remote_rq(sch, p, dst_rq, true))) {
>> > -		dispatch_enqueue(sch, rq, find_global_dsq(sch, task_cpu(p)), p,
>> > -				 enq_flags | SCX_ENQ_CLEAR_OPSS | SCX_ENQ_GDSQ_FALLBACK);
>> > +	    unlikely(!task_can_run_on_remote_rq(sch, p, dst_rq, false))) {
>> > +		/*
>> > +		 * Affinity changed after dispatch decision and the task
>> > +		 * can't run anymore on the destination rq.
>> 
>> More of a nitpick, but this doesn't necessarily mean that the affinity changed.
>> The scheduler could have also issued an invalid dispatch to a CPU outside of
>> the task's cpumask (e.g. due to a bug), in which case the task won't be
>> re-enqueued if we simply drop the dispatch, correct?
>
> That's right, the scheduler could have issues an invalid dispatch and in
> that case we would just drop the task on the floor, which is not really
> nice, it'd be better to immediately error in this case. And we don't need
> the global DSQ fallback, since we're erroring anyway.
>
> I need to rethink this part...

The fundamental problem here is differentiating between buggy dispatches that
should have never been issued and dispatches that were valid at the moment
the BPF scheduler was preparing the task for dispatch, but became invalid due
to racing cpumask changes.

The crucial observation is that SCX will only detect racing dequeues/enqueues
if they race with the window between scx_bpf_dsq_insert() and finish_dispatch().
That's because scx_bpf_dsq_insert() stores a snapshot of the task's current
qseq value, which is compared in finish_dispatch().

The BPF-side cpumask checks traditionally happen outside of this window, making
finish_dispatch() incapable of detecting cpumask changes that raced with the
BPF-side check but happened strictly before scx_bpf_dsq_insert().

To resolve this, we need to extend the race detection window so that it
includes the BPF-side checks.

The simple way to do this is to do scx_bpf_dsq_insert() at the very beginning,
once we know which task we would like to dispatch, and cancel the pending
dispatch via scx_bpf_dispatch_cancel() if any of the pre-dispatch checks fail
on the BPF side. This way, the "critical section" includes BPF-side checks, and
SCX will ignore the dispatch if there was a dequeue/enqueue racing with the
critical section.

With this solution, we can throw an error if task_can_run_on_remote_rq() is
false, because we know that there was no racing cpumask change (if there was,
it would have been caught earlier, in finish_dispatch()).


  reply	other threads:[~2026-03-20  9:18 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-19  8:35 [PATCH v2 sched_ext/for-7.1] sched_ext: Invalidate dispatch decisions on CPU affinity changes Andrea Righi
2026-03-19 10:31 ` Kuba Piecuch
2026-03-19 13:54   ` Kuba Piecuch
2026-03-19 21:09   ` Andrea Righi
2026-03-20  9:18     ` Kuba Piecuch [this message]
2026-03-23 23:13       ` Tejun Heo
2026-04-22  6:33         ` Cheng-Yang Chou
2026-04-22 11:02           ` Andrea Righi
2026-04-23 13:32           ` Kuba Piecuch
2026-04-26  1:47             ` Cheng-Yang Chou
2026-04-27  9:06               ` Kuba Piecuch
2026-05-01 16:19                 ` Cheng-Yang Chou
2026-05-04  8:00                   ` Kuba Piecuch
2026-05-04 21:24                     ` Tejun Heo
2026-05-04 21:58                       ` Andrea Righi
2026-05-05  8:35                         ` Cheng-Yang Chou
2026-05-05  8:01                       ` Kuba Piecuch
2026-05-05  8:31                         ` Tejun Heo
2026-05-05  9:13                           ` Kuba Piecuch
2026-05-05 15:14                             ` Tejun Heo
2026-05-05 15:58                           ` Cheng-Yang Chou
2026-03-19 15:18 ` Kuba Piecuch
2026-03-19 19:01   ` Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DH7HWWN0HZQM.1ZSKEH89LMOKQ@google.com \
    --to=jpiecuch@google.com \
    --cc=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=christian.loehle@arm.com \
    --cc=emil@etsalapatis.com \
    --cc=hodgesd@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.