From: Kuba Piecuch <jpiecuch@google.com>
To: Andrea Righi <arighi@nvidia.com>, Tejun Heo <tj@kernel.org>,
David Vernet <void@manifault.com>,
Changwoo Min <changwoo@igalia.com>
Cc: Kuba Piecuch <jpiecuch@google.com>,
Emil Tsalapatis <emil@etsalapatis.com>,
Christian Loehle <christian.loehle@arm.com>,
Daniel Hodges <hodgesd@meta.com>, <sched-ext@lists.linux.dev>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 sched_ext/for-7.1] sched_ext: Invalidate dispatch decisions on CPU affinity changes
Date: Thu, 19 Mar 2026 15:18:38 +0000 [thread overview]
Message-ID: <DH6UY7YCS6LN.KMPOKET3PSF9@google.com> (raw)
In-Reply-To: <20260319083518.94673-1-arighi@nvidia.com>
Hi Andrea,
On Thu Mar 19, 2026 at 8:35 AM UTC, Andrea Righi wrote:
> A BPF scheduler may rely on p->cpus_ptr from ops.dispatch() to select a
> target CPU. However, task affinity can change between the dispatch
> decision and its finalization in finish_dispatch(). When this happens,
> the scheduler may attempt to dispatch a task to a CPU that is no longer
> allowed, resulting in fatal errors such as:
>
> EXIT: runtime error (SCX_DSQ_LOCAL[_ON] target CPU 10 not allowed for stress-ng-race-[13565])
>
> This race exists because ops.dispatch() runs without holding the task's
> run queue lock, allowing a concurrent set_cpus_allowed() to update
> p->cpus_ptr while the BPF scheduler is still using it. The dispatch is
> then finalized using stale affinity information.
>
> Example timeline:
>
> CPU0 CPU1
> ---- ----
> task_rq_lock(p)
> if (cpumask_test_cpu(cpu, p->cpus_ptr))
> set_cpus_allowed_scx(p, new_mask)
> task_rq_unlock(p)
> scx_bpf_dsq_insert(p,
> SCX_DSQ_LOCAL_ON | cpu, 0)
>
> With commit ebf1ccff79c4 ("sched_ext: Fix ops.dequeue() semantics"), BPF
> schedulers can avoid the affinity race by tracking task state and
> handling %SCX_DEQ_SCHED_CHANGE in ops.dequeue(): when a task is dequeued
> due to a property change, the scheduler can update the task state and
> skip the direct dispatch from ops.dispatch() for non-queued tasks.
>
> However, schedulers that do not implement task state tracking and
> dispatch directly to a local DSQ directly from ops.dispatch() may
> trigger the scx_error() condition when the kernel validates the
> destination in dispatch_to_local_dsq().
The two paragraphs above mention "direct dispatch from ops.dispatch()"
and "dispatch directly to a local DSQ directly from ops.dispatch()".
My understanding is that a "direct dispatch" can only happen from
ops.select_cpu() or ops.enqueue(), not from ops.dispatch(). Is this just
an unfortunate choice of words?
Would "dispatch to a local DSQ" be a more accurate phrase here?
Thanks,
Kuba
next prev parent reply other threads:[~2026-03-19 15:18 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-19 8:35 [PATCH v2 sched_ext/for-7.1] sched_ext: Invalidate dispatch decisions on CPU affinity changes Andrea Righi
2026-03-19 10:31 ` Kuba Piecuch
2026-03-19 13:54 ` Kuba Piecuch
2026-03-19 21:09 ` Andrea Righi
2026-03-20 9:18 ` Kuba Piecuch
2026-03-23 23:13 ` Tejun Heo
2026-03-19 15:18 ` Kuba Piecuch [this message]
2026-03-19 19:01 ` Andrea Righi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DH6UY7YCS6LN.KMPOKET3PSF9@google.com \
--to=jpiecuch@google.com \
--cc=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=christian.loehle@arm.com \
--cc=emil@etsalapatis.com \
--cc=hodgesd@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sched-ext@lists.linux.dev \
--cc=tj@kernel.org \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox