public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Daniel Hodges <hodgesd@meta.com>
To: Andrea Righi <arighi@nvidia.com>
Cc: <tj@kernel.org>, <void@manifault.com>, <changwoo@igalia.com>,
	<sched-ext@lists.linux.dev>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] sched_ext: Clear direct dispatch state on dequeue when dsq is NULL
Date: Wed, 21 Jan 2026 16:31:02 -0800	[thread overview]
Message-ID: <aXFukGjN0F7W3Hoa@fb.com> (raw)
In-Reply-To: <aXFA4-b4WLPjxCME@gpd4>

On Wed, Jan 21, 2026 at 10:10:59PM +0100, Andrea Righi wrote:
> Hi Daniel,
> 
> On Wed, Jan 21, 2026 at 07:56:02AM -0800, Daniel Hodges wrote:
> > When a task is direct-dispatched from ops.select_cpu() or ops.enqueue(),
> > ddsp_dsq_id is set to indicate the target DSQ. If the task is dequeued
> > before dispatch_enqueue() completes (e.g., task killed or receives a
> > signal), dispatch_dequeue() is called with dsq == NULL.
> > 
> > In this case, the task is unlinked from ddsp_deferred_locals and
> > holding_cpu is cleared, but ddsp_dsq_id and ddsp_enq_flags are left
> > stale. On the next wakeup, when ops.select_cpu() calls
> > scx_bpf_dsq_insert(), mark_direct_dispatch() finds ddsp_dsq_id already
> > set and triggers:
> > 
> >   WARNING: CPU: 56 PID: 2323042 at kernel/sched/ext.c:2157
> >            scx_bpf_dsq_insert+0x16b/0x1d0
> > 
> > Fix this by clearing ddsp_dsq_id and ddsp_enq_flags in dispatch_dequeue()
> > when dsq is NULL, ensuring clean state for subsequent wakeups.
> 
> I've tried to fix this a while ago (same as this, right?
> https://github.com/sched-ext/scx/issues/2758), I remember that I applied
> exactly the same patch, but I was still able to trigger the warning.
> 
> IIRC there's also a race in ttwu_queue_wakelist tasks and
> sched_setscheduler() that can hit the stale ddsp_dsq_id (maybe other
> cases).

I figured there was probably some other paths that it could race.


> Long story short, the only thing that was working reliably for me was to
> clear ddsp_dsq_id and ddsp_enq_flags in select_task_rq_scx(), but I thought
> it was a bit too overkill and then I've never finished to investigate the
> real issue...
> 
> In conclusion, I think this is fixing some of these warnings that we see
> and it's probably good to apply it, but it's not fixing all of them.
> 
> Anyway, I'll do some tests with this patch and report back!
> 
> Thanks,
> -Andrea

Sounds good, I hit this running cosmos on a moderately loaded machine.
I'll see if I can get a reproducer made and do some more testing.

  reply	other threads:[~2026-01-22  0:31 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-21 15:56 [PATCH] sched_ext: Clear direct dispatch state on dequeue when dsq is NULL Daniel Hodges
2026-01-21 21:10 ` Andrea Righi
2026-01-22  0:31   ` Daniel Hodges [this message]
2026-01-28 10:53     ` Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aXFukGjN0F7W3Hoa@fb.com \
    --to=hodgesd@meta.com \
    --cc=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox