From: Andrea Righi <arighi@nvidia.com>
To: Samuele Mariotti <smariotti@disroot.org>
Cc: Tejun Heo <tj@kernel.org>,
void@manifault.com, changwoo@igalia.com,
sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org,
Paolo Valente <paolo.valente@unimore.it>
Subject: Re: [PATCH] sched_ext: Fix spurious WARN on stale ops_state in ops_dequeue()
Date: Thu, 14 May 2026 22:08:53 +0200 [thread overview]
Message-ID: <agYr1RbafGUevfOH@gpd4> (raw)
In-Reply-To: <agV8-RlKEkiAbd-h@cachyos>
Hi Samuele,
On Thu, May 14, 2026 at 11:13:49AM +0200, Samuele Mariotti wrote:
> Hi Tejun,
>
> > Let's not do the WARN and exit. We shouldn't get this wrong and if we get
> > this wrong, it's going to be obvious from lockup detectors. Can you please
> > add a comment explaining the retry condition tho?
> >
> > Thanks.
> >
> > --
> > tejun
>
> Thanks for the feedback. If I understood correctly, you prefer no retry
> limit, letting the lockup detectors catch any real bug. I also added
> unlikely() since the stale case is by definition rare.
>
> Here is the updated version:
>
> /*
> * If SCX_TASK_IN_CUSTODY is not set, opss is stale: finish_dispatch()
> * has already claimed the task and cleared SCX_TASK_IN_CUSTODY. Retry
> * to get a fresh view of p->scx.ops_state.
> */
> if (unlikely(!(READ_ONCE(p->scx.flags) & SCX_TASK_IN_CUSTODY))) {
> cpu_relax();
> goto retry;
> }
The code looks good to me, I'd elaborate more on the comment to make it clear
that the retry loop is guaranteed to terminate (not a deadlock).
How about this (or something along these lines)?
/*
* A queued task must be in BPF scheduler's custody. If
* SCX_TASK_IN_CUSTODY is clear, finish_dispatch() on another
* CPU has already passed call_task_dequeue() (which clears the
* flag), but has not yet written SCX_OPSS_NONE. That final
* store does not require this rq's lock, so retrying with
* cpu_relax() is bounded: we'll observe NONE (or DISPATCHING,
* handled by the fallthrough) on a subsequent iteration.
*/
Thanks,
-Andrea
next prev parent reply other threads:[~2026-05-14 20:09 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-13 9:53 [PATCH] sched_ext: Fix spurious WARN on stale ops_state in ops_dequeue() Samuele Mariotti
2026-05-13 14:26 ` Andrea Righi
2026-05-13 16:41 ` Samuele Mariotti
2026-05-13 16:49 ` Andrea Righi
2026-05-13 20:01 ` Tejun Heo
2026-05-14 9:13 ` Samuele Mariotti
2026-05-14 20:08 ` Andrea Righi [this message]
2026-05-14 4:00 ` sashiko-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agYr1RbafGUevfOH@gpd4 \
--to=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=paolo.valente@unimore.it \
--cc=sched-ext@lists.linux.dev \
--cc=smariotti@disroot.org \
--cc=tj@kernel.org \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.