From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>
Cc: David Vernet <void@manifault.com>,
Changwoo Min <changwoo@igalia.com>,
Emil Tsalapatis <emil@etsalapatis.com>,
sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] sched_ext: Fix NULL pointer deref and warnings during scx teardown
Date: Mon, 2 Feb 2026 19:54:50 +0100 [thread overview]
Message-ID: <aYDy-tCqsH990lW9@gpd4> (raw)
In-Reply-To: <aYDaao9Xb_Bkv0NH@slm.duckdns.org>
On Mon, Feb 02, 2026 at 07:10:02AM -1000, Tejun Heo wrote:
> Hello,
>
> On Mon, Feb 02, 2026 at 04:13:41PM +0100, Andrea Righi wrote:
> > @@ -2619,6 +2619,9 @@ static void set_cpus_allowed_scx(struct task_struct *p,
> >
> > set_cpus_allowed_common(p, ac);
> >
> > + if (unlikely(!sch))
> > + return;
> > +
>
> I don't quite understand how this would happen. set_cpu_allowed_scx() is
> called from do_set_cpus_allowed() with task_rq locked. ie. the task *has* to
> be on sched_ext for it to be called. It's straightforward task rq lock
> synchronization, so there's no race window.
>
> Combined with the failures in switching_to_scx() and switched_form_scx(), I
> wonder whether what's actually broken is more something like the disable
> path missing some tasks?
>
> Thanks.
>
> --
> tejun
I'm able to reproduce the NULL pointer dereference in set_cpu_allowed_scx()
quite easily running `stress-ng --race-sched 0` with an scx scheduler that
is intentionally starving tasks, triggering a stall => disable.
I think this is what's happening:
CPU0 CPU1
---- ----
__sched_setscheduler()
task_rq_lock(p)
next_class = __setscheduler_class()
// next_class is ext_sched_class
scx_disable_workfn()
scx_set_enable_state(SCX_DISABLING)
scx_task_iter_start()
while ((p = next())) {
...
p->sched_class = fair_sched_class
...
}
scx_task_iter_stop()
synchronize_rcu()
RCU_INIT_POINTER(scx_root, NULL)
scoped_guard(sched_change, ...) {
p->sched_class = next_class;
// next_class is still ext_sched_class,
// overwriting fair_sched_class!
}
// Guard ends, calls sched_change_end()
// switching_to_scx() called
// scx_root == NULL => returns early
task_rq_unlock(p)
sched_setaffinity(p)
set_cpus_allowed_scx()
sch = scx_root; // scx_root == NULL => BUG!
-Andrea
next prev parent reply other threads:[~2026-02-02 18:54 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-02 15:13 [PATCH] sched_ext: Fix NULL pointer deref and warnings during scx teardown Andrea Righi
2026-02-02 17:10 ` Tejun Heo
2026-02-02 18:54 ` Andrea Righi [this message]
2026-02-02 20:52 ` Tejun Heo
2026-02-02 22:50 ` Andrea Righi
2026-02-03 14:01 ` Andrea Righi
2026-02-04 20:08 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYDy-tCqsH990lW9@gpd4 \
--to=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=emil@etsalapatis.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sched-ext@lists.linux.dev \
--cc=tj@kernel.org \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.