* [PATCH sched_ext/for-7.0-fixes] sched_ext: Disable preemption between scx_claim_exit() and kicking helper work
@ 2026-02-25 5:00 Tejun Heo
2026-02-25 6:43 ` Andrea Righi
0 siblings, 1 reply; 3+ messages in thread
From: Tejun Heo @ 2026-02-25 5:00 UTC (permalink / raw)
To: linux-kernel, sched-ext; +Cc: void, arighi, changwoo, emil, stable, Tejun Heo
scx_claim_exit() atomically sets exit_kind, which prevents scx_error() from
triggering further error handling. After claiming exit, the caller must kick
the helper kthread work which initiates bypass mode and teardown.
If the calling task gets preempted between claiming exit and kicking the
helper work, and the BPF scheduler fails to schedule it back (since error
handling is now disabled), the helper work is never queued, bypass mode
never activates, tasks stop being dispatched, and the system wedges.
Disable preemption across scx_claim_exit() and the subsequent work kicking
in all callers - scx_disable() and scx_vexit(). Add
lockdep_assert_preemption_disabled() to scx_claim_exit() to enforce the
requirement.
Fixes: a69040ed57f5 ("sched_ext: Simplify breather mechanism with scx_aborting flag")
Cc: stable@vger.kernel.org # v6.19+
Signed-off-by: Tejun Heo <tj@kernel.org>
---
kernel/sched/ext.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index c18e81e8ef51..9280381f8923 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -4423,10 +4423,19 @@ static void scx_disable_workfn(struct kthread_work *work)
scx_bypass(false);
}
+/*
+ * Claim the exit on @sch. The caller must ensure that the helper kthread work
+ * is kicked before the current task can be preempted. Once exit_kind is
+ * claimed, scx_error() can no longer trigger, so if the current task gets
+ * preempted and the BPF scheduler fails to schedule it back, the helper work
+ * will never be kicked and the whole system can wedge.
+ */
static bool scx_claim_exit(struct scx_sched *sch, enum scx_exit_kind kind)
{
int none = SCX_EXIT_NONE;
+ lockdep_assert_preemption_disabled();
+
if (!atomic_try_cmpxchg(&sch->exit_kind, &none, kind))
return false;
@@ -4449,6 +4458,7 @@ static void scx_disable(enum scx_exit_kind kind)
rcu_read_lock();
sch = rcu_dereference(scx_root);
if (sch) {
+ guard(preempt)();
scx_claim_exit(sch, kind);
kthread_queue_work(sch->helper, &sch->disable_work);
}
@@ -4771,6 +4781,8 @@ static bool scx_vexit(struct scx_sched *sch,
{
struct scx_exit_info *ei = sch->exit_info;
+ guard(preempt)();
+
if (!scx_claim_exit(sch, kind))
return false;
--
2.53.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH sched_ext/for-7.0-fixes] sched_ext: Disable preemption between scx_claim_exit() and kicking helper work
2026-02-25 5:00 [PATCH sched_ext/for-7.0-fixes] sched_ext: Disable preemption between scx_claim_exit() and kicking helper work Tejun Heo
@ 2026-02-25 6:43 ` Andrea Righi
2026-02-25 7:46 ` Tejun Heo
0 siblings, 1 reply; 3+ messages in thread
From: Andrea Righi @ 2026-02-25 6:43 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-kernel, sched-ext, void, changwoo, emil, stable
On Tue, Feb 24, 2026 at 07:00:55PM -1000, Tejun Heo wrote:
> scx_claim_exit() atomically sets exit_kind, which prevents scx_error() from
> triggering further error handling. After claiming exit, the caller must kick
> the helper kthread work which initiates bypass mode and teardown.
>
> If the calling task gets preempted between claiming exit and kicking the
> helper work, and the BPF scheduler fails to schedule it back (since error
> handling is now disabled), the helper work is never queued, bypass mode
> never activates, tasks stop being dispatched, and the system wedges.
>
> Disable preemption across scx_claim_exit() and the subsequent work kicking
> in all callers - scx_disable() and scx_vexit(). Add
> lockdep_assert_preemption_disabled() to scx_claim_exit() to enforce the
> requirement.
>
> Fixes: a69040ed57f5 ("sched_ext: Simplify breather mechanism with scx_aborting flag")
I think the same race window already existed even before this commit, we
were just doing atomic_try_cmpxchg() directly, instead of using the
scx_claim_exit() helper.
So, probably the right target should be f0e1a0643a59b ("sched_ext:
Implement BPF extensible scheduler class").
Apart than that, the fix looks good to me.
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Thanks,
-Andrea
> Cc: stable@vger.kernel.org # v6.19+
> Signed-off-by: Tejun Heo <tj@kernel.org>
> ---
> kernel/sched/ext.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index c18e81e8ef51..9280381f8923 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -4423,10 +4423,19 @@ static void scx_disable_workfn(struct kthread_work *work)
> scx_bypass(false);
> }
>
> +/*
> + * Claim the exit on @sch. The caller must ensure that the helper kthread work
> + * is kicked before the current task can be preempted. Once exit_kind is
> + * claimed, scx_error() can no longer trigger, so if the current task gets
> + * preempted and the BPF scheduler fails to schedule it back, the helper work
> + * will never be kicked and the whole system can wedge.
> + */
> static bool scx_claim_exit(struct scx_sched *sch, enum scx_exit_kind kind)
> {
> int none = SCX_EXIT_NONE;
>
> + lockdep_assert_preemption_disabled();
> +
> if (!atomic_try_cmpxchg(&sch->exit_kind, &none, kind))
> return false;
>
> @@ -4449,6 +4458,7 @@ static void scx_disable(enum scx_exit_kind kind)
> rcu_read_lock();
> sch = rcu_dereference(scx_root);
> if (sch) {
> + guard(preempt)();
> scx_claim_exit(sch, kind);
> kthread_queue_work(sch->helper, &sch->disable_work);
> }
> @@ -4771,6 +4781,8 @@ static bool scx_vexit(struct scx_sched *sch,
> {
> struct scx_exit_info *ei = sch->exit_info;
>
> + guard(preempt)();
> +
> if (!scx_claim_exit(sch, kind))
> return false;
>
> --
> 2.53.0
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH sched_ext/for-7.0-fixes] sched_ext: Disable preemption between scx_claim_exit() and kicking helper work
2026-02-25 6:43 ` Andrea Righi
@ 2026-02-25 7:46 ` Tejun Heo
0 siblings, 0 replies; 3+ messages in thread
From: Tejun Heo @ 2026-02-25 7:46 UTC (permalink / raw)
To: Andrea Righi; +Cc: linux-kernel, sched-ext, void, changwoo, emil, stable
On Wed, Feb 25, 2026 at 07:43:04AM +0100, Andrea Righi wrote:
> I think the same race window already existed even before this commit, we
> were just doing atomic_try_cmpxchg() directly, instead of using the
> scx_claim_exit() helper.
>
> So, probably the right target should be f0e1a0643a59b ("sched_ext:
> Implement BPF extensible scheduler class").
You're right. Updated the Fixes tag and stable Cc to v6.12+.
Applied to sched_ext/for-7.0-fixes.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-02-25 7:46 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-25 5:00 [PATCH sched_ext/for-7.0-fixes] sched_ext: Disable preemption between scx_claim_exit() and kicking helper work Tejun Heo
2026-02-25 6:43 ` Andrea Righi
2026-02-25 7:46 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox