From: Tejun Heo <tj@kernel.org>
To: David Vernet <void@manifault.com>,
Andrea Righi <arighi@nvidia.com>,
Changwoo Min <changwoo@igalia.com>
Cc: sched-ext@lists.linux.dev, emil@etsalapatis.com,
linux-kernel@vger.kernel.org
Subject: [PATCH sched_ext/for-7.1-fixes] sched_ext: Defer scx_hardlockup() out of NMI
Date: Fri, 24 Apr 2026 10:14:32 -1000 [thread overview]
Message-ID: <e78c625bf7ef51fef0e564d9c25482aa@kernel.org> (raw)
scx_hardlockup() runs from NMI and eventually calls scx_claim_exit(),
which takes scx_sched_lock. scx_sched_lock isn't NMI-safe and grabbing
it from NMI context can lead to deadlocks.
The hardlockup handler is best-effort recovery and the disable path it
triggers runs off of irq_work anyway. Move the handle_lockup() call into
an irq_work so it runs in IRQ context.
Fixes: ebeca1f930ea ("sched_ext: Introduce cgroup sub-sched support")
Cc: stable@vger.kernel.org # v7.1+
Signed-off-by: Tejun Heo <tj@kernel.org>
---
kernel/sched/ext.c | 33 +++++++++++++++++++++++++++------
1 file changed, 27 insertions(+), 6 deletions(-)
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -5142,6 +5142,25 @@ void scx_softlockup(u32 dur_s)
smp_processor_id(), dur_s);
}
+/*
+ * scx_hardlockup() runs from NMI and eventually calls scx_claim_exit(),
+ * which takes scx_sched_lock. scx_sched_lock isn't NMI-safe and grabbing
+ * it from NMI context can lead to deadlocks. Defer via irq_work; the
+ * disable path runs off irq_work anyway.
+ */
+static atomic_t scx_hardlockup_cpu = ATOMIC_INIT(-1);
+
+static void scx_hardlockup_irq_workfn(struct irq_work *work)
+{
+ int cpu = atomic_xchg(&scx_hardlockup_cpu, -1);
+
+ if (cpu >= 0 && handle_lockup("hard lockup - CPU %d", cpu))
+ printk_deferred(KERN_ERR "sched_ext: Hard lockup - CPU %d, disabling BPF scheduler\n",
+ cpu);
+}
+
+static DEFINE_IRQ_WORK(scx_hardlockup_irq_work, scx_hardlockup_irq_workfn);
+
/**
* scx_hardlockup - sched_ext hardlockup handler
*
@@ -5150,17 +5169,19 @@ void scx_softlockup(u32 dur_s)
* Try kicking out the current scheduler in an attempt to recover the system to
* a good state before taking more drastic actions.
*
- * Returns %true if sched_ext is enabled and abort was initiated, which may
- * resolve the reported hardlockup. %false if sched_ext is not enabled or
- * someone else already initiated abort.
+ * Queues an irq_work; the handle_lockup() call happens in IRQ context (see
+ * scx_hardlockup_irq_workfn).
+ *
+ * Returns %true if sched_ext is enabled and the work was queued, %false
+ * otherwise.
*/
bool scx_hardlockup(int cpu)
{
- if (!handle_lockup("hard lockup - CPU %d", cpu))
+ if (!rcu_access_pointer(scx_root))
return false;
- printk_deferred(KERN_ERR "sched_ext: Hard lockup - CPU %d, disabling BPF scheduler\n",
- cpu);
+ atomic_cmpxchg(&scx_hardlockup_cpu, -1, cpu);
+ irq_work_queue(&scx_hardlockup_irq_work);
return true;
}
next reply other threads:[~2026-04-24 20:14 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-24 20:14 Tejun Heo [this message]
2026-04-24 20:15 ` [PATCH sched_ext/for-7.1-fixes] sched_ext: Defer scx_hardlockup() out of NMI Tejun Heo
2026-04-24 22:19 ` Andrea Righi
2026-04-25 0:23 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e78c625bf7ef51fef0e564d9c25482aa@kernel.org \
--to=tj@kernel.org \
--cc=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=emil@etsalapatis.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sched-ext@lists.linux.dev \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.