From: Tejun Heo <tj@kernel.org>
To: David Vernet <void@manifault.com>,
Andrea Righi <andrea.righi@linux.dev>,
Changwoo Min <changwoo@igalia.com>
Cc: Dan Schatzberg <schatzberg.dan@gmail.com>,
Emil Tsalapatis <etsal@meta.com>,
sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org,
Tejun Heo <tj@kernel.org>, Andrea Righi <arighi@nvidia.com>
Subject: [PATCH v2 08/14] sched_ext: Refactor lockup handlers into handle_lockup()
Date: Mon, 10 Nov 2025 10:56:30 -1000 [thread overview]
Message-ID: <20251110205636.405592-9-tj@kernel.org> (raw)
In-Reply-To: <20251110205636.405592-1-tj@kernel.org>
scx_rcu_cpu_stall() and scx_softlockup() share the same pattern: check if the
scheduler is enabled under RCU read lock and trigger an error if so. Extract
the common pattern into handle_lockup() helper. Add scx_verror() macro and use
guard(rcu)().
This simplifies both handlers, reduces code duplication, and prepares for
hardlockup handling.
Cc: Dan Schatzberg <schatzberg.dan@gmail.com>
Cc: Emil Tsalapatis <etsal@meta.com>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
---
kernel/sched/ext.c | 65 ++++++++++++++++++----------------------------
1 file changed, 25 insertions(+), 40 deletions(-)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 600918095245..d9572bf99b5b 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -192,6 +192,7 @@ static __printf(4, 5) bool scx_exit(struct scx_sched *sch,
}
#define scx_error(sch, fmt, args...) scx_exit((sch), SCX_EXIT_ERROR, 0, fmt, ##args)
+#define scx_verror(sch, fmt, args) scx_vexit((sch), SCX_EXIT_ERROR, 0, fmt, args)
#define SCX_HAS_OP(sch, op) test_bit(SCX_OP_IDX(op), (sch)->has_op)
@@ -3650,39 +3651,40 @@ bool scx_allow_ttwu_queue(const struct task_struct *p)
return false;
}
-/**
- * scx_rcu_cpu_stall - sched_ext RCU CPU stall handler
- *
- * While there are various reasons why RCU CPU stalls can occur on a system
- * that may not be caused by the current BPF scheduler, try kicking out the
- * current scheduler in an attempt to recover the system to a good state before
- * issuing panics.
- */
-bool scx_rcu_cpu_stall(void)
+static __printf(1, 2) bool handle_lockup(const char *fmt, ...)
{
struct scx_sched *sch;
+ va_list args;
- rcu_read_lock();
+ guard(rcu)();
sch = rcu_dereference(scx_root);
- if (unlikely(!sch)) {
- rcu_read_unlock();
+ if (unlikely(!sch))
return false;
- }
switch (scx_enable_state()) {
case SCX_ENABLING:
case SCX_ENABLED:
- break;
+ va_start(args, fmt);
+ scx_verror(sch, fmt, args);
+ va_end(args);
+ return true;
default:
- rcu_read_unlock();
return false;
}
+}
- scx_error(sch, "RCU CPU stall detected!");
- rcu_read_unlock();
-
- return true;
+/**
+ * scx_rcu_cpu_stall - sched_ext RCU CPU stall handler
+ *
+ * While there are various reasons why RCU CPU stalls can occur on a system
+ * that may not be caused by the current BPF scheduler, try kicking out the
+ * current scheduler in an attempt to recover the system to a good state before
+ * issuing panics.
+ */
+bool scx_rcu_cpu_stall(void)
+{
+ return handle_lockup("RCU CPU stall detected!");
}
/**
@@ -3697,28 +3699,11 @@ bool scx_rcu_cpu_stall(void)
*/
void scx_softlockup(u32 dur_s)
{
- struct scx_sched *sch;
-
- rcu_read_lock();
-
- sch = rcu_dereference(scx_root);
- if (unlikely(!sch))
- goto out_unlock;
-
- switch (scx_enable_state()) {
- case SCX_ENABLING:
- case SCX_ENABLED:
- break;
- default:
- goto out_unlock;
- }
-
- printk_deferred(KERN_ERR "sched_ext: Soft lockup - CPU%d stuck for %us, disabling \"%s\"\n",
- smp_processor_id(), dur_s, scx_root->ops.name);
+ if (!handle_lockup("soft lockup - CPU %d stuck for %us", smp_processor_id(), dur_s))
+ return;
- scx_error(sch, "soft lockup - CPU#%d stuck for %us", smp_processor_id(), dur_s);
-out_unlock:
- rcu_read_unlock();
+ printk_deferred(KERN_ERR "sched_ext: Soft lockup - CPU %d stuck for %us, disabling BPF scheduler\n",
+ smp_processor_id(), dur_s);
}
/**
--
2.51.2
next prev parent reply other threads:[~2025-11-10 20:56 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-10 20:56 [PATCHSET v2 sched_ext/for-6.19] sched_ext: Improve bypass mode scalability Tejun Heo
2025-11-10 20:56 ` [PATCH v2 01/14] sched_ext: Don't set ddsp_dsq_id during select_cpu in bypass mode Tejun Heo
2025-11-10 21:21 ` Emil Tsalapatis
2025-11-10 21:56 ` Tejun Heo
2025-11-10 20:56 ` [PATCH v2 02/14] sched_ext: Make slice values tunable and use shorter slice " Tejun Heo
2025-11-10 21:56 ` Emil Tsalapatis
2025-11-11 17:43 ` [PATCH v3 02/14] sched_ext: Use " Tejun Heo
2025-11-11 18:07 ` Andrea Righi
2025-11-10 20:56 ` [PATCH v2 03/14] sched_ext: Refactor do_enqueue_task() local and global DSQ paths Tejun Heo
2025-11-10 22:06 ` Emil Tsalapatis
2025-11-10 20:56 ` [PATCH v2 04/14] sched_ext: Use per-CPU DSQs instead of per-node global DSQs in bypass mode Tejun Heo
2025-11-10 21:43 ` Emil Tsalapatis
2025-11-10 21:59 ` Tejun Heo
2025-11-10 23:26 ` Emil Tsalapatis
2025-11-10 20:56 ` [PATCH v2 05/14] sched_ext: Simplify breather mechanism with scx_aborting flag Tejun Heo
2025-11-11 16:34 ` Emil Tsalapatis
2025-11-10 20:56 ` [PATCH v2 06/14] sched_ext: Exit dispatch and move operations immediately when aborting Tejun Heo
2025-11-10 20:56 ` [PATCH v2 07/14] sched_ext: Make scx_exit() and scx_vexit() return bool Tejun Heo
2025-11-10 20:56 ` Tejun Heo [this message]
2025-11-10 20:56 ` [PATCH v2 09/14] sched_ext: Make handle_lockup() propagate scx_verror() result Tejun Heo
2025-11-10 20:56 ` [PATCH v2 10/14] sched_ext: Hook up hardlockup detector Tejun Heo
2025-11-11 18:33 ` [PATCH UPDATED " Tejun Heo
2025-11-11 18:39 ` Tejun Heo
2025-11-10 20:56 ` [PATCH v2 11/14] sched_ext: Add scx_cpu0 example scheduler Tejun Heo
2025-11-10 20:56 ` [PATCH v2 12/14] sched_ext: Factor out scx_dsq_list_node cursor initialization into INIT_DSQ_LIST_CURSOR Tejun Heo
2025-11-10 23:56 ` Emil Tsalapatis
2025-11-10 20:56 ` [PATCH v2 13/14] sched_ext: Factor out abbreviated dispatch dequeue into dispatch_dequeue_locked() Tejun Heo
2025-11-10 20:56 ` [PATCH v2 14/14] sched_ext: Implement load balancer for bypass mode Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251110205636.405592-9-tj@kernel.org \
--to=tj@kernel.org \
--cc=andrea.righi@linux.dev \
--cc=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=etsal@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=schatzberg.dan@gmail.com \
--cc=sched-ext@lists.linux.dev \
--cc=void@manifault.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox