All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: David Vernet <void@manifault.com>,
	Andrea Righi <arighi@nvidia.com>,
	Changwoo Min <changwoo@igalia.com>
Cc: sched-ext@lists.linux.dev, Emil Tsalapatis <emil@etsalapatis.com>,
	Cheng-Yang Chou <yphbchou0911@gmail.com>,
	linux-kernel@vger.kernel.org, Tejun Heo <tj@kernel.org>
Subject: [PATCH 4/5] sched_ext: Fix scx_sched_lock / rq lock ordering
Date: Mon,  9 Mar 2026 15:16:52 -1000	[thread overview]
Message-ID: <20260310011653.2993712-5-tj@kernel.org> (raw)
In-Reply-To: <20260310011653.2993712-1-tj@kernel.org>

There are two sites that nest rq lock inside scx_sched_lock:

- scx_bypass() takes scx_sched_lock then rq lock per CPU to propagate
  per-cpu bypass flags and re-enqueue tasks.

- sysrq_handle_sched_ext_dump() takes scx_sched_lock to iterate all
  scheds, scx_dump_state() then takes rq lock per CPU for dump.

And scx_claim_exit() takes scx_sched_lock to propagate exits to
descendants. It can be reached from scx_tick(), BPF kfuncs, and many
other paths with rq lock already held, creating the reverse ordering:

  rq lock -> scx_sched_lock vs. scx_sched_lock -> rq lock

Fix by flipping scx_bypass() to take rq lock first, and dropping
scx_sched_lock from sysrq_handle_sched_ext_dump() as scx_sched_all is
already RCU-traversable and scx_dump_lock now prevents dumping a dead
sched. This makes the consistent ordering rq lock -> scx_sched_lock.

Reported-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
Link: http://lkml.kernel.org/r/20260309163025.2240221-1-yphbchou0911@gmail.com
Fixes: ebeca1f930ea ("sched_ext: Introduce cgroup sub-sched support")
Signed-off-by: Tejun Heo <tj@kernel.org>
---
 kernel/sched/ext.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index cf28a8f62ad0..677c1c6c64bf 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -5097,8 +5097,8 @@ static void scx_bypass(struct scx_sched *sch, bool bypass)
 		struct rq *rq = cpu_rq(cpu);
 		struct task_struct *p, *n;
 
-		raw_spin_lock(&scx_sched_lock);
 		raw_spin_rq_lock(rq);
+		raw_spin_lock(&scx_sched_lock);
 
 		scx_for_each_descendant_pre(pos, sch) {
 			struct scx_sched_pcpu *pcpu = per_cpu_ptr(pos->pcpu, cpu);
@@ -7240,8 +7240,6 @@ static void sysrq_handle_sched_ext_dump(u8 key)
 	struct scx_exit_info ei = { .kind = SCX_EXIT_NONE, .reason = "SysRq-D" };
 	struct scx_sched *sch;
 
-	guard(raw_spinlock_irqsave)(&scx_sched_lock);
-
 	list_for_each_entry_rcu(sch, &scx_sched_all, all)
 		scx_dump_state(sch, &ei, 0, false);
 }
-- 
2.53.0


  parent reply	other threads:[~2026-03-10  1:16 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10  1:16 [PATCHSET sched_ext/for-7.1] Fix sub-sched locking issues Tejun Heo
2026-03-10  1:16 ` [PATCH 1/5] sched_ext: Fix sub_detach op check to test the parent's ops Tejun Heo
2026-03-10  1:16 ` [PATCH 2/5] sched_ext: Add scx_dump_lock and dump_disabled Tejun Heo
2026-03-10  1:16 ` [PATCH 3/5] sched_ext: Always bounce scx_disable() through irq_work Tejun Heo
2026-03-10  1:16 ` Tejun Heo [this message]
2026-03-10  5:18   ` [PATCH 4/5] sched_ext: Fix scx_sched_lock / rq lock ordering Cheng-Yang Chou
2026-03-10  6:39   ` Andrea Righi
2026-03-10  6:47     ` Andrea Righi
2026-03-10  1:16 ` [PATCH 5/5] sched_ext: Reject sub-sched attachment to a disabled parent Tejun Heo
2026-03-10  6:50 ` [PATCHSET sched_ext/for-7.1] Fix sub-sched locking issues Andrea Righi
2026-03-10 17:17 ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260310011653.2993712-5-tj@kernel.org \
    --to=tj@kernel.org \
    --cc=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=emil@etsalapatis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=void@manifault.com \
    --cc=yphbchou0911@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.