public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched_ext: Prevent RB-tree corruption in scx_bpf_task_set_dsq_vtime()
@ 2026-04-15 19:32 Cheng-Yang Chou
  2026-04-16  1:49 ` Zhao Mengmeng
  2026-04-16 17:56 ` Tejun Heo
  0 siblings, 2 replies; 6+ messages in thread
From: Cheng-Yang Chou @ 2026-04-15 19:32 UTC (permalink / raw)
  To: sched-ext, Tejun Heo, David Vernet, Andrea Righi, Changwoo Min
  Cc: Ching-Chun Huang, Chia-Ping Tsai, yphbchou0911, stable

scx_bpf_task_set_dsq_vtime() allows modifying a task's dsq_vtime without
checking if it is already enqueued on SCX_DSQ_PRIQ. Since dsq_vtime is
the rb-tree sorting key, mutating it in-place violates the BST invariant
and corrupts the tree structure.

In ops.dispatch():
	p = scx_bpf_dsq_peek(PRIO_DSQ); // Get a task already in the DSQ
	if (p) {
		// This illegally returns %true
		scx_bpf_task_set_dsq_vtime(p, 0xFFFFFFFFFFFFFFFF);
	}

Fix this by adding a check for the SCX_TASK_DSQ_ON_PRIQ flag. Disallow
vtime modification and trigger scx_error() if the task is already queued
on a priority DSQ.

Fixes: 3035addfaf28 ("sched_ext: Add scx_bpf_task_set_slice() and scx_bpf_task_set_dsq_vtime()")
Cc: stable@vger.kernel.org # v6.19+
Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
---
 kernel/sched/ext.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 012ca8bd70fb..7a54f9bc5e7a 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -8540,7 +8540,8 @@ __bpf_kfunc bool scx_bpf_task_set_slice(struct task_struct *p, u64 slice,
  * @aux: implicit BPF argument to access bpf_prog_aux hidden from BPF progs
  *
  * Set @p's virtual time to @vtime. Returns %true on success, %false if the
- * calling scheduler doesn't have authority over @p.
+ * calling scheduler doesn't have authority over @p. If @p is already enqueued
+ * on a priority DSQ, scx_error() is triggered and %false is returned.
  */
 __bpf_kfunc bool scx_bpf_task_set_dsq_vtime(struct task_struct *p, u64 vtime,
 					    const struct bpf_prog_aux *aux)
@@ -8552,6 +8553,11 @@ __bpf_kfunc bool scx_bpf_task_set_dsq_vtime(struct task_struct *p, u64 vtime,
 	if (unlikely(!scx_task_on_sched(sch, p)))
 		return false;
 
+	if (unlikely(READ_ONCE(p->scx.dsq_flags) & SCX_TASK_DSQ_ON_PRIQ)) {
+		scx_error(sch, "vtime modification disallowed while on a priority DSQ");
+		return false;
+	}
+
 	p->scx.dsq_vtime = vtime;
 	return true;
 }
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched_ext: Prevent RB-tree corruption in scx_bpf_task_set_dsq_vtime()
  2026-04-15 19:32 [PATCH] sched_ext: Prevent RB-tree corruption in scx_bpf_task_set_dsq_vtime() Cheng-Yang Chou
@ 2026-04-16  1:49 ` Zhao Mengmeng
  2026-04-16  5:02   ` Cheng-Yang Chou
  2026-04-16 17:56 ` Tejun Heo
  1 sibling, 1 reply; 6+ messages in thread
From: Zhao Mengmeng @ 2026-04-16  1:49 UTC (permalink / raw)
  To: yphbchou0911
  Cc: arighi, changwoo, chia7712, jserv, sched-ext, stable, tj, void

Just discuss, if is better to use WARN_ON_ONCE instead of failing the
scheduler, just like the check in the beginning of dispatch_enqueue().

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched_ext: Prevent RB-tree corruption in scx_bpf_task_set_dsq_vtime()
  2026-04-16  1:49 ` Zhao Mengmeng
@ 2026-04-16  5:02   ` Cheng-Yang Chou
  0 siblings, 0 replies; 6+ messages in thread
From: Cheng-Yang Chou @ 2026-04-16  5:02 UTC (permalink / raw)
  To: Zhao Mengmeng
  Cc: arighi, changwoo, chia7712, jserv, sched-ext, stable, tj, void

Hi Zhao,

On Thu, Apr 16, 2026 at 09:49:56AM +0800, Zhao Mengmeng wrote:
> Just discuss, if is better to use WARN_ON_ONCE instead of failing the
> scheduler, just like the check in the beginning of dispatch_enqueue().

I think this is more like a misused scheduler API.
dispatch_enqueue() uses scx_error() for the same category of errors:
- scx_error(sch, "attempting to dispatch to a destroyed dsq")
- scx_error(sch, "cannot use vtime ordering for built-in DSQs")

That said, I don't have a strong opinion on this. :-) 

-- 
Thanks,
Cheng-Yang

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched_ext: Prevent RB-tree corruption in scx_bpf_task_set_dsq_vtime()
  2026-04-15 19:32 [PATCH] sched_ext: Prevent RB-tree corruption in scx_bpf_task_set_dsq_vtime() Cheng-Yang Chou
  2026-04-16  1:49 ` Zhao Mengmeng
@ 2026-04-16 17:56 ` Tejun Heo
  2026-04-16 18:00   ` Andrea Righi
  1 sibling, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2026-04-16 17:56 UTC (permalink / raw)
  To: Cheng-Yang Chou
  Cc: sched-ext, David Vernet, Andrea Righi, Changwoo Min,
	Ching-Chun Huang, Chia-Ping Tsai, stable

On Thu, Apr 16, 2026 at 03:32:44AM +0800, Cheng-Yang Chou wrote:
> scx_bpf_task_set_dsq_vtime() allows modifying a task's dsq_vtime without
> checking if it is already enqueued on SCX_DSQ_PRIQ. Since dsq_vtime is
> the rb-tree sorting key, mutating it in-place violates the BST invariant
> and corrupts the tree structure.
> 
> In ops.dispatch():
> 	p = scx_bpf_dsq_peek(PRIO_DSQ); // Get a task already in the DSQ
> 	if (p) {
> 		// This illegally returns %true
> 		scx_bpf_task_set_dsq_vtime(p, 0xFFFFFFFFFFFFFFFF);
> 	}
> 
> Fix this by adding a check for the SCX_TASK_DSQ_ON_PRIQ flag. Disallow
> vtime modification and trigger scx_error() if the task is already queued
> on a priority DSQ.

If the user updates the vtime after inserting, the tree looks wrong but it
won't cause crashes or anything. Later insertions might get confused in
terms of ordering but it's a rather obvious user-shotting-their-own-foot, so
I'm more inclined to leave it as-is.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched_ext: Prevent RB-tree corruption in scx_bpf_task_set_dsq_vtime()
  2026-04-16 17:56 ` Tejun Heo
@ 2026-04-16 18:00   ` Andrea Righi
  2026-04-16 18:09     ` Cheng-Yang Chou
  0 siblings, 1 reply; 6+ messages in thread
From: Andrea Righi @ 2026-04-16 18:00 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Cheng-Yang Chou, sched-ext, David Vernet, Changwoo Min,
	Ching-Chun Huang, Chia-Ping Tsai, stable

On Thu, Apr 16, 2026 at 07:56:33AM -1000, Tejun Heo wrote:
> On Thu, Apr 16, 2026 at 03:32:44AM +0800, Cheng-Yang Chou wrote:
> > scx_bpf_task_set_dsq_vtime() allows modifying a task's dsq_vtime without
> > checking if it is already enqueued on SCX_DSQ_PRIQ. Since dsq_vtime is
> > the rb-tree sorting key, mutating it in-place violates the BST invariant
> > and corrupts the tree structure.
> > 
> > In ops.dispatch():
> > 	p = scx_bpf_dsq_peek(PRIO_DSQ); // Get a task already in the DSQ
> > 	if (p) {
> > 		// This illegally returns %true
> > 		scx_bpf_task_set_dsq_vtime(p, 0xFFFFFFFFFFFFFFFF);
> > 	}
> > 
> > Fix this by adding a check for the SCX_TASK_DSQ_ON_PRIQ flag. Disallow
> > vtime modification and trigger scx_error() if the task is already queued
> > on a priority DSQ.
> 
> If the user updates the vtime after inserting, the tree looks wrong but it
> won't cause crashes or anything. Later insertions might get confused in
> terms of ordering but it's a rather obvious user-shotting-their-own-foot, so
> I'm more inclined to leave it as-is.

I agree. This looks like intentionally breaking the tree. If users do so, they
can keep the pieces. :)

Thanks,
-Andrea

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched_ext: Prevent RB-tree corruption in scx_bpf_task_set_dsq_vtime()
  2026-04-16 18:00   ` Andrea Righi
@ 2026-04-16 18:09     ` Cheng-Yang Chou
  0 siblings, 0 replies; 6+ messages in thread
From: Cheng-Yang Chou @ 2026-04-16 18:09 UTC (permalink / raw)
  To: Andrea Righi
  Cc: Tejun Heo, sched-ext, David Vernet, Changwoo Min,
	Ching-Chun Huang, Chia-Ping Tsai, stable

Hi Tejun, Andrea,

On Thu, Apr 16, 2026 at 08:00:30PM +0200, Andrea Righi wrote:
> On Thu, Apr 16, 2026 at 07:56:33AM -1000, Tejun Heo wrote:
> > On Thu, Apr 16, 2026 at 03:32:44AM +0800, Cheng-Yang Chou wrote:
> > > scx_bpf_task_set_dsq_vtime() allows modifying a task's dsq_vtime without
> > > checking if it is already enqueued on SCX_DSQ_PRIQ. Since dsq_vtime is
> > > the rb-tree sorting key, mutating it in-place violates the BST invariant
> > > and corrupts the tree structure.
> > > 
> > > In ops.dispatch():
> > > 	p = scx_bpf_dsq_peek(PRIO_DSQ); // Get a task already in the DSQ
> > > 	if (p) {
> > > 		// This illegally returns %true
> > > 		scx_bpf_task_set_dsq_vtime(p, 0xFFFFFFFFFFFFFFFF);
> > > 	}
> > > 
> > > Fix this by adding a check for the SCX_TASK_DSQ_ON_PRIQ flag. Disallow
> > > vtime modification and trigger scx_error() if the task is already queued
> > > on a priority DSQ.
> > 
> > If the user updates the vtime after inserting, the tree looks wrong but it
> > won't cause crashes or anything. Later insertions might get confused in
> > terms of ordering but it's a rather obvious user-shotting-their-own-foot, so
> > I'm more inclined to leave it as-is.
> 
> I agree. This looks like intentionally breaking the tree. If users do so, they
> can keep the pieces. :)

Ah, I see. I was just experimenting with some scheduler API combinations.
Indeed, users shouldn't do this if they want their scheduler to work
correctly.

Thanks for the explanation!

-- 
Thanks,
Cheng-Yang

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-04-16 18:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-15 19:32 [PATCH] sched_ext: Prevent RB-tree corruption in scx_bpf_task_set_dsq_vtime() Cheng-Yang Chou
2026-04-16  1:49 ` Zhao Mengmeng
2026-04-16  5:02   ` Cheng-Yang Chou
2026-04-16 17:56 ` Tejun Heo
2026-04-16 18:00   ` Andrea Righi
2026-04-16 18:09     ` Cheng-Yang Chou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox