[PATCH] sched_ext: Fix NULL pointer dereferences in put_prev_task

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] sched_ext: Fix NULL pointer dereferences in put_prev_task_scx
@ 2025-06-09 11:36 liuwenfang
  2025-06-09 12:09 ` Andrea Righi
  0 siblings, 1 reply; 5+ messages in thread
From: liuwenfang @ 2025-06-09 11:36 UTC (permalink / raw)
  To: 'Tejun Heo'
  Cc: 'David Vernet', 'Andrea Righi',
	'Changwoo Min', 'Ingo Molnar',
	'Peter Zijlstra', 'Juri Lelli',
	'Vincent Guittot', 'Dietmar Eggemann',
	'Steven Rostedt', 'Ben Segall',
	'Mel Gorman', 'Valentin Schneider',
	'linux-kernel@vger.kernel.org'

As put_prev_task can be used in other kernel modules which can lead
to a NULL pointer. Fix this by checking for a valid next.

Signed-off-by: l00013971 <l00013971@hihonor.com>
---
 kernel/sched/ext.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index f5133249f..6a579babd 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -3262,7 +3262,7 @@ static void put_prev_task_scx(struct rq *rq, struct task_struct *p,
 		 * ops.enqueue() that @p is the only one available for this cpu,
 		 * which should trigger an explicit follow-up scheduling event.
 		 */
-		if (sched_class_above(&ext_sched_class, next->sched_class)) {
+		if (next && sched_class_above(&ext_sched_class, next->sched_class)) {
 			WARN_ON_ONCE(!static_branch_unlikely(&scx_ops_enq_last));
 			do_enqueue_task(rq, p, SCX_ENQ_LAST, -1);
 		} else {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched_ext: Fix NULL pointer dereferences in put_prev_task_scx
  2025-06-09 11:36 [PATCH] sched_ext: Fix NULL pointer dereferences in put_prev_task_scx liuwenfang
@ 2025-06-09 12:09 ` Andrea Righi
  2025-06-10  6:22   ` 回复: " liuwenfang
  0 siblings, 1 reply; 5+ messages in thread
From: Andrea Righi @ 2025-06-09 12:09 UTC (permalink / raw)
  To: liuwenfang
  Cc: 'Tejun Heo', 'David Vernet',
	'Changwoo Min', 'Ingo Molnar',
	'Peter Zijlstra', 'Juri Lelli',
	'Vincent Guittot', 'Dietmar Eggemann',
	'Steven Rostedt', 'Ben Segall',
	'Mel Gorman', 'Valentin Schneider',
	'linux-kernel@vger.kernel.org'

On Mon, Jun 09, 2025 at 11:36:15AM +0000, liuwenfang wrote:
> As put_prev_task can be used in other kernel modules which can lead
> to a NULL pointer. Fix this by checking for a valid next.

Actually, put_prev_task() should be used only within kernel/sched/ and, in
theory, you should have done a dequeue_task() before put_prev_task() in
this scenario, so SCX_TASK_QUEUED shouldn't be set in p->scx.flags.

The change might still make sense, but can you clarify how you triggered
the NULL pointer dereference?

Thanks,
-Andrea

> 
> Signed-off-by: l00013971 <l00013971@hihonor.com>
> ---
>  kernel/sched/ext.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index f5133249f..6a579babd 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -3262,7 +3262,7 @@ static void put_prev_task_scx(struct rq *rq, struct task_struct *p,
>  		 * ops.enqueue() that @p is the only one available for this cpu,
>  		 * which should trigger an explicit follow-up scheduling event.
>  		 */
> -		if (sched_class_above(&ext_sched_class, next->sched_class)) {
> +		if (next && sched_class_above(&ext_sched_class, next->sched_class)) {
>  			WARN_ON_ONCE(!static_branch_unlikely(&scx_ops_enq_last));
>  			do_enqueue_task(rq, p, SCX_ENQ_LAST, -1);
>  		} else {
> -- 
> 2.17.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* 回复: [PATCH] sched_ext: Fix NULL pointer dereferences in put_prev_task_scx
  2025-06-09 12:09 ` Andrea Righi
@ 2025-06-10  6:22   ` liuwenfang
  2025-06-11  6:37     ` Andrea Righi
  0 siblings, 1 reply; 5+ messages in thread
From: liuwenfang @ 2025-06-10  6:22 UTC (permalink / raw)
  To: 'Andrea Righi'
  Cc: 'Tejun Heo', 'David Vernet',
	'Changwoo Min', 'Ingo Molnar',
	'Peter Zijlstra', 'Juri Lelli',
	'Vincent Guittot', 'Dietmar Eggemann',
	'Steven Rostedt', 'Ben Segall',
	'Mel Gorman', 'Valentin Schneider',
	'linux-kernel@vger.kernel.org'

Thanks for your feedback.
This is triggered in kernel modules developed In the mobile scenario: 
tasks on rq are migrated while the current cpu should be halted for power saving policy.
Its migration logic:
drain_rq_cpu_stop -- migrate_all_tasks :
for (;;) {
	if (rq->nr_running == 1)
		break;
	for_each_class(class) {
		next = class->pick_next_task(rq);
		if (next) {
			next->sched_class->put_prev_task(rq, next, NULL);
			break;
		}
	}
	if (is_idle_task(next))
		break;
	dest_cpu = select_task_rq(next...);
	move_queued_task(rq, rf, next, dest_cpu);
	...
}

put_prev_task in this function is selected to update util and statistics info for each runnable tasks,
here they are not dequeued yet.

Best regards,
 
> On Mon, Jun 09, 2025 at 11:36:15AM +0000, liuwenfang wrote:
> > As put_prev_task can be used in other kernel modules which can lead to
> > a NULL pointer. Fix this by checking for a valid next.
> 
> Actually, put_prev_task() should be used only within kernel/sched/ and, in theory,
> you should have done a dequeue_task() before put_prev_task() in this scenario,
> so SCX_TASK_QUEUED shouldn't be set in p->scx.flags.
> 
> The change might still make sense, but can you clarify how you triggered the
> NULL pointer dereference?
> 
> Thanks,
> -Andrea
> 
> >
> > Signed-off-by: l00013971 <l00013971@hihonor.com>
> > ---
> >  kernel/sched/ext.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index
> > f5133249f..6a579babd 100644
> > --- a/kernel/sched/ext.c
> > +++ b/kernel/sched/ext.c
> > @@ -3262,7 +3262,7 @@ static void put_prev_task_scx(struct rq *rq, struct
> task_struct *p,
> >  		 * ops.enqueue() that @p is the only one available for this cpu,
> >  		 * which should trigger an explicit follow-up scheduling event.
> >  		 */
> > -		if (sched_class_above(&ext_sched_class, next->sched_class)) {
> > +		if (next && sched_class_above(&ext_sched_class, next->sched_class))
> > +{
> >  			WARN_ON_ONCE(!static_branch_unlikely(&scx_ops_enq_last));
> >  			do_enqueue_task(rq, p, SCX_ENQ_LAST, -1);
> >  		} else {
> > --
> > 2.17.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 回复: [PATCH] sched_ext: Fix NULL pointer dereferences in put_prev_task_scx
  2025-06-10  6:22   ` 回复: " liuwenfang
@ 2025-06-11  6:37     ` Andrea Righi
  2025-06-11  7:17       ` liuwenfang
  0 siblings, 1 reply; 5+ messages in thread
From: Andrea Righi @ 2025-06-11  6:37 UTC (permalink / raw)
  To: liuwenfang
  Cc: 'Tejun Heo', 'David Vernet',
	'Changwoo Min', 'Ingo Molnar',
	'Peter Zijlstra', 'Juri Lelli',
	'Vincent Guittot', 'Dietmar Eggemann',
	'Steven Rostedt', 'Ben Segall',
	'Mel Gorman', 'Valentin Schneider',
	'linux-kernel@vger.kernel.org'

Hi liuwenfang,

On Tue, Jun 10, 2025 at 06:22:22AM +0000, liuwenfang wrote:
> Thanks for your feedback.
> This is triggered in kernel modules developed In the mobile scenario: 
> tasks on rq are migrated while the current cpu should be halted for power saving policy.
> Its migration logic:
> drain_rq_cpu_stop -- migrate_all_tasks :
> for (;;) {
> 	if (rq->nr_running == 1)
> 		break;
> 	for_each_class(class) {
> 		next = class->pick_next_task(rq);
> 		if (next) {
> 			next->sched_class->put_prev_task(rq, next, NULL);

Can you do something like this instead?

  next->sched_class->put_prev_task(rq, next, next);

this should give you the same behavior you're relying on, without requiring
the extra check in ext.c.

> 			break;
> 		}
> 	}
> 	if (is_idle_task(next))
> 		break;
> 	dest_cpu = select_task_rq(next...);
> 	move_queued_task(rq, rf, next, dest_cpu);
> 	...
> }
> 
> put_prev_task in this function is selected to update util and statistics info for each runnable tasks,
> here they are not dequeued yet.

I see, so it's an optimization/workaround to update utilization info
without fully dequeuing the tasks. Is this out-of-tree code, I guess?

When you say the CPU is halted, we're not talking about CPU hotplugging,
right? We're just migrating tasks off the CPU?

Also, if you're running sched_ext there are ways to exclude certain CPUs at
the scheduler's level via ops.select_cpu() / ops.dispatch(). Do you think
this could be a viable solution for your specific use case?

Thanks,
-Andrea

PS https://subspace.kernel.org/etiquette.html#do-not-top-post-when-replying

> 
> Best regards,
>  
> > On Mon, Jun 09, 2025 at 11:36:15AM +0000, liuwenfang wrote:
> > > As put_prev_task can be used in other kernel modules which can lead to
> > > a NULL pointer. Fix this by checking for a valid next.
> > 
> > Actually, put_prev_task() should be used only within kernel/sched/ and, in theory,
> > you should have done a dequeue_task() before put_prev_task() in this scenario,
> > so SCX_TASK_QUEUED shouldn't be set in p->scx.flags.
> > 
> > The change might still make sense, but can you clarify how you triggered the
> > NULL pointer dereference?
> > 
> > Thanks,
> > -Andrea
> > 
> > >
> > > Signed-off-by: l00013971 <l00013971@hihonor.com>
> > > ---
> > >  kernel/sched/ext.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index
> > > f5133249f..6a579babd 100644
> > > --- a/kernel/sched/ext.c
> > > +++ b/kernel/sched/ext.c
> > > @@ -3262,7 +3262,7 @@ static void put_prev_task_scx(struct rq *rq, struct
> > task_struct *p,
> > >  		 * ops.enqueue() that @p is the only one available for this cpu,
> > >  		 * which should trigger an explicit follow-up scheduling event.
> > >  		 */
> > > -		if (sched_class_above(&ext_sched_class, next->sched_class)) {
> > > +		if (next && sched_class_above(&ext_sched_class, next->sched_class))
> > > +{
> > >  			WARN_ON_ONCE(!static_branch_unlikely(&scx_ops_enq_last));
> > >  			do_enqueue_task(rq, p, SCX_ENQ_LAST, -1);
> > >  		} else {
> > > --
> > > 2.17.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched_ext: Fix NULL pointer dereferences in put_prev_task_scx
  2025-06-11  6:37     ` Andrea Righi
@ 2025-06-11  7:17       ` liuwenfang
  0 siblings, 0 replies; 5+ messages in thread
From: liuwenfang @ 2025-06-11  7:17 UTC (permalink / raw)
  To: 'Andrea Righi'
  Cc: 'Tejun Heo', 'David Vernet',
	'Changwoo Min', 'Ingo Molnar',
	'Peter Zijlstra', 'Juri Lelli',
	'Vincent Guittot', 'Dietmar Eggemann',
	'Steven Rostedt', 'Ben Segall',
	'Mel Gorman', 'Valentin Schneider',
	'linux-kernel@vger.kernel.org'

> 
> Hi liuwenfang,
> 
> On Tue, Jun 10, 2025 at 06:22:22AM +0000, liuwenfang wrote:
> > Thanks for your feedback.
> > This is triggered in kernel modules developed In the mobile scenario:
> > tasks on rq are migrated while the current cpu should be halted for power
> saving policy.
> > Its migration logic:
> > drain_rq_cpu_stop -- migrate_all_tasks :
> > for (;;) {
> > 	if (rq->nr_running == 1)
> > 		break;
> > 	for_each_class(class) {
> > 		next = class->pick_next_task(rq);
> > 		if (next) {
> > 			next->sched_class->put_prev_task(rq, next, NULL);
> 
> Can you do something like this instead?
> 
>   next->sched_class->put_prev_task(rq, next, next);
> 
> this should give you the same behavior you're relying on, without requiring the
> extra check in ext.c.

Yes, this can help me.

> 
> > 			break;
> > 		}
> > 	}
> > 	if (is_idle_task(next))
> > 		break;
> > 	dest_cpu = select_task_rq(next...);
> > 	move_queued_task(rq, rf, next, dest_cpu);
> > 	...
> > }
> >
> > put_prev_task in this function is selected to update util and
> > statistics info for each runnable tasks, here they are not dequeued yet.
> 
> I see, so it's an optimization/workaround to update utilization info without fully
> dequeuing the tasks. Is this out-of-tree code, I guess?
> 

Yes, this is out-of-tree code. 

> When you say the CPU is halted, we're not talking about CPU hotplugging, right?
> We're just migrating tasks off the CPU?
> 

Yes, this is one power-saving method to control tasks placement by bypassing
Certain CPUs in software, simply letting the halted CPU enter an idle state.
It replaces CPU hotplugging to reduce costs in mobile gaming scenario. 

> Also, if you're running sched_ext there are ways to exclude certain CPUs at the
> scheduler's level via ops.select_cpu() / ops.dispatch(). Do you think this could be
> a viable solution for your specific use case?
> 

Yes, Thanks very much, This is exactly the approach we have implemented.

Best regards

> Thanks,
> -Andrea
> 
> PS https://subspace.kernel.org/etiquette.html#do-not-top-post-when-replying
> 
> >
> > Best regards,
> >
> > > On Mon, Jun 09, 2025 at 11:36:15AM +0000, liuwenfang wrote:
> > > > As put_prev_task can be used in other kernel modules which can
> > > > lead to a NULL pointer. Fix this by checking for a valid next.
> > >
> > > Actually, put_prev_task() should be used only within kernel/sched/
> > > and, in theory, you should have done a dequeue_task() before
> > > put_prev_task() in this scenario, so SCX_TASK_QUEUED shouldn't be set in
> p->scx.flags.
> > >
> > > The change might still make sense, but can you clarify how you
> > > triggered the NULL pointer dereference?
> > >
> > > Thanks,
> > > -Andrea
> > >
> > > >
> > > > Signed-off-by: l00013971 <l00013971@hihonor.com>
> > > > ---
> > > >  kernel/sched/ext.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index
> > > > f5133249f..6a579babd 100644
> > > > --- a/kernel/sched/ext.c
> > > > +++ b/kernel/sched/ext.c
> > > > @@ -3262,7 +3262,7 @@ static void put_prev_task_scx(struct rq *rq,
> > > > struct
> > > task_struct *p,
> > > >  		 * ops.enqueue() that @p is the only one available for this cpu,
> > > >  		 * which should trigger an explicit follow-up scheduling event.
> > > >  		 */
> > > > -		if (sched_class_above(&ext_sched_class, next->sched_class)) {
> > > > +		if (next && sched_class_above(&ext_sched_class,
> > > > +next->sched_class)) {
> > > >
> 	WARN_ON_ONCE(!static_branch_unlikely(&scx_ops_enq_last));
> > > >  			do_enqueue_task(rq, p, SCX_ENQ_LAST, -1);
> > > >  		} else {
> > > > --
> > > > 2.17.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-06-11  7:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-09 11:36 [PATCH] sched_ext: Fix NULL pointer dereferences in put_prev_task_scx liuwenfang
2025-06-09 12:09 ` Andrea Righi
2025-06-10  6:22   ` 回复: " liuwenfang
2025-06-11  6:37     ` Andrea Righi
2025-06-11  7:17       ` liuwenfang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).