All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
@ 2025-04-10 12:10 Luis Claudio R. Goncalves
  2025-04-10 12:20 ` Sebastian Andrzej Siewior
  2025-04-10 12:40 ` Peter Zijlstra
  0 siblings, 2 replies; 5+ messages in thread
From: Luis Claudio R. Goncalves @ 2025-04-10 12:10 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
	Tejun Heo, David Vernet, Barret Rhoden, Josh Don, Crystal Wood,
	linux-kernel, linux-rt-devel, Juri Lelli, Ben Segall,
	Dietmar Eggemann, Ingo Molnar, Mel Gorman, Peter Zijlstra,
	Valentin Schneider, Vincent Guittot, Thomas Gleixner
  Cc: lclaudio00

With PREEMPT_RT enabled, some of the calls to put_task_struct() coming
from rt_mutex_adjust_prio_chain() could happen in preemptible context and
with a mutex enqueued. That could lead to this sequence:

	rt_mutex_adjust_prio_chain()
	  put_task_struct()
	    __put_task_struct()
	      sched_ext_free()
	        spin_lock_irqsave()
	          rtlock_lock() --->  TRIGGERS
	                              lockdep_assert(!current->pi_blocked_on);

Fix that by unconditionally resorting to the deferred call to
__put_task_struct().

v2: (Rostedt) remove the #ifdef from put_task_struct() and create
    tsk_is_pi_blocked_on() in sched.h to make the change cleaner.

v3: (Sebastian and PeterZ) always call the RCU deferred __put_task_struct().

Suggested-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
---
 include/linux/sched/task.h |   20 +++++---------------
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 0f2aeb37bbb04..49847efe5559e 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -134,22 +134,12 @@ static inline void put_task_struct(struct task_struct *t)
 		return;
 
 	/*
-	 * In !RT, it is always safe to call __put_task_struct().
-	 * Under RT, we can only call it in preemptible context.
-	 */
-	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) {
-		static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
-
-		lock_map_acquire_try(&put_task_map);
-		__put_task_struct(t);
-		lock_map_release(&put_task_map);
-		return;
-	}
-
-	/*
-	 * under PREEMPT_RT, we can't call put_task_struct
+	 * In !RT, it is always safe to call __put_task_struct(),
+	 * but under PREEMPT_RT, we can't call put_task_struct
 	 * in atomic context because it will indirectly
-	 * acquire sleeping locks.
+	 * acquire sleeping locks. The same is true if the
+	 * current process has a mutex enqueued (blocked on
+	 * a PI chain).
 	 *
 	 * call_rcu() will schedule delayed_put_task_struct_rcu()
 	 * to be called in process context.
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
  2025-04-10 12:10 [PATCH v3] sched: do not call __put_task_struct() on rt if pi_blocked_on is set Luis Claudio R. Goncalves
@ 2025-04-10 12:20 ` Sebastian Andrzej Siewior
  2025-04-10 12:39   ` Luis Claudio R. Goncalves
  2025-04-10 12:40 ` Peter Zijlstra
  1 sibling, 1 reply; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-10 12:20 UTC (permalink / raw)
  To: Luis Claudio R. Goncalves
  Cc: Clark Williams, Steven Rostedt, Tejun Heo, David Vernet,
	Barret Rhoden, Josh Don, Crystal Wood, linux-kernel,
	linux-rt-devel, Juri Lelli, Ben Segall, Dietmar Eggemann,
	Ingo Molnar, Mel Gorman, Peter Zijlstra, Valentin Schneider,
	Vincent Guittot, Thomas Gleixner, lclaudio00

On 2025-04-10 09:10:12 [-0300], Luis Claudio R. Goncalves wrote:
> diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> --- a/include/linux/sched/task.h
> +++ b/include/linux/sched/task.h
> @@ -134,22 +134,12 @@ static inline void put_task_struct(struct task_struct *t)
>  		return;
>  
>  	/*
> -	 * In !RT, it is always safe to call __put_task_struct().
> -	 * Under RT, we can only call it in preemptible context.
> -	 */
> -	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) {
> -		static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
> -
> -		lock_map_acquire_try(&put_task_map);
> -		__put_task_struct(t);
> -		lock_map_release(&put_task_map);
> -		return;
> -	}
> -
> -	/*
> -	 * under PREEMPT_RT, we can't call put_task_struct
> +	 * In !RT, it is always safe to call __put_task_struct(),
> +	 * but under PREEMPT_RT, we can't call put_task_struct
>  	 * in atomic context because it will indirectly
> -	 * acquire sleeping locks.
> +	 * acquire sleeping locks. The same is true if the
> +	 * current process has a mutex enqueued (blocked on
> +	 * a PI chain).
>  	 *
>  	 * call_rcu() will schedule delayed_put_task_struct_rcu()
>  	 * to be called in process context.

Did you test it with lockdep with and without PREEMPT_RT? It would be
nice to throw some testing on it.
This comment here "call_rcu will schedule bla in process context" is
wrong. It will schedule the callback in softirq context. Unless RCU is
configured to run the callbacks in rcuc/ thread which is the default for
PREEMPT_RT. Also delayed_put_task_struct_rcu() does not exist, imho
never did.

Sebastian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
  2025-04-10 12:20 ` Sebastian Andrzej Siewior
@ 2025-04-10 12:39   ` Luis Claudio R. Goncalves
  0 siblings, 0 replies; 5+ messages in thread
From: Luis Claudio R. Goncalves @ 2025-04-10 12:39 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Clark Williams, Steven Rostedt, Tejun Heo, David Vernet,
	Barret Rhoden, Josh Don, Crystal Wood, linux-kernel,
	linux-rt-devel, Juri Lelli, Ben Segall, Dietmar Eggemann,
	Ingo Molnar, Mel Gorman, Peter Zijlstra, Valentin Schneider,
	Vincent Guittot, Thomas Gleixner

On Thu, Apr 10, 2025 at 02:20:02PM +0200, Sebastian Andrzej Siewior wrote:
> On 2025-04-10 09:10:12 [-0300], Luis Claudio R. Goncalves wrote:
> > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> > --- a/include/linux/sched/task.h
> > +++ b/include/linux/sched/task.h
> > @@ -134,22 +134,12 @@ static inline void put_task_struct(struct task_struct *t)
> >  		return;
> >  
> >  	/*
> > -	 * In !RT, it is always safe to call __put_task_struct().
> > -	 * Under RT, we can only call it in preemptible context.
> > -	 */
> > -	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) {
> > -		static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
> > -
> > -		lock_map_acquire_try(&put_task_map);
> > -		__put_task_struct(t);
> > -		lock_map_release(&put_task_map);
> > -		return;
> > -	}
> > -
> > -	/*
> > -	 * under PREEMPT_RT, we can't call put_task_struct
> > +	 * In !RT, it is always safe to call __put_task_struct(),
> > +	 * but under PREEMPT_RT, we can't call put_task_struct
> >  	 * in atomic context because it will indirectly
> > -	 * acquire sleeping locks.
> > +	 * acquire sleeping locks. The same is true if the
> > +	 * current process has a mutex enqueued (blocked on
> > +	 * a PI chain).
> >  	 *
> >  	 * call_rcu() will schedule delayed_put_task_struct_rcu()
> >  	 * to be called in process context.
> 
> Did you test it with lockdep with and without PREEMPT_RT? It would be
> nice to throw some testing on it.

I will re-run the full set of tests on both kernels.

> This comment here "call_rcu will schedule bla in process context" is
> wrong. It will schedule the callback in softirq context. Unless RCU is
> configured to run the callbacks in rcuc/ thread which is the default for
> PREEMPT_RT. Also delayed_put_task_struct_rcu() does not exist, imho
> never did.

I kept the original comment about the call_rcu in process context, but
didn't realize that wouldn't hold true for !RT. Would you prefer I adjust
the comments (for RT vs non-RT and other possibilities) or remove them
entirely?

And I completely missed delayed_put_task_struct_rcu() vs
__put_task_struct_rcu_cb() in the original comment.

Thank you again for the review!
Luis
> 
> Sebastian
> 
---end quoted text---


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
  2025-04-10 12:10 [PATCH v3] sched: do not call __put_task_struct() on rt if pi_blocked_on is set Luis Claudio R. Goncalves
  2025-04-10 12:20 ` Sebastian Andrzej Siewior
@ 2025-04-10 12:40 ` Peter Zijlstra
  2025-04-10 14:32   ` Luis Claudio R. Goncalves
  1 sibling, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2025-04-10 12:40 UTC (permalink / raw)
  To: Luis Claudio R. Goncalves
  Cc: Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
	Tejun Heo, David Vernet, Barret Rhoden, Josh Don, Crystal Wood,
	linux-kernel, linux-rt-devel, Juri Lelli, Ben Segall,
	Dietmar Eggemann, Ingo Molnar, Mel Gorman, Valentin Schneider,
	Vincent Guittot, Thomas Gleixner, lclaudio00

On Thu, Apr 10, 2025 at 09:10:12AM -0300, Luis Claudio R. Goncalves wrote:
> With PREEMPT_RT enabled, some of the calls to put_task_struct() coming
> from rt_mutex_adjust_prio_chain() could happen in preemptible context and
> with a mutex enqueued. That could lead to this sequence:
> 
> 	rt_mutex_adjust_prio_chain()
> 	  put_task_struct()
> 	    __put_task_struct()
> 	      sched_ext_free()
> 	        spin_lock_irqsave()
> 	          rtlock_lock() --->  TRIGGERS
> 	                              lockdep_assert(!current->pi_blocked_on);
> 
> Fix that by unconditionally resorting to the deferred call to
> __put_task_struct().
> 
> v2: (Rostedt) remove the #ifdef from put_task_struct() and create
>     tsk_is_pi_blocked_on() in sched.h to make the change cleaner.
> 
> v3: (Sebastian and PeterZ) always call the RCU deferred __put_task_struct().

Changelog goes below the --- line.

> Suggested-by: Crystal Wood <crwood@redhat.com>
> Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
> ---
>  include/linux/sched/task.h |   20 +++++---------------
>  1 file changed, 5 insertions(+), 15 deletions(-)
> 
> diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> index 0f2aeb37bbb04..49847efe5559e 100644
> --- a/include/linux/sched/task.h
> +++ b/include/linux/sched/task.h
> @@ -134,22 +134,12 @@ static inline void put_task_struct(struct task_struct *t)
>  		return;
>  
>  	/*
> -	 * In !RT, it is always safe to call __put_task_struct().
> -	 * Under RT, we can only call it in preemptible context.
> -	 */
> -	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) {
> -		static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
> -
> -		lock_map_acquire_try(&put_task_map);
> -		__put_task_struct(t);
> -		lock_map_release(&put_task_map);
> -		return;
> -	}

I don't think you've substantiated why the !PREEMPT_RT case needs to go.

> -
> -	/*
> -	 * under PREEMPT_RT, we can't call put_task_struct
> +	 * In !RT, it is always safe to call __put_task_struct(),
> +	 * but under PREEMPT_RT, we can't call put_task_struct
>  	 * in atomic context because it will indirectly
> -	 * acquire sleeping locks.
> +	 * acquire sleeping locks. The same is true if the
> +	 * current process has a mutex enqueued (blocked on
> +	 * a PI chain).
>  	 *
>  	 * call_rcu() will schedule delayed_put_task_struct_rcu()
>  	 * to be called in process context.
> -- 
> 2.49.0
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
  2025-04-10 12:40 ` Peter Zijlstra
@ 2025-04-10 14:32   ` Luis Claudio R. Goncalves
  0 siblings, 0 replies; 5+ messages in thread
From: Luis Claudio R. Goncalves @ 2025-04-10 14:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
	Tejun Heo, David Vernet, Barret Rhoden, Josh Don, Crystal Wood,
	linux-kernel, linux-rt-devel, Juri Lelli, Ben Segall,
	Dietmar Eggemann, Ingo Molnar, Mel Gorman, Valentin Schneider,
	Vincent Guittot, Thomas Gleixner

On Thu, Apr 10, 2025 at 02:40:59PM +0200, Peter Zijlstra wrote:
> On Thu, Apr 10, 2025 at 09:10:12AM -0300, Luis Claudio R. Goncalves wrote:
> > With PREEMPT_RT enabled, some of the calls to put_task_struct() coming
> > from rt_mutex_adjust_prio_chain() could happen in preemptible context and
> > with a mutex enqueued. That could lead to this sequence:
> > 
> > 	rt_mutex_adjust_prio_chain()
> > 	  put_task_struct()
> > 	    __put_task_struct()
> > 	      sched_ext_free()
> > 	        spin_lock_irqsave()
> > 	          rtlock_lock() --->  TRIGGERS
> > 	                              lockdep_assert(!current->pi_blocked_on);
> > 
> > Fix that by unconditionally resorting to the deferred call to
> > __put_task_struct().
> > 
> > v2: (Rostedt) remove the #ifdef from put_task_struct() and create
> >     tsk_is_pi_blocked_on() in sched.h to make the change cleaner.
> > 
> > v3: (Sebastian and PeterZ) always call the RCU deferred __put_task_struct().
> 
> Changelog goes below the --- line.
> 
> > Suggested-by: Crystal Wood <crwood@redhat.com>
> > Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
> > ---
> >  include/linux/sched/task.h |   20 +++++---------------
> >  1 file changed, 5 insertions(+), 15 deletions(-)
> > 
> > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> > index 0f2aeb37bbb04..49847efe5559e 100644
> > --- a/include/linux/sched/task.h
> > +++ b/include/linux/sched/task.h
> > @@ -134,22 +134,12 @@ static inline void put_task_struct(struct task_struct *t)
> >  		return;
> >  
> >  	/*
> > -	 * In !RT, it is always safe to call __put_task_struct().
> > -	 * Under RT, we can only call it in preemptible context.
> > -	 */
> > -	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) {
> > -		static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
> > -
> > -		lock_map_acquire_try(&put_task_map);
> > -		__put_task_struct(t);
> > -		lock_map_release(&put_task_map);
> > -		return;
> > -	}
> 
> I don't think you've substantiated why the !PREEMPT_RT case needs to go.

That was my misunderstanding of "unconditionally call the deferred
function". I see I took it too far and made the patch wrong.

I am testing v4 (closer to the original code with fixed comments) that is
basically:

	if !RT
		__put_task_struct (original code)
	else
		call_rcu(__put_task_struct_rcu_cb)

With the corrected comments Sebastian pointed out.

As soon as the tests complete I will post v4.

Thanks,
Luis
 
> > -
> > -	/*
> > -	 * under PREEMPT_RT, we can't call put_task_struct
> > +	 * In !RT, it is always safe to call __put_task_struct(),
> > +	 * but under PREEMPT_RT, we can't call put_task_struct
> >  	 * in atomic context because it will indirectly
> > -	 * acquire sleeping locks.
> > +	 * acquire sleeping locks. The same is true if the
> > +	 * current process has a mutex enqueued (blocked on
> > +	 * a PI chain).
> >  	 *
> >  	 * call_rcu() will schedule delayed_put_task_struct_rcu()
> >  	 * to be called in process context.
> > -- 
> > 2.49.0
> > 
> 
---end quoted text---


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-04-10 14:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-10 12:10 [PATCH v3] sched: do not call __put_task_struct() on rt if pi_blocked_on is set Luis Claudio R. Goncalves
2025-04-10 12:20 ` Sebastian Andrzej Siewior
2025-04-10 12:39   ` Luis Claudio R. Goncalves
2025-04-10 12:40 ` Peter Zijlstra
2025-04-10 14:32   ` Luis Claudio R. Goncalves

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.