linux-rt-devel.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
@ 2025-04-09 18:58 Luis Claudio R. Goncalves
  2025-04-10  6:48 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 5+ messages in thread
From: Luis Claudio R. Goncalves @ 2025-04-09 18:58 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
	Tejun Heo, David Vernet, Barret Rhoden, Josh Don, Crystal Wood,
	linux-kernel, linux-rt-devel, Juri Lelli, lclaudio00

With PREEMPT_RT enabled, some of the calls to put_task_struct() coming
from rt_mutex_adjust_prio_chain() could happen in preemptible context and
with a mutex enqueued. That could lead to this sequence:

	rt_mutex_adjust_prio_chain()
	  put_task_struct()
	    __put_task_struct()
	      sched_ext_free()
	        spin_lock_irqsave()
	          rtlock_lock() --->  TRIGGERS
	                              lockdep_assert(!current->pi_blocked_on);

Adjust the check in put_task_struct() to also consider pi_blocked_on before
calling __put_task_struct(), resorting to the deferred call in case it is
set.

v2: Rostedt suggested removing the #ifdef from put_task_struct() and
    creating tsk_is_pi_blocked_on() in sched.h to make the change cleaner.

Suggested-by: Crystal Wood <crwood@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
---
 include/linux/sched.h      |   12 ++++++++++++
 include/linux/sched/task.h |   10 +++++++---
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5ec93e5ba53a9..9fbfa7f55a83d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2148,6 +2148,18 @@ static inline bool task_is_runnable(struct task_struct *p)
 	return p->on_rq && !p->se.sched_delayed;
 }
 
+#ifdef CONFIG_RT_MUTEXES
+static inline bool tsk_is_pi_blocked_on(struct task_struct *tsk)
+{
+	return tsk->pi_blocked_on != NULL;
+}
+#else
+static inline bool tsk_is_pi_blocked_on(strut task_struct *tsk)
+{
+	return false;
+}
+#endif
+
 extern bool sched_task_on_rq(struct task_struct *p);
 extern unsigned long get_wchan(struct task_struct *p);
 extern struct task_struct *cpu_curr_snapshot(int cpu);
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 0f2aeb37bbb04..1f17a3dd51774 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -135,9 +135,11 @@ static inline void put_task_struct(struct task_struct *t)
 
 	/*
 	 * In !RT, it is always safe to call __put_task_struct().
-	 * Under RT, we can only call it in preemptible context.
+	 * Under RT, we can only call it in preemptible context,
+	 * when not blocked on a PI chain.
 	 */
-	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) {
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT) ||
+	    (preemptible() || !tsk_is_pi_blocked_on(current))) {
 		static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
 
 		lock_map_acquire_try(&put_task_map);
@@ -149,7 +151,9 @@ static inline void put_task_struct(struct task_struct *t)
 	/*
 	 * under PREEMPT_RT, we can't call put_task_struct
 	 * in atomic context because it will indirectly
-	 * acquire sleeping locks.
+	 * acquire sleeping locks. The same is true if the
+	 * current process has a mutex enqueued (blocked on
+	 * a PI chain).
 	 *
 	 * call_rcu() will schedule delayed_put_task_struct_rcu()
 	 * to be called in process context.
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
  2025-04-09 18:58 [PATCH v2] sched: do not call __put_task_struct() on rt if pi_blocked_on is set Luis Claudio R. Goncalves
@ 2025-04-10  6:48 ` Sebastian Andrzej Siewior
  2025-04-10  7:51   ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-10  6:48 UTC (permalink / raw)
  To: Luis Claudio R. Goncalves
  Cc: Clark Williams, Steven Rostedt, Tejun Heo, David Vernet,
	Barret Rhoden, Josh Don, Crystal Wood, linux-kernel,
	linux-rt-devel, Juri Lelli, lclaudio00, Ben Segall,
	Dietmar Eggemann, Ingo Molnar, Mel Gorman, Peter Zijlstra,
	Valentin Schneider, Vincent Guittot, Thomas Gleixner

+ sched folks.

On 2025-04-09 15:58:32 [-0300], Luis Claudio R. Goncalves wrote:
> With PREEMPT_RT enabled, some of the calls to put_task_struct() coming
> from rt_mutex_adjust_prio_chain() could happen in preemptible context and
> with a mutex enqueued. That could lead to this sequence:
> 
> 	rt_mutex_adjust_prio_chain()
> 	  put_task_struct()
> 	    __put_task_struct()
> 	      sched_ext_free()
> 	        spin_lock_irqsave()
> 	          rtlock_lock() --->  TRIGGERS
> 	                              lockdep_assert(!current->pi_blocked_on);
> 
> Adjust the check in put_task_struct() to also consider pi_blocked_on before
> calling __put_task_struct(), resorting to the deferred call in case it is
> set.
> 
> v2: Rostedt suggested removing the #ifdef from put_task_struct() and
>     creating tsk_is_pi_blocked_on() in sched.h to make the change cleaner.

I complained about this special RT case in put_task_struct() when it was
first got introduced. Couldn't we just just unconditionally do the RCU
put?

> Suggested-by: Crystal Wood <crwood@redhat.com>
> Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
> ---
>  include/linux/sched.h      |   12 ++++++++++++
>  include/linux/sched/task.h |   10 +++++++---
>  2 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 5ec93e5ba53a9..9fbfa7f55a83d 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2148,6 +2148,18 @@ static inline bool task_is_runnable(struct task_struct *p)
>  	return p->on_rq && !p->se.sched_delayed;
>  }
>  
> +#ifdef CONFIG_RT_MUTEXES
> +static inline bool tsk_is_pi_blocked_on(struct task_struct *tsk)
> +{
> +	return tsk->pi_blocked_on != NULL;
> +}
> +#else
> +static inline bool tsk_is_pi_blocked_on(strut task_struct *tsk)
> +{
> +	return false;
> +}
> +#endif
> +
>  extern bool sched_task_on_rq(struct task_struct *p);
>  extern unsigned long get_wchan(struct task_struct *p);
>  extern struct task_struct *cpu_curr_snapshot(int cpu);
> diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> index 0f2aeb37bbb04..1f17a3dd51774 100644
> --- a/include/linux/sched/task.h
> +++ b/include/linux/sched/task.h
> @@ -135,9 +135,11 @@ static inline void put_task_struct(struct task_struct *t)
>  
>  	/*
>  	 * In !RT, it is always safe to call __put_task_struct().
> -	 * Under RT, we can only call it in preemptible context.
> +	 * Under RT, we can only call it in preemptible context,
> +	 * when not blocked on a PI chain.
>  	 */
> -	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) {
> +	if (!IS_ENABLED(CONFIG_PREEMPT_RT) ||
> +	    (preemptible() || !tsk_is_pi_blocked_on(current))) {
>  		static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
>  
>  		lock_map_acquire_try(&put_task_map);
> @@ -149,7 +151,9 @@ static inline void put_task_struct(struct task_struct *t)
>  	/*
>  	 * under PREEMPT_RT, we can't call put_task_struct
>  	 * in atomic context because it will indirectly
> -	 * acquire sleeping locks.
> +	 * acquire sleeping locks. The same is true if the
> +	 * current process has a mutex enqueued (blocked on
> +	 * a PI chain).
>  	 *
>  	 * call_rcu() will schedule delayed_put_task_struct_rcu()
>  	 * to be called in process context.

Sebastian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
  2025-04-10  6:48 ` Sebastian Andrzej Siewior
@ 2025-04-10  7:51   ` Peter Zijlstra
  2025-04-10 15:32     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2025-04-10  7:51 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Luis Claudio R. Goncalves, Clark Williams, Steven Rostedt,
	Tejun Heo, David Vernet, Barret Rhoden, Josh Don, Crystal Wood,
	linux-kernel, linux-rt-devel, Juri Lelli, lclaudio00, Ben Segall,
	Dietmar Eggemann, Ingo Molnar, Mel Gorman, Valentin Schneider,
	Vincent Guittot, Thomas Gleixner

On Thu, Apr 10, 2025 at 08:48:44AM +0200, Sebastian Andrzej Siewior wrote:
> + sched folks.
> 
> On 2025-04-09 15:58:32 [-0300], Luis Claudio R. Goncalves wrote:
> > With PREEMPT_RT enabled, some of the calls to put_task_struct() coming
> > from rt_mutex_adjust_prio_chain() could happen in preemptible context and
> > with a mutex enqueued. That could lead to this sequence:
> > 
> > 	rt_mutex_adjust_prio_chain()
> > 	  put_task_struct()
> > 	    __put_task_struct()
> > 	      sched_ext_free()
> > 	        spin_lock_irqsave()
> > 	          rtlock_lock() --->  TRIGGERS
> > 	                              lockdep_assert(!current->pi_blocked_on);
> > 
> > Adjust the check in put_task_struct() to also consider pi_blocked_on before
> > calling __put_task_struct(), resorting to the deferred call in case it is
> > set.
> > 
> > v2: Rostedt suggested removing the #ifdef from put_task_struct() and
> >     creating tsk_is_pi_blocked_on() in sched.h to make the change cleaner.

Oh gawd, this patch makes a sad situation worse.

> I complained about this special RT case in put_task_struct() when it was
> first got introduced. Couldn't we just just unconditionally do the RCU
> put?

Yeah, please make it simpler, not more complex.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
  2025-04-10  7:51   ` Peter Zijlstra
@ 2025-04-10 15:32     ` Sebastian Andrzej Siewior
  2025-05-12 19:01       ` Luis Claudio R. Goncalves
  0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-10 15:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Luis Claudio R. Goncalves, Clark Williams, Steven Rostedt,
	Tejun Heo, David Vernet, Barret Rhoden, Josh Don, Crystal Wood,
	linux-kernel, linux-rt-devel, Juri Lelli, lclaudio00, Ben Segall,
	Dietmar Eggemann, Ingo Molnar, Mel Gorman, Valentin Schneider,
	Vincent Guittot, Thomas Gleixner

On 2025-04-10 09:51:03 [+0200], Peter Zijlstra wrote:
> > I complained about this special RT case in put_task_struct() when it was
> > first got introduced. Couldn't we just just unconditionally do the RCU
> > put?
> 
> Yeah, please make it simpler, not more complex.

Just so we clear: simpler as in everyone does call_rcu() or RT does
always call_rcu() and everyone else __put_task_struct()? I mean we would
end up with one call chain I am just not sure how expensive it gets for
!RT.

Sebastian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
  2025-04-10 15:32     ` Sebastian Andrzej Siewior
@ 2025-05-12 19:01       ` Luis Claudio R. Goncalves
  0 siblings, 0 replies; 5+ messages in thread
From: Luis Claudio R. Goncalves @ 2025-05-12 19:01 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Peter Zijlstra, Clark Williams, Steven Rostedt, Tejun Heo,
	David Vernet, Barret Rhoden, Josh Don, Crystal Wood, linux-kernel,
	linux-rt-devel, Juri Lelli, Ben Segall, Dietmar Eggemann,
	Ingo Molnar, Mel Gorman, Valentin Schneider, Vincent Guittot,
	Thomas Gleixner

On Thu, Apr 10, 2025 at 05:32:05PM +0200, Sebastian Andrzej Siewior wrote:
> On 2025-04-10 09:51:03 [+0200], Peter Zijlstra wrote:
> > > I complained about this special RT case in put_task_struct() when it was
> > > first got introduced. Couldn't we just just unconditionally do the RCU
> > > put?
> > 
> > Yeah, please make it simpler, not more complex.
> 
> Just so we clear: simpler as in everyone does call_rcu() or RT does
> always call_rcu() and everyone else __put_task_struct()? I mean we would
> end up with one call chain I am just not sure how expensive it gets for
> !RT.

Sebastian, I implemented the change where put_task_struct() unconditionally
resorted to:

	call_rcu(&t->rcu, __put_task_struct_rcu_cb);

I submitted the kernels I built with that change and a pristine upstream
kenrel to LTP and stress-ng and also ran 'perf bench all'. I built kernels
with and without lockdep and extra debug. All kernels survived the tests
without a scratch and I haven't observed differences in behaviors nor
timings (for the tests that had that information).

What would be a good benchmark to compare the kernels with and without the
put_task_struct() change? I would like to observe whether there is a
penalty or added overhead with the change in place.

Best,
Luis

> Sebastian
> 
---end quoted text---


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-05-12 19:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-09 18:58 [PATCH v2] sched: do not call __put_task_struct() on rt if pi_blocked_on is set Luis Claudio R. Goncalves
2025-04-10  6:48 ` Sebastian Andrzej Siewior
2025-04-10  7:51   ` Peter Zijlstra
2025-04-10 15:32     ` Sebastian Andrzej Siewior
2025-05-12 19:01       ` Luis Claudio R. Goncalves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).