* [PATCH v5] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
@ 2025-06-17 14:00 Luis Claudio R. Goncalves
2025-06-17 14:07 ` Wander Lairson Costa
2025-06-18 7:03 ` Sebastian Andrzej Siewior
0 siblings, 2 replies; 4+ messages in thread
From: Luis Claudio R. Goncalves @ 2025-06-17 14:00 UTC (permalink / raw)
To: Peter Zijlstra, Sebastian Andrzej Siewior, Clark Williams,
Steven Rostedt, Tejun Heo, David Vernet, Barret Rhoden, Josh Don,
Crystal Wood, linux-kernel, linux-rt-devel, Juri Lelli,
Ben Segall, Dietmar Eggemann, Ingo Molnar, Mel Gorman,
Valentin Schneider, Vincent Guittot, Thomas Gleixner,
Wander Lairson Costa
With PREEMPT_RT enabled, some of the calls to put_task_struct() coming
from rt_mutex_adjust_prio_chain() could happen in preemptible context and
with a mutex enqueued. That could lead to this sequence:
rt_mutex_adjust_prio_chain()
put_task_struct()
__put_task_struct()
sched_ext_free()
spin_lock_irqsave()
rtlock_lock() ---> TRIGGERS
lockdep_assert(!current->pi_blocked_on);
Fix that by unconditionally resorting to the deferred call to
__put_task_struct() if PREEMPT_RT is enabled.
Suggested-by: Crystal Wood <crwood@redhat.com>
Fixes: 893cdaaa3977 ("sched: avoid false lockdep splat in put_task_struct()")
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
---
v2: (Rostedt) remove the #ifdef from put_task_struct() and create
tsk_is_pi_blocked_on() in sched.h to make the change cleaner.
v3: (Sebastian, PeterZ) always call the deferred __put_task_struct() on RT.
v4: Fix the implementation of what was requested on v3.
v5: Add the "Fixes:" tag.
include/linux/sched/task.h | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 0f2aeb37bbb04..51678a541477a 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -134,11 +134,8 @@ static inline void put_task_struct(struct task_struct *t)
if (!refcount_dec_and_test(&t->usage))
return;
- /*
- * In !RT, it is always safe to call __put_task_struct().
- * Under RT, we can only call it in preemptible context.
- */
- if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) {
+ /* In !RT, it is always safe to call __put_task_struct(). */
+ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
lock_map_acquire_try(&put_task_map);
@@ -148,11 +145,13 @@ static inline void put_task_struct(struct task_struct *t)
}
/*
- * under PREEMPT_RT, we can't call put_task_struct
+ * Under PREEMPT_RT, we can't call __put_task_struct
* in atomic context because it will indirectly
- * acquire sleeping locks.
+ * acquire sleeping locks. The same is true if the
+ * current process has a mutex enqueued (blocked on
+ * a PI chain).
*
- * call_rcu() will schedule delayed_put_task_struct_rcu()
+ * call_rcu() will schedule __put_task_struct_rcu_cb()
* to be called in process context.
*
* __put_task_struct() is called when
@@ -165,7 +164,7 @@ static inline void put_task_struct(struct task_struct *t)
*
* delayed_free_task() also uses ->rcu, but it is only called
* when it fails to fork a process. Therefore, there is no
- * way it can conflict with put_task_struct().
+ * way it can conflict with __put_task_struct().
*/
call_rcu(&t->rcu, __put_task_struct_rcu_cb);
}
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v5] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
2025-06-17 14:00 [PATCH v5] sched: do not call __put_task_struct() on rt if pi_blocked_on is set Luis Claudio R. Goncalves
@ 2025-06-17 14:07 ` Wander Lairson Costa
2025-06-18 7:03 ` Sebastian Andrzej Siewior
1 sibling, 0 replies; 4+ messages in thread
From: Wander Lairson Costa @ 2025-06-17 14:07 UTC (permalink / raw)
To: Luis Claudio R. Goncalves
Cc: Peter Zijlstra, Sebastian Andrzej Siewior, Clark Williams,
Steven Rostedt, Tejun Heo, David Vernet, Barret Rhoden, Josh Don,
Crystal Wood, linux-kernel, linux-rt-devel, Juri Lelli,
Ben Segall, Dietmar Eggemann, Ingo Molnar, Mel Gorman,
Valentin Schneider, Vincent Guittot, Thomas Gleixner
On Tue, Jun 17, 2025 at 11:00:36AM -0300, Luis Claudio R. Goncalves wrote:
> With PREEMPT_RT enabled, some of the calls to put_task_struct() coming
> from rt_mutex_adjust_prio_chain() could happen in preemptible context and
> with a mutex enqueued. That could lead to this sequence:
>
> rt_mutex_adjust_prio_chain()
> put_task_struct()
> __put_task_struct()
> sched_ext_free()
> spin_lock_irqsave()
> rtlock_lock() ---> TRIGGERS
> lockdep_assert(!current->pi_blocked_on);
>
> Fix that by unconditionally resorting to the deferred call to
> __put_task_struct() if PREEMPT_RT is enabled.
>
> Suggested-by: Crystal Wood <crwood@redhat.com>
> Fixes: 893cdaaa3977 ("sched: avoid false lockdep splat in put_task_struct()")
> Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
> ---
>
> v2: (Rostedt) remove the #ifdef from put_task_struct() and create
> tsk_is_pi_blocked_on() in sched.h to make the change cleaner.
> v3: (Sebastian, PeterZ) always call the deferred __put_task_struct() on RT.
> v4: Fix the implementation of what was requested on v3.
> v5: Add the "Fixes:" tag.
>
> include/linux/sched/task.h | 17 ++++++++---------
> 1 file changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> index 0f2aeb37bbb04..51678a541477a 100644
> --- a/include/linux/sched/task.h
> +++ b/include/linux/sched/task.h
> @@ -134,11 +134,8 @@ static inline void put_task_struct(struct task_struct *t)
> if (!refcount_dec_and_test(&t->usage))
> return;
>
> - /*
> - * In !RT, it is always safe to call __put_task_struct().
> - * Under RT, we can only call it in preemptible context.
> - */
> - if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) {
> + /* In !RT, it is always safe to call __put_task_struct(). */
> + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
> static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
>
> lock_map_acquire_try(&put_task_map);
> @@ -148,11 +145,13 @@ static inline void put_task_struct(struct task_struct *t)
> }
>
> /*
> - * under PREEMPT_RT, we can't call put_task_struct
> + * Under PREEMPT_RT, we can't call __put_task_struct
> * in atomic context because it will indirectly
> - * acquire sleeping locks.
> + * acquire sleeping locks. The same is true if the
> + * current process has a mutex enqueued (blocked on
> + * a PI chain).
> *
> - * call_rcu() will schedule delayed_put_task_struct_rcu()
> + * call_rcu() will schedule __put_task_struct_rcu_cb()
> * to be called in process context.
> *
> * __put_task_struct() is called when
> @@ -165,7 +164,7 @@ static inline void put_task_struct(struct task_struct *t)
> *
> * delayed_free_task() also uses ->rcu, but it is only called
> * when it fails to fork a process. Therefore, there is no
> - * way it can conflict with put_task_struct().
> + * way it can conflict with __put_task_struct().
> */
> call_rcu(&t->rcu, __put_task_struct_rcu_cb);
> }
>
Reviewed-by: Wander Lairson Costa <wander@redhat.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v5] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
2025-06-17 14:00 [PATCH v5] sched: do not call __put_task_struct() on rt if pi_blocked_on is set Luis Claudio R. Goncalves
2025-06-17 14:07 ` Wander Lairson Costa
@ 2025-06-18 7:03 ` Sebastian Andrzej Siewior
2025-06-25 21:48 ` Crystal Wood
1 sibling, 1 reply; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-06-18 7:03 UTC (permalink / raw)
To: Luis Claudio R. Goncalves
Cc: Peter Zijlstra, Clark Williams, Steven Rostedt, Tejun Heo,
David Vernet, Barret Rhoden, Josh Don, Crystal Wood, linux-kernel,
linux-rt-devel, Juri Lelli, Ben Segall, Dietmar Eggemann,
Ingo Molnar, Mel Gorman, Valentin Schneider, Vincent Guittot,
Thomas Gleixner, Wander Lairson Costa
On 2025-06-17 11:00:36 [-0300], Luis Claudio R. Goncalves wrote:
> With PREEMPT_RT enabled, some of the calls to put_task_struct() coming
> from rt_mutex_adjust_prio_chain() could happen in preemptible context and
> with a mutex enqueued. That could lead to this sequence:
>
> rt_mutex_adjust_prio_chain()
> put_task_struct()
> __put_task_struct()
> sched_ext_free()
> spin_lock_irqsave()
> rtlock_lock() ---> TRIGGERS
> lockdep_assert(!current->pi_blocked_on);
Maybe with the addition of
| The first case was observed with sched_ext_free().
| Crystal Wood was able to reproduce the problem to __put_task_struct()
| being called during rt_mutex_adjust_prio_chain().
The first sentence will imply a Fixes: with the introduction of
sched_ext. The second implies that the original fix was not complete and
nobody managed to trigger it until now.
> Fix that by unconditionally resorting to the deferred call to
> __put_task_struct() if PREEMPT_RT is enabled.
>
> Suggested-by: Crystal Wood <crwood@redhat.com>
> Fixes: 893cdaaa3977 ("sched: avoid false lockdep splat in put_task_struct()")
> Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Sebastian
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v5] sched: do not call __put_task_struct() on rt if pi_blocked_on is set
2025-06-18 7:03 ` Sebastian Andrzej Siewior
@ 2025-06-25 21:48 ` Crystal Wood
0 siblings, 0 replies; 4+ messages in thread
From: Crystal Wood @ 2025-06-25 21:48 UTC (permalink / raw)
To: Sebastian Andrzej Siewior, Luis Claudio R. Goncalves
Cc: Peter Zijlstra, Clark Williams, Steven Rostedt, Tejun Heo,
David Vernet, Barret Rhoden, Josh Don, linux-kernel,
linux-rt-devel, Juri Lelli, Ben Segall, Dietmar Eggemann,
Ingo Molnar, Mel Gorman, Valentin Schneider, Vincent Guittot,
Thomas Gleixner, Wander Lairson Costa
On Wed, 2025-06-18 at 09:03 +0200, Sebastian Andrzej Siewior wrote:
> On 2025-06-17 11:00:36 [-0300], Luis Claudio R. Goncalves wrote:
> > With PREEMPT_RT enabled, some of the calls to put_task_struct() coming
> > from rt_mutex_adjust_prio_chain() could happen in preemptible context and
> > with a mutex enqueued. That could lead to this sequence:
> >
> > rt_mutex_adjust_prio_chain()
> > put_task_struct()
> > __put_task_struct()
> > sched_ext_free()
> > spin_lock_irqsave()
> > rtlock_lock() ---> TRIGGERS
> > lockdep_assert(!current->pi_blocked_on);
>
> Maybe with the addition of
>
> > The first case was observed with sched_ext_free().
> > Crystal Wood was able to reproduce the problem to __put_task_struct()
> > being called during rt_mutex_adjust_prio_chain().
>
> The first sentence will imply a Fixes: with the introduction of
> sched_ext. The second implies that the original fix was not complete and
> nobody managed to trigger it until now.
sched_ext_free() just happens to be the first cleanup function called,
so that's where the blowup happens. I think the "nobody managed to
trigger it" was because we didn't have the pi_blocked_on assert until
recently -- and my "other cases with a similar cause" was probably older
kernels with the assert backported, but not sched_ext, so the backtrace
was different.
-Crystal
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-06-25 21:48 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-17 14:00 [PATCH v5] sched: do not call __put_task_struct() on rt if pi_blocked_on is set Luis Claudio R. Goncalves
2025-06-17 14:07 ` Wander Lairson Costa
2025-06-18 7:03 ` Sebastian Andrzej Siewior
2025-06-25 21:48 ` Crystal Wood
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).