* [PATCH v8] kernel/fork: beware of __put_task_struct calling context
@ 2023-05-15 16:22 Wander Lairson Costa
2023-05-15 16:43 ` Oleg Nesterov
0 siblings, 1 reply; 5+ messages in thread
From: Wander Lairson Costa @ 2023-05-15 16:22 UTC (permalink / raw)
To: Christian Brauner (Microsoft), Michael S. Tsirkin, Mike Christie,
Peter Zijlstra, Guo Ren, Kefeng Wang, Wander Lairson Costa,
Oleg Nesterov, Andrew Morton, Liam R. Howlett, Suren Baghdasaryan,
Mathieu Desnoyers, Nicholas Piggin, Andrei Vagin, open list
Cc: Hu Chunyu, Valentin Schneider, Sebastian Andrzej Siewior,
Paul McKenney, Steven Rostedt
Under PREEMPT_RT, __put_task_struct() indirectly acquires sleeping
locks. Therefore, it can't be called from an non-preemptible context.
One practical example is splat inside inactive_task_timer(), which is
called in a interrupt context:
CPU: 1 PID: 2848 Comm: life Kdump: loaded Tainted: G W ---------
Hardware name: HP ProLiant DL388p Gen8, BIOS P70 07/15/2012
Call Trace:
dump_stack_lvl+0x57/0x7d
mark_lock_irq.cold+0x33/0xba
? stack_trace_save+0x4b/0x70
? save_trace+0x55/0x150
mark_lock+0x1e7/0x400
mark_usage+0x11d/0x140
__lock_acquire+0x30d/0x930
lock_acquire.part.0+0x9c/0x210
? refill_obj_stock+0x3d/0x3a0
? rcu_read_lock_sched_held+0x3f/0x70
? trace_lock_acquire+0x38/0x140
? lock_acquire+0x30/0x80
? refill_obj_stock+0x3d/0x3a0
rt_spin_lock+0x27/0xe0
? refill_obj_stock+0x3d/0x3a0
refill_obj_stock+0x3d/0x3a0
? inactive_task_timer+0x1ad/0x340
kmem_cache_free+0x357/0x560
inactive_task_timer+0x1ad/0x340
? switched_from_dl+0x2d0/0x2d0
__run_hrtimer+0x8a/0x1a0
__hrtimer_run_queues+0x91/0x130
hrtimer_interrupt+0x10f/0x220
__sysvec_apic_timer_interrupt+0x7b/0xd0
sysvec_apic_timer_interrupt+0x4f/0xd0
? asm_sysvec_apic_timer_interrupt+0xa/0x20
asm_sysvec_apic_timer_interrupt+0x12/0x20
RIP: 0033:0x7fff196bf6f5
Instead of calling __put_task_struct() directly, we defer it using
call_rcu(). A more natural approach would use a workqueue, but since
in PREEMPT_RT, we can't allocate dynamic memory from atomic context,
the code would become more complex because we would need to put the
work_struct instance in the task_struct and initialize it when we
allocate a new task_struct.
Changelog
=========
v1:
* Initial implementation fixing the splat.
v2:
* Isolate the logic in its own function.
* Fix two more cases caught in review.
v3:
* Change __put_task_struct() to handle the issue internally.
v4:
* Explain why call_rcu() is safe to call from interrupt context.
v5:
* Explain why __put_task_struct() doesn't conflict with
put_task_sruct_rcu_user.
v6:
* As per Sebastian's review, revert back the implementation of v2
with a distinct function.
* Add a check in put_task_struct() to warning when called from a
non-sleepable context.
* Address more call sites.
v7:
* Fix typos.
* Add an explanation why the new function doesn't conflict with
delayed_free_task().
v8:
* Bring back v5.
* Fix coding style.
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reported-by: Hu Chunyu <chuhu@redhat.com>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: Valentin Schneider <vschneid@redhat.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Paul McKenney <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
include/linux/sched/task.h | 31 ++++++++++++++++++++++++++++++-
kernel/fork.c | 12 ++++++++++--
2 files changed, 40 insertions(+), 3 deletions(-)
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 537cbf9a2ade..d39223bca0e0 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -118,7 +118,36 @@ static inline struct task_struct *get_task_struct(struct task_struct *t)
return t;
}
-extern void __put_task_struct(struct task_struct *t);
+extern void ___put_task_struct(struct task_struct *t);
+extern void __put_task_struct_rcu_cb(struct rcu_head *rhp);
+
+static inline void __put_task_struct(struct task_struct *tsk)
+{
+ /*
+ * under PREEMPT_RT, we can't call put_task_struct
+ * in atomic context because it will indirectly
+ * acquire sleeping locks.
+ *
+ * call_rcu() will schedule delayed_put_task_struct_rcu()
+ * to be called in process context.
+ *
+ * __put_task_struct() is called when
+ * refcount_dec_and_test(&t->usage) succeeds.
+ *
+ * This means that it can't "conflict" with
+ * put_task_struct_rcu_user() which abuses ->rcu the same
+ * way; rcu_users has a reference so task->usage can't be
+ * zero after rcu_users 1 -> 0 transition.
+ *
+ * delayed_free_task() also uses ->rcu, but it is only called
+ * when it fails to fork a process. Therefore, there is no
+ * way it can conflict with put_task_struct().
+ */
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
+ call_rcu(&tsk->rcu, __put_task_struct_rcu_cb);
+ else
+ ___put_task_struct(tsk);
+}
static inline void put_task_struct(struct task_struct *t)
{
diff --git a/kernel/fork.c b/kernel/fork.c
index ed4e01daccaa..1ac043135489 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -969,7 +969,7 @@ static inline void put_signal_struct(struct signal_struct *sig)
free_signal_struct(sig);
}
-void __put_task_struct(struct task_struct *tsk)
+void ___put_task_struct(struct task_struct *tsk)
{
WARN_ON(!tsk->exit_state);
WARN_ON(refcount_read(&tsk->usage));
@@ -986,7 +986,15 @@ void __put_task_struct(struct task_struct *tsk)
sched_core_free(tsk);
free_task(tsk);
}
-EXPORT_SYMBOL_GPL(__put_task_struct);
+EXPORT_SYMBOL_GPL(___put_task_struct);
+
+void __put_task_struct_rcu_cb(struct rcu_head *rhp)
+{
+ struct task_struct *task = container_of(rhp, struct task_struct, rcu);
+
+ ___put_task_struct(task);
+}
+EXPORT_SYMBOL_GPL(__put_task_struct_rcu_cb);
void __init __weak arch_task_cache_init(void) { }
--
2.40.1
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH v8] kernel/fork: beware of __put_task_struct calling context
2023-05-15 16:22 [PATCH v8] kernel/fork: beware of __put_task_struct calling context Wander Lairson Costa
@ 2023-05-15 16:43 ` Oleg Nesterov
2023-05-15 17:14 ` Wander Lairson Costa
0 siblings, 1 reply; 5+ messages in thread
From: Oleg Nesterov @ 2023-05-15 16:43 UTC (permalink / raw)
To: Wander Lairson Costa
Cc: Christian Brauner (Microsoft), Michael S. Tsirkin, Mike Christie,
Peter Zijlstra, Guo Ren, Kefeng Wang, Andrew Morton,
Liam R. Howlett, Suren Baghdasaryan, Mathieu Desnoyers,
Nicholas Piggin, Andrei Vagin, open list, Hu Chunyu,
Valentin Schneider, Sebastian Andrzej Siewior, Paul McKenney,
Steven Rostedt
Certainly I have missed something...
but,
On 05/15, Wander Lairson Costa wrote:
>
> -extern void __put_task_struct(struct task_struct *t);
> +extern void ___put_task_struct(struct task_struct *t);
> +extern void __put_task_struct_rcu_cb(struct rcu_head *rhp);
I don't understand these renames, why can't you simply put this fix
into put_task_struct() ?
but this is minor,
> +static inline void __put_task_struct(struct task_struct *tsk)
> +{
...
> + if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
> + call_rcu(&tsk->rcu, __put_task_struct_rcu_cb);
> + else
> + ___put_task_struct(tsk);
> +}
did you see the emails from Peter? In particular, this one:
https://lore.kernel.org/lkml/20230505133902.GC38236@hirez.programming.kicks-ass.net/
Oleg.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH v8] kernel/fork: beware of __put_task_struct calling context
2023-05-15 16:43 ` Oleg Nesterov
@ 2023-05-15 17:14 ` Wander Lairson Costa
2023-05-15 18:08 ` Oleg Nesterov
0 siblings, 1 reply; 5+ messages in thread
From: Wander Lairson Costa @ 2023-05-15 17:14 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Christian Brauner (Microsoft), Michael S. Tsirkin, Mike Christie,
Peter Zijlstra, Guo Ren, Kefeng Wang, Andrew Morton,
Liam R. Howlett, Suren Baghdasaryan, Mathieu Desnoyers,
Nicholas Piggin, Andrei Vagin, open list, Hu Chunyu,
Valentin Schneider, Sebastian Andrzej Siewior, Paul McKenney,
Steven Rostedt
On Mon, May 15, 2023 at 1:43 PM Oleg Nesterov <oleg@redhat.com> wrote:
>
> Certainly I have missed something...
>
> but,
>
> On 05/15, Wander Lairson Costa wrote:
> >
> > -extern void __put_task_struct(struct task_struct *t);
> > +extern void ___put_task_struct(struct task_struct *t);
> > +extern void __put_task_struct_rcu_cb(struct rcu_head *rhp);
>
> I don't understand these renames, why can't you simply put this fix
> into put_task_struct() ?
>
No particular reason, it was just a matter of style and keep the parts simple.
> but this is minor,
>
> > +static inline void __put_task_struct(struct task_struct *tsk)
> > +{
> ...
> > + if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
> > + call_rcu(&tsk->rcu, __put_task_struct_rcu_cb);
> > + else
> > + ___put_task_struct(tsk);
> > +}
>
> did you see the emails from Peter? In particular, this one:
>
> https://lore.kernel.org/lkml/20230505133902.GC38236@hirez.programming.kicks-ass.net/
>
I didn't notice the lock_acquire/lock_release part. However, I tested
the patch with CONFIG_PROVE_RAW_LOCK_NESTING and there was no warning.
> Oleg.
>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH v8] kernel/fork: beware of __put_task_struct calling context
2023-05-15 17:14 ` Wander Lairson Costa
@ 2023-05-15 18:08 ` Oleg Nesterov
2023-05-15 18:58 ` Wander Lairson Costa
0 siblings, 1 reply; 5+ messages in thread
From: Oleg Nesterov @ 2023-05-15 18:08 UTC (permalink / raw)
To: Wander Lairson Costa
Cc: Christian Brauner (Microsoft), Michael S. Tsirkin, Mike Christie,
Peter Zijlstra, Guo Ren, Kefeng Wang, Andrew Morton,
Liam R. Howlett, Suren Baghdasaryan, Mathieu Desnoyers,
Nicholas Piggin, Andrei Vagin, open list, Hu Chunyu,
Valentin Schneider, Sebastian Andrzej Siewior, Paul McKenney,
Steven Rostedt
On 05/15, Wander Lairson Costa wrote:
> On Mon, May 15, 2023 at 1:43 PM Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > Certainly I have missed something...
> >
> > but,
> >
> > On 05/15, Wander Lairson Costa wrote:
> > >
> > > -extern void __put_task_struct(struct task_struct *t);
> > > +extern void ___put_task_struct(struct task_struct *t);
> > > +extern void __put_task_struct_rcu_cb(struct rcu_head *rhp);
> >
> > I don't understand these renames, why can't you simply put this fix
> > into put_task_struct() ?
> >
>
> No particular reason, it was just a matter of style and keep the parts simple.
Well, to me a single/simple change in put_task_struct() makes more
sense, but I won't argue.
static inline void put_task_struct(struct task_struct *t)
{
if (!refcount_dec_and_test(...))
return;
if (IS_ENABLED(PREEMPT_RT) && ...)
return call_rcu(...);
...
__put_task_struct();
...
}
> > > +static inline void __put_task_struct(struct task_struct *tsk)
> > > +{
> > ...
> > > + if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
> > > + call_rcu(&tsk->rcu, __put_task_struct_rcu_cb);
> > > + else
> > > + ___put_task_struct(tsk);
> > > +}
> >
> > did you see the emails from Peter? In particular, this one:
> >
> > https://lore.kernel.org/lkml/20230505133902.GC38236@hirez.programming.kicks-ass.net/
> >
>
> I didn't notice the lock_acquire/lock_release part. However, I tested
> the patch with CONFIG_PROVE_RAW_LOCK_NESTING and there was no warning.
Hmm. I tend to trust the Sebastian's analysis in
https://lore.kernel.org/all/Y+zFNrCjBn53%2F+Q2@linutronix.de/
I'll try to look at it later, although I hope Sebastian or Peter
can explain this before I try ;)
Oleg.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH v8] kernel/fork: beware of __put_task_struct calling context
2023-05-15 18:08 ` Oleg Nesterov
@ 2023-05-15 18:58 ` Wander Lairson Costa
0 siblings, 0 replies; 5+ messages in thread
From: Wander Lairson Costa @ 2023-05-15 18:58 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Christian Brauner (Microsoft), Michael S. Tsirkin, Mike Christie,
Peter Zijlstra, Guo Ren, Kefeng Wang, Andrew Morton,
Liam R. Howlett, Suren Baghdasaryan, Mathieu Desnoyers,
Nicholas Piggin, Andrei Vagin, open list, Hu Chunyu,
Valentin Schneider, Sebastian Andrzej Siewior, Paul McKenney,
Steven Rostedt
On Mon, May 15, 2023 at 3:09 PM Oleg Nesterov <oleg@redhat.com> wrote:
>
> On 05/15, Wander Lairson Costa wrote:
> > On Mon, May 15, 2023 at 1:43 PM Oleg Nesterov <oleg@redhat.com> wrote:
> > >
> > > Certainly I have missed something...
> > >
> > > but,
> > >
> > > On 05/15, Wander Lairson Costa wrote:
> > > >
> > > > -extern void __put_task_struct(struct task_struct *t);
> > > > +extern void ___put_task_struct(struct task_struct *t);
> > > > +extern void __put_task_struct_rcu_cb(struct rcu_head *rhp);
> > >
> > > I don't understand these renames, why can't you simply put this fix
> > > into put_task_struct() ?
> > >
> >
> > No particular reason, it was just a matter of style and keep the parts simple.
>
> Well, to me a single/simple change in put_task_struct() makes more
> sense, but I won't argue.
>
My initial thought was to break the code in smaller functions, but
maybe just changing put_task_struct() could be better.
> static inline void put_task_struct(struct task_struct *t)
> {
> if (!refcount_dec_and_test(...))
> return;
>
> if (IS_ENABLED(PREEMPT_RT) && ...)
> return call_rcu(...);
>
> ...
> __put_task_struct();
> ...
> }
>
> > > > +static inline void __put_task_struct(struct task_struct *tsk)
> > > > +{
> > > ...
> > > > + if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
> > > > + call_rcu(&tsk->rcu, __put_task_struct_rcu_cb);
> > > > + else
> > > > + ___put_task_struct(tsk);
> > > > +}
> > >
> > > did you see the emails from Peter? In particular, this one:
> > >
> > > https://lore.kernel.org/lkml/20230505133902.GC38236@hirez.programming.kicks-ass.net/
> > >
> >
> > I didn't notice the lock_acquire/lock_release part. However, I tested
> > the patch with CONFIG_PROVE_RAW_LOCK_NESTING and there was no warning.
>
> Hmm. I tend to trust the Sebastian's analysis in
>
> https://lore.kernel.org/all/Y+zFNrCjBn53%2F+Q2@linutronix.de/
>
> I'll try to look at it later, although I hope Sebastian or Peter
> can explain this before I try ;)
>
The inability to see and reproduce the potential issue is part of my
confusion about addressing this property.
> Oleg.
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-05-15 18:59 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-15 16:22 [PATCH v8] kernel/fork: beware of __put_task_struct calling context Wander Lairson Costa
2023-05-15 16:43 ` Oleg Nesterov
2023-05-15 17:14 ` Wander Lairson Costa
2023-05-15 18:08 ` Oleg Nesterov
2023-05-15 18:58 ` Wander Lairson Costa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox