* [PATCH] tracing/function-return-tracer: don't trace kfree while it frees the return stack
@ 2008-11-23 16:33 Frederic Weisbecker
2008-11-23 16:40 ` Ingo Molnar
0 siblings, 1 reply; 7+ messages in thread
From: Frederic Weisbecker @ 2008-11-23 16:33 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Steven Rostedt, Linux Kernel
Impact: fix a crash
While I killed the cat process, I got sometimes the following (but rare)
crash:
[ 65.689027] Pid: 2969, comm: cat Not tainted (2.6.28-rc6-tip #83) AMILO Li 2727
[ 65.689027] EIP: 0060:[<00000000>] EFLAGS: 00010082 CPU: 1
[ 65.689027] EIP is at 0x0
[ 65.689027] EAX: 00000000 EBX: f66cd780 ECX: c019a64a EDX: f66cd780
[ 65.689027] ESI: 00000286 EDI: f66cd780 EBP: f630be2c ESP: f630be24
[ 65.689027] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 65.689027] Process cat (pid: 2969, ti=f630a000 task=f66cd780 task.ti=f630a000)
[ 65.689027] Stack:
[ 65.689027] 00000012 f630bd54 f630be7c c012c853 00000000 c0133cc9 f66cda54 f630be5c
[ 65.689027] f630be68 f66cda54 f66cd88c f66cd878 f7070000 00000001 f630be90 c0135dbc
[ 65.689027] f614a614 f630be68 f630be68 f65ba200 00000002 f630bf10 f630be90 c012cad6
[ 65.689027] Call Trace:
[ 65.689027] [<c012c853>] ? do_exit+0x603/0x850
[ 65.689027] [<c0133cc9>] ? next_signal+0x9/0x40
[ 65.689027] [<c0135dbc>] ? dequeue_signal+0x8c/0x180
[ 65.689027] [<c012cad6>] ? do_group_exit+0x36/0x90
[ 65.689027] [<c013709c>] ? get_signal_to_deliver+0x20c/0x390
[ 65.689027] [<c0102b69>] ? do_notify_resume+0x99/0x8b0
[ 65.689027] [<c02e6d1a>] ? tty_ldisc_deref+0x5a/0x80
[ 65.689027] [<c014db9b>] ? trace_hardirqs_on+0xb/0x10
[ 65.689027] [<c02e6d1a>] ? tty_ldisc_deref+0x5a/0x80
[ 65.689027] [<c02e39b0>] ? n_tty_write+0x0/0x340
[ 65.689027] [<c02e1812>] ? redirected_tty_write+0x82/0x90
[ 65.689027] [<c019ee99>] ? vfs_write+0x99/0xd0
[ 65.689027] [<c02e1790>] ? redirected_tty_write+0x0/0x90
[ 65.689027] [<c019f342>] ? sys_write+0x42/0x70
[ 65.689027] [<c01035ca>] ? work_notifysig+0x13/0x19
[ 65.689027] Code: Bad EIP value.
[ 65.689027] EIP: [<00000000>] 0x0 SS:ESP 0068:f630be24
This is because on do_exit(), kfree is called to free the return addresses stack
but kfree is traced and stored its return address in this stack.
This patch fixes it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 90d99fb..ffff7ec 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1628,8 +1628,9 @@ void ftrace_retfunc_init_task(struct task_struct *t)
void ftrace_retfunc_exit_task(struct task_struct *t)
{
- kfree(t->ret_stack);
+ struct ftrace_ret_stack *ret_stack = t->ret_stack;
t->ret_stack = NULL;
+ kfree(ret_stack);
}
#endif
--
1.5.2.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] tracing/function-return-tracer: don't trace kfree while it frees the return stack
2008-11-23 16:33 [PATCH] tracing/function-return-tracer: don't trace kfree while it frees the return stack Frederic Weisbecker
@ 2008-11-23 16:40 ` Ingo Molnar
2008-11-23 16:44 ` Ingo Molnar
2008-11-23 19:24 ` Steven Rostedt
0 siblings, 2 replies; 7+ messages in thread
From: Ingo Molnar @ 2008-11-23 16:40 UTC (permalink / raw)
To: Frederic Weisbecker; +Cc: Steven Rostedt, Linux Kernel
* Frederic Weisbecker <fweisbec@gmail.com> wrote:
> Impact: fix a crash
>
> While I killed the cat process, I got sometimes the following (but
> rare) crash:
>
> [ 65.689027] Pid: 2969, comm: cat Not tainted (2.6.28-rc6-tip #83) AMILO Li 2727
> [ 65.689027] EIP: 0060:[<00000000>] EFLAGS: 00010082 CPU: 1
> [ 65.689027] EIP is at 0x0
> [ 65.689027] EAX: 00000000 EBX: f66cd780 ECX: c019a64a EDX: f66cd780
> [ 65.689027] ESI: 00000286 EDI: f66cd780 EBP: f630be2c ESP: f630be24
> [ 65.689027] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> [ 65.689027] Process cat (pid: 2969, ti=f630a000 task=f66cd780 task.ti=f630a000)
> [ 65.689027] Stack:
> [ 65.689027] 00000012 f630bd54 f630be7c c012c853 00000000 c0133cc9 f66cda54 f630be5c
> [ 65.689027] f630be68 f66cda54 f66cd88c f66cd878 f7070000 00000001 f630be90 c0135dbc
> [ 65.689027] f614a614 f630be68 f630be68 f65ba200 00000002 f630bf10 f630be90 c012cad6
> [ 65.689027] Call Trace:
> [ 65.689027] [<c012c853>] ? do_exit+0x603/0x850
> [ 65.689027] [<c0133cc9>] ? next_signal+0x9/0x40
> [ 65.689027] [<c0135dbc>] ? dequeue_signal+0x8c/0x180
> [ 65.689027] [<c012cad6>] ? do_group_exit+0x36/0x90
> [ 65.689027] [<c013709c>] ? get_signal_to_deliver+0x20c/0x390
> [ 65.689027] [<c0102b69>] ? do_notify_resume+0x99/0x8b0
> [ 65.689027] [<c02e6d1a>] ? tty_ldisc_deref+0x5a/0x80
> [ 65.689027] [<c014db9b>] ? trace_hardirqs_on+0xb/0x10
> [ 65.689027] [<c02e6d1a>] ? tty_ldisc_deref+0x5a/0x80
> [ 65.689027] [<c02e39b0>] ? n_tty_write+0x0/0x340
> [ 65.689027] [<c02e1812>] ? redirected_tty_write+0x82/0x90
> [ 65.689027] [<c019ee99>] ? vfs_write+0x99/0xd0
> [ 65.689027] [<c02e1790>] ? redirected_tty_write+0x0/0x90
> [ 65.689027] [<c019f342>] ? sys_write+0x42/0x70
> [ 65.689027] [<c01035ca>] ? work_notifysig+0x13/0x19
> [ 65.689027] Code: Bad EIP value.
> [ 65.689027] EIP: [<00000000>] 0x0 SS:ESP 0068:f630be24
>
> This is because on do_exit(), kfree is called to free the return addresses stack
> but kfree is traced and stored its return address in this stack.
> This patch fixes it.
>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
>
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 90d99fb..ffff7ec 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -1628,8 +1628,9 @@ void ftrace_retfunc_init_task(struct task_struct *t)
>
> void ftrace_retfunc_exit_task(struct task_struct *t)
> {
> - kfree(t->ret_stack);
> + struct ftrace_ret_stack *ret_stack = t->ret_stack;
> t->ret_stack = NULL;
> + kfree(ret_stack);
heh, nice one :)
note that we also need to keep gcc from reordering things here (no
matter how unlikely in this particular case).
(also, small detail: we prefer a newline after variable definitions.)
Full commit attached below.
Ingo
-------------->
>From eae849ca034c7f1015f0a6f17421ebc737f0a069 Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Sun, 23 Nov 2008 17:33:12 +0100
Subject: [PATCH] tracing/function-return-tracer: don't trace kfree while it frees the return stack
Impact: fix a crash
While I killed the cat process, I got sometimes the following (but rare)
crash:
[ 65.689027] Pid: 2969, comm: cat Not tainted (2.6.28-rc6-tip #83) AMILO Li 2727
[ 65.689027] EIP: 0060:[<00000000>] EFLAGS: 00010082 CPU: 1
[ 65.689027] EIP is at 0x0
[ 65.689027] EAX: 00000000 EBX: f66cd780 ECX: c019a64a EDX: f66cd780
[ 65.689027] ESI: 00000286 EDI: f66cd780 EBP: f630be2c ESP: f630be24
[ 65.689027] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 65.689027] Process cat (pid: 2969, ti=f630a000 task=f66cd780 task.ti=f630a000)
[ 65.689027] Stack:
[ 65.689027] 00000012 f630bd54 f630be7c c012c853 00000000 c0133cc9 f66cda54 f630be5c
[ 65.689027] f630be68 f66cda54 f66cd88c f66cd878 f7070000 00000001 f630be90 c0135dbc
[ 65.689027] f614a614 f630be68 f630be68 f65ba200 00000002 f630bf10 f630be90 c012cad6
[ 65.689027] Call Trace:
[ 65.689027] [<c012c853>] ? do_exit+0x603/0x850
[ 65.689027] [<c0133cc9>] ? next_signal+0x9/0x40
[ 65.689027] [<c0135dbc>] ? dequeue_signal+0x8c/0x180
[ 65.689027] [<c012cad6>] ? do_group_exit+0x36/0x90
[ 65.689027] [<c013709c>] ? get_signal_to_deliver+0x20c/0x390
[ 65.689027] [<c0102b69>] ? do_notify_resume+0x99/0x8b0
[ 65.689027] [<c02e6d1a>] ? tty_ldisc_deref+0x5a/0x80
[ 65.689027] [<c014db9b>] ? trace_hardirqs_on+0xb/0x10
[ 65.689027] [<c02e6d1a>] ? tty_ldisc_deref+0x5a/0x80
[ 65.689027] [<c02e39b0>] ? n_tty_write+0x0/0x340
[ 65.689027] [<c02e1812>] ? redirected_tty_write+0x82/0x90
[ 65.689027] [<c019ee99>] ? vfs_write+0x99/0xd0
[ 65.689027] [<c02e1790>] ? redirected_tty_write+0x0/0x90
[ 65.689027] [<c019f342>] ? sys_write+0x42/0x70
[ 65.689027] [<c01035ca>] ? work_notifysig+0x13/0x19
[ 65.689027] Code: Bad EIP value.
[ 65.689027] EIP: [<00000000>] 0x0 SS:ESP 0068:f630be24
This is because on do_exit(), kfree is called to free the return addresses stack
but kfree is traced and stored its return address in this stack.
This patch fixes it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/trace/ftrace.c | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 90d99fb..53042f1 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1628,8 +1628,13 @@ void ftrace_retfunc_init_task(struct task_struct *t)
void ftrace_retfunc_exit_task(struct task_struct *t)
{
- kfree(t->ret_stack);
+ struct ftrace_ret_stack *ret_stack = t->ret_stack;
+
t->ret_stack = NULL;
+ /* NULL must become visible to IRQs before we free it: */
+ barrier();
+
+ kfree(ret_stack);
}
#endif
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] tracing/function-return-tracer: don't trace kfree while it frees the return stack
2008-11-23 16:40 ` Ingo Molnar
@ 2008-11-23 16:44 ` Ingo Molnar
2008-11-23 17:43 ` Frederic Weisbecker
2008-11-23 19:24 ` Steven Rostedt
1 sibling, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2008-11-23 16:44 UTC (permalink / raw)
To: Frederic Weisbecker; +Cc: Steven Rostedt, Linux Kernel
* Ingo Molnar <mingo@elte.hu> wrote:
> void ftrace_retfunc_exit_task(struct task_struct *t)
> {
> - kfree(t->ret_stack);
> + struct ftrace_ret_stack *ret_stack = t->ret_stack;
> +
> t->ret_stack = NULL;
> + /* NULL must become visible to IRQs before we free it: */
> + barrier();
> +
> + kfree(ret_stack);
> }
hm, this reminds me - this is the wrong place for the callback - the
call stack is freed too early.
instead of freeing it in do_exit() (like it's done right now), it
should be freed when the task struct and thread info is freed: in
kernel/fork.c:free_task() - okay?
The difference is minor but could allow more complete tracing: as
right now we'll skip the entries that get generated when we schedule
away from a dead task.
Ingo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tracing/function-return-tracer: don't trace kfree while it frees the return stack
2008-11-23 16:44 ` Ingo Molnar
@ 2008-11-23 17:43 ` Frederic Weisbecker
0 siblings, 0 replies; 7+ messages in thread
From: Frederic Weisbecker @ 2008-11-23 17:43 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Steven Rostedt, Linux Kernel
Ingo Molnar a écrit :
> note that we also need to keep gcc from reordering things here (no
> matter how unlikely in this particular case).
>
> (also, small detail: we prefer a newline after variable definitions.)
>
> Full commit attached below.
>
I guessed that since tsk->ret_stack is referenced for the last time before the call to kfree,
the registers to memory flushing would be done before the call.
But you're right, I should prevent from possible compiler strange behaviours, thanks!
Ingo Molnar a écrit :
> * Ingo Molnar <mingo@elte.hu> wrote:
>
>> void ftrace_retfunc_exit_task(struct task_struct *t)
>> {
>> - kfree(t->ret_stack);
>> + struct ftrace_ret_stack *ret_stack = t->ret_stack;
>> +
>> t->ret_stack = NULL;
>> + /* NULL must become visible to IRQs before we free it: */
>> + barrier();
>> +
>> + kfree(ret_stack);
>> }
>
> hm, this reminds me - this is the wrong place for the callback - the
> call stack is freed too early.
>
> instead of freeing it in do_exit() (like it's done right now), it
> should be freed when the task struct and thread info is freed: in
> kernel/fork.c:free_task() - okay?
>
> The difference is minor but could allow more complete tracing: as
> right now we'll skip the entries that get generated when we schedule
> away from a dead task.
>
> Ingo
That's right. Even if the noret code path will not be traced, we are loosing some traces.
That should be fixed with the patch below (didn't tested widely but seems good).
--
>From a7e143e42808f66a7128c7de322ebd6733b7cd17 Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Sun, 23 Nov 2008 18:30:01 +0100
Subject: [PATCH] tracing/function-return-tracer: Free the return stack on free_task()
Impact: avoid loosing some traces when a task is freed
do_exit() is not the last function called when a task finishes.
There are still some functions which are to be called such as free_task().
So we delay the freeing of the return stack to the last moment.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
diff --git a/kernel/exit.c b/kernel/exit.c
index c8334ed..cb24037 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -47,7 +47,6 @@
#include <linux/task_io_accounting_ops.h>
#include <linux/tracehook.h>
#include <trace/sched.h>
-#include <linux/ftrace.h>
#include <asm/uaccess.h>
#include <asm/unistd.h>
@@ -1125,7 +1124,6 @@ NORET_TYPE void do_exit(long code)
preempt_disable();
/* causes final put_task_struct in finish_task_switch(). */
tsk->state = TASK_DEAD;
- ftrace_retfunc_exit_task(tsk);
schedule();
BUG();
/* Avoid "noreturn function does return". */
diff --git a/kernel/fork.c b/kernel/fork.c
index 354d3f0..d183739 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -141,6 +141,7 @@ void free_task(struct task_struct *tsk)
prop_local_destroy_single(&tsk->dirties);
free_thread_info(tsk->stack);
rt_mutex_debug_task_free(tsk);
+ ftrace_retfunc_exit_task(tsk);
free_task_struct(tsk);
}
EXPORT_SYMBOL(free_task);
--
1.5.2.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] tracing/function-return-tracer: don't trace kfree while it frees the return stack
2008-11-23 16:40 ` Ingo Molnar
2008-11-23 16:44 ` Ingo Molnar
@ 2008-11-23 19:24 ` Steven Rostedt
2008-11-23 19:35 ` Ingo Molnar
1 sibling, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2008-11-23 19:24 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Frederic Weisbecker, Linux Kernel
On Sun, 23 Nov 2008, Ingo Molnar wrote:
> >
> > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> > index 90d99fb..ffff7ec 100644
> > --- a/kernel/trace/ftrace.c
> > +++ b/kernel/trace/ftrace.c
> > @@ -1628,8 +1628,9 @@ void ftrace_retfunc_init_task(struct task_struct *t)
> >
> > void ftrace_retfunc_exit_task(struct task_struct *t)
> > {
> > - kfree(t->ret_stack);
> > + struct ftrace_ret_stack *ret_stack = t->ret_stack;
> > t->ret_stack = NULL;
> > + kfree(ret_stack);
>
> heh, nice one :)
>
> note that we also need to keep gcc from reordering things here (no
> matter how unlikely in this particular case).
I first thought that too, but thinking about it, if gcc does do that, then
it will break the logic for a correct C program.
t is passed in as a pointer, then it modifies the contents of t (which
could be a global pointer), then it calls a external function, that might
also reference the global pointer.
This means that if it were to reorder the two, it would break C, because
the compiler can not assume that the called function will read the global
pointer either.
In other words, the compiler should not need to worry about SMP or
modifications done by interrupts or other threads. But the compiler should
always preserve the order that is assumed by a single context.
-- Steve
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tracing/function-return-tracer: don't trace kfree while it frees the return stack
2008-11-23 19:24 ` Steven Rostedt
@ 2008-11-23 19:35 ` Ingo Molnar
2008-11-23 19:48 ` Steven Rostedt
0 siblings, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2008-11-23 19:35 UTC (permalink / raw)
To: Steven Rostedt; +Cc: Frederic Weisbecker, Linux Kernel
* Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Sun, 23 Nov 2008, Ingo Molnar wrote:
> > >
> > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> > > index 90d99fb..ffff7ec 100644
> > > --- a/kernel/trace/ftrace.c
> > > +++ b/kernel/trace/ftrace.c
> > > @@ -1628,8 +1628,9 @@ void ftrace_retfunc_init_task(struct task_struct *t)
> > >
> > > void ftrace_retfunc_exit_task(struct task_struct *t)
> > > {
> > > - kfree(t->ret_stack);
> > > + struct ftrace_ret_stack *ret_stack = t->ret_stack;
> > > t->ret_stack = NULL;
> > > + kfree(ret_stack);
> >
> > heh, nice one :)
> >
> > note that we also need to keep gcc from reordering things here (no
> > matter how unlikely in this particular case).
>
> I first thought that too, but thinking about it, if gcc does do that, then
> it will break the logic for a correct C program.
>
> t is passed in as a pointer, then it modifies the contents of t
> (which could be a global pointer), then it calls a external
> function, that might also reference the global pointer.
>
> This means that if it were to reorder the two, it would break C,
> because the compiler can not assume that the called function will
> read the global pointer either.
>
> In other words, the compiler should not need to worry about SMP or
> modifications done by interrupts or other threads. But the compiler
> should always preserve the order that is assumed by a single
> context.
Correct, but this assumes that kfree is a C function. Which it might
not necessarily be: it could be optimized via an inline in certain
cases, etc. It's best to document such cases explicitly.
In any case, the real solution is what i suggested in the previous
mail, to do the freeing from the task-struct freeing path in
kernel/fork.c:free_task() - that has other advantages as well.
Ingo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] tracing/function-return-tracer: don't trace kfree while it frees the return stack
2008-11-23 19:35 ` Ingo Molnar
@ 2008-11-23 19:48 ` Steven Rostedt
0 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2008-11-23 19:48 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Frederic Weisbecker, Linux Kernel
On Sun, 23 Nov 2008, Ingo Molnar wrote:
> > >
> > > note that we also need to keep gcc from reordering things here (no
> > > matter how unlikely in this particular case).
> >
> > I first thought that too, but thinking about it, if gcc does do that, then
> > it will break the logic for a correct C program.
> >
> > t is passed in as a pointer, then it modifies the contents of t
> > (which could be a global pointer), then it calls a external
> > function, that might also reference the global pointer.
> >
> > This means that if it were to reorder the two, it would break C,
> > because the compiler can not assume that the called function will
> > read the global pointer either.
> >
> > In other words, the compiler should not need to worry about SMP or
> > modifications done by interrupts or other threads. But the compiler
> > should always preserve the order that is assumed by a single
> > context.
>
> Correct, but this assumes that kfree is a C function. Which it might
> not necessarily be: it could be optimized via an inline in certain
> cases, etc. It's best to document such cases explicitly.
Yeah, I thought about kfree being optimized out somehow, but thinking
about what kfree does, it seems difficult to imagine how that could
happen.
>
> In any case, the real solution is what i suggested in the previous
> mail, to do the freeing from the task-struct freeing path in
> kernel/fork.c:free_task() - that has other advantages as well.
Yeah, but sometimes it's good to talk about quirks of a compiler, even on
obsoleted situations ;-)
-- Steve
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-11-23 19:49 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-23 16:33 [PATCH] tracing/function-return-tracer: don't trace kfree while it frees the return stack Frederic Weisbecker
2008-11-23 16:40 ` Ingo Molnar
2008-11-23 16:44 ` Ingo Molnar
2008-11-23 17:43 ` Frederic Weisbecker
2008-11-23 19:24 ` Steven Rostedt
2008-11-23 19:35 ` Ingo Molnar
2008-11-23 19:48 ` Steven Rostedt
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.