* [PATCH 0/3] [GIT PULL][urgent] function-graph: memory leak and race fixes
@ 2009-06-02 18:30 Steven Rostedt
2009-06-02 18:30 ` [PATCH 1/3] function-graph: only allocate init tasks if it was not already done Steven Rostedt
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Steven Rostedt @ 2009-06-02 18:30 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, stable,
Luis Claudio R. Goncalves, Oleg Nesterov
Ingo,
During testing of the -rt patch, Luis found a bug with the function
profiler (not in 30-rc and I sent the patch for you already), that
he could trigger a crash when enabling and disabling the function
profiler in a loop.
During my investigation of this, I tried enabling and disabling the
function graph tracer in a loop and was able to also crash the kernel.
This crash was not the same as the profiler crash. Looking into this
I found a few problems with the enabling of the function graph tracer.
One was a memory leak, the other two were races on SMP machines.
The races were more likely the cause of the crashes I saw. With these
patches applied, I no longer can produce the crash.
I've also Cc'd the stable team since these bugs also exist in 2.6.29.
Please pull the latest tip/tracing/urgent tree, which can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
tip/tracing/urgent
Steven Rostedt (3):
function-graph: only allocate init tasks if it was not already done
function-graph: enable the stack after initialization of other variables
function-graph: add memory barriers for accessing task's ret_stack
----
kernel/trace/ftrace.c | 23 +++++++++++++++--------
kernel/trace/trace_functions_graph.c | 6 ++++++
2 files changed, 21 insertions(+), 8 deletions(-)
--
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] function-graph: only allocate init tasks if it was not already done
2009-06-02 18:30 [PATCH 0/3] [GIT PULL][urgent] function-graph: memory leak and race fixes Steven Rostedt
@ 2009-06-02 18:30 ` Steven Rostedt
2009-06-02 18:30 ` [PATCH 2/3] function-graph: enable the stack after initialization of other variables Steven Rostedt
2009-06-02 18:30 ` [PATCH 3/3] function-graph: add memory barriers for accessing tasks ret_stack Steven Rostedt
2 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2009-06-02 18:30 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, stable,
Luis Claudio R. Goncalves, Oleg Nesterov
[-- Attachment #1: 0001-function-graph-only-allocate-init-tasks-if-it-was-no.patch --]
[-- Type: text/plain, Size: 1351 bytes --]
From: Steven Rostedt <srostedt@redhat.com>
When the function graph tracer is enabled, it calls the initialization
needed for the init tasks that would be called on all created tasks.
The problem is that this is called every time the function graph tracer
is enabled, and the ret_stack is allocated for the idle tasks each time.
Thus, the old ret_stack is lost and a memory leak is created.
This is also dangerous because if an interrupt happened on another CPU
with the init task and the ret_stack is replaced, we then lose all the
return pointers for the interrupt, and a crash would take place.
[ Impact: fix memory leak and possible crash due to race ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
kernel/trace/ftrace.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index f1ed080..ebff62e 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2643,8 +2643,10 @@ static int start_graph_tracing(void)
return -ENOMEM;
/* The cpu_boot init_task->ret_stack will never be freed */
- for_each_online_cpu(cpu)
- ftrace_graph_init_task(idle_task(cpu));
+ for_each_online_cpu(cpu) {
+ if (!idle_task(cpu)->ret_stack)
+ ftrace_graph_init_task(idle_task(cpu));
+ }
do {
ret = alloc_retstack_tasklist(ret_stack_list);
--
1.6.3.1
--
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/3] function-graph: enable the stack after initialization of other variables
2009-06-02 18:30 [PATCH 0/3] [GIT PULL][urgent] function-graph: memory leak and race fixes Steven Rostedt
2009-06-02 18:30 ` [PATCH 1/3] function-graph: only allocate init tasks if it was not already done Steven Rostedt
@ 2009-06-02 18:30 ` Steven Rostedt
2009-06-02 19:02 ` Frederic Weisbecker
2009-06-02 18:30 ` [PATCH 3/3] function-graph: add memory barriers for accessing tasks ret_stack Steven Rostedt
2 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2009-06-02 18:30 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, stable,
Luis Claudio R. Goncalves, Oleg Nesterov
[-- Attachment #1: 0002-function-graph-enable-the-stack-after-initialization.patch --]
[-- Type: text/plain, Size: 2309 bytes --]
From: Steven Rostedt <srostedt@redhat.com>
The function graph tracer checks if the task_struct has ret_stack defined
to know if it is OK or not to use it. The initialization is done for
all tasks by one process, but the idle tasks use the same initialization
used by new tasks.
If an interrupt happens on an idle task that just had the ret_stack
created, but before the rest of the initialization took place, then
we can corrupt the return address of the functions.
This patch moves the setting of the task_struct's ret_stack to after
the other variables have been initialized.
[ Impact: prevent kernel panic on idle task when starting function graph ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
kernel/trace/ftrace.c | 9 +++++++--
kernel/trace/trace_functions_graph.c | 6 ++++++
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index ebff62e..20e0660 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2739,15 +2739,20 @@ void unregister_ftrace_graph(void)
void ftrace_graph_init_task(struct task_struct *t)
{
if (atomic_read(&ftrace_graph_active)) {
- t->ret_stack = kmalloc(FTRACE_RETFUNC_DEPTH
+ struct ftrace_ret_stack *ret_stack;
+
+ ret_stack = kmalloc(FTRACE_RETFUNC_DEPTH
* sizeof(struct ftrace_ret_stack),
GFP_KERNEL);
- if (!t->ret_stack)
+ if (!ret_stack)
return;
t->curr_ret_stack = -1;
atomic_set(&t->tracing_graph_pause, 0);
atomic_set(&t->trace_overrun, 0);
t->ftrace_timestamp = 0;
+ /* make curr_ret_stack visable before we add the ret_stack */
+ smp_wmb();
+ t->ret_stack = ret_stack;
} else
t->ret_stack = NULL;
}
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index d28687e..baeb5fe 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -65,6 +65,12 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth)
if (!current->ret_stack)
return -EBUSY;
+ /*
+ * We must make sure the ret_stack is tested before we read
+ * anything else.
+ */
+ smp_rmb();
+
/* The return trace stack is full */
if (current->curr_ret_stack == FTRACE_RETFUNC_DEPTH - 1) {
atomic_inc(¤t->trace_overrun);
--
1.6.3.1
--
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/3] function-graph: add memory barriers for accessing tasks ret_stack
2009-06-02 18:30 [PATCH 0/3] [GIT PULL][urgent] function-graph: memory leak and race fixes Steven Rostedt
2009-06-02 18:30 ` [PATCH 1/3] function-graph: only allocate init tasks if it was not already done Steven Rostedt
2009-06-02 18:30 ` [PATCH 2/3] function-graph: enable the stack after initialization of other variables Steven Rostedt
@ 2009-06-02 18:30 ` Steven Rostedt
2 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2009-06-02 18:30 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, stable,
Luis Claudio R. Goncalves, Oleg Nesterov
[-- Attachment #1: 0003-function-graph-add-memory-barriers-for-accessing-tas.patch --]
[-- Type: text/plain, Size: 1266 bytes --]
From: Steven Rostedt <srostedt@redhat.com>
The code that handles the tasks ret_stack allocation for every task
assumes that only an interrupt can cause issues (even though interrupts
are disabled).
In reality, the code is allocating the ret_stack for tasks that may be
running on other CPUs and there are not efficient memory barriers to
handle this case.
[ Impact: prevent crash due to using of uninitialized ret_stack variables ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
kernel/trace/ftrace.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 20e0660..1664d3f 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2580,12 +2580,12 @@ static int alloc_retstack_tasklist(struct ftrace_ret_stack **ret_stack_list)
}
if (t->ret_stack == NULL) {
- t->curr_ret_stack = -1;
- /* Make sure IRQs see the -1 first: */
- barrier();
- t->ret_stack = ret_stack_list[start++];
atomic_set(&t->tracing_graph_pause, 0);
atomic_set(&t->trace_overrun, 0);
+ t->curr_ret_stack = -1;
+ /* Make sure the tasks see the -1 first: */
+ smp_wmb();
+ t->ret_stack = ret_stack_list[start++];
}
} while_each_thread(g, t);
--
1.6.3.1
--
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] function-graph: enable the stack after initialization of other variables
2009-06-02 18:30 ` [PATCH 2/3] function-graph: enable the stack after initialization of other variables Steven Rostedt
@ 2009-06-02 19:02 ` Frederic Weisbecker
2009-06-02 19:30 ` Steven Rostedt
0 siblings, 1 reply; 7+ messages in thread
From: Frederic Weisbecker @ 2009-06-02 19:02 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, Ingo Molnar, Andrew Morton, stable,
Luis Claudio R. Goncalves, Oleg Nesterov
On Tue, Jun 02, 2009 at 02:30:38PM -0400, Steven Rostedt wrote:
> From: Steven Rostedt <srostedt@redhat.com>
>
> The function graph tracer checks if the task_struct has ret_stack defined
> to know if it is OK or not to use it. The initialization is done for
> all tasks by one process, but the idle tasks use the same initialization
> used by new tasks.
>
> If an interrupt happens on an idle task that just had the ret_stack
> created, but before the rest of the initialization took place, then
> we can corrupt the return address of the functions.
>
> This patch moves the setting of the task_struct's ret_stack to after
> the other variables have been initialized.
>
> [ Impact: prevent kernel panic on idle task when starting function graph ]
>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
> kernel/trace/ftrace.c | 9 +++++++--
> kernel/trace/trace_functions_graph.c | 6 ++++++
> 2 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index ebff62e..20e0660 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2739,15 +2739,20 @@ void unregister_ftrace_graph(void)
> void ftrace_graph_init_task(struct task_struct *t)
> {
> if (atomic_read(&ftrace_graph_active)) {
> - t->ret_stack = kmalloc(FTRACE_RETFUNC_DEPTH
> + struct ftrace_ret_stack *ret_stack;
> +
> + ret_stack = kmalloc(FTRACE_RETFUNC_DEPTH
> * sizeof(struct ftrace_ret_stack),
> GFP_KERNEL);
> - if (!t->ret_stack)
> + if (!ret_stack)
> return;
> t->curr_ret_stack = -1;
> atomic_set(&t->tracing_graph_pause, 0);
> atomic_set(&t->trace_overrun, 0);
> t->ftrace_timestamp = 0;
> + /* make curr_ret_stack visable before we add the ret_stack */
> + smp_wmb();
> + t->ret_stack = ret_stack;
> } else
> t->ret_stack = NULL;
> }
Thanks for fixing this.
> diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
> index d28687e..baeb5fe 100644
> --- a/kernel/trace/trace_functions_graph.c
> +++ b/kernel/trace/trace_functions_graph.c
> @@ -65,6 +65,12 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth)
> if (!current->ret_stack)
> return -EBUSY;
>
> + /*
> + * We must make sure the ret_stack is tested before we read
> + * anything else.
> + */
> + smp_rmb();
Isn't this part a too much costly for very traced function?
> +
> /* The return trace stack is full */
> if (current->curr_ret_stack == FTRACE_RETFUNC_DEPTH - 1) {
> atomic_inc(¤t->trace_overrun);
> --
> 1.6.3.1
>
> --
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] function-graph: enable the stack after initialization of other variables
2009-06-02 19:02 ` Frederic Weisbecker
@ 2009-06-02 19:30 ` Steven Rostedt
2009-06-03 15:35 ` Paul E. McKenney
0 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2009-06-02 19:30 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Ingo Molnar, Andrew Morton, stable,
Luis Claudio R. Goncalves, Oleg Nesterov, Paul E. McKenney
On Tue, 2 Jun 2009, Frederic Weisbecker wrote:
> > diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
> > index d28687e..baeb5fe 100644
> > --- a/kernel/trace/trace_functions_graph.c
> > +++ b/kernel/trace/trace_functions_graph.c
> > @@ -65,6 +65,12 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth)
> > if (!current->ret_stack)
> > return -EBUSY;
> >
> > + /*
> > + * We must make sure the ret_stack is tested before we read
> > + * anything else.
> > + */
> > + smp_rmb();
>
>
> Isn't this part a too much costly for very traced function?
I was thinking that, but otherwise we still have the problem. It is a read
barrier which I don't think is as costly as a write or full barrier. But
because we have the if statement, perhaps a "read_barrier_depends" can do?
[ added Paul McKenney because he's good with barriers ]
-- Steve
>
>
> > +
> > /* The return trace stack is full */
> > if (current->curr_ret_stack == FTRACE_RETFUNC_DEPTH - 1) {
> > atomic_inc(¤t->trace_overrun);
> > --
> > 1.6.3.1
> >
> > --
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] function-graph: enable the stack after initialization of other variables
2009-06-02 19:30 ` Steven Rostedt
@ 2009-06-03 15:35 ` Paul E. McKenney
0 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2009-06-03 15:35 UTC (permalink / raw)
To: Steven Rostedt
Cc: Frederic Weisbecker, LKML, Ingo Molnar, Andrew Morton, stable,
Luis Claudio R. Goncalves, Oleg Nesterov
On Tue, Jun 02, 2009 at 03:30:14PM -0400, Steven Rostedt wrote:
>
> On Tue, 2 Jun 2009, Frederic Weisbecker wrote:
> > > diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
> > > index d28687e..baeb5fe 100644
> > > --- a/kernel/trace/trace_functions_graph.c
> > > +++ b/kernel/trace/trace_functions_graph.c
> > > @@ -65,6 +65,12 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth)
> > > if (!current->ret_stack)
> > > return -EBUSY;
> > >
> > > + /*
> > > + * We must make sure the ret_stack is tested before we read
> > > + * anything else.
> > > + */
> > > + smp_rmb();
> >
> >
> > Isn't this part a too much costly for very traced function?
>
> I was thinking that, but otherwise we still have the problem. It is a read
> barrier which I don't think is as costly as a write or full barrier. But
> because we have the if statement, perhaps a "read_barrier_depends" can do?
Although this would work on some CPU architectures (x86, s390, Power,
and perhaps a few others), it would break on those weakly ordered systems
that do not respect control dependencies.
One approach would be to use another level of indirection. If this new
pointer is NULL, you return EBUSY. Make this new pointer reference
the updated fields (tracing_graph_pause, trace_overrun, and so on).
This could point into the same data structure -- it is not necessary to
actually allocate a new chunk of memory.
Then you can use the existing rcu_assign_pointer() and rcu_dereference()
primitives.
Does this approach help at all?
Thanx, Paul
> [ added Paul McKenney because he's good with barriers ]
>
> -- Steve
>
>
> >
> >
> > > +
> > > /* The return trace stack is full */
> > > if (current->curr_ret_stack == FTRACE_RETFUNC_DEPTH - 1) {
> > > atomic_inc(¤t->trace_overrun);
> > > --
> > > 1.6.3.1
> > >
> > > --
> >
> >
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-06-03 15:40 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-02 18:30 [PATCH 0/3] [GIT PULL][urgent] function-graph: memory leak and race fixes Steven Rostedt
2009-06-02 18:30 ` [PATCH 1/3] function-graph: only allocate init tasks if it was not already done Steven Rostedt
2009-06-02 18:30 ` [PATCH 2/3] function-graph: enable the stack after initialization of other variables Steven Rostedt
2009-06-02 19:02 ` Frederic Weisbecker
2009-06-02 19:30 ` Steven Rostedt
2009-06-03 15:35 ` Paul E. McKenney
2009-06-02 18:30 ` [PATCH 3/3] function-graph: add memory barriers for accessing tasks ret_stack Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox