* [PATCH 0/5] [GIT PULL][2.6.33] tracing: various updates
@ 2009-10-13 20:33 Steven Rostedt
2009-10-13 20:33 ` [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp Steven Rostedt
` (4 more replies)
0 siblings, 5 replies; 20+ messages in thread
From: Steven Rostedt @ 2009-10-13 20:33 UTC (permalink / raw)
To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker
Ingo,
Please pull the latest tip/tracing/core tree, which can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
tip/tracing/core
Jiri Olsa (2):
tracing: enable records during the module load
tracing: enabling "__cold" functions
Steven Rostedt (2):
function-graph/x86: replace unbalanced ret with jmp
tracing: export symbols for kernel lock tracepoints
jolsa@redhat.com (1):
tracing: support multiple pids in set_pid_ftrace file
----
arch/x86/kernel/entry_32.S | 7 +-
arch/x86/kernel/entry_64.S | 6 +-
kernel/trace/ftrace.c | 272 +++++++++++++++++++++++++++++++-------------
kernel/trace/trace.h | 4 +-
lib/kernel_lock.c | 2 +
scripts/recordmcount.pl | 1 +
6 files changed, 205 insertions(+), 87 deletions(-)
^ permalink raw reply [flat|nested] 20+ messages in thread* [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp 2009-10-13 20:33 [PATCH 0/5] [GIT PULL][2.6.33] tracing: various updates Steven Rostedt @ 2009-10-13 20:33 ` Steven Rostedt 2009-10-13 20:47 ` Mathieu Desnoyers ` (2 more replies) 2009-10-13 20:33 ` [PATCH 2/5] [PATCH 2/5] tracing: export symbols for kernel lock tracepoints Steven Rostedt ` (3 subsequent siblings) 4 siblings, 3 replies; 20+ messages in thread From: Steven Rostedt @ 2009-10-13 20:33 UTC (permalink / raw) To: linux-kernel Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Mathieu Desnoyers [-- Attachment #1: 0001-function-graph-x86-replace-unbalanced-ret-with-jmp.patch --] [-- Type: text/plain, Size: 2155 bytes --] From: Steven Rostedt <srostedt@redhat.com> The function graph tracer replaces the return address with a hook to trace the exit of the function call. This hook will finish by returning to the real location the function should return to. But the current implementation uses a ret to jump to the real return location. This causes a imbalance between calls and ret. That is the original function does a call, the ret goes to the handler and then the handler does a ret without a matching call. Although the function graph tracer itself still breaks the branch predictor by replacing the original ret, by using a second ret and causing an imbalance, it breaks the predictor even more. This patch replaces the ret with a jmp to keep the calls and ret balanced. I tested this on one box and it showed a 1.7% increase in performance. Another box only showed a small 0.3% increase. But no box that I tested this on showed a decrease in performance by making this change. Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- arch/x86/kernel/entry_32.S | 7 ++----- arch/x86/kernel/entry_64.S | 6 +++--- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S index c097e7d..7d52e9d 100644 --- a/arch/x86/kernel/entry_32.S +++ b/arch/x86/kernel/entry_32.S @@ -1185,17 +1185,14 @@ END(ftrace_graph_caller) .globl return_to_handler return_to_handler: - pushl $0 pushl %eax - pushl %ecx pushl %edx movl %ebp, %eax call ftrace_return_to_handler - movl %eax, 0xc(%esp) + movl %eax, %ecx popl %edx - popl %ecx popl %eax - ret + jmp *%ecx #endif .section .rodata,"a" diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S index b5c061f..bd5bbdd 100644 --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -155,11 +155,11 @@ GLOBAL(return_to_handler) call ftrace_return_to_handler - movq %rax, 16(%rsp) + movq %rax, %rdi movq 8(%rsp), %rdx movq (%rsp), %rax - addq $16, %rsp - retq + addq $24, %rsp + jmp *%rdi #endif -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp 2009-10-13 20:33 ` [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp Steven Rostedt @ 2009-10-13 20:47 ` Mathieu Desnoyers 2009-10-13 21:10 ` Steven Rostedt 2009-10-13 21:02 ` Frederic Weisbecker 2009-10-14 6:58 ` [tip:tracing/core] function-graph/x86: Replace " tip-bot for Steven Rostedt 2 siblings, 1 reply; 20+ messages in thread From: Mathieu Desnoyers @ 2009-10-13 20:47 UTC (permalink / raw) To: Steven Rostedt Cc: linux-kernel, Ingo Molnar, Andrew Morton, Frederic Weisbecker * Steven Rostedt (rostedt@goodmis.org) wrote: > From: Steven Rostedt <srostedt@redhat.com> > > The function graph tracer replaces the return address with a hook to > trace the exit of the function call. This hook will finish by returning > to the real location the function should return to. > > But the current implementation uses a ret to jump to the real return > location. This causes a imbalance between calls and ret. That is > the original function does a call, the ret goes to the handler > and then the handler does a ret without a matching call. > > Although the function graph tracer itself still breaks the branch > predictor by replacing the original ret, by using a second ret and > causing an imbalance, it breaks the predictor even more. > > This patch replaces the ret with a jmp to keep the calls and ret > balanced. I tested this on one box and it showed a 1.7% increase in > performance. Another box only showed a small 0.3% increase. But no > box that I tested this on showed a decrease in performance by making this > change. This sounds exactly like what I proposed at LPC. I'm glad it shows actual improvements. Just to make sure I understand, the old sequence was: call fct call ftrace_entry ret to fct ret to ftrace_exit ret to caller and you now have: call fct call ftrace_entry ret to fct ret to ftrace_exit jmp to caller Am I correct ? Mathieu > > Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> > Cc: Frederic Weisbecker <fweisbec@gmail.com> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org> > --- > arch/x86/kernel/entry_32.S | 7 ++----- > arch/x86/kernel/entry_64.S | 6 +++--- > 2 files changed, 5 insertions(+), 8 deletions(-) > > diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S > index c097e7d..7d52e9d 100644 > --- a/arch/x86/kernel/entry_32.S > +++ b/arch/x86/kernel/entry_32.S > @@ -1185,17 +1185,14 @@ END(ftrace_graph_caller) > > .globl return_to_handler > return_to_handler: > - pushl $0 > pushl %eax > - pushl %ecx > pushl %edx > movl %ebp, %eax > call ftrace_return_to_handler > - movl %eax, 0xc(%esp) > + movl %eax, %ecx > popl %edx > - popl %ecx > popl %eax > - ret > + jmp *%ecx > #endif > > .section .rodata,"a" > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > index b5c061f..bd5bbdd 100644 > --- a/arch/x86/kernel/entry_64.S > +++ b/arch/x86/kernel/entry_64.S > @@ -155,11 +155,11 @@ GLOBAL(return_to_handler) > > call ftrace_return_to_handler > > - movq %rax, 16(%rsp) > + movq %rax, %rdi > movq 8(%rsp), %rdx > movq (%rsp), %rax > - addq $16, %rsp > - retq > + addq $24, %rsp > + jmp *%rdi > #endif > > > -- > 1.6.3.3 > > -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp 2009-10-13 20:47 ` Mathieu Desnoyers @ 2009-10-13 21:10 ` Steven Rostedt 2009-10-13 21:21 ` Mathieu Desnoyers 0 siblings, 1 reply; 20+ messages in thread From: Steven Rostedt @ 2009-10-13 21:10 UTC (permalink / raw) To: Mathieu Desnoyers Cc: linux-kernel, Ingo Molnar, Andrew Morton, Frederic Weisbecker On Tue, 2009-10-13 at 16:47 -0400, Mathieu Desnoyers wrote: > * Steven Rostedt (rostedt@goodmis.org) wrote: > > From: Steven Rostedt <srostedt@redhat.com> > > > > The function graph tracer replaces the return address with a hook to > > trace the exit of the function call. This hook will finish by returning > > to the real location the function should return to. > > > > But the current implementation uses a ret to jump to the real return > > location. This causes a imbalance between calls and ret. That is > > the original function does a call, the ret goes to the handler > > and then the handler does a ret without a matching call. > > > > Although the function graph tracer itself still breaks the branch > > predictor by replacing the original ret, by using a second ret and > > causing an imbalance, it breaks the predictor even more. > > > > This patch replaces the ret with a jmp to keep the calls and ret > > balanced. I tested this on one box and it showed a 1.7% increase in > > performance. Another box only showed a small 0.3% increase. But no > > box that I tested this on showed a decrease in performance by making this > > change. > > This sounds exactly like what I proposed at LPC. I'm glad it shows > actual improvements. This is what we discussed at LPC. We both were under the assumption that a jump would work. The question was how to make that jump without hosing registers. We lucked out that this is the back end of the return sequence. Where we can still clobber callie registers. (just not the ones holding the return code). > > Just to make sure I understand, the old sequence was: > > call fct > call ftrace_entry > ret to fct > ret to ftrace_exit > ret to caller > > and you now have: > > call fct > call ftrace_entry > ret to fct > ret to ftrace_exit > jmp to caller > > Am I correct ? Almost. What it was: call function function: call mcount mcount: call ftrace_entry ftrace_entry: mess up with return code of caller ret ret [function code] ret to ftrace_exit ftrace_exit: get real return ret to original So for the function we have 3 calls and 4 rets Now we have: What it was: call function function: call mcount mcount: call ftrace_entry ftrace_entry: mess up with return code of caller ret ret [function code] ret to ftrace_exit ftrace_exit: get real return jmp to original Now we have 3 calls and 3 rets Note the first call still does not match the ret, but we don't do two rets anymore. -- Steve ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp 2009-10-13 21:10 ` Steven Rostedt @ 2009-10-13 21:21 ` Mathieu Desnoyers 2009-10-13 21:26 ` Steven Rostedt 0 siblings, 1 reply; 20+ messages in thread From: Mathieu Desnoyers @ 2009-10-13 21:21 UTC (permalink / raw) To: Steven Rostedt Cc: linux-kernel, Ingo Molnar, Andrew Morton, Frederic Weisbecker * Steven Rostedt (rostedt@goodmis.org) wrote: > On Tue, 2009-10-13 at 16:47 -0400, Mathieu Desnoyers wrote: > > * Steven Rostedt (rostedt@goodmis.org) wrote: > > > From: Steven Rostedt <srostedt@redhat.com> > > > > > > The function graph tracer replaces the return address with a hook to > > > trace the exit of the function call. This hook will finish by returning > > > to the real location the function should return to. > > > > > > But the current implementation uses a ret to jump to the real return > > > location. This causes a imbalance between calls and ret. That is > > > the original function does a call, the ret goes to the handler > > > and then the handler does a ret without a matching call. > > > > > > Although the function graph tracer itself still breaks the branch > > > predictor by replacing the original ret, by using a second ret and > > > causing an imbalance, it breaks the predictor even more. > > > > > > This patch replaces the ret with a jmp to keep the calls and ret > > > balanced. I tested this on one box and it showed a 1.7% increase in > > > performance. Another box only showed a small 0.3% increase. But no > > > box that I tested this on showed a decrease in performance by making this > > > change. > > > > This sounds exactly like what I proposed at LPC. I'm glad it shows > > actual improvements. > > This is what we discussed at LPC. We both were under the assumption that > a jump would work. The question was how to make that jump without hosing > registers. > > We lucked out that this is the back end of the return sequence. Where we > can still clobber callie registers. (just not the ones holding the > return code). > > > > > Just to make sure I understand, the old sequence was: > > > > call fct > > call ftrace_entry > > ret to fct > > ret to ftrace_exit > > ret to caller > > > > and you now have: > > > > call fct > > call ftrace_entry > > ret to fct > > ret to ftrace_exit > > jmp to caller > > > > Am I correct ? > > Almost. > > What it was: > > call function > function: > call mcount > mcount: > call ftrace_entry > ftrace_entry: > mess up with return code of caller > ret > ret > > [function code] > > ret to ftrace_exit > ftrace_exit: > get real return > ret to original > > So for the function we have 3 calls and 4 rets > > Now we have: > > What it was: > > call function > function: > call mcount > mcount: > call ftrace_entry Can we manage to change this call > ftrace_entry: > mess up with return code of caller > ret .. and this ret for 2 jmp instructions too ? Given that we have no choice but to kill call/ret prediction logic, I think it might be good to try to use this logic as little as possible (by favoring jmp jmp over call/ret when the return target is invariant). That's just an idea, benchmarks could prove me right/wrong. Mathieu > ret > > [function code] > > ret to ftrace_exit > ftrace_exit: > get real return > jmp to original > > Now we have 3 calls and 3 rets > > Note the first call still does not match the ret, but we don't do two > rets anymore. > > -- Steve > > > -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp 2009-10-13 21:21 ` Mathieu Desnoyers @ 2009-10-13 21:26 ` Steven Rostedt 2009-10-13 22:32 ` Mathieu Desnoyers 0 siblings, 1 reply; 20+ messages in thread From: Steven Rostedt @ 2009-10-13 21:26 UTC (permalink / raw) To: Mathieu Desnoyers Cc: linux-kernel, Ingo Molnar, Andrew Morton, Frederic Weisbecker On Tue, 2009-10-13 at 17:21 -0400, Mathieu Desnoyers wrote: > > What it was: > > > > call function > > function: > > call mcount > > mcount: > > call ftrace_entry > > Can we manage to change this call Note, that call jumps to C code. > > > ftrace_entry: > > mess up with return code of caller > > ret > > .. and this ret for 2 jmp instructions too ? The code is all in C, and it too calls functions. Not sure where this helps out any. The ret here matches their calls. Thus the prediction will work. > > Given that we have no choice but to kill call/ret prediction logic, I > think it might be good to try to use this logic as little as possible > (by favoring jmp jmp over call/ret when the return target is invariant). > > That's just an idea, benchmarks could prove me right/wrong. I don't see how this would help. And I'm not about to waste time experimenting. What's the rational? -- Steve ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp 2009-10-13 21:26 ` Steven Rostedt @ 2009-10-13 22:32 ` Mathieu Desnoyers 0 siblings, 0 replies; 20+ messages in thread From: Mathieu Desnoyers @ 2009-10-13 22:32 UTC (permalink / raw) To: Steven Rostedt Cc: linux-kernel, Ingo Molnar, Andrew Morton, Frederic Weisbecker * Steven Rostedt (rostedt@goodmis.org) wrote: > On Tue, 2009-10-13 at 17:21 -0400, Mathieu Desnoyers wrote: > > > > What it was: > > > > > > call function > > > function: > > > call mcount > > > mcount: > > > call ftrace_entry > > > > Can we manage to change this call > > Note, that call jumps to C code. > > > > > > ftrace_entry: > > > mess up with return code of caller > > > ret > > > > .. and this ret for 2 jmp instructions too ? > > The code is all in C, and it too calls functions. Not sure where this > helps out any. The ret here matches their calls. Thus the prediction > will work. > Oh, OK. I thought the callback was in assembly. That's a bit more work than I thought. > > > > Given that we have no choice but to kill call/ret prediction logic, I > > think it might be good to try to use this logic as little as possible > > (by favoring jmp jmp over call/ret when the return target is invariant). > > > > That's just an idea, benchmarks could prove me right/wrong. > > I don't see how this would help. And I'm not about to waste time > experimenting. What's the rational? > The idea is that call/ret are fast when predictions are right. In this case, I wonder if the fact that we trash the call/ret prediction (even if this happens after the paired call/ret) would have an impact on balanced call/ret in the tracing code path. I doubt so, but who knows. Probably not worth spending much time on this. It would just have been nice to try if the implementation would have been trivial. Thanks, Mathieu > -- Steve > > -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp 2009-10-13 20:33 ` [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp Steven Rostedt 2009-10-13 20:47 ` Mathieu Desnoyers @ 2009-10-13 21:02 ` Frederic Weisbecker 2009-10-13 21:12 ` Steven Rostedt 2009-10-14 6:58 ` [tip:tracing/core] function-graph/x86: Replace " tip-bot for Steven Rostedt 2 siblings, 1 reply; 20+ messages in thread From: Frederic Weisbecker @ 2009-10-13 21:02 UTC (permalink / raw) To: Steven Rostedt Cc: linux-kernel, Ingo Molnar, Andrew Morton, Mathieu Desnoyers On Tue, Oct 13, 2009 at 04:33:50PM -0400, Steven Rostedt wrote: > From: Steven Rostedt <srostedt@redhat.com> > > The function graph tracer replaces the return address with a hook to > trace the exit of the function call. This hook will finish by returning > to the real location the function should return to. > > But the current implementation uses a ret to jump to the real return > location. This causes a imbalance between calls and ret. That is > the original function does a call, the ret goes to the handler > and then the handler does a ret without a matching call. > > Although the function graph tracer itself still breaks the branch > predictor by replacing the original ret, by using a second ret and > causing an imbalance, it breaks the predictor even more. I have troubles to understand by it breaks the predictor, especially since there is not conditional branch in return_to_handler. But still I don't understand why a ret would break more the branch prediction than a jmp. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp 2009-10-13 21:02 ` Frederic Weisbecker @ 2009-10-13 21:12 ` Steven Rostedt 2009-10-13 22:26 ` Frederic Weisbecker 0 siblings, 1 reply; 20+ messages in thread From: Steven Rostedt @ 2009-10-13 21:12 UTC (permalink / raw) To: Frederic Weisbecker Cc: linux-kernel, Ingo Molnar, Andrew Morton, Mathieu Desnoyers On Tue, 2009-10-13 at 23:02 +0200, Frederic Weisbecker wrote: > On Tue, Oct 13, 2009 at 04:33:50PM -0400, Steven Rostedt wrote: > > From: Steven Rostedt <srostedt@redhat.com> > > > > The function graph tracer replaces the return address with a hook to > > trace the exit of the function call. This hook will finish by returning > > to the real location the function should return to. > > > > But the current implementation uses a ret to jump to the real return > > location. This causes a imbalance between calls and ret. That is > > the original function does a call, the ret goes to the handler > > and then the handler does a ret without a matching call. > > > > Although the function graph tracer itself still breaks the branch > > predictor by replacing the original ret, by using a second ret and > > causing an imbalance, it breaks the predictor even more. > > > > I have troubles to understand by it breaks the predictor, especially > since there is not conditional branch in return_to_handler. > But still I don't understand why a ret would break more the branch > prediction than a jmp. Calls are branch prediction jumps. Which associates the "ret" with the call. As it approaches the ret, it starts to receive the code after the call. But this is stack order. Every call should hit one ret. But with the original code, we break this stack. We have one call and two rets. Which means that the branch prediction will also get messed up with the previous stored call. -- Steve ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp 2009-10-13 21:12 ` Steven Rostedt @ 2009-10-13 22:26 ` Frederic Weisbecker 0 siblings, 0 replies; 20+ messages in thread From: Frederic Weisbecker @ 2009-10-13 22:26 UTC (permalink / raw) To: Steven Rostedt Cc: linux-kernel, Ingo Molnar, Andrew Morton, Mathieu Desnoyers On Tue, Oct 13, 2009 at 05:12:46PM -0400, Steven Rostedt wrote: > On Tue, 2009-10-13 at 23:02 +0200, Frederic Weisbecker wrote: > > On Tue, Oct 13, 2009 at 04:33:50PM -0400, Steven Rostedt wrote: > > > From: Steven Rostedt <srostedt@redhat.com> > > > > > > The function graph tracer replaces the return address with a hook to > > > trace the exit of the function call. This hook will finish by returning > > > to the real location the function should return to. > > > > > > But the current implementation uses a ret to jump to the real return > > > location. This causes a imbalance between calls and ret. That is > > > the original function does a call, the ret goes to the handler > > > and then the handler does a ret without a matching call. > > > > > > Although the function graph tracer itself still breaks the branch > > > predictor by replacing the original ret, by using a second ret and > > > causing an imbalance, it breaks the predictor even more. > > > > > > > > I have troubles to understand by it breaks the predictor, especially > > since there is not conditional branch in return_to_handler. > > But still I don't understand why a ret would break more the branch > > prediction than a jmp. > > Calls are branch prediction jumps. Which associates the "ret" with the > call. As it approaches the ret, it starts to receive the code after the > call. > > But this is stack order. Every call should hit one ret. But with the > original code, we break this stack. We have one call and two rets. Which > means that the branch prediction will also get messed up with the > previous stored call. Oh, ok I got it. Thanks. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [tip:tracing/core] function-graph/x86: Replace unbalanced ret with jmp 2009-10-13 20:33 ` [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp Steven Rostedt 2009-10-13 20:47 ` Mathieu Desnoyers 2009-10-13 21:02 ` Frederic Weisbecker @ 2009-10-14 6:58 ` tip-bot for Steven Rostedt 2 siblings, 0 replies; 20+ messages in thread From: tip-bot for Steven Rostedt @ 2009-10-14 6:58 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, mathieu.desnoyers, hpa, mingo, fweisbec, rostedt, srostedt, tglx, mingo Commit-ID: 194ec34184869f0de1cf255c924fc5299e1b3d27 Gitweb: http://git.kernel.org/tip/194ec34184869f0de1cf255c924fc5299e1b3d27 Author: Steven Rostedt <srostedt@redhat.com> AuthorDate: Tue, 13 Oct 2009 16:33:50 -0400 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Wed, 14 Oct 2009 08:13:53 +0200 function-graph/x86: Replace unbalanced ret with jmp The function graph tracer replaces the return address with a hook to trace the exit of the function call. This hook will finish by returning to the real location the function should return to. But the current implementation uses a ret to jump to the real return location. This causes a imbalance between calls and ret. That is the original function does a call, the ret goes to the handler and then the handler does a ret without a matching call. Although the function graph tracer itself still breaks the branch predictor by replacing the original ret, by using a second ret and causing an imbalance, it breaks the predictor even more. This patch replaces the ret with a jmp to keep the calls and ret balanced. I tested this on one box and it showed a 1.7% increase in performance. Another box only showed a small 0.3% increase. But no box that I tested this on showed a decrease in performance by making this change. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091013203425.042034383@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- arch/x86/kernel/entry_32.S | 7 ++----- arch/x86/kernel/entry_64.S | 6 +++--- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S index c097e7d..7d52e9d 100644 --- a/arch/x86/kernel/entry_32.S +++ b/arch/x86/kernel/entry_32.S @@ -1185,17 +1185,14 @@ END(ftrace_graph_caller) .globl return_to_handler return_to_handler: - pushl $0 pushl %eax - pushl %ecx pushl %edx movl %ebp, %eax call ftrace_return_to_handler - movl %eax, 0xc(%esp) + movl %eax, %ecx popl %edx - popl %ecx popl %eax - ret + jmp *%ecx #endif .section .rodata,"a" diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S index b5c061f..bd5bbdd 100644 --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -155,11 +155,11 @@ GLOBAL(return_to_handler) call ftrace_return_to_handler - movq %rax, 16(%rsp) + movq %rax, %rdi movq 8(%rsp), %rdx movq (%rsp), %rax - addq $16, %rsp - retq + addq $24, %rsp + jmp *%rdi #endif ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/5] [PATCH 2/5] tracing: export symbols for kernel lock tracepoints 2009-10-13 20:33 [PATCH 0/5] [GIT PULL][2.6.33] tracing: various updates Steven Rostedt 2009-10-13 20:33 ` [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp Steven Rostedt @ 2009-10-13 20:33 ` Steven Rostedt 2009-10-13 20:43 ` Frederic Weisbecker 2009-10-13 20:33 ` [PATCH 3/5] [PATCH 3/5] tracing: support multiple pids in set_pid_ftrace file Steven Rostedt ` (2 subsequent siblings) 4 siblings, 1 reply; 20+ messages in thread From: Steven Rostedt @ 2009-10-13 20:33 UTC (permalink / raw) To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker [-- Attachment #1: 0002-tracing-export-symbols-for-kernel-lock-tracepoints.patch --] [-- Type: text/plain, Size: 665 bytes --] From: Steven Rostedt <srostedt@redhat.com> The big kernel lock tracepoints are used inside modules and they need to be exported accordingly. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- lib/kernel_lock.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/lib/kernel_lock.c b/lib/kernel_lock.c index 4ebfa5a..76f05bc 100644 --- a/lib/kernel_lock.c +++ b/lib/kernel_lock.c @@ -139,3 +139,5 @@ void __lockfunc _unlock_kernel(const char *func, const char *file, int line) EXPORT_SYMBOL(_lock_kernel); EXPORT_SYMBOL(_unlock_kernel); +EXPORT_TRACEPOINT_SYMBOL(lock_kernel); +EXPORT_TRACEPOINT_SYMBOL(unlock_kernel); -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] [PATCH 2/5] tracing: export symbols for kernel lock tracepoints 2009-10-13 20:33 ` [PATCH 2/5] [PATCH 2/5] tracing: export symbols for kernel lock tracepoints Steven Rostedt @ 2009-10-13 20:43 ` Frederic Weisbecker 2009-10-13 21:14 ` Steven Rostedt 0 siblings, 1 reply; 20+ messages in thread From: Frederic Weisbecker @ 2009-10-13 20:43 UTC (permalink / raw) To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton On Tue, Oct 13, 2009 at 04:33:51PM -0400, Steven Rostedt wrote: > From: Steven Rostedt <srostedt@redhat.com> > > The big kernel lock tracepoints are used inside modules and they need > to be exported accordingly. > > Signed-off-by: Steven Rostedt <rostedt@goodmis.org> > --- > lib/kernel_lock.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/lib/kernel_lock.c b/lib/kernel_lock.c > index 4ebfa5a..76f05bc 100644 > --- a/lib/kernel_lock.c > +++ b/lib/kernel_lock.c > @@ -139,3 +139,5 @@ void __lockfunc _unlock_kernel(const char *func, const char *file, int line) > EXPORT_SYMBOL(_lock_kernel); > EXPORT_SYMBOL(_unlock_kernel); > > +EXPORT_TRACEPOINT_SYMBOL(lock_kernel); > +EXPORT_TRACEPOINT_SYMBOL(unlock_kernel); > -- I remember when you showed me this patch, and it fixed a real issue because we had the tracepoint call in smp_lock.h: #define lock_kernel() \ trace_lock_kernel(__FUNC__, __FILE__, __LINE__); \ _lock_kernel(); \ So indeed the modules needed this missing export. But I had to move the tracepoint calls to lib/kernel_lock.c instead, because we had very bad headers dependencies. So it's not needed anymore. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/5] [PATCH 2/5] tracing: export symbols for kernel lock tracepoints 2009-10-13 20:43 ` Frederic Weisbecker @ 2009-10-13 21:14 ` Steven Rostedt 0 siblings, 0 replies; 20+ messages in thread From: Steven Rostedt @ 2009-10-13 21:14 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: linux-kernel, Ingo Molnar, Andrew Morton On Tue, 2009-10-13 at 22:43 +0200, Frederic Weisbecker wrote: > On Tue, Oct 13, 2009 at 04:33:51PM -0400, Steven Rostedt wrote: > > From: Steven Rostedt <srostedt@redhat.com> > > > > The big kernel lock tracepoints are used inside modules and they need > > to be exported accordingly. > > > > Signed-off-by: Steven Rostedt <rostedt@goodmis.org> > > --- > > lib/kernel_lock.c | 2 ++ > > 1 files changed, 2 insertions(+), 0 deletions(-) > > > > diff --git a/lib/kernel_lock.c b/lib/kernel_lock.c > > index 4ebfa5a..76f05bc 100644 > > --- a/lib/kernel_lock.c > > +++ b/lib/kernel_lock.c > > @@ -139,3 +139,5 @@ void __lockfunc _unlock_kernel(const char *func, const char *file, int line) > > EXPORT_SYMBOL(_lock_kernel); > > EXPORT_SYMBOL(_unlock_kernel); > > > > +EXPORT_TRACEPOINT_SYMBOL(lock_kernel); > > +EXPORT_TRACEPOINT_SYMBOL(unlock_kernel); > > -- > > > I remember when you showed me this patch, and it fixed a real > issue because we had the tracepoint call in smp_lock.h: > > #define lock_kernel() \ > trace_lock_kernel(__FUNC__, __FILE__, __LINE__); \ > _lock_kernel(); \ > > So indeed the modules needed this missing export. But I had > to move the tracepoint calls to lib/kernel_lock.c instead, > because we had very bad headers dependencies. > > So it's not needed anymore. Ah, I missed that change. I've had this queued up for a while. Ingo, want me to rebase and remove this patch? -- Steve ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/5] [PATCH 3/5] tracing: support multiple pids in set_pid_ftrace file 2009-10-13 20:33 [PATCH 0/5] [GIT PULL][2.6.33] tracing: various updates Steven Rostedt 2009-10-13 20:33 ` [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp Steven Rostedt 2009-10-13 20:33 ` [PATCH 2/5] [PATCH 2/5] tracing: export symbols for kernel lock tracepoints Steven Rostedt @ 2009-10-13 20:33 ` Steven Rostedt 2009-10-14 6:58 ` [tip:tracing/core] tracing: Support " tip-bot for jolsa@redhat.com 2009-10-13 20:33 ` [PATCH 4/5] [PATCH 4/5] tracing: enable records during the module load Steven Rostedt 2009-10-13 20:33 ` [PATCH 5/5] [PATCH 5/5] tracing: enabling "__cold" functions Steven Rostedt 4 siblings, 1 reply; 20+ messages in thread From: Steven Rostedt @ 2009-10-13 20:33 UTC (permalink / raw) To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Jiri Olsa [-- Attachment #1: 0003-tracing-support-multiple-pids-in-set_pid_ftrace-file.patch --] [-- Type: text/plain, Size: 8348 bytes --] From: jolsa@redhat.com <jolsa@redhat.com> Adding the possibility to set more than 1 pid in the set_pid_ftrace file, thus allowing to trace more than 1 independent processes. Usage: sh-4.0# echo 284 > ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid 284 sh-4.0# echo 1 >> ./set_ftrace_pid sh-4.0# echo 0 >> ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid swapper tasks 1 284 sh-4.0# echo 4 > ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid 4 sh-4.0# echo > ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid no pid sh-4.0# Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <1253088900-18506-1-git-send-email-jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- kernel/trace/ftrace.c | 234 ++++++++++++++++++++++++++++++++++--------------- kernel/trace/trace.h | 4 +- 2 files changed, 167 insertions(+), 71 deletions(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 45c9659..0c799d1 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -60,6 +60,13 @@ static int last_ftrace_enabled; /* Quick disabling of function tracer. */ int function_trace_stop; +/* List for set_ftrace_pid's pids. */ +LIST_HEAD(ftrace_pids); +struct ftrace_pid { + struct list_head list; + struct pid *pid; +}; + /* * ftrace_disabled is set when an anomaly is discovered. * ftrace_disabled is much stronger than ftrace_enabled. @@ -159,7 +166,7 @@ static int __register_ftrace_function(struct ftrace_ops *ops) else func = ftrace_list_func; - if (ftrace_pid_trace) { + if (!list_empty(&ftrace_pids)) { set_ftrace_pid_function(func); func = ftrace_pid_func; } @@ -207,7 +214,7 @@ static int __unregister_ftrace_function(struct ftrace_ops *ops) if (ftrace_list->next == &ftrace_list_end) { ftrace_func_t func = ftrace_list->func; - if (ftrace_pid_trace) { + if (!list_empty(&ftrace_pids)) { set_ftrace_pid_function(func); func = ftrace_pid_func; } @@ -235,7 +242,7 @@ static void ftrace_update_pid_func(void) func = __ftrace_trace_function; #endif - if (ftrace_pid_trace) { + if (!list_empty(&ftrace_pids)) { set_ftrace_pid_function(func); func = ftrace_pid_func; } else { @@ -825,8 +832,6 @@ static __init void ftrace_profile_debugfs(struct dentry *d_tracer) } #endif /* CONFIG_FUNCTION_PROFILER */ -/* set when tracing only a pid */ -struct pid *ftrace_pid_trace; static struct pid * const ftrace_swapper_pid = &init_struct_pid; #ifdef CONFIG_DYNAMIC_FTRACE @@ -2758,23 +2763,6 @@ static inline void ftrace_startup_enable(int command) { } # define ftrace_shutdown_sysctl() do { } while (0) #endif /* CONFIG_DYNAMIC_FTRACE */ -static ssize_t -ftrace_pid_read(struct file *file, char __user *ubuf, - size_t cnt, loff_t *ppos) -{ - char buf[64]; - int r; - - if (ftrace_pid_trace == ftrace_swapper_pid) - r = sprintf(buf, "swapper tasks\n"); - else if (ftrace_pid_trace) - r = sprintf(buf, "%u\n", pid_vnr(ftrace_pid_trace)); - else - r = sprintf(buf, "no pid\n"); - - return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); -} - static void clear_ftrace_swapper(void) { struct task_struct *p; @@ -2825,14 +2813,12 @@ static void set_ftrace_pid(struct pid *pid) rcu_read_unlock(); } -static void clear_ftrace_pid_task(struct pid **pid) +static void clear_ftrace_pid_task(struct pid *pid) { - if (*pid == ftrace_swapper_pid) + if (pid == ftrace_swapper_pid) clear_ftrace_swapper(); else - clear_ftrace_pid(*pid); - - *pid = NULL; + clear_ftrace_pid(pid); } static void set_ftrace_pid_task(struct pid *pid) @@ -2843,11 +2829,140 @@ static void set_ftrace_pid_task(struct pid *pid) set_ftrace_pid(pid); } +static int ftrace_pid_add(int p) +{ + struct pid *pid; + struct ftrace_pid *fpid; + int ret = -EINVAL; + + mutex_lock(&ftrace_lock); + + if (!p) + pid = ftrace_swapper_pid; + else + pid = find_get_pid(p); + + if (!pid) + goto out; + + ret = 0; + + list_for_each_entry(fpid, &ftrace_pids, list) + if (fpid->pid == pid) + goto out_put; + + ret = -ENOMEM; + + fpid = kmalloc(sizeof(*fpid), GFP_KERNEL); + if (!fpid) + goto out_put; + + list_add(&fpid->list, &ftrace_pids); + fpid->pid = pid; + + set_ftrace_pid_task(pid); + + ftrace_update_pid_func(); + ftrace_startup_enable(0); + + mutex_unlock(&ftrace_lock); + return 0; + +out_put: + if (pid != ftrace_swapper_pid) + put_pid(pid); + +out: + mutex_unlock(&ftrace_lock); + return ret; +} + +static void ftrace_pid_reset(void) +{ + struct ftrace_pid *fpid, *safe; + + mutex_lock(&ftrace_lock); + list_for_each_entry_safe(fpid, safe, &ftrace_pids, list) { + struct pid *pid = fpid->pid; + + clear_ftrace_pid_task(pid); + + list_del(&fpid->list); + kfree(fpid); + } + + ftrace_update_pid_func(); + ftrace_startup_enable(0); + + mutex_unlock(&ftrace_lock); +} + +static void *fpid_start(struct seq_file *m, loff_t *pos) +{ + mutex_lock(&ftrace_lock); + + if (list_empty(&ftrace_pids) && (!*pos)) + return (void *) 1; + + return seq_list_start(&ftrace_pids, *pos); +} + +static void *fpid_next(struct seq_file *m, void *v, loff_t *pos) +{ + if (v == (void *)1) + return NULL; + + return seq_list_next(v, &ftrace_pids, pos); +} + +static void fpid_stop(struct seq_file *m, void *p) +{ + mutex_unlock(&ftrace_lock); +} + +static int fpid_show(struct seq_file *m, void *v) +{ + const struct ftrace_pid *fpid = list_entry(v, struct ftrace_pid, list); + + if (v == (void *)1) { + seq_printf(m, "no pid\n"); + return 0; + } + + if (fpid->pid == ftrace_swapper_pid) + seq_printf(m, "swapper tasks\n"); + else + seq_printf(m, "%u\n", pid_vnr(fpid->pid)); + + return 0; +} + +static const struct seq_operations ftrace_pid_sops = { + .start = fpid_start, + .next = fpid_next, + .stop = fpid_stop, + .show = fpid_show, +}; + +static int +ftrace_pid_open(struct inode *inode, struct file *file) +{ + int ret = 0; + + if ((file->f_mode & FMODE_WRITE) && + (file->f_flags & O_TRUNC)) + ftrace_pid_reset(); + + if (file->f_mode & FMODE_READ) + ret = seq_open(file, &ftrace_pid_sops); + + return ret; +} + static ssize_t ftrace_pid_write(struct file *filp, const char __user *ubuf, size_t cnt, loff_t *ppos) { - struct pid *pid; char buf[64]; long val; int ret; @@ -2860,57 +2975,38 @@ ftrace_pid_write(struct file *filp, const char __user *ubuf, buf[cnt] = 0; + /* + * Allow "echo > set_ftrace_pid" or "echo -n '' > set_ftrace_pid" + * to clean the filter quietly. + */ + strstrip(buf); + if (strlen(buf) == 0) + return 1; + ret = strict_strtol(buf, 10, &val); if (ret < 0) return ret; - mutex_lock(&ftrace_lock); - if (val < 0) { - /* disable pid tracing */ - if (!ftrace_pid_trace) - goto out; - - clear_ftrace_pid_task(&ftrace_pid_trace); - - } else { - /* swapper task is special */ - if (!val) { - pid = ftrace_swapper_pid; - if (pid == ftrace_pid_trace) - goto out; - } else { - pid = find_get_pid(val); - - if (pid == ftrace_pid_trace) { - put_pid(pid); - goto out; - } - } - - if (ftrace_pid_trace) - clear_ftrace_pid_task(&ftrace_pid_trace); - - if (!pid) - goto out; - - ftrace_pid_trace = pid; - - set_ftrace_pid_task(ftrace_pid_trace); - } + ret = ftrace_pid_add(val); - /* update the function call */ - ftrace_update_pid_func(); - ftrace_startup_enable(0); + return ret ? ret : cnt; +} - out: - mutex_unlock(&ftrace_lock); +static int +ftrace_pid_release(struct inode *inode, struct file *file) +{ + if (file->f_mode & FMODE_READ) + seq_release(inode, file); - return cnt; + return 0; } static const struct file_operations ftrace_pid_fops = { - .read = ftrace_pid_read, - .write = ftrace_pid_write, + .open = ftrace_pid_open, + .write = ftrace_pid_write, + .read = seq_read, + .llseek = seq_lseek, + .release = ftrace_pid_release, }; static __init int ftrace_init_debugfs(void) diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index f22a7ac..acef8b4 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -496,12 +496,12 @@ print_graph_function(struct trace_iterator *iter) } #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ -extern struct pid *ftrace_pid_trace; +extern struct list_head ftrace_pids; #ifdef CONFIG_FUNCTION_TRACER static inline int ftrace_trace_task(struct task_struct *task) { - if (!ftrace_pid_trace) + if (list_empty(&ftrace_pids)) return 1; return test_tsk_trace_trace(task); -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [tip:tracing/core] tracing: Support multiple pids in set_pid_ftrace file 2009-10-13 20:33 ` [PATCH 3/5] [PATCH 3/5] tracing: support multiple pids in set_pid_ftrace file Steven Rostedt @ 2009-10-14 6:58 ` tip-bot for jolsa@redhat.com 0 siblings, 0 replies; 20+ messages in thread From: tip-bot for jolsa@redhat.com @ 2009-10-14 6:58 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, jolsa, fweisbec, rostedt, tglx, mingo Commit-ID: 756d17ee7ee4fbc8238bdf97100af63e6ac441ef Gitweb: http://git.kernel.org/tip/756d17ee7ee4fbc8238bdf97100af63e6ac441ef Author: jolsa@redhat.com <jolsa@redhat.com> AuthorDate: Tue, 13 Oct 2009 16:33:52 -0400 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Wed, 14 Oct 2009 08:13:53 +0200 tracing: Support multiple pids in set_pid_ftrace file Adding the possibility to set more than 1 pid in the set_pid_ftrace file, thus allowing to trace more than 1 independent processes. Usage: sh-4.0# echo 284 > ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid 284 sh-4.0# echo 1 >> ./set_ftrace_pid sh-4.0# echo 0 >> ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid swapper tasks 1 284 sh-4.0# echo 4 > ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid 4 sh-4.0# echo > ./set_ftrace_pid sh-4.0# cat ./set_ftrace_pid no pid sh-4.0# Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091013203425.565454612@goodmis.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/trace/ftrace.c | 234 ++++++++++++++++++++++++++++++++++--------------- kernel/trace/trace.h | 4 +- 2 files changed, 167 insertions(+), 71 deletions(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 45c9659..0c799d1 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -60,6 +60,13 @@ static int last_ftrace_enabled; /* Quick disabling of function tracer. */ int function_trace_stop; +/* List for set_ftrace_pid's pids. */ +LIST_HEAD(ftrace_pids); +struct ftrace_pid { + struct list_head list; + struct pid *pid; +}; + /* * ftrace_disabled is set when an anomaly is discovered. * ftrace_disabled is much stronger than ftrace_enabled. @@ -159,7 +166,7 @@ static int __register_ftrace_function(struct ftrace_ops *ops) else func = ftrace_list_func; - if (ftrace_pid_trace) { + if (!list_empty(&ftrace_pids)) { set_ftrace_pid_function(func); func = ftrace_pid_func; } @@ -207,7 +214,7 @@ static int __unregister_ftrace_function(struct ftrace_ops *ops) if (ftrace_list->next == &ftrace_list_end) { ftrace_func_t func = ftrace_list->func; - if (ftrace_pid_trace) { + if (!list_empty(&ftrace_pids)) { set_ftrace_pid_function(func); func = ftrace_pid_func; } @@ -235,7 +242,7 @@ static void ftrace_update_pid_func(void) func = __ftrace_trace_function; #endif - if (ftrace_pid_trace) { + if (!list_empty(&ftrace_pids)) { set_ftrace_pid_function(func); func = ftrace_pid_func; } else { @@ -825,8 +832,6 @@ static __init void ftrace_profile_debugfs(struct dentry *d_tracer) } #endif /* CONFIG_FUNCTION_PROFILER */ -/* set when tracing only a pid */ -struct pid *ftrace_pid_trace; static struct pid * const ftrace_swapper_pid = &init_struct_pid; #ifdef CONFIG_DYNAMIC_FTRACE @@ -2758,23 +2763,6 @@ static inline void ftrace_startup_enable(int command) { } # define ftrace_shutdown_sysctl() do { } while (0) #endif /* CONFIG_DYNAMIC_FTRACE */ -static ssize_t -ftrace_pid_read(struct file *file, char __user *ubuf, - size_t cnt, loff_t *ppos) -{ - char buf[64]; - int r; - - if (ftrace_pid_trace == ftrace_swapper_pid) - r = sprintf(buf, "swapper tasks\n"); - else if (ftrace_pid_trace) - r = sprintf(buf, "%u\n", pid_vnr(ftrace_pid_trace)); - else - r = sprintf(buf, "no pid\n"); - - return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); -} - static void clear_ftrace_swapper(void) { struct task_struct *p; @@ -2825,14 +2813,12 @@ static void set_ftrace_pid(struct pid *pid) rcu_read_unlock(); } -static void clear_ftrace_pid_task(struct pid **pid) +static void clear_ftrace_pid_task(struct pid *pid) { - if (*pid == ftrace_swapper_pid) + if (pid == ftrace_swapper_pid) clear_ftrace_swapper(); else - clear_ftrace_pid(*pid); - - *pid = NULL; + clear_ftrace_pid(pid); } static void set_ftrace_pid_task(struct pid *pid) @@ -2843,11 +2829,140 @@ static void set_ftrace_pid_task(struct pid *pid) set_ftrace_pid(pid); } +static int ftrace_pid_add(int p) +{ + struct pid *pid; + struct ftrace_pid *fpid; + int ret = -EINVAL; + + mutex_lock(&ftrace_lock); + + if (!p) + pid = ftrace_swapper_pid; + else + pid = find_get_pid(p); + + if (!pid) + goto out; + + ret = 0; + + list_for_each_entry(fpid, &ftrace_pids, list) + if (fpid->pid == pid) + goto out_put; + + ret = -ENOMEM; + + fpid = kmalloc(sizeof(*fpid), GFP_KERNEL); + if (!fpid) + goto out_put; + + list_add(&fpid->list, &ftrace_pids); + fpid->pid = pid; + + set_ftrace_pid_task(pid); + + ftrace_update_pid_func(); + ftrace_startup_enable(0); + + mutex_unlock(&ftrace_lock); + return 0; + +out_put: + if (pid != ftrace_swapper_pid) + put_pid(pid); + +out: + mutex_unlock(&ftrace_lock); + return ret; +} + +static void ftrace_pid_reset(void) +{ + struct ftrace_pid *fpid, *safe; + + mutex_lock(&ftrace_lock); + list_for_each_entry_safe(fpid, safe, &ftrace_pids, list) { + struct pid *pid = fpid->pid; + + clear_ftrace_pid_task(pid); + + list_del(&fpid->list); + kfree(fpid); + } + + ftrace_update_pid_func(); + ftrace_startup_enable(0); + + mutex_unlock(&ftrace_lock); +} + +static void *fpid_start(struct seq_file *m, loff_t *pos) +{ + mutex_lock(&ftrace_lock); + + if (list_empty(&ftrace_pids) && (!*pos)) + return (void *) 1; + + return seq_list_start(&ftrace_pids, *pos); +} + +static void *fpid_next(struct seq_file *m, void *v, loff_t *pos) +{ + if (v == (void *)1) + return NULL; + + return seq_list_next(v, &ftrace_pids, pos); +} + +static void fpid_stop(struct seq_file *m, void *p) +{ + mutex_unlock(&ftrace_lock); +} + +static int fpid_show(struct seq_file *m, void *v) +{ + const struct ftrace_pid *fpid = list_entry(v, struct ftrace_pid, list); + + if (v == (void *)1) { + seq_printf(m, "no pid\n"); + return 0; + } + + if (fpid->pid == ftrace_swapper_pid) + seq_printf(m, "swapper tasks\n"); + else + seq_printf(m, "%u\n", pid_vnr(fpid->pid)); + + return 0; +} + +static const struct seq_operations ftrace_pid_sops = { + .start = fpid_start, + .next = fpid_next, + .stop = fpid_stop, + .show = fpid_show, +}; + +static int +ftrace_pid_open(struct inode *inode, struct file *file) +{ + int ret = 0; + + if ((file->f_mode & FMODE_WRITE) && + (file->f_flags & O_TRUNC)) + ftrace_pid_reset(); + + if (file->f_mode & FMODE_READ) + ret = seq_open(file, &ftrace_pid_sops); + + return ret; +} + static ssize_t ftrace_pid_write(struct file *filp, const char __user *ubuf, size_t cnt, loff_t *ppos) { - struct pid *pid; char buf[64]; long val; int ret; @@ -2860,57 +2975,38 @@ ftrace_pid_write(struct file *filp, const char __user *ubuf, buf[cnt] = 0; + /* + * Allow "echo > set_ftrace_pid" or "echo -n '' > set_ftrace_pid" + * to clean the filter quietly. + */ + strstrip(buf); + if (strlen(buf) == 0) + return 1; + ret = strict_strtol(buf, 10, &val); if (ret < 0) return ret; - mutex_lock(&ftrace_lock); - if (val < 0) { - /* disable pid tracing */ - if (!ftrace_pid_trace) - goto out; - - clear_ftrace_pid_task(&ftrace_pid_trace); - - } else { - /* swapper task is special */ - if (!val) { - pid = ftrace_swapper_pid; - if (pid == ftrace_pid_trace) - goto out; - } else { - pid = find_get_pid(val); - - if (pid == ftrace_pid_trace) { - put_pid(pid); - goto out; - } - } - - if (ftrace_pid_trace) - clear_ftrace_pid_task(&ftrace_pid_trace); - - if (!pid) - goto out; - - ftrace_pid_trace = pid; - - set_ftrace_pid_task(ftrace_pid_trace); - } + ret = ftrace_pid_add(val); - /* update the function call */ - ftrace_update_pid_func(); - ftrace_startup_enable(0); + return ret ? ret : cnt; +} - out: - mutex_unlock(&ftrace_lock); +static int +ftrace_pid_release(struct inode *inode, struct file *file) +{ + if (file->f_mode & FMODE_READ) + seq_release(inode, file); - return cnt; + return 0; } static const struct file_operations ftrace_pid_fops = { - .read = ftrace_pid_read, - .write = ftrace_pid_write, + .open = ftrace_pid_open, + .write = ftrace_pid_write, + .read = seq_read, + .llseek = seq_lseek, + .release = ftrace_pid_release, }; static __init int ftrace_init_debugfs(void) diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index f22a7ac..acef8b4 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -496,12 +496,12 @@ print_graph_function(struct trace_iterator *iter) } #endif /* CONFIG_FUNCTION_GRAPH_TRACER */ -extern struct pid *ftrace_pid_trace; +extern struct list_head ftrace_pids; #ifdef CONFIG_FUNCTION_TRACER static inline int ftrace_trace_task(struct task_struct *task) { - if (!ftrace_pid_trace) + if (list_empty(&ftrace_pids)) return 1; return test_tsk_trace_trace(task); ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 4/5] [PATCH 4/5] tracing: enable records during the module load 2009-10-13 20:33 [PATCH 0/5] [GIT PULL][2.6.33] tracing: various updates Steven Rostedt ` (2 preceding siblings ...) 2009-10-13 20:33 ` [PATCH 3/5] [PATCH 3/5] tracing: support multiple pids in set_pid_ftrace file Steven Rostedt @ 2009-10-13 20:33 ` Steven Rostedt 2009-10-14 6:59 ` [tip:tracing/core] tracing: Enable " tip-bot for Jiri Olsa 2009-10-13 20:33 ` [PATCH 5/5] [PATCH 5/5] tracing: enabling "__cold" functions Steven Rostedt 4 siblings, 1 reply; 20+ messages in thread From: Steven Rostedt @ 2009-10-13 20:33 UTC (permalink / raw) To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Jiri Olsa [-- Attachment #1: 0004-tracing-enable-records-during-the-module-load.patch --] [-- Type: text/plain, Size: 2697 bytes --] From: Jiri Olsa <jolsa@redhat.com> I was debuging some module using "function" and "function_graph" tracers and noticed, that if you load module after you enabled tracing, the module's hooks will convert only to NOP instructions. The attached patch enables modules' hooks if there's function trace allready on, thus allowing to trace module functions. Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <1255183220-15640-1-git-send-email-jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- kernel/trace/ftrace.c | 38 ++++++++++++++++++++++++++++++-------- 1 files changed, 30 insertions(+), 8 deletions(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 0c799d1..aaea9cd 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -1270,12 +1270,34 @@ static int ftrace_update_code(struct module *mod) ftrace_new_addrs = p->newlist; p->flags = 0L; - /* convert record (i.e, patch mcount-call with NOP) */ - if (ftrace_code_disable(mod, p)) { - p->flags |= FTRACE_FL_CONVERTED; - ftrace_update_cnt++; - } else + /* + * Do the initial record convertion from mcount jump + * to the NOP instructions. + */ + if (!ftrace_code_disable(mod, p)) { ftrace_free_rec(p); + continue; + } + + p->flags |= FTRACE_FL_CONVERTED; + ftrace_update_cnt++; + + /* + * If the tracing is enabled, go ahead and enable the record. + * + * The reason not to enable the record immediatelly is the + * inherent check of ftrace_make_nop/ftrace_make_call for + * correct previous instructions. Making first the NOP + * conversion puts the module to the correct state, thus + * passing the ftrace_make_call check. + */ + if (ftrace_start_up) { + int failed = __ftrace_replace_code(p, 1); + if (failed) { + ftrace_bug(failed, p->ip); + ftrace_free_rec(p); + } + } } stop = ftrace_now(raw_smp_processor_id()); @@ -2609,7 +2631,7 @@ static __init int ftrace_init_dyn_debugfs(struct dentry *d_tracer) return 0; } -static int ftrace_convert_nops(struct module *mod, +static int ftrace_process_locs(struct module *mod, unsigned long *start, unsigned long *end) { @@ -2669,7 +2691,7 @@ static void ftrace_init_module(struct module *mod, { if (ftrace_disabled || start == end) return; - ftrace_convert_nops(mod, start, end); + ftrace_process_locs(mod, start, end); } static int ftrace_module_notify(struct notifier_block *self, @@ -2730,7 +2752,7 @@ void __init ftrace_init(void) last_ftrace_enabled = ftrace_enabled = 1; - ret = ftrace_convert_nops(NULL, + ret = ftrace_process_locs(NULL, __start_mcount_loc, __stop_mcount_loc); -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [tip:tracing/core] tracing: Enable records during the module load 2009-10-13 20:33 ` [PATCH 4/5] [PATCH 4/5] tracing: enable records during the module load Steven Rostedt @ 2009-10-14 6:59 ` tip-bot for Jiri Olsa 0 siblings, 0 replies; 20+ messages in thread From: tip-bot for Jiri Olsa @ 2009-10-14 6:59 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, jolsa, fweisbec, rostedt, tglx, mingo Commit-ID: 5cb084bb1f3fd4dcdaf7e4cf564994346ec8f783 Gitweb: http://git.kernel.org/tip/5cb084bb1f3fd4dcdaf7e4cf564994346ec8f783 Author: Jiri Olsa <jolsa@redhat.com> AuthorDate: Tue, 13 Oct 2009 16:33:53 -0400 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Wed, 14 Oct 2009 08:13:54 +0200 tracing: Enable records during the module load I was debuging some module using "function" and "function_graph" tracers and noticed, that if you load module after you enabled tracing, the module's hooks will convert only to NOP instructions. The attached patch enables modules' hooks if there's function trace allready on, thus allowing to trace module functions. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20091013203425.896285120@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/trace/ftrace.c | 38 ++++++++++++++++++++++++++++++-------- 1 files changed, 30 insertions(+), 8 deletions(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 0c799d1..aaea9cd 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -1270,12 +1270,34 @@ static int ftrace_update_code(struct module *mod) ftrace_new_addrs = p->newlist; p->flags = 0L; - /* convert record (i.e, patch mcount-call with NOP) */ - if (ftrace_code_disable(mod, p)) { - p->flags |= FTRACE_FL_CONVERTED; - ftrace_update_cnt++; - } else + /* + * Do the initial record convertion from mcount jump + * to the NOP instructions. + */ + if (!ftrace_code_disable(mod, p)) { ftrace_free_rec(p); + continue; + } + + p->flags |= FTRACE_FL_CONVERTED; + ftrace_update_cnt++; + + /* + * If the tracing is enabled, go ahead and enable the record. + * + * The reason not to enable the record immediatelly is the + * inherent check of ftrace_make_nop/ftrace_make_call for + * correct previous instructions. Making first the NOP + * conversion puts the module to the correct state, thus + * passing the ftrace_make_call check. + */ + if (ftrace_start_up) { + int failed = __ftrace_replace_code(p, 1); + if (failed) { + ftrace_bug(failed, p->ip); + ftrace_free_rec(p); + } + } } stop = ftrace_now(raw_smp_processor_id()); @@ -2609,7 +2631,7 @@ static __init int ftrace_init_dyn_debugfs(struct dentry *d_tracer) return 0; } -static int ftrace_convert_nops(struct module *mod, +static int ftrace_process_locs(struct module *mod, unsigned long *start, unsigned long *end) { @@ -2669,7 +2691,7 @@ static void ftrace_init_module(struct module *mod, { if (ftrace_disabled || start == end) return; - ftrace_convert_nops(mod, start, end); + ftrace_process_locs(mod, start, end); } static int ftrace_module_notify(struct notifier_block *self, @@ -2730,7 +2752,7 @@ void __init ftrace_init(void) last_ftrace_enabled = ftrace_enabled = 1; - ret = ftrace_convert_nops(NULL, + ret = ftrace_process_locs(NULL, __start_mcount_loc, __stop_mcount_loc); ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 5/5] [PATCH 5/5] tracing: enabling "__cold" functions 2009-10-13 20:33 [PATCH 0/5] [GIT PULL][2.6.33] tracing: various updates Steven Rostedt ` (3 preceding siblings ...) 2009-10-13 20:33 ` [PATCH 4/5] [PATCH 4/5] tracing: enable records during the module load Steven Rostedt @ 2009-10-13 20:33 ` Steven Rostedt 2009-10-14 6:59 ` [tip:tracing/core] tracing: Enable " tip-bot for Jiri Olsa 4 siblings, 1 reply; 20+ messages in thread From: Steven Rostedt @ 2009-10-13 20:33 UTC (permalink / raw) To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Jiri Olsa [-- Attachment #1: 0005-tracing-enabling-__cold-functions.patch --] [-- Type: text/plain, Size: 953 bytes --] From: Jiri Olsa <jolsa@redhat.com> Based on the commit: a586df06 "x86: Support __attribute__((__cold__)) in gcc 4.3" some of the functions goes to the ".text.unlikely" section. Looks like there's not many of them (I found printk, panic, __ssb_dma_not_implemented, fat_fs_error), but still worth to include I think. Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <1255104899-27784-1-git-send-email-jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> --- scripts/recordmcount.pl | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl index 090d300..bfb8b2c 100755 --- a/scripts/recordmcount.pl +++ b/scripts/recordmcount.pl @@ -119,6 +119,7 @@ my %text_sections = ( ".sched.text" => 1, ".spinlock.text" => 1, ".irqentry.text" => 1, + ".text.unlikely" => 1, ); $objdump = "objdump" if ((length $objdump) == 0); -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [tip:tracing/core] tracing: Enable "__cold" functions 2009-10-13 20:33 ` [PATCH 5/5] [PATCH 5/5] tracing: enabling "__cold" functions Steven Rostedt @ 2009-10-14 6:59 ` tip-bot for Jiri Olsa 0 siblings, 0 replies; 20+ messages in thread From: tip-bot for Jiri Olsa @ 2009-10-14 6:59 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, jolsa, fweisbec, rostedt, tglx, mingo Commit-ID: 4d8289494a37e19cd7f3beacea9c957ad3debad6 Gitweb: http://git.kernel.org/tip/4d8289494a37e19cd7f3beacea9c957ad3debad6 Author: Jiri Olsa <jolsa@redhat.com> AuthorDate: Tue, 13 Oct 2009 16:33:54 -0400 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Wed, 14 Oct 2009 08:13:54 +0200 tracing: Enable "__cold" functions Based on the commit: a586df06 "x86: Support __attribute__((__cold__)) in gcc 4.3" some of the functions goes to the ".text.unlikely" section. Looks like there's not many of them (I found printk, panic, __ssb_dma_not_implemented, fat_fs_error), but still worth to include I think. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20091013203426.175845614@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- scripts/recordmcount.pl | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl index 090d300..bfb8b2c 100755 --- a/scripts/recordmcount.pl +++ b/scripts/recordmcount.pl @@ -119,6 +119,7 @@ my %text_sections = ( ".sched.text" => 1, ".spinlock.text" => 1, ".irqentry.text" => 1, + ".text.unlikely" => 1, ); $objdump = "objdump" if ((length $objdump) == 0); ^ permalink raw reply related [flat|nested] 20+ messages in thread
end of thread, other threads:[~2009-10-14 7:00 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-10-13 20:33 [PATCH 0/5] [GIT PULL][2.6.33] tracing: various updates Steven Rostedt 2009-10-13 20:33 ` [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced ret with jmp Steven Rostedt 2009-10-13 20:47 ` Mathieu Desnoyers 2009-10-13 21:10 ` Steven Rostedt 2009-10-13 21:21 ` Mathieu Desnoyers 2009-10-13 21:26 ` Steven Rostedt 2009-10-13 22:32 ` Mathieu Desnoyers 2009-10-13 21:02 ` Frederic Weisbecker 2009-10-13 21:12 ` Steven Rostedt 2009-10-13 22:26 ` Frederic Weisbecker 2009-10-14 6:58 ` [tip:tracing/core] function-graph/x86: Replace " tip-bot for Steven Rostedt 2009-10-13 20:33 ` [PATCH 2/5] [PATCH 2/5] tracing: export symbols for kernel lock tracepoints Steven Rostedt 2009-10-13 20:43 ` Frederic Weisbecker 2009-10-13 21:14 ` Steven Rostedt 2009-10-13 20:33 ` [PATCH 3/5] [PATCH 3/5] tracing: support multiple pids in set_pid_ftrace file Steven Rostedt 2009-10-14 6:58 ` [tip:tracing/core] tracing: Support " tip-bot for jolsa@redhat.com 2009-10-13 20:33 ` [PATCH 4/5] [PATCH 4/5] tracing: enable records during the module load Steven Rostedt 2009-10-14 6:59 ` [tip:tracing/core] tracing: Enable " tip-bot for Jiri Olsa 2009-10-13 20:33 ` [PATCH 5/5] [PATCH 5/5] tracing: enabling "__cold" functions Steven Rostedt 2009-10-14 6:59 ` [tip:tracing/core] tracing: Enable " tip-bot for Jiri Olsa
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox