linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Paul Mackerras <paulus@samba.org>, Ingo Molnar <mingo@elte.hu>
Subject: [PATCH 2/7] perf: Fetch hot regs from the template caller
Date: Fri, 26 Mar 2010 02:52:37 +0100	[thread overview]
Message-ID: <1269568362-13690-3-git-send-regression-fweisbec@gmail.com> (raw)
In-Reply-To: <1269568362-13690-1-git-send-regression-fweisbec@gmail.com>

Trace events can be defined from a template using
DECLARE_EVENT_CLASS/DEFINE_EVENT or directly with TRACE_EVENT.

In both cases we have a template tracepoint handler, used to
record the trace, to which we pass our ftrace event instance.

In the function level, if the class is named "foo" and the event
is named "blah", we have the following chain of calls:

perf_trace_blah() -> perf_trace_templ_foo()

In the case we have several events sharing the class "blah",
we'll have multiple users of perf_trace_templ_foo(), and it
won't be inlined by the compiler. This is usually what happens
with the DECLARE_EVENT_CLASS/DEFINE_EVENT based definition.

But if perf_trace_blah() is the only caller of perf_trace_templ_foo()
there are fair chances that it will be inlined.

The problem is that we fetch the regs from perf_trace_templ_foo()
after we rewinded the frame pointer to the second caller, we want
to reach the caller of perf_trace_blah() to get the right source
of the event. And we do this by always assuming that
perf_trace_templ_foo() is not inlined. But as shown above this
is not always true. And if it is inlined we miss the first caller,
losing the most important level of precision.

We get:
	    61.31%       ls  [kernel.kallsyms]  [k] do_softirq
                         |
                         --- do_softirq
                             irq_exit
                             do_IRQ
                             common_interrupt
                            |
                            |--25.00%-- tty_buffer_request_room

Instead of:
	    61.31%       ls  [kernel.kallsyms]  [k] __do_softirq
                         |
                         --- __do_softirq
                             do_softirq
                             irq_exit
                             do_IRQ
                             common_interrupt
                            |
                            |--25.00%-- tty_buffer_request_room

To fix this, we fetch the regs from perf_trace_blah() rather than
perf_trace_templ_foo() so that we don't have to deal with inlining
surprises.

That also bring us the advantage of having the true source of the
event even if we don't have frame pointers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
---
 include/trace/ftrace.h |   23 ++++++++++++-----------
 1 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index ea6f9d4..882c648 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -758,13 +758,12 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
 static notrace void							\
 perf_trace_templ_##call(struct ftrace_event_call *event_call,		\
-			    proto)					\
+			struct pt_regs *__regs, proto)			\
 {									\
 	struct ftrace_data_offsets_##call __maybe_unused __data_offsets;\
 	struct ftrace_raw_##call *entry;				\
 	u64 __addr = 0, __count = 1;					\
 	unsigned long irq_flags;					\
-	struct pt_regs *__regs;						\
 	int __entry_size;						\
 	int __data_size;						\
 	int rctx;							\
@@ -785,20 +784,22 @@ perf_trace_templ_##call(struct ftrace_event_call *event_call,		\
 									\
 	{ assign; }							\
 									\
-	__regs = &__get_cpu_var(perf_trace_regs);			\
-	perf_fetch_caller_regs(__regs, 2);				\
-									\
 	perf_trace_buf_submit(entry, __entry_size, rctx, __addr,	\
 			       __count, irq_flags, __regs);		\
 }
 
 #undef DEFINE_EVENT
-#define DEFINE_EVENT(template, call, proto, args)		\
-static notrace void perf_trace_##call(proto)			\
-{								\
-	struct ftrace_event_call *event_call = &event_##call;	\
-								\
-	perf_trace_templ_##template(event_call, args);		\
+#define DEFINE_EVENT(template, call, proto, args)			\
+static notrace void perf_trace_##call(proto)				\
+{									\
+	struct ftrace_event_call *event_call = &event_##call;		\
+	struct pt_regs *__regs = &get_cpu_var(perf_trace_regs);		\
+									\
+	perf_fetch_caller_regs(__regs, 1);				\
+									\
+	perf_trace_templ_##template(event_call, __regs, args);		\
+									\
+	put_cpu_var(perf_trace_regs);					\
 }
 
 #undef DEFINE_EVENT_PRINT
-- 
1.6.2.3


  parent reply	other threads:[~2010-03-26  1:53 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-26  1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
2010-03-26  1:52 ` [PATCH 1/7] perf: Drop the frame reliablity check Frederic Weisbecker
2010-03-26  1:52 ` Frederic Weisbecker [this message]
2010-03-26  1:52 ` [PATCH 3/7] x86: Unify dumpstack.h and stacktrace.h Frederic Weisbecker
2010-03-26  1:52 ` [PATCH 4/7] perf: Move perf_arch_fetch_caller_regs into a macro Frederic Weisbecker
2010-03-26  1:52 ` [PATCH 5/7] perf: Make perf_fetch_caller_regs rewind to the first caller only Frederic Weisbecker
2010-04-08  9:57   ` [BUG perf] perf_fetch_caller_regs / rewind_frame_pointer can panic Eric Dumazet
2010-04-08 10:59     ` Frederic Weisbecker
2010-04-08 12:32     ` [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching Frederic Weisbecker
2010-04-08 13:52       ` Eric Dumazet
2010-04-08 17:31         ` [GIT PULL] perf fix Frederic Weisbecker
2010-04-13 22:51           ` Ingo Molnar
2010-03-26  1:52 ` [PATCH 6/7] perf: Use hot regs with software sched/migrate events Frederic Weisbecker
2010-03-26  1:52 ` [PATCH 7/7] perf: Correctly align perf event tracing buffer Frederic Weisbecker
2010-03-26  6:02 ` [PATCH 0/7] perf updates and fixes Paul Mackerras
2010-03-26  7:58   ` Ingo Molnar
2010-03-26 17:38     ` Frederic Weisbecker
2010-03-26 17:45   ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1269568362-13690-3-git-send-regression-fweisbec@gmail.com \
    --to=fweisbec@gmail.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).