From: Frederic Weisbecker <fweisbec@gmail.com>
To: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
linux-kernel@vger.kernel.org, linuxppc-dev@ozlabs.org,
anton@samba.org, Ingo Molnar <mingo@elte.hu>
Subject: Re: [PATCH] powerpc/perf_events: Implement perf_arch_fetch_caller_regs for powerpc
Date: Tue, 16 Mar 2010 21:56:28 +0100 [thread overview]
Message-ID: <20100316205609.GA6822@nowhere> (raw)
In-Reply-To: <20100316032213.GA3656@drongo>
On Tue, Mar 16, 2010 at 02:22:13PM +1100, Paul Mackerras wrote:
> On Mon, Mar 15, 2010 at 10:04:54PM +0100, Frederic Weisbecker wrote:
> > On Mon, Mar 15, 2010 at 04:46:15PM +1100, Paul Mackerras wrote:
>
> > > 14.99% perf [kernel.kallsyms] [k] ._raw_spin_lock
> > > |
> > > --- ._raw_spin_lock
> > > |
> > > |--25.00%-- .alloc_fd
> > > | (nil)
> > > | |
> > > | |--50.00%-- .anon_inode_getfd
> > > | | .sys_perf_event_open
> > > | | syscall_exit
> > > | | syscall
> > > | | create_counter
> > > | | __cmd_record
> > > | | run_builtin
> > > | | main
> > > | | 0xfd2e704
> > > | | 0xfd2e8c0
> > > | | (nil)
> > >
> > > ... etc.
> > >
> > > Signed-off-by: Paul Mackerras <paulus@samba.org>
> >
> >
> > Cool!
>
> By the way, I notice that gcc tends to inline the tracing functions,
> which means that by going up 2 stack frames we miss some of the
> functions. For example, for the lock:lock_acquire event, we have
> _raw_spin_lock() -> lock_acquire() -> trace_lock_acquire() ->
> perf_trace_lock_acquire() -> perf_trace_templ_lock_acquire() ->
> perf_fetch_caller_regs() -> perf_arch_fetch_caller_regs().
>
> But in the ppc64 kernel binary I just built, gcc inlined
> trace_lock_acquire in lock_acquire, and perf_trace_templ_lock_acquire
> in perf_trace_lock_acquire. Given that perf_fetch_caller_regs is
> explicitly inlined, going up two levels from perf_fetch_caller_regs
> gets us to _raw_spin_lock, whereas I think you intended it to get us
> to trace_lock_acquire. I'm not sure what to do about that - any
> thoughts?
Yeah I've indeed seen this, and the problem is especially
the fact perf_trace_templ_lock_acquire may or may not be
inlined.
It is used for trace events that use the TRACE_EVENT_CLASS
thing. We define a pattern of event structure that is shared
among several events.
For example event A and event B share perf_trace_templ_foo.
Both will have a different perf_trace_blah but those
perf_trace_blah will both call the same perf_trace_templ_foo(),
in this case, it won't be inlined.
Events that don't share a pattern will have their
perf_trace_templ inlined, because there will be an exclusive 1:1
relationship between both.
The rewind of 2 is well suited for events sharing a pattern, ip
will match the right event source, and not one of its callers.
Unfortunately, the others are more unlucky.
I didn't mind much about this yet because it had no bad effect
on lock events. Quite the opposite actually. It's not very interesting
to have lock_acquire as the event source unless you have a callchain.
If you have no callchain, you'll see a lot of such in perf report:
sym1 lock_aquire
sym2 lock_acquire
sym3 lock_acquire
What you want here is the function that called lock_acquire.
But if you have a callchain it's fine, because you have the nature
of the event (lock_aquire) and the origin as well.
That said, lock events are an exception where the mistake
has a lucky result. Other inlined events are harmed as we lose
their most important caller. So I'm going to fix that.
I can just fetch the regs from perf_trace_foo() and pass them
to perf_trace_templ_foo() and here we are.
Thanks.
prev parent reply other threads:[~2010-03-16 20:56 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-15 5:46 [PATCH] powerpc/perf_events: Implement perf_arch_fetch_caller_regs for powerpc Paul Mackerras
2010-03-15 17:36 ` Michael Neuling
2010-03-15 21:04 ` Frederic Weisbecker
2010-03-16 3:22 ` Paul Mackerras
2010-03-16 20:56 ` Frederic Weisbecker [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100316205609.GA6822@nowhere \
--to=fweisbec@gmail.com \
--cc=anton@samba.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=mingo@elte.hu \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).