All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Mackerras <paulus@samba.org>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, linuxppc-dev@ozlabs.org,
	anton@samba.org, Ingo Molnar <mingo@elte.hu>
Subject: Re: [PATCH] powerpc/perf_events: Implement perf_arch_fetch_caller_regs for powerpc
Date: Tue, 16 Mar 2010 14:22:13 +1100	[thread overview]
Message-ID: <20100316032213.GA3656@drongo> (raw)
In-Reply-To: <20100315210450.GF5082@nowhere>

On Mon, Mar 15, 2010 at 10:04:54PM +0100, Frederic Weisbecker wrote:
> On Mon, Mar 15, 2010 at 04:46:15PM +1100, Paul Mackerras wrote:

> >     14.99%            perf  [kernel.kallsyms]  [k] ._raw_spin_lock
> >                       |
> >                       --- ._raw_spin_lock
> >                          |          
> >                          |--25.00%-- .alloc_fd
> >                          |          (nil)
> >                          |          |          
> >                          |          |--50.00%-- .anon_inode_getfd
> >                          |          |          .sys_perf_event_open
> >                          |          |          syscall_exit
> >                          |          |          syscall
> >                          |          |          create_counter
> >                          |          |          __cmd_record
> >                          |          |          run_builtin
> >                          |          |          main
> >                          |          |          0xfd2e704
> >                          |          |          0xfd2e8c0
> >                          |          |          (nil)
> > 
> > ... etc.
> > 
> > Signed-off-by: Paul Mackerras <paulus@samba.org>
> 
> 
> Cool!

By the way, I notice that gcc tends to inline the tracing functions,
which means that by going up 2 stack frames we miss some of the
functions.  For example, for the lock:lock_acquire event, we have
_raw_spin_lock() -> lock_acquire() -> trace_lock_acquire() ->
perf_trace_lock_acquire() -> perf_trace_templ_lock_acquire() ->
perf_fetch_caller_regs() -> perf_arch_fetch_caller_regs().

But in the ppc64 kernel binary I just built, gcc inlined
trace_lock_acquire in lock_acquire, and perf_trace_templ_lock_acquire
in perf_trace_lock_acquire.  Given that perf_fetch_caller_regs is
explicitly inlined, going up two levels from perf_fetch_caller_regs
gets us to _raw_spin_lock, whereas I think you intended it to get us
to trace_lock_acquire.  I'm not sure what to do about that - any
thoughts?

Paul.

WARNING: multiple messages have this Message-ID (diff)
From: Paul Mackerras <paulus@samba.org>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	Peter Zijlstra <peterz@infradead.org>,
	benh@kernel.crashing.org, linux-kernel@vger.kernel.org,
	anton@samba.org, linuxppc-dev@ozlabs.org
Subject: Re: [PATCH] powerpc/perf_events: Implement perf_arch_fetch_caller_regs for powerpc
Date: Tue, 16 Mar 2010 14:22:13 +1100	[thread overview]
Message-ID: <20100316032213.GA3656@drongo> (raw)
In-Reply-To: <20100315210450.GF5082@nowhere>

On Mon, Mar 15, 2010 at 10:04:54PM +0100, Frederic Weisbecker wrote:
> On Mon, Mar 15, 2010 at 04:46:15PM +1100, Paul Mackerras wrote:

> >     14.99%            perf  [kernel.kallsyms]  [k] ._raw_spin_lock
> >                       |
> >                       --- ._raw_spin_lock
> >                          |          
> >                          |--25.00%-- .alloc_fd
> >                          |          (nil)
> >                          |          |          
> >                          |          |--50.00%-- .anon_inode_getfd
> >                          |          |          .sys_perf_event_open
> >                          |          |          syscall_exit
> >                          |          |          syscall
> >                          |          |          create_counter
> >                          |          |          __cmd_record
> >                          |          |          run_builtin
> >                          |          |          main
> >                          |          |          0xfd2e704
> >                          |          |          0xfd2e8c0
> >                          |          |          (nil)
> > 
> > ... etc.
> > 
> > Signed-off-by: Paul Mackerras <paulus@samba.org>
> 
> 
> Cool!

By the way, I notice that gcc tends to inline the tracing functions,
which means that by going up 2 stack frames we miss some of the
functions.  For example, for the lock:lock_acquire event, we have
_raw_spin_lock() -> lock_acquire() -> trace_lock_acquire() ->
perf_trace_lock_acquire() -> perf_trace_templ_lock_acquire() ->
perf_fetch_caller_regs() -> perf_arch_fetch_caller_regs().

But in the ppc64 kernel binary I just built, gcc inlined
trace_lock_acquire in lock_acquire, and perf_trace_templ_lock_acquire
in perf_trace_lock_acquire.  Given that perf_fetch_caller_regs is
explicitly inlined, going up two levels from perf_fetch_caller_regs
gets us to _raw_spin_lock, whereas I think you intended it to get us
to trace_lock_acquire.  I'm not sure what to do about that - any
thoughts?

Paul.

  reply	other threads:[~2010-03-16  3:22 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-15  5:46 [PATCH] powerpc/perf_events: Implement perf_arch_fetch_caller_regs for powerpc Paul Mackerras
2010-03-15  5:46 ` Paul Mackerras
2010-03-15 17:36 ` Michael Neuling
2010-03-15 17:36   ` Michael Neuling
2010-03-15 21:04 ` Frederic Weisbecker
2010-03-15 21:04   ` Frederic Weisbecker
2010-03-16  3:22   ` Paul Mackerras [this message]
2010-03-16  3:22     ` Paul Mackerras
2010-03-16 20:56     ` Frederic Weisbecker
2010-03-16 20:56       ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100316032213.GA3656@drongo \
    --to=paulus@samba.org \
    --cc=anton@samba.org \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.