* LBR callchains from tracepoints
[not found] ` <20160426010358.GD16708@kernel.org>
@ 2016-04-26 1:24 ` Arnaldo Carvalho de Melo
2016-04-26 16:38 ` Peter Zijlstra
0 siblings, 1 reply; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-04-26 1:24 UTC (permalink / raw)
To: Stephane Eranian
Cc: Peter Zijlstra, David Ahern, Milian Wolff,
Frédéric Weisbecker, Ingo Molnar, Namhyung Kim,
Linux Kernel Mailing List
Em Mon, Apr 25, 2016 at 10:03:58PM -0300, Arnaldo Carvalho de Melo escreveu:
> I now need to continue investigation why this doesn't seem to work from
> tracepoints...
Bummer, the changeset (at the end of this message) hasn't any
explanation, is this really impossible? I.e. LBR callstacks from
tracepoints? Even if we set perf_event_attr.exclude_callchain_kernel?
I've read somewhere that LBR wouldn't work for the kernel, but when I
tried, for cycles:ppp I got:
[acme@jouet linux]$ perf record --call-graph lbr usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data (9 samples) ]
[acme@jouet linux]$ perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 4000,
sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1,
precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec:
1, branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES
[acme@jouet linux]$
- 59.69% usleep [kernel] [k] vma_interval_tree_insert
→vma_interval_tree_insert [kernel]
vma_adjust [kernel]
__split_vma.isra.31 [kernel]
split_vma [kernel]
mprotect_fixup [kernel]
sys_mprotect [kernel]
entry_SYSCALL_64_fastpath [kernel]
mprotect ld-2.22.so
_dl_relocate_object ld-2.22.so
memcpy@GLIBC_2.2.5 libc-2.22.so
_dl_relocate_object ld-2.22.so
__gettimeofday libc-2.22.so
_dl_vdso_vsym libc-2.22.so
_dl_lookup_symbol_x ld-2.22.so
This was done on a Broadwell system (ThinkPad t450s).
- Arnaldo
commit 2481c5fa6db0237e4f0168f88913178b2b495b7c
Author: Stephane Eranian <eranian@google.com>
Date: Thu Feb 9 23:20:59 2012 +0100
perf: Disable PERF_SAMPLE_BRANCH_* when not supported
PERF_SAMPLE_BRANCH_* is disabled for:
- SW events (sw counters, tracepoints)
- HW breakpoints
- ALL but Intel x86 architecture
- AMD64 processors
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-10-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: LBR callchains from tracepoints
2016-04-26 1:24 ` LBR callchains from tracepoints Arnaldo Carvalho de Melo
@ 2016-04-26 16:38 ` Peter Zijlstra
2016-04-26 17:26 ` Alexei Starovoitov
0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2016-04-26 16:38 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Stephane Eranian, David Ahern, Milian Wolff,
Frédéric Weisbecker, Ingo Molnar, Namhyung Kim,
Linux Kernel Mailing List
On Mon, Apr 25, 2016 at 10:24:31PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Apr 25, 2016 at 10:03:58PM -0300, Arnaldo Carvalho de Melo escreveu:
> > I now need to continue investigation why this doesn't seem to work from
> > tracepoints...
>
> Bummer, the changeset (at the end of this message) hasn't any
> explanation, is this really impossible? I.e. LBR callstacks from
> tracepoints? Even if we set perf_event_attr.exclude_callchain_kernel?
Could maybe be done, but its tricky to implement as the LBR is managed
by the hardware PMU and tracepoints are a software PMU, so we need to
then somehow frob with cross-pmu resources, in a very arch specific way.
And programmability of the hardware PMU will then depend on events
outside of it.
All rather icky.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: LBR callchains from tracepoints
2016-04-26 16:38 ` Peter Zijlstra
@ 2016-04-26 17:26 ` Alexei Starovoitov
2016-04-26 18:20 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 5+ messages in thread
From: Alexei Starovoitov @ 2016-04-26 17:26 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern,
Milian Wolff, Frédéric Weisbecker, Ingo Molnar,
Namhyung Kim, Linux Kernel Mailing List
On Tue, Apr 26, 2016 at 06:38:28PM +0200, Peter Zijlstra wrote:
> On Mon, Apr 25, 2016 at 10:24:31PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Apr 25, 2016 at 10:03:58PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > I now need to continue investigation why this doesn't seem to work from
> > > tracepoints...
> >
> > Bummer, the changeset (at the end of this message) hasn't any
> > explanation, is this really impossible? I.e. LBR callstacks from
> > tracepoints? Even if we set perf_event_attr.exclude_callchain_kernel?
>
> Could maybe be done, but its tricky to implement as the LBR is managed
> by the hardware PMU and tracepoints are a software PMU, so we need to
> then somehow frob with cross-pmu resources, in a very arch specific way.
> And programmability of the hardware PMU will then depend on events
> outside of it.
btw we're thinking to add support for lbr to bpf, so that from the program
we can get accurate and fast stacks. That's especially important for user
space stacks. No clear idea how to do it yet, but it would be really useful.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: LBR callchains from tracepoints
2016-04-26 17:26 ` Alexei Starovoitov
@ 2016-04-26 18:20 ` Arnaldo Carvalho de Melo
2016-04-26 19:07 ` Peter Zijlstra
0 siblings, 1 reply; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-04-26 18:20 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Peter Zijlstra, Stephane Eranian, David Ahern, Milian Wolff,
Frédéric Weisbecker, Ingo Molnar, Namhyung Kim,
Linux Kernel Mailing List
Em Tue, Apr 26, 2016 at 10:26:32AM -0700, Alexei Starovoitov escreveu:
> On Tue, Apr 26, 2016 at 06:38:28PM +0200, Peter Zijlstra wrote:
> > On Mon, Apr 25, 2016 at 10:24:31PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Mon, Apr 25, 2016 at 10:03:58PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > I now need to continue investigation why this doesn't seem to work from
> > > > tracepoints...
> > > Bummer, the changeset (at the end of this message) hasn't any
> > > explanation, is this really impossible? I.e. LBR callstacks from
> > > tracepoints? Even if we set perf_event_attr.exclude_callchain_kernel?
> > Could maybe be done, but its tricky to implement as the LBR is managed
> > by the hardware PMU and tracepoints are a software PMU, so we need to
> > then somehow frob with cross-pmu resources, in a very arch specific way.
> > And programmability of the hardware PMU will then depend on events
> > outside of it.
> btw we're thinking to add support for lbr to bpf, so that from the program
> we can get accurate and fast stacks. That's especially important for user
> space stacks. No clear idea how to do it yet, but it would be really useful.
Yeah, and that already works in perf, its just that it doesn't work from some
points (PERF_TYPE_SOFTWARE, PERF_TYPE_TRACEPOINT, etc), as described in the
changeset I mentioned.
'perf trace --call-graph lbr' doesn't work right now even with it
interested only in the user space bits, i.e. setting
perf_event_attr.exclude_callchain_kernel.
# perf trace --call-graph dwarf
works, but that, as you mention, really isn't "fast" and sometimes not
accurate, or at least wasn't with broken toolchains.
Example of mixed strace-like with userspace-only DWARF callchains (would be
lovely if this was with LBR, huh?) plus fp callchains for the
sched:sched_switch tracepoint plus LBR callchains for a hardware event, cycles,
look further below for the reason of the broken timestamps for
PERF_TYPE_HARDWARE events:
# perf trace -e nanosleep --event sched:sched_switch/call-graph=fp/ --ev cycles/call-graph=lbr,period=100/ usleep 1
18446744073709.551 ( ): cycles/call-graph=lbr,period=100/:)
__intel_pmu_enable_all+0xfe200080 ([kernel.kallsyms])
intel_pmu_enable_all+0xfe200010 ([kernel.kallsyms])
x86_pmu_enable+0xfe200271 ([kernel.kallsyms])
perf_pmu_enable.part.81+0xfe200007 ([kernel.kallsyms])
ctx_resched+0xfe20007a ([kernel.kallsyms])
perf_event_exec+0xfe20011d ([kernel.kallsyms])
setup_new_exec+0xfe20006f ([kernel.kallsyms])
load_elf_binary+0xfe2003e3 ([kernel.kallsyms])
search_binary_handler+0xfe20009e ([kernel.kallsyms])
do_execveat_common.isra.38+0xfe20052c ([kernel.kallsyms])
sys_execve+0xfe20003a ([kernel.kallsyms])
do_syscall_64+0xfe200062 ([kernel.kallsyms])
return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
[0] ([unknown])
0.310 ( 0.006 ms): usleep/20951 nanosleep(rqtp: 0x7ffda8904500 ) ...
0.310 ( ): sched:sched_switch:usleep:20951 [120] S ==> swapper/3:0 [120])
__schedule+0xfe200402 ([kernel.kallsyms])
schedule+0xfe200035 ([kernel.kallsyms])
do_nanosleep+0xfe20006f ([kernel.kallsyms])
hrtimer_nanosleep+0xfe2000dc ([kernel.kallsyms])
sys_nanosleep+0xfe20007a ([kernel.kallsyms])
do_syscall_64+0xfe200062 ([kernel.kallsyms])
return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
__nanosleep+0xffff00bfad62c010 (/usr/lib64/libc-2.22.so)
18446679523046.461 ( ): cycles/call-graph=lbr,period=100/:)
perf_pmu_enable.part.81+0xfe200007 ([kernel.kallsyms])
__perf_event_task_sched_in+0xfe2001ad ([kernel.kallsyms])
finish_task_switch+0xfe200156 ([kernel.kallsyms])
__schedule+0xfe200397 ([kernel.kallsyms])
schedule+0xfe200035 ([kernel.kallsyms])
do_nanosleep+0xfe20006f ([kernel.kallsyms])
hrtimer_nanosleep+0xfe2000dc ([kernel.kallsyms])
sys_nanosleep+0xfe20007a ([kernel.kallsyms])
do_syscall_64+0xfe200062 ([kernel.kallsyms])
return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
[0] ([unknown])
0.377 ( 0.073 ms): usleep/20951 ... [continued]: nanosleep()) = 0
[root@jouet ~]#
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0 (PERF_COUNT_HW_CPU_CYCLES)
size 112
{ sample_period, sample_freq } 100
sample_type IP|TID|CALLCHAIN|BRANCH_STACK|IDENTIFIER
read_format ID
disabled 1
inherit 1
enable_on_exec 1
sample_id_all 1
exclude_guest 1
{ wakeup_events, wakeup_watermark } 1
branch_sample_type USER|CALL_STACK|NO_FLAGS|NO_CYCLES
missing PERF_SAMPLE_TIME, will fix.
- Arnaldo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: LBR callchains from tracepoints
2016-04-26 18:20 ` Arnaldo Carvalho de Melo
@ 2016-04-26 19:07 ` Peter Zijlstra
0 siblings, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2016-04-26 19:07 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Alexei Starovoitov, Stephane Eranian, David Ahern, Milian Wolff,
Frédéric Weisbecker, Ingo Molnar, Namhyung Kim,
Linux Kernel Mailing List
On Tue, Apr 26, 2016 at 03:20:32PM -0300, Arnaldo Carvalho de Melo wrote:
> Yeah, and that already works in perf, its just that it doesn't work from some
> points (PERF_TYPE_SOFTWARE, PERF_TYPE_TRACEPOINT, etc), as described in the
> changeset I mentioned.
Look at it the other way around, it _only_ works for intel cpu events,
_nothing_ else.
There is only a single PMU that supports LBR callgraph thingies, all the
others do not.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-04-26 19:07 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <loom.20160419T015612-712@post.gmane.org>
[not found] ` <57157965.4020506@gmail.com>
[not found] ` <1704384.MQzPvE4Oa5@milian-kdab2>
[not found] ` <20160426010358.GD16708@kernel.org>
2016-04-26 1:24 ` LBR callchains from tracepoints Arnaldo Carvalho de Melo
2016-04-26 16:38 ` Peter Zijlstra
2016-04-26 17:26 ` Alexei Starovoitov
2016-04-26 18:20 ` Arnaldo Carvalho de Melo
2016-04-26 19:07 ` Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox