From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>,
"Stephane Eranian" <eranian@google.com>,
"David Ahern" <dsahern@gmail.com>,
"Milian Wolff" <milian.wolff@kdab.com>,
"Frédéric Weisbecker" <fweisbec@gmail.com>,
"Ingo Molnar" <mingo@kernel.org>,
"Namhyung Kim" <namhyung@kernel.org>,
"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Subject: Re: LBR callchains from tracepoints
Date: Tue, 26 Apr 2016 15:20:32 -0300 [thread overview]
Message-ID: <20160426182032.GJ11033@kernel.org> (raw)
In-Reply-To: <20160426172629.GA42777@ast-mbp.thefacebook.com>
Em Tue, Apr 26, 2016 at 10:26:32AM -0700, Alexei Starovoitov escreveu:
> On Tue, Apr 26, 2016 at 06:38:28PM +0200, Peter Zijlstra wrote:
> > On Mon, Apr 25, 2016 at 10:24:31PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Mon, Apr 25, 2016 at 10:03:58PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > I now need to continue investigation why this doesn't seem to work from
> > > > tracepoints...
> > > Bummer, the changeset (at the end of this message) hasn't any
> > > explanation, is this really impossible? I.e. LBR callstacks from
> > > tracepoints? Even if we set perf_event_attr.exclude_callchain_kernel?
> > Could maybe be done, but its tricky to implement as the LBR is managed
> > by the hardware PMU and tracepoints are a software PMU, so we need to
> > then somehow frob with cross-pmu resources, in a very arch specific way.
> > And programmability of the hardware PMU will then depend on events
> > outside of it.
> btw we're thinking to add support for lbr to bpf, so that from the program
> we can get accurate and fast stacks. That's especially important for user
> space stacks. No clear idea how to do it yet, but it would be really useful.
Yeah, and that already works in perf, its just that it doesn't work from some
points (PERF_TYPE_SOFTWARE, PERF_TYPE_TRACEPOINT, etc), as described in the
changeset I mentioned.
'perf trace --call-graph lbr' doesn't work right now even with it
interested only in the user space bits, i.e. setting
perf_event_attr.exclude_callchain_kernel.
# perf trace --call-graph dwarf
works, but that, as you mention, really isn't "fast" and sometimes not
accurate, or at least wasn't with broken toolchains.
Example of mixed strace-like with userspace-only DWARF callchains (would be
lovely if this was with LBR, huh?) plus fp callchains for the
sched:sched_switch tracepoint plus LBR callchains for a hardware event, cycles,
look further below for the reason of the broken timestamps for
PERF_TYPE_HARDWARE events:
# perf trace -e nanosleep --event sched:sched_switch/call-graph=fp/ --ev cycles/call-graph=lbr,period=100/ usleep 1
18446744073709.551 ( ): cycles/call-graph=lbr,period=100/:)
__intel_pmu_enable_all+0xfe200080 ([kernel.kallsyms])
intel_pmu_enable_all+0xfe200010 ([kernel.kallsyms])
x86_pmu_enable+0xfe200271 ([kernel.kallsyms])
perf_pmu_enable.part.81+0xfe200007 ([kernel.kallsyms])
ctx_resched+0xfe20007a ([kernel.kallsyms])
perf_event_exec+0xfe20011d ([kernel.kallsyms])
setup_new_exec+0xfe20006f ([kernel.kallsyms])
load_elf_binary+0xfe2003e3 ([kernel.kallsyms])
search_binary_handler+0xfe20009e ([kernel.kallsyms])
do_execveat_common.isra.38+0xfe20052c ([kernel.kallsyms])
sys_execve+0xfe20003a ([kernel.kallsyms])
do_syscall_64+0xfe200062 ([kernel.kallsyms])
return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
[0] ([unknown])
0.310 ( 0.006 ms): usleep/20951 nanosleep(rqtp: 0x7ffda8904500 ) ...
0.310 ( ): sched:sched_switch:usleep:20951 [120] S ==> swapper/3:0 [120])
__schedule+0xfe200402 ([kernel.kallsyms])
schedule+0xfe200035 ([kernel.kallsyms])
do_nanosleep+0xfe20006f ([kernel.kallsyms])
hrtimer_nanosleep+0xfe2000dc ([kernel.kallsyms])
sys_nanosleep+0xfe20007a ([kernel.kallsyms])
do_syscall_64+0xfe200062 ([kernel.kallsyms])
return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
__nanosleep+0xffff00bfad62c010 (/usr/lib64/libc-2.22.so)
18446679523046.461 ( ): cycles/call-graph=lbr,period=100/:)
perf_pmu_enable.part.81+0xfe200007 ([kernel.kallsyms])
__perf_event_task_sched_in+0xfe2001ad ([kernel.kallsyms])
finish_task_switch+0xfe200156 ([kernel.kallsyms])
__schedule+0xfe200397 ([kernel.kallsyms])
schedule+0xfe200035 ([kernel.kallsyms])
do_nanosleep+0xfe20006f ([kernel.kallsyms])
hrtimer_nanosleep+0xfe2000dc ([kernel.kallsyms])
sys_nanosleep+0xfe20007a ([kernel.kallsyms])
do_syscall_64+0xfe200062 ([kernel.kallsyms])
return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
[0] ([unknown])
0.377 ( 0.073 ms): usleep/20951 ... [continued]: nanosleep()) = 0
[root@jouet ~]#
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0 (PERF_COUNT_HW_CPU_CYCLES)
size 112
{ sample_period, sample_freq } 100
sample_type IP|TID|CALLCHAIN|BRANCH_STACK|IDENTIFIER
read_format ID
disabled 1
inherit 1
enable_on_exec 1
sample_id_all 1
exclude_guest 1
{ wakeup_events, wakeup_watermark } 1
branch_sample_type USER|CALL_STACK|NO_FLAGS|NO_CYCLES
missing PERF_SAMPLE_TIME, will fix.
- Arnaldo
next prev parent reply other threads:[~2016-04-26 18:20 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <loom.20160419T015612-712@post.gmane.org>
[not found] ` <57157965.4020506@gmail.com>
[not found] ` <1704384.MQzPvE4Oa5@milian-kdab2>
[not found] ` <20160426010358.GD16708@kernel.org>
2016-04-26 1:24 ` LBR callchains from tracepoints Arnaldo Carvalho de Melo
2016-04-26 16:38 ` Peter Zijlstra
2016-04-26 17:26 ` Alexei Starovoitov
2016-04-26 18:20 ` Arnaldo Carvalho de Melo [this message]
2016-04-26 19:07 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160426182032.GJ11033@kernel.org \
--to=acme@kernel.org \
--cc=alexei.starovoitov@gmail.com \
--cc=dsahern@gmail.com \
--cc=eranian@google.com \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=milian.wolff@kdab.com \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox