From: Peter Zijlstra <peterz@infradead.org>
To: Stephane Eranian <eranian@google.com>
Cc: Kan Liang <kan.liang@intel.com>,
LKML <linux-kernel@vger.kernel.org>,
"mingo@redhat.com" <mingo@redhat.com>,
Paul Mackerras <paulus@samba.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Jiri Olsa <jolsa@redhat.com>,
"ak@linux.intel.com" <ak@linux.intel.com>
Subject: Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain
Date: Wed, 5 Nov 2014 16:45:35 +0100 [thread overview]
Message-ID: <20141105154535.GU3337@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <CABPqkBRF6VEigBcBr+6Hy5E3tmfk4vpDiR8Zui8=853GTjG_XQ@mail.gmail.com>
On Wed, Nov 05, 2014 at 02:22:07PM +0100, Stephane Eranian wrote:
> I tend to agree here. The problem with FP is that it is not easy to figure
> out how a binary has been compiled. Getting valid FP callchains for
> large binaries using lots of shared libraries is very challenging. All
> libraries must be compiled with FP. It is not easy to test if FP was
> compiled in. There is no ELF header flag for this. Need to inspect
> the x86 asm and look at function prologues.
build world ftw :-), I realize that on many distros this is hard, but in
some environments its really rather easy.
But yes, its tedious without the capability to build world.
> This is where LBR has an advantage, it works regardless of how
> a binaries and shared libs have been compiled. That is why this is
> a good (or some would say better) approach which is using hardware
> assist.
Right, but only because we made of mess of the thing in the first place
:-/
> > We're all more familiar with FP, and it doesn't have the obvious problem
> > if only 16 entries. I've worked on quite a bit of software that had much
> > deeper callchains -- yay for recursive algorithms and/or C++.
> >
> Yes, this is true too. But it is not so clear to me if people really care about
> top of callchains that much. I think usually 2-6 would probably yield enough
> useful info.
Right, with C++ if you have a particularly gruesome object hierarchy a
simple constructor can blow your entire 16 calls out the window, so when
you then get around to doing actual work there's nothing left.
But yes, that should not be too common I think.
> LBR callstack fails for leaf function optimization. Where the callee does
> not return to its caller but instead to the caller's caller. That is the one
> case I know about. There are others I believe.
Yeah, tail call and jong jump might also confuse the thing, I can't
remember.
> > With a bit of care FP can be 'perfect', although Andi likes to point out
> > that glibc isn't and often wrecks FP :-(
> >
> Especially any hand-crafted assembly...
Well, it doesn't need to. But yes its easy to do wrong in that case.
> I don't think it would be very hard to modify the patch set to make that 3rd
> mode visible. Just need to make that new PERF_RECORD_* type visible
> to user and modify the compatibility checks.
There's no new RECORD type afaict; would not the relatively simple patch
I proposed be enough? It exposes PERF_SAMPLE_BRANCH_CALL_STACK and you'd
get the data through the normal PERF_SAMPLE_BRANCH_STACK output.
next prev parent reply other threads:[~2014-11-05 15:45 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-05 2:55 [PATCH V7 00/17] perf, x86: Haswell LBR call stack support Kan Liang
2014-11-05 2:55 ` [PATCH V7 01/17] perf, x86: Reduce lbr_sel_map size Kan Liang
2015-02-18 17:13 ` [tip:perf/core] perf/x86/intel: Reduce lbr_sel_map[] size tip-bot for Yan, Zheng
2014-11-05 2:55 ` [PATCH V7 02/17] perf, core: introduce pmu context switch callback Kan Liang
2015-02-18 17:14 ` [tip:perf/core] perf: Introduce " tip-bot for Yan, Zheng
2014-11-05 2:55 ` [PATCH V7 03/17] perf, x86: use context switch callback to flush LBR stack Kan Liang
2015-02-18 17:14 ` [tip:perf/core] perf/x86/intel: Use " tip-bot for Yan, Zheng
2014-11-05 2:56 ` [PATCH V7 04/17] perf, x86: Basic Haswell LBR call stack support Kan Liang
2015-02-18 17:14 ` [tip:perf/core] perf/x86/intel: Add basic " tip-bot for Yan, Zheng
2014-11-05 2:56 ` [PATCH V7 05/17] perf, core: pmu specific data for perf task context Kan Liang
2015-02-18 17:15 ` [tip:perf/core] perf: Add " tip-bot for Yan, Zheng
2015-12-09 8:34 ` Peter Zijlstra
2015-12-09 14:59 ` Liang, Kan
2015-12-09 15:14 ` Peter Zijlstra
2015-12-09 15:25 ` Liang, Kan
2014-11-05 2:56 ` [PATCH V7 06/17] perf, core: always switch pmu specific data during context switch Kan Liang
2015-02-18 17:15 ` [tip:perf/core] perf: Always " tip-bot for Yan, Zheng
2014-11-05 2:56 ` [PATCH V7 07/17] perf, x86: allocate space for storing LBR stack Kan Liang
2015-02-18 17:15 ` [tip:perf/core] perf/x86/intel: Allocate " tip-bot for Yan, Zheng
2014-11-05 2:56 ` [PATCH V7 08/17] perf, x86: track number of events that use LBR callstack Kan Liang
2015-02-18 17:15 ` [tip:perf/core] perf/x86/intel: Track number of events that use the " tip-bot for Yan, Zheng
2014-11-05 2:56 ` [PATCH V7 09/17] perf, x86: Save/resotre LBR stack during context switch Kan Liang
2015-02-18 17:16 ` [tip:perf/core] perf/x86/intel: Save/ restore " tip-bot for Yan, Zheng
2014-11-05 2:56 ` [PATCH V7 10/17] perf, core: simplify need branch stack check Kan Liang
2015-02-18 17:16 ` [tip:perf/core] perf: Simplify the " tip-bot for Yan, Zheng
2014-11-05 2:56 ` [PATCH V7 11/17] perf, core: expose LBR call stack to user perf tool Kan Liang
2014-11-05 9:20 ` Peter Zijlstra
2014-11-05 2:56 ` [PATCH V7 12/17] perf, x86: re-organize code that implicitly enables LBR/PEBS Kan Liang
2015-02-18 17:16 ` [tip:perf/core] perf/x86/intel: Re-organize " tip-bot for Yan, Zheng
2014-11-05 2:56 ` [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain Kan Liang
2014-11-05 9:21 ` Peter Zijlstra
2014-11-05 9:58 ` Stephane Eranian
2014-11-05 10:43 ` Peter Zijlstra
2014-11-05 10:57 ` Stephane Eranian
2014-11-05 12:49 ` Peter Zijlstra
2014-11-05 13:22 ` Stephane Eranian
2014-11-05 15:45 ` Peter Zijlstra [this message]
2014-11-05 15:53 ` Liang, Kan
2014-11-05 16:29 ` Peter Zijlstra
2014-11-05 17:52 ` Andi Kleen
2014-11-05 17:57 ` Andi Kleen
2014-11-05 17:40 ` Andi Kleen
2014-11-05 2:56 ` [PATCH V7 14/17] perf, x86: disable FREEZE_LBRS_ON_PMI when LBR operates in callstack mode Kan Liang
2015-02-18 17:17 ` [tip:perf/core] perf/x86/intel: Disable " tip-bot for Yan, Zheng
2014-11-05 2:56 ` [PATCH V7 15/17] perf, x86: Discard zero length call entries in LBR call stack Kan Liang
2015-02-18 17:17 ` [tip:perf/core] perf/x86/intel: " tip-bot for Yan, Zheng
2014-11-05 2:56 ` [PATCH V7 16/17] perf tools: handle LBR call stack data Kan Liang
2014-11-05 2:56 ` [PATCH V7 17/17] perf tools: choose to dump callchain from LBR and FP Kan Liang
2014-11-05 9:37 ` [PATCH V7 00/17] perf, x86: Haswell LBR call stack support Peter Zijlstra
2014-11-05 16:22 ` Liang, Kan
2014-11-05 16:27 ` Peter Zijlstra
2014-11-05 17:02 ` Liang, Kan
2015-02-18 17:17 ` [tip:perf/core] perf/x86/intel: Expose LBR callstack to user space tooling tip-bot for Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141105154535.GU3337@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=eranian@google.com \
--cc=jolsa@redhat.com \
--cc=kan.liang@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox