linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Milian Wolff <milian.wolff@kdab.com>
To: andi@firstfloor.org
Cc: linux-perf-users@vger.kernel.org
Subject: Re: usability issues with inlining and backtraces
Date: Mon, 13 Jun 2016 18:07:13 +0200	[thread overview]
Message-ID: <3237195.qkayrO3A0T@milian-kdab2> (raw)
In-Reply-To: <7872019.1GQhL8td4o@milian-kdab2>

[-- Attachment #1: Type: text/plain, Size: 5475 bytes --]

Ping?

Andi, I'd be very interested in your answers on my questions below. Most 
notably, `srcline` with `--call-graph ...,address` seems to be highly buggy 
for me.

Thanks

On Monday, May 23, 2016 5:18:53 PM CEST Milian Wolff wrote:
> On Monday, May 23, 2016 7:59:54 AM CEST Andi Kleen wrote:
> > Milian Wolff <milian.wolff@kdab.com> writes:
> > > Here, the cost of inlined functions (random.tcc:3332,random.h:143) is
> > > attributed to the main function. This is of course correct, but very
> > > unhelpful to me as a programmer. I'm much more interested in the line
> > > inside test.cpp which triggered the call to random.tcc:3332 etc. Is
> > > there
> > > a way to get that data? Note how you can a way better backtrace when
> > > using GDB on the same
> > 
> > > binary as above:
> > Yes this works using --call-graph ...,address
> > 
> > If you use srcfile it can resolve inlines. it is currently not supported
> > for other backtraces, but yes expanding them in the normal history
> > would be useful.
> 
> Can you expand on this part a bit please? It's the first time I realize that
> I can add a sort key to `--call-graph`, so at the very least this shows
> that the defaults are bad.
> 
> For the example I'm now trying this:
> 
>     $ perf report -g graph,address -s sym,srcline --no-children --stdio
> 
> # Overhead  Symbol                       Source:Line
> # ........  ...........................  ..................
> #
>     21.82%  [.] main                     random.tcc:3332
> 
>             |--3.84%--main +8388740
>             |
>             |          __libc_start_main +140389813309681
>             |          _start +8388649
>             | 
>             | ...
> 
>      6.29%  [.] __hypot_finite           __hypot_finite+152
> 
>             ---hypot +140389819301908
>                main +8388987
>                __libc_start_main +140389813309681
>                _start +8388649
> 
> -> much better, but still unusable backtraces. And my `perf report` only
> allows this:
> 
>     sort_key:       call graph sort key (function|address)
> 
> So we need to add support for "|srcline" here as well?
> 
> > > On one hand, we have the same issue as above, namely inlined functions
> > > being attributed directly to the parent function. See how the backtrace
> > > shows main calling hypot? Look at the source, I'm not calling hypot
> > > anywhere - it's std::norm calling it internally eventually. And GDB does
> > > know about that and can give me a proper backtrace.
> > 
> > Use srcfile then.
> 
> See above, can you please extend this answer a bit? How would you use
> srcfile, and where, to get the full backtrace? Is that maybe a patch that
> is not yet included in acme's perf/core branch?
> 
> > > But, differently to above, the major gripe I have with this output is
> > > exemplified by this part:
> > > 
> > > ~~~~~~~~~
> > > 
> > >      3.63%  [.] __hypot_finite         __hypot_finite+257
> > >      
> > >             ---hypot
> > >             
> > >                main
> > >                __libc_start_main
> > >                _start
> > > 
> > > ~~~~~~~~~
> > > 
> > > I added `srcline` to the report, but the backtrace still only contains
> > > the
> > > symbol name. I hope, that we can simply honor srcline there as well, to
> > > at
> > 
> > > least print something like this instead:
> > --call-graph ....,address
> > 
> > (or --branch-history)
> 
> Both produce output like I've shown above, i.e. they always put down
> 
>   <symbol> +<offset>
> 
> insetead of using the much more useable
> 
>   <symbol> <file>:<line>
> 
> format. What am I missing?
> 
> > > ########## symbol cost aggregation
> > > 
> > > Once we have that implemented, can we maybe account for the following: I
> > > told `perf report` to aggregate by symbol first, then srcline, i.e. `-s
> > > sym,srcline`. But the report seems to aggregate by `srcline`, because I
> > > see some symbols (__hypot_finite) multiple times, for different code
> > > points. Can we merge those and start the backtrace then at the different
> > > code points?
> > 
> > > Something like this would be my desirable output:
> > perf sorts by address, but the compiler can generate the same symbol
> > in multiple versions or in may inlines instances. That's probably
> > more work to fix.
> > 
> > > As such, I propose a "simplified" `perf annotate` output, which gets
> > > closer to what you'll see in other profilers, e.g. VTune or also
> > > Microsoft Visual studio's sampling profilers: Use the source files and
> > > annotate those with inclusive cost. Only show the cost of individual
> > > binary instructions (status quo) when explicitly asked. I.e. I want the
> > > output to look something like that
> > 
> > > (I made up the actual percentages):
> > Sounds like the fallacy of abstracted performance analysis.
> 
> Can you please elaborate that some more as well? For me, as a user-space
> application developer, getting an aggregated inclusive cost for the source
> code I actually write is extremely useful. Of course, when it's not directly
> clear from that view why something is slow I still need to dig deeper and
> get to the current in-depth assembly view. But why is having a simplified
> overview a fallacy? Why is this abstracted performance analysis?
> 
> Thanks a lot for your input!


-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]

  reply	other threads:[~2016-06-13 16:07 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-23  9:10 usability issues with inlining and backtraces Milian Wolff
2016-05-23 14:11 ` Arnaldo Carvalho de Melo
2016-08-15 11:44   ` Milian Wolff
2016-05-23 14:59 ` Andi Kleen
2016-05-23 15:18   ` Milian Wolff
2016-06-13 16:07     ` Milian Wolff [this message]
2016-08-15  9:32       ` Milian Wolff
2016-08-15 11:32         ` Milian Wolff
2016-08-15 17:13           ` Andi Kleen
2016-08-16 15:13             ` Namhyung Kim
2016-08-16 15:45               ` Milian Wolff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3237195.qkayrO3A0T@milian-kdab2 \
    --to=milian.wolff@kdab.com \
    --cc=andi@firstfloor.org \
    --cc=linux-perf-users@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).