linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Milian Wolff <milian.wolff@kdab.com>
To: "Jin, Yao" <yao.jin@linux.intel.com>
Cc: linux-perf-users <linux-perf-users@vger.kernel.org>
Subject: Re: Feedback on: [RESEND PATCH v2 0/5] perf report: Show inline stack
Date: Tue, 10 Jan 2017 16:03:50 +0100	[thread overview]
Message-ID: <3552078.l4qC7a8nAH@milian-kdab2> (raw)
In-Reply-To: <780b3634-fa42-01f5-4f95-3c571fa4eb84@linux.intel.com>

[-- Attachment #1: Type: text/plain, Size: 8129 bytes --]

On Tuesday, January 10, 2017 9:53:04 AM CET Jin, Yao wrote:
> Hi Wolff,
> 
> I update the patch series with adding inline searching for each
> callchain entry.

Hey Yao,

great! Is your git branch publicly accessible somewhere, so I can check it 
out?

> Now the result tested on my skl machine is:
> 
> # Overhead  Command  Shared Object        Symbol
> # ........  .......  ................... ..................................
> #
>      56.76%  inline   inline               [.] main
> 
>              ---main
>                 __libc_start_main
>                 _start

This is still wrong, no? Tons of stuff gets inlined into main and it seems as 
if your code is still not capable of displaying what that is. Please refer to 
the output I pasted from my patch in the previous mail. I'm explicitly looking 
for the cost of using <random> e.g.

>      40.54%  inline   libm-2.23.so         [.] __hypot_finite
> 
>              ---__hypot_finite
> 
>                  --40.09%--__hypot
> 
>                            ---/usr/include/c++/5/bits/random.h:151 (inline)
>                               /usr/include/c++/5/bits/random.h:332 (inline)
> /usr/include/c++/5/bits/random.tcc:3328 (inline)
>                               /usr/include/c++/5/bits/random.h:185 (inline)
>                               /usr/include/c++/5/bits/random.h:1818 (inline)
> /usr/include/c++/5/bits/random.h:1809 (inline)
> /home/jinyao/skl-ws/perf-dev/lck-2867/test/inline.cpp:14 (inline)
>                            main
>                            __libc_start_main
>                            _start

Much better, great work! But again I have to ask the same question:

Why not integrate directly with the other call stack nodes? This would solve 
the following problems:

- correct display of children cost for inlined frames, i.e. in a `perf report 
--children --inline`, I suspect your branch won't actually show any top-level 
entries for inline.cpp:14, correct?

- display/aggregation: currently you are always displaying srcline in your 
approach, what if I want to display/aggregate by function symbol instead?

So, could we please discuss this please?

Thanks a lot

> On 1/9/2017 4:41 PM, Milian Wolff wrote:
> > On Monday, January 9, 2017 4:17:48 PM CET Jin, Yao wrote:
> >> Hi Wolff,
> >> 
> >> Thanks a lot for your test code!
> >> 
> >> I have updated my patch. The new output for your test code (binary name
> >> is inline) is:
> >> 
> >> # Overhead  Command  Shared Object        Symbol
> >> # ........  .......  ...................
> >> ..................................
> >> #
> >> 
> >>       56.76%  inline   inline               [.] main
> >>       
> >>               ---main
> >>               
> >>                  __libc_start_main
> >>                  _start
> >>       
> >>       40.54%  inline   libm-2.23.so         [.] __hypot_finite
> >>       
> >>               ---__hypot_finite
> >>               
> >>                   --40.09%--__hypot
> >>                   
> >>                             main
> >>                             __libc_start_main
> >>                             _start
> >>        
> >>        1.30%  inline   libm-2.23.so         [.] __hypot
> >>        
> >>               ---__hypot
> >>               
> >>                  main
> >>                  
> >>                  ---/usr/include/c++/5/bits/random.h:151 (inline)
> >>                  
> >>                     /usr/include/c++/5/bits/random.h:332 (inline)
> >>                     /usr/include/c++/5/bits/random.tcc:3328 (inline)
> >>                     /usr/include/c++/5/bits/random.h:185 (inline)
> >>                     /usr/include/c++/5/bits/random.h:1818 (inline)
> >>                     /usr/include/c++/5/bits/random.h:1809 (inline)
> >> 
> >> /home/jinyao/skl-ws/perf-dev/lck-2867/test/inline.cpp:14 (inline)
> >> 
> >>                  __libc_start_main
> >>                  _start
> >> 
> >> The inline entry is tagged with "(inline)". Is this result within
> >> expectation?
> > 
> > No, this is not the expected output. I expect to get information about the
> > inlined frames in the middle of the call stack, in this example that
> > happens for the first two major hotspots.
> > 
> > Below is the output from my approach to this problem using my -g srcline:
> > 
> > ~~~~~~~~~~~~~~ pasted here for readability:
> > https://paste.kde.org/pmq3bc1le
> > Samples: 149  of event 'cycles:ppp', Event count (approx.): 141334499
> > 
> >    Overhead  Command  Shared Object     Symbol
> > 
> > -   62.24%  inline   inline            [.] main
> > 
> >     - 27.98% std::generate_canonical<double, 53ul,
> > 
> > std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>
> > > random.tcc:3332
> > 
> >          std::__detail::_Adaptor<std::linear_congruential_engine<unsigned
> >          long,
> > 
> > 16807ul, 0ul, 2147483647ul>, double>::operator() random.h:185
> > 
> >          std::uniform_real_distribution<double>::operator()<std::linear_co
> >          ngruential_engine<unsigned> 
> > long, 16807ul, 0ul, 2147483647ul> > random.h:1818
> > 
> >          std::uniform_real_distribution<double>::operator()<std::linear_co
> >          ngruential_engine<unsigned> 
> > long, 16807ul, 0ul, 2147483647ul> > random.h:1809
> > 
> >          main ine.cpp:14
> >          __libc_start_main +241
> >          _start +4194346
> >     
> >     - 25.88% std::__detail::_Mod<unsigned long, 2147483647ul, 16807ul,
> >     0ul,
> > 
> > true, true>::__calc random.h:143
> > 
> >          std::__detail::__mod<unsigned long, 2147483647ul, 16807ul, 0ul>
> > 
> > random.h:151
> > 
> >          std::linear_congruential_engine<unsigned long, 16807ul, 0ul,
> > 
> > 2147483647ul>::operator() random.h:332
> > 
> >          std::generate_canonical<double, 53ul,
> > 
> > std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>
> > > random.tcc:3332
> > 
> >          std::__detail::_Adaptor<std::linear_congruential_engine<unsigned
> >          long,
> > 
> > 16807ul, 0ul, 2147483647ul>, double>::operator() random.h:185
> > 
> >          std::uniform_real_distribution<double>::operator()<std::linear_co
> >          ngruential_engine<unsigned> 
> > long, 16807ul, 0ul, 2147483647ul> > random.h:1818
> > 
> >          std::uniform_real_distribution<double>::operator()<std::linear_co
> >          ngruential_engine<unsigned> 
> > long, 16807ul, 0ul, 2147483647ul> > random.h:1809
> > 
> >          main ine.cpp:14
> >          __libc_start_main +241
> >          _start +4194346
> >     
> >     + 4.14% main inline.cpp:13
> >     + 2.89%
> > 
> > std::uniform_real_distribution<double>::operator()<std::linear_congruentia
> > l_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > random.h:1818
> > 
> >     + 0.68% std::_Norm_helper<true>::_S_do_it<double> complex:655
> >     + 0.67% std::generate_canonical<double, 53ul,
> > 
> > std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>
> > > random.tcc:3326
> > -   26.24%  inline   libm-2.24.so      [.] __hypot_finite
> > 
> >     - __hypot_finite +179
> >     
> >        - 25.56% hypot +20
> >        
> >             std::__complex_abs complex:589
> >             std::abs<double> complex:597
> >             std::_Norm_helper<true>::_S_do_it<double> complex:654
> >             std::norm<double> complex:664
> >             main ine.cpp:14
> >             __libc_start_main +241
> >             _start +4194346
> >          
> >          0.68% 0x40ead238ed1a2cf4
> > 
> > -    6.85%  inline   libm-2.24.so      [.] hypot
> > 
> >       hypot +40
> >       std::__complex_abs complex:589
> >       std::abs<double> complex:597
> >       std::_Norm_helper<true>::_S_do_it<double> complex:654
> >       std::norm<double> complex:664
> >       main ine.cpp:14
> >       __libc_start_main +241
> >       _start +4194346
> > 
> > ~~~~~~~~~~~~~~
> > 
> > I hope you see the value of such a report versus what you are proposing.
> > 
> > Thanks


-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]

  reply	other threads:[~2017-01-10 15:03 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-07 20:13 Feedback on: [RESEND PATCH v2 0/5] perf report: Show inline stack Milian Wolff
2017-01-09  3:07 ` Jin, Yao
2017-01-09  8:02   ` Milian Wolff
2017-01-09  8:17     ` Jin, Yao
2017-01-09  8:41       ` Milian Wolff
2017-01-10  1:53         ` Jin, Yao
2017-01-10 15:03           ` Milian Wolff [this message]
2017-01-11  1:01             ` Jin, Yao
2017-01-13  8:56               ` Milian Wolff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3552078.l4qC7a8nAH@milian-kdab2 \
    --to=milian.wolff@kdab.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=yao.jin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).