All of lore.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Milian Wolff <milian.wolff@kdab.com>
Cc: Taeung Song <treeze.taeung@gmail.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	linux-kernel@vger.kernel.org,
	Adrian Hunter <adrian.hunter@intel.com>,
	Andi Kleen <ak@linux.intel.com>, David Ahern <dsahern@gmail.com>,
	Jin Yao <yao.jin@linux.intel.com>, Jiri Olsa <jolsa@redhat.com>,
	Kim Phillips <kim.phillips@arm.com>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Wang Nan <wangnan0@huawei.com>,
	kernel-team@lge.com
Subject: Re: [PATCH/RFC 0/4] perf annotate: Add --source-only option and the new source code TUI view
Date: Thu, 29 Jun 2017 16:11:01 +0900	[thread overview]
Message-ID: <20170629071101.GA10884@sejong> (raw)
In-Reply-To: <1571575.U9nG2MaxHg@milian-kdab2>

Hello,

On Wed, Jun 28, 2017 at 11:53:22AM +0200, Milian Wolff wrote:
> On Wednesday, June 28, 2017 5:18:08 AM CEST Taeung Song wrote:
> > Hi,
> > 
> > The --source-only option and new source code TUI view can show
> > the result of performance analysis based on full source code per
> > symbol(function). (Namhyung Kim told me this idea and it was also requested
> > by others some time ago..)
> > 
> > If someone wants to see the cause, he/she will need to dig into the asm.
> > But before that, looking at the source level can give a hint or clue
> > for the problem.
> > 
> > For example, if target symbol is 'hex2u64' of util/util.c,
> > the output is like below.
> > 
> >     $ perf annotate --source-only --stdio -s hex2u64
> >  Percent |      Source code of util.c for cycles:ppp (42 samples)
> > -----------------------------------------------------------------
> >     0.00 : 354   * While we find nice hex chars, build a long_val.
> >     0.00 : 355   * Return number of chars processed.
> >     0.00 : 356   */
> >     0.00 : 357  int hex2u64(const char *ptr, u64 *long_val)
> >     2.38 : 358  {
> >     2.38 : 359          const char *p = ptr;
> >     0.00 : 360          *long_val = 0;
> >     0.00 : 361
> >    30.95 : 362          while (*p) {
> >    23.81 : 363                  const int hex_val = hex(*p);
> >     0.00 : 364
> >    14.29 : 365                  if (hex_val < 0)
> >     0.00 : 366                          break;
> >     0.00 : 367
> >    26.19 : 368                  *long_val = (*long_val << 4) | hex_val;
> >     0.00 : 369                  p++;
> >     0.00 : 370          }
> >     0.00 : 371
> >     0.00 : 372          return p - ptr;
> >     0.00 : 373  }
> > 
> > And I added many perf developers into Cc: because I want to listen to your
> > opinions about this new feature, if you don't mind.
> > 
> > If you give some feedback, I'd appreciate it! :)
> 
> Thanks Taeung,
> 
> I requested this feature some time ago and it's really cool to see someone 
> step up and implement it - much appreciated!
> 
> I just tested it out on my pet-example that leverages C++ instead of C:
> 
> ~~~~~
> #include <complex>
> #include <cmath>
> #include <random>
> #include <iostream>
> 
> using namespace std;
> 
> int main()
> {
>     uniform_real_distribution<double> uniform(-1E5, 1E5);
>     default_random_engine engine;
>     double s = 0;
>     for (int i = 0; i < 10000000; ++i) {
>         s += norm(complex<double>(uniform(engine), uniform(engine)));
>     }
>     cout << s << '\n';
>     return 0;
> }
> ~~~~~
> 
> Compile it with:
> 
> g++ -O2 -g -std=c++11 test.cpp -o test
> 
> Then record it with perf:
> 
> perf record --call-graph dwarf ./test
> 
> Then analyse it with `perf report`. You'll see one entry for main with 
> something like:
> 
> +  100.00%    39.69%  cpp-inlining  cpp-inlining      [.] main
> 
> Select it and annotate it, then switch to your new source-only view:
> 
> main  test.cpp
>        │  30                                                                                                                                                                                             >        │  31    using namespace std;                                                                                                                                                                     >        │  32                                                                                                                                                                                             >        │  33    int main()                                                                                                                                                                               >        │+ 34    {                                                                                                                                                                                        >        │  35        uniform_real_distribution<double> uniform(-1E5, 1E5);                                                                                                                                >        │  36        default_random_engine engine;                                                                                                                                                        >        │+ 37        double s = 0;                                                                                                                                                                        >        │+ 38        for (int i = 0; i < 10000000; ++i) {                                                                                                                                                 >   4.88 │+ 39            s += norm(complex<double>(uniform(engine), uniform(engine)));                                                                                                                    >        │  40        }                                                                                                                                                                                    >        │  41        cout << s << '\n';                                                                                                                                                                   >        │  42        return 0;                                                                                                                                                                            >        │+ 43    } 
> 
> Note: the line numbers are off b/c my file contains a file-header on-top. 
> Ignore that.
> 
> Note2: There is no column header shown, so it's unclear what the first column 
> represents.
> 
> Note 3: report showed 39.69% self cost in main, 100.00% inclusive. annotate 
> shows 4.88... What is that?
> 
> What this shows, is that it's extremely important to visualize inclusive cost 
> _and_ self cost in this view. Additionally, we need to account for inlining. 
> Right now, we only see the self cost that is directly within main, I suspect. 

Currently perf annotate doesn't use the sample period, it uses sample
count instead and print the percentage within the function.  So it's a
different number to the perf report.  I think we need to fix this
first.

Thanks,
Namhyung


> For C++ this is usually very misleading, and basically makes the annotate view 
> completely useless for application-level profiling. If a second column would 
> be added with the inclusive cost with the ability to drill down, then I could 
> easily see myself using this view.
> 
> I would appreciate if you could take this into account.
> 
> Thanks a lot
> 
> 
> -- 
> Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer
> KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
> Tel: +49-30-521325470
> KDAB - The Qt Experts

  parent reply	other threads:[~2017-06-29  7:11 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-28  3:18 [PATCH/RFC 0/4] perf annotate: Add --source-only option and the new source code TUI view Taeung Song
2017-06-28  3:26 ` Taeung Song
2017-06-28  9:53 ` Milian Wolff
2017-06-28 16:27   ` Taeung Song
2017-06-28 16:32     ` Milian Wolff
2017-06-29  7:11   ` Namhyung Kim [this message]
2017-06-30  7:14     ` Taeung Song
2017-06-30 16:21   ` Taeung Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170629071101.GA10884@sejong \
    --to=namhyung@kernel.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=dsahern@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=kernel-team@lge.com \
    --cc=kim.phillips@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=milian.wolff@kdab.com \
    --cc=treeze.taeung@gmail.com \
    --cc=wangnan0@huawei.com \
    --cc=yao.jin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.