linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Milian Wolff <milian.wolff@kdab.com>
Cc: acme@kernel.org, jolsa@kernel.org,
	Jin Yao <yao.jin@linux.intel.com>,
	Linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	David Ahern <dsahern@gmail.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	kernel-team@lge.com
Subject: Re: [PATCH v4 14/15] perf report: use srcline from callchain for hist entries
Date: Thu, 5 Oct 2017 13:08:29 +0900	[thread overview]
Message-ID: <20171005040829.GA1584@danjae.aot.lge.com> (raw)
In-Reply-To: <20171001143100.19988-15-milian.wolff@kdab.com>

On Sun, Oct 01, 2017 at 04:30:59PM +0200, Milian Wolff wrote:
> This also removes the symbol name from the srcline column,
> more on this below.
> 
> This ensures we use the correct srcline, which could originate
> from a potentially inlined function. The hist entries used to
> query for the srcline based purely on the IP, which leads to
> wrong results for inlined entries.

Yep, AFAICS current srcline returns the first entry in a inline-chain
and the srcfile (sort key) returns the last.  I think we need to make
it consistent.  It seems this patch fix it when --inline option is
used, but I guess the --no-inline case still has the problem.

Thanks,
Namhyung


> 
> Before:
> 
> ~~~~~
> perf report --inline -s srcline -g none --stdio
> ...
> # Children      Self  Source:Line
> # ........  ........  ..................................................................................................................................
> #
>     94.23%     0.00%  __libc_start_main+18446603487898210537
>     94.23%     0.00%  _start+41
>     44.58%     0.00%  main+100
>     44.58%     0.00%  std::_Norm_helper<true>::_S_do_it<double>+100
>     44.58%     0.00%  std::__complex_abs+100
>     44.58%     0.00%  std::abs<double>+100
>     44.58%     0.00%  std::norm<double>+100
>     36.01%     0.00%  hypot+18446603487892193300
>     25.81%     0.00%  main+41
>     25.81%     0.00%  std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()+41
>     25.81%     0.00%  std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >+41
>     25.75%    25.75%  random.h:143
>     18.39%     0.00%  main+57
>     18.39%     0.00%  std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()+57
>     18.39%     0.00%  std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >+57
>     13.80%    13.80%  random.tcc:3330
>      5.64%     0.00%  ??:0
>      4.13%     4.13%  __hypot_finite+163
>      4.13%     0.00%  __hypot_finite+18446603487892193443
> ...
> ~~~~~
> 
> After:
> 
> ~~~~~
> perf report --inline -s srcline -g none --stdio
> ...
> # Children      Self  Source:Line
> # ........  ........  ...........................................
> #
>     94.30%     1.19%  main.cpp:39
>     94.23%     0.00%  __libc_start_main+18446603487898210537
>     94.23%     0.00%  _start+41
>     48.44%     1.70%  random.h:1823
>     48.44%     0.00%  random.h:1814
>     46.74%     2.53%  random.h:185
>     44.68%     0.10%  complex:589
>     44.68%     0.00%  complex:597
>     44.68%     0.00%  complex:654
>     44.68%     0.00%  complex:664
>     40.61%    13.80%  random.tcc:3330
>     36.01%     0.00%  hypot+18446603487892193300
>     26.81%     0.00%  random.h:151
>     26.81%     0.00%  random.h:332
>     25.75%    25.75%  random.h:143
>      5.64%     0.00%  ??:0
>      4.13%     4.13%  __hypot_finite+163
>      4.13%     0.00%  __hypot_finite+18446603487892193443
> ...
> ~~~~~
> 
> Note that this change removes the symbol from the source:line
> hist column. If this information is desired, users should
> explicitly query for it if needed. I.e. run this command
> instead:
> 
> ~~~~~
> perf report --inline -s sym,srcline -g none --stdio
> ...
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 1K of event 'cycles:uppp'
> # Event count (approx.): 1381229476
> #
> # Children      Self  Symbol                                                                                                                               Source:Line
> # ........  ........  ...................................................................................................................................  ...........................................
> #
>     94.30%     1.19%  [.] main                                                                                                                             main.cpp:39
>     94.23%     0.00%  [.] __libc_start_main                                                                                                                __libc_start_main+18446603487898210537
>     94.23%     0.00%  [.] _start                                                                                                                           _start+41
>     48.44%     0.00%  [.] std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined)  random.h:1814
>     48.44%     0.00%  [.] std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined)  random.h:1823
>     46.74%     0.00%  [.] std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inlined)  random.h:185
>     44.68%     0.00%  [.] std::_Norm_helper<true>::_S_do_it<double> (inlined)                                                                              complex:654
>     44.68%     0.00%  [.] std::__complex_abs (inlined)                                                                                                     complex:589
>     44.68%     0.00%  [.] std::abs<double> (inlined)                                                                                                       complex:597
>     44.68%     0.00%  [.] std::norm<double> (inlined)                                                                                                      complex:664
>     39.80%    13.59%  [.] std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >               random.tcc:3330
>     36.01%     0.00%  [.] hypot                                                                                                                            hypot+18446603487892193300
>     26.81%     0.00%  [.] std::__detail::__mod<unsigned long, 2147483647ul, 16807ul, 0ul> (inlined)                                                        random.h:151
>     26.81%     0.00%  [.] std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>::operator() (inlined)                                 random.h:332
>     25.75%     0.00%  [.] std::__detail::_Mod<unsigned long, 2147483647ul, 16807ul, 0ul, true, true>::__calc (inlined)                                     random.h:143
>     25.19%    25.19%  [.] std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >               random.h:143
>      4.13%     4.13%  [.] __hypot_finite                                                                                                                   __hypot_finite+163
>      4.13%     0.00%  [.] __hypot_finite                                                                                                                   __hypot_finite+18446603487892193443
> ...
> ~~~~~
> 
> Compared to the old behavior, this reduces duplication in the output.
> Before we used to print the symbol name in the srcline column even
> when the sym column was explicitly requested. I.e. the output was:
> 
> ~~~~~
> perf report --inline -s sym,srcline -g none --stdio
> ...
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 1K of event 'cycles:uppp'
> # Event count (approx.): 1381229476
> #
> # Children      Self  Symbol                                                                                                                               Source:Line
> # ........  ........  ...................................................................................................................................  ..................................................................................................................................
> #
>     94.23%     0.00%  [.] __libc_start_main                                                                                                                __libc_start_main+18446603487898210537
>     94.23%     0.00%  [.] _start                                                                                                                           _start+41
>     44.58%     0.00%  [.] main                                                                                                                             main+100
>     44.58%     0.00%  [.] std::_Norm_helper<true>::_S_do_it<double> (inlined)                                                                              std::_Norm_helper<true>::_S_do_it<double>+100
>     44.58%     0.00%  [.] std::__complex_abs (inlined)                                                                                                     std::__complex_abs+100
>     44.58%     0.00%  [.] std::abs<double> (inlined)                                                                                                       std::abs<double>+100
>     44.58%     0.00%  [.] std::norm<double> (inlined)                                                                                                      std::norm<double>+100
>     36.01%     0.00%  [.] hypot                                                                                                                            hypot+18446603487892193300
>     25.81%     0.00%  [.] main                                                                                                                             main+41
>     25.81%     0.00%  [.] std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inlined)  std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()+41
>     25.81%     0.00%  [.] std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined)  std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >+41
>     25.69%    25.69%  [.] std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >               random.h:143
>     18.39%     0.00%  [.] main                                                                                                                             main+57
>     18.39%     0.00%  [.] std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inlined)  std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator()+57
>     18.39%     0.00%  [.] std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined)  std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >+57
>     13.80%    13.80%  [.] std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >               random.tcc:3330
>      4.13%     4.13%  [.] __hypot_finite                                                                                                                   __hypot_finite+163
>      4.13%     0.00%  [.] __hypot_finite                                                                                                                   __hypot_finite+18446603487892193443
> ...
> ~~~~~
> 
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Yao Jin <yao.jin@linux.intel.com>
> Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
> ---
>  tools/perf/util/callchain.c | 1 +
>  tools/perf/util/event.c     | 1 +
>  tools/perf/util/hist.c      | 2 ++
>  tools/perf/util/symbol.h    | 1 +
>  4 files changed, 5 insertions(+)
> 
> diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
> index 48d2869025b3..1b82225b96c3 100644
> --- a/tools/perf/util/callchain.c
> +++ b/tools/perf/util/callchain.c
> @@ -1065,6 +1065,7 @@ int fill_callchain_info(struct addr_location *al, struct callchain_cursor_node *
>  {
>  	al->map = node->map;
>  	al->sym = node->sym;
> +	al->srcline = node->srcline;
>  	if (node->map)
>  		al->addr = node->map->map_ip(node->map, node->ip);
>  	else
> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index 10366b87d0b5..2c651dc4fcda 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -1503,6 +1503,7 @@ int machine__resolve(struct machine *machine, struct addr_location *al,
>  	al->sym = NULL;
>  	al->cpu = sample->cpu;
>  	al->socket = -1;
> +	al->srcline = NULL;
>  
>  	if (al->cpu >= 0) {
>  		struct perf_env *env = machine->env;
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index b0fa9c217e1c..25d143053ab5 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -596,6 +596,7 @@ __hists__add_entry(struct hists *hists,
>  			.map	= al->map,
>  			.sym	= al->sym,
>  		},
> +		.srcline = al->srcline ? strdup(al->srcline) : NULL,
>  		.socket	 = al->socket,
>  		.cpu	 = al->cpu,
>  		.cpumode = al->cpumode,
> @@ -950,6 +951,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
>  			.map = al->map,
>  			.sym = al->sym,
>  		},
> +		.srcline = al->srcline ? strdup(al->srcline) : NULL,
>  		.parent = iter->parent,
>  		.raw_data = sample->raw_data,
>  		.raw_size = sample->raw_size,
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 8f072c28b6d3..a2aec8048bb1 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -209,6 +209,7 @@ struct addr_location {
>  	struct thread *thread;
>  	struct map    *map;
>  	struct symbol *sym;
> +	const char    *srcline;
>  	u64	      addr;
>  	char	      level;
>  	u8	      filtered;
> -- 
> 2.14.2
> 

  reply	other threads:[~2017-10-05  4:10 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-01 14:30 [PATCH v4 00/15] generate full callchain cursor entries for inlined frames Milian Wolff
2017-10-01 14:30 ` [PATCH v4 01/15] perf report: remove code to handle inline frames from browsers Milian Wolff
2017-10-01 14:30 ` [PATCH v4 02/15] perf util: store srcline in callchain_cursor_node Milian Wolff
2017-10-01 14:30 ` [PATCH v4 03/15] perf util: refactor inline_list to operate on symbols Milian Wolff
2017-10-05  1:56   ` Namhyung Kim
2017-10-08 19:53     ` Milian Wolff
2017-10-01 14:30 ` [PATCH v4 04/15] perf util: refactor inline_list to store srcline string directly Milian Wolff
2017-10-01 14:30 ` [PATCH v4 05/15] perf report: create real callchain entries for inlined frames Milian Wolff
2017-10-05  3:35   ` Namhyung Kim
2017-10-08 20:26     ` Milian Wolff
2017-10-01 14:30 ` [PATCH v4 06/15] perf report: fall-back to function name comparison for -g srcline Milian Wolff
2017-10-01 14:30 ` [PATCH v4 07/15] perf report: mark inlined frames in output by " (inlined)" suffix Milian Wolff
2017-10-01 14:30 ` [PATCH v4 08/15] perf script: mark inlined frames and do not print DSO for them Milian Wolff
2017-10-01 14:30 ` [PATCH v4 09/15] perf report: compare symbol name for inlined frames when matching Milian Wolff
2017-10-01 14:30 ` [PATCH v4 10/15] perf report: compare symbol name for inlined frames when sorting Milian Wolff
2017-10-01 14:30 ` [PATCH v4 11/15] perf report: properly handle branch count in match_chain Milian Wolff
2017-10-01 14:30 ` [PATCH v4 12/15] perf report: cache failed lookups of inlined frames Milian Wolff
2017-10-05  3:43   ` Namhyung Kim
2017-10-08 20:28     ` Milian Wolff
2017-10-01 14:30 ` [PATCH v4 13/15] perf report: cache srclines for callchain nodes Milian Wolff
2017-10-01 14:30 ` [PATCH v4 14/15] perf report: use srcline from callchain for hist entries Milian Wolff
2017-10-05  4:08   ` Namhyung Kim [this message]
2017-10-09 20:21     ` Milian Wolff
2017-10-01 14:31 ` [PATCH v4 15/15] perf util: enable handling of inlined frames by default Milian Wolff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171005040829.GA1584@danjae.aot.lge.com \
    --to=namhyung@kernel.org \
    --cc=Linux-kernel@vger.kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=dsahern@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kernel-team@lge.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=milian.wolff@kdab.com \
    --cc=yao.jin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).