public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Paul Mackerras <paulus@samba.org>, Ingo Molnar <mingo@kernel.org>,
	Namhyung Kim <namhyung.kim@lge.com>,
	LKML <linux-kernel@vger.kernel.org>, Arun Sharma <asharma@fb.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Jiri Olsa <jolsa@redhat.com>,
	Rodrigo Campos <rodrigo@sdfg.com.ar>
Subject: [PATCHSET 00/28] perf tools: Add support to accumulate hist periods (v5)
Date: Wed,  8 Jan 2014 17:46:05 +0900	[thread overview]
Message-ID: <1389170793-21926-1-git-send-email-namhyung@kernel.org> (raw)

Hello,

This is my third attempt to implement cumulative hist period report.
This work begins from Arun's SORT_INCLUSIVE patch [1] but I completely
rewrote it from scratch.

The patch 01 to 03 are independent cleanups and can be applied separately.

Please see the patch 04/28.  I refactored functions that add hist
entries with struct add_entry_iter.  While I converted all functions
carefully, it'd be better anyone can test and confirm that I didn't
mess up something - especially for branch stack and mem stuff.

This patchset basically adds period in a sample to every node in the
callchain.  A hist_entry now has an additional fields to keep the
cumulative period if --children option is given on perf report.

I changed the option as a separate --children and added a new
"Children" column (and renamed the default "Overhead" column into
"Self").  The output will be sorted by children (cumulative) overhead
for now.  The reason I changed to the --children is that I still think
it's much different from other --call-graph options.  The --call-graph
option will take care of it even with --children option.

I know that the UI should be changed also to be more flexible as Ingo
requested, but I'd like to do this first and then move to work on the
next.  I also added a new config option to enable it by default.

 * changes in v5:
  - support both of --children and --call-graph (Arun)
  - refactor hist_entry_iter to share with perf top (Jiri)
  - various cleanups and fixes (Jiri)
  - add ack's from Jiri

 * changes in v4:
  - change to --children option (Ingo)
  - rebased on new annotation change (Arnaldo)
  - support perf top also
  - enable --children option by default (Ingo)

 * changes in v3:
  - change to --cumulate option
  - fix a couple of bugs (Jiri, Rodrigo)
  - rename some help functions (Arnaldo)
  - cache previous hist entries rathen than just symbol and dso
  - add some preparatory cleanups
  - add report.cumulate config option


Let me show you an example:

  $ cat abc.c
  #define barrier() asm volatile("" ::: "memory")

  void a(void)
  {
  	int i;
  	for (i = 0; i < 1000000; i++)
  		barrier();
  }
  void b(void)
  {
  	a();
  }
  void c(void)
  {
  	b();
  }
  int main(void)
  {
  	c();
  	return 0;
  }

With this simple program I ran perf record and report:

  $ perf record -g -e cycles:u ./abc


Case 1.

  $ perf report --stdio --no-call-graph --no-children

  # Overhead  Command      Shared Object          Symbol
  # ........  .......  .................  ..............
  #
      91.50%      abc  abc                [.] a         
       8.18%      abc  ld-2.17.so         [.] strlen    
       0.31%      abc  [kernel.kallsyms]  [k] page_fault
       0.01%      abc  ld-2.17.so         [.] _start    


Case 2. (current default behavior)

  $ perf report --stdio --call-graph --no-children

  # Overhead  Command      Shared Object          Symbol
  # ........  .......  .................  ..............
  #
      91.50%      abc  abc                [.] a         
                  |
                  --- a
                      b
                      c
                      main
                      __libc_start_main

       8.18%      abc  ld-2.17.so         [.] strlen    
                  |
                  --- strlen
                      _dl_sysdep_start

       0.31%      abc  [kernel.kallsyms]  [k] page_fault
                  |
                  --- page_fault
                      _start

       0.01%      abc  ld-2.17.so         [.] _start    
                  |
                  --- _start


Case 3.

  $ perf report --no-call-graph --children --stdio

  #     Self  Children  Command      Shared Object                 Symbol
  # ........  ........  .......  .................  .....................
  #
       0.00%    91.50%      abc  libc-2.17.so       [.] __libc_start_main
       0.00%    91.50%      abc  abc                [.] main             
       0.00%    91.50%      abc  abc                [.] c                
       0.00%    91.50%      abc  abc                [.] b                
      91.50%    91.50%      abc  abc                [.] a                
       0.00%     8.18%      abc  ld-2.17.so         [.] _dl_sysdep_start 
       8.18%     8.18%      abc  ld-2.17.so         [.] strlen           
       0.01%     0.33%      abc  ld-2.17.so         [.] _start           
       0.31%     0.31%      abc  [kernel.kallsyms]  [k] page_fault       

As you can see __libc_start_main -> main -> c -> b -> a callchain show
up in the output.

Finally, it looks like below with both option enabled:

Case 4. (default behavior?)

  $ perf report --call-graph --children --stdio

  #     Self  Children  Command      Shared Object                 Symbol
  # ........  ........  .......  .................  .....................
  #
       0.00%    91.50%      abc  libc-2.17.so       [.] __libc_start_main
                  |
                  --- __libc_start_main

       0.00%    91.50%      abc  abc                [.] main             
                  |
                  --- main
                      __libc_start_main

       0.00%    91.50%      abc  abc                [.] c                
                  |
                  --- c
                      main
                      __libc_start_main

       0.00%    91.50%      abc  abc                [.] b                
                  |
                  --- b
                      c
                      main
                      __libc_start_main

      91.50%    91.50%      abc  abc                [.] a                
                  |
                  --- a
                      b
                      c
                      main
                      __libc_start_main
  ...


Currently the perf enables both of --call-graph and --children when it
finds callchains in the samples.  While this is useful for TUI or GTK,
I'm not sure for stdio as it'd consume so much lines.

I know it have some rough edges or even bugs, but I really want to
release it and get reviews.  It does not handle event groups and
annotations yet.

You can also get this series on 'perf/cumulate-v5' branch in my tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git


Any comments are welcome, thanks.
Namhyung


Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>

[1] https://lkml.org/lkml/2012/3/31/6


Namhyung Kim (28):
  perf tools: Insert filtered entries to hists also
  perf tools: Do not update total period of a hists when filtering
  perf tools: Remove symbol_conf.use_callchain check
  perf tools: Introduce struct hist_entry_iter
  perf hists: Convert hist entry functions to use struct he_stat
  perf hists: Add support for accumulated stat of hist entry
  perf hists: Check if accumulated when adding a hist entry
  perf hists: Accumulate hist entry stat based on the callchain
  perf tools: Update cpumode for each cumulative entry
  perf report: Cache cumulative callchains
  perf callchain: Add callchain_cursor_snapshot()
  perf tools: Save callchain info for each cumulative entry
  perf hists: Sort hist entries by accumulated period
  perf ui/hist: Add support to accumulated hist stat
  perf ui/browser: Add support to accumulated hist stat
  perf ui/gtk: Add support to accumulated hist stat
  perf tools: Apply percent-limit to cumulative percentage
  perf tools: Add more hpp helper functions
  perf report: Add --children option
  perf report: Add report.children config option
  perf tools: Factor out sample__resolve_callchain()
  perf tools: Factor out fill_callchain_info()
  perf tools: Factor out hist_entry_iter code
  perf tools: Add callback function to hist_entry_iter
  perf top: Convert to hist_entry_iter
  perf top: Add --children option
  perf top: Add top.children config option
  perf tools: Enable --children option by default

 tools/perf/Documentation/perf-report.txt |   5 +
 tools/perf/Documentation/perf-top.txt    |   6 +
 tools/perf/builtin-annotate.c            |   3 +-
 tools/perf/builtin-diff.c                |   2 +-
 tools/perf/builtin-report.c              | 202 +++---------
 tools/perf/builtin-top.c                 | 104 +++---
 tools/perf/tests/hists_link.c            |   4 +-
 tools/perf/ui/browsers/hists.c           |  50 +--
 tools/perf/ui/gtk/hists.c                |  20 +-
 tools/perf/ui/hist.c                     |  62 ++++
 tools/perf/ui/stdio/hist.c               |   4 +-
 tools/perf/util/callchain.c              |  66 ++++
 tools/perf/util/callchain.h              |  17 +
 tools/perf/util/event.c                  |  18 +-
 tools/perf/util/hist.c                   | 542 +++++++++++++++++++++++++++++--
 tools/perf/util/hist.h                   |  49 ++-
 tools/perf/util/machine.c                |   2 -
 tools/perf/util/sort.h                   |  18 +-
 tools/perf/util/symbol.c                 |  11 +-
 tools/perf/util/symbol.h                 |   3 +-
 20 files changed, 899 insertions(+), 289 deletions(-)

-- 
1.7.11.7


             reply	other threads:[~2014-01-08  8:46 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-08  8:46 Namhyung Kim [this message]
2014-01-08  8:46 ` [PATCH 01/28] perf tools: Insert filtered entries to hists also Namhyung Kim
2014-01-08 12:41   ` Arnaldo Carvalho de Melo
2014-01-08 16:22     ` Jiri Olsa
2014-01-08 18:59       ` Arnaldo Carvalho de Melo
2014-01-09 12:57         ` Namhyung Kim
2014-01-09 14:37           ` Arnaldo Carvalho de Melo
2014-01-14  0:15             ` Namhyung Kim
2014-01-08  8:46 ` [PATCH 02/28] perf tools: Do not update total period of a hists when filtering Namhyung Kim
2014-01-08  8:46 ` [PATCH 03/28] perf tools: Remove symbol_conf.use_callchain check Namhyung Kim
2014-01-08 12:57   ` Arnaldo Carvalho de Melo
2014-01-09 13:16     ` Namhyung Kim
2014-01-08  8:46 ` [PATCH 04/28] perf tools: Introduce struct hist_entry_iter Namhyung Kim
2014-01-08  8:46 ` [PATCH 05/28] perf hists: Convert hist entry functions to use struct he_stat Namhyung Kim
2014-01-08  8:46 ` [PATCH 06/28] perf hists: Add support for accumulated stat of hist entry Namhyung Kim
2014-01-08  8:46 ` [PATCH 07/28] perf hists: Check if accumulated when adding a " Namhyung Kim
2014-01-08  8:46 ` [PATCH 08/28] perf hists: Accumulate hist entry stat based on the callchain Namhyung Kim
2014-01-08  8:46 ` [PATCH 09/28] perf tools: Update cpumode for each cumulative entry Namhyung Kim
2014-01-08  8:46 ` [PATCH 10/28] perf report: Cache cumulative callchains Namhyung Kim
2014-01-09 18:06   ` Jiri Olsa
2014-01-13 23:55     ` Namhyung Kim
2014-01-14 13:17       ` Jiri Olsa
2014-01-11 16:02   ` Jiri Olsa
2014-01-13  8:45     ` Namhyung Kim
2014-01-08  8:46 ` [PATCH 11/28] perf callchain: Add callchain_cursor_snapshot() Namhyung Kim
2014-01-08  8:46 ` [PATCH 12/28] perf tools: Save callchain info for each cumulative entry Namhyung Kim
2014-01-08  8:46 ` [PATCH 13/28] perf hists: Sort hist entries by accumulated period Namhyung Kim
2014-01-08  8:46 ` [PATCH 14/28] perf ui/hist: Add support to accumulated hist stat Namhyung Kim
2014-01-08  8:46 ` [PATCH 15/28] perf ui/browser: " Namhyung Kim
2014-01-08  8:46 ` [PATCH 16/28] perf ui/gtk: " Namhyung Kim
2014-01-08  8:46 ` [PATCH 17/28] perf tools: Apply percent-limit to cumulative percentage Namhyung Kim
2014-01-08  8:46 ` [PATCH 18/28] perf tools: Add more hpp helper functions Namhyung Kim
2014-01-08  8:46 ` [PATCH 19/28] perf report: Add --children option Namhyung Kim
2014-01-08  8:46 ` [PATCH 20/28] perf report: Add report.children config option Namhyung Kim
2014-01-08  8:46 ` [PATCH 21/28] perf tools: Factor out sample__resolve_callchain() Namhyung Kim
2014-01-08  8:46 ` [PATCH 22/28] perf tools: Factor out fill_callchain_info() Namhyung Kim
2014-01-08  8:46 ` [PATCH 23/28] perf tools: Factor out hist_entry_iter code Namhyung Kim
2014-01-11 16:24   ` Jiri Olsa
2014-01-13  8:49     ` Namhyung Kim
2014-01-08  8:46 ` [PATCH 24/28] perf tools: Add callback function to hist_entry_iter Namhyung Kim
2014-01-08  8:46 ` [PATCH 25/28] perf top: Convert " Namhyung Kim
2014-01-11 16:35   ` Jiri Olsa
2014-01-13  8:55     ` Namhyung Kim
2014-01-13 10:45       ` Namhyung Kim
2014-01-08  8:46 ` [PATCH 26/28] perf top: Add --children option Namhyung Kim
2014-01-08  8:46 ` [PATCH 27/28] perf top: Add top.children config option Namhyung Kim
2014-01-08  8:46 ` [PATCH 28/28] perf tools: Enable --children option by default Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1389170793-21926-1-git-send-email-namhyung@kernel.org \
    --to=namhyung@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@ghostprotocols.net \
    --cc=asharma@fb.com \
    --cc=fweisbec@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung.kim@lge.com \
    --cc=paulus@samba.org \
    --cc=rodrigo@sdfg.com.ar \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox