linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] perf tools updates
@ 2010-04-24  2:05 Frederic Weisbecker
  2010-04-24  2:27 ` Frederic Weisbecker
  2010-04-27  9:15 ` Ingo Molnar
  0 siblings, 2 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2010-04-24  2:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, William Cohen, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Hitoshi Mitake,
	Masami Hiramatsu, Tom Zanussi, Arjan van de Ven, Pekka Enberg,
	Li Zefan, Stephane Eranian, Jens Axboe, Jason Baron,
	Xiao Guangrong

Ingo,

Please pull the perf/core branch that can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
	perf/core

Thanks,
	Frederic
---

Frederic Weisbecker (6):
      perf: Generalize perf lock's sample event reordering to the session layer
      perf: Use generic sample reordering in perf sched
      perf: Use generic sample reordering in perf kmem
      perf: Use generic sample reordering in perf trace
      perf: Use generic sample reordering in perf timechart
      perf: Add a perf trace option to check samples ordering reliability

Hitoshi Mitake (1):
      perf lock: Fix state machine to recognize lock sequence

Stephane Eranian (1):
      perf: Fix initialization bug in parse_single_tracepoint_event()

William Cohen (1):
      perf: Some perf-kvm documentation edits


 tools/perf/Documentation/perf-kvm.txt |    9 +-
 tools/perf/builtin-kmem.c             |    6 +-
 tools/perf/builtin-lock.c             |  595 ++++++++++++++++++++-------------
 tools/perf/builtin-sched.c            |    8 +-
 tools/perf/builtin-timechart.c        |  112 +------
 tools/perf/builtin-trace.c            |   13 +
 tools/perf/util/parse-events.c        |   13 +-
 tools/perf/util/session.c             |  179 ++++++++++-
 tools/perf/util/session.h             |   10 +
 9 files changed, 583 insertions(+), 362 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] perf tools updates
  2010-04-24  2:05 Frederic Weisbecker
@ 2010-04-24  2:27 ` Frederic Weisbecker
  2010-04-27  9:15 ` Ingo Molnar
  1 sibling, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2010-04-24  2:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, William Cohen, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Paul Mackerras, Hitoshi Mitake, Masami Hiramatsu, Tom Zanussi,
	Arjan van de Ven, Pekka Enberg, Li Zefan, Stephane Eranian,
	Jens Axboe, Jason Baron, Xiao Guangrong

On Sat, Apr 24, 2010 at 04:05:33AM +0200, Frederic Weisbecker wrote:
> Ingo,
> 
> Please pull the perf/core branch that can be found at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> 	perf/core
> 
> Thanks,
> 	Frederic



I forgot to highlight some things here.

- The -M option is not used anymore. Well actually I just checked and it's
  used by the record perl/python scripts. But it's not needed there anymore, so
  I'll drop it in another pass. But globally it's over with the buffers
  multiplexing needs.

- But I haven't plugged the reordering thing to the live mode, because I'm
  not sure exactly if that would be welcome. With live mode we want the
  events as they arrive, using the reordering there would make it get the
  events per bunches of 2 seconds slices. I guess we'll figure out a solution
  for that.

- Perf lock gets into a better shape. There is still some work to make
  it truly usable though. I need to unearth the event injection thing
  to lower the size of the events, profile by lock classes, etc...

- Various important fixes

Thanks.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] perf tools updates
  2010-04-24  2:05 Frederic Weisbecker
  2010-04-24  2:27 ` Frederic Weisbecker
@ 2010-04-27  9:15 ` Ingo Molnar
  1 sibling, 0 replies; 14+ messages in thread
From: Ingo Molnar @ 2010-04-27  9:15 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, William Cohen, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Paul Mackerras, Hitoshi Mitake, Masami Hiramatsu, Tom Zanussi,
	Arjan van de Ven, Pekka Enberg, Li Zefan, Stephane Eranian,
	Jens Axboe, Jason Baron, Xiao Guangrong


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> Ingo,
> 
> Please pull the perf/core branch that can be found at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> 	perf/core
> 
> Thanks,
> 	Frederic
> ---
> 
> Frederic Weisbecker (6):
>       perf: Generalize perf lock's sample event reordering to the session layer
>       perf: Use generic sample reordering in perf sched
>       perf: Use generic sample reordering in perf kmem
>       perf: Use generic sample reordering in perf trace
>       perf: Use generic sample reordering in perf timechart
>       perf: Add a perf trace option to check samples ordering reliability
> 
> Hitoshi Mitake (1):
>       perf lock: Fix state machine to recognize lock sequence
> 
> Stephane Eranian (1):
>       perf: Fix initialization bug in parse_single_tracepoint_event()
> 
> William Cohen (1):
>       perf: Some perf-kvm documentation edits
> 
> 
>  tools/perf/Documentation/perf-kvm.txt |    9 +-
>  tools/perf/builtin-kmem.c             |    6 +-
>  tools/perf/builtin-lock.c             |  595 ++++++++++++++++++++-------------
>  tools/perf/builtin-sched.c            |    8 +-
>  tools/perf/builtin-timechart.c        |  112 +------
>  tools/perf/builtin-trace.c            |   13 +
>  tools/perf/util/parse-events.c        |   13 +-
>  tools/perf/util/session.c             |  179 ++++++++++-
>  tools/perf/util/session.h             |   10 +
>  9 files changed, 583 insertions(+), 362 deletions(-)

Pulled, thanks a lot Frederic!

	Ingo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [GIT PULL] perf tools updates
@ 2011-05-22  1:45 Frederic Weisbecker
  2011-05-22  8:49 ` Ingo Molnar
  0 siblings, 1 reply; 14+ messages in thread
From: Frederic Weisbecker @ 2011-05-22  1:45 UTC (permalink / raw)
  To: Ingo Molnar, Arnaldo Carvalho de Melo
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Stephane Eranian, Linus Torvalds

Ingo, Arnaldo,

Please pull the perf/core branch that can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
	perf/core

Thanks,
	Frederic
---

Frederic Weisbecker (6):
      perf tools: Check we are able to read the event size on mmap
      perf tools: Remove junk code in mmap size handling
      perf tools: Move evlist sample helpers to evlist area
      perf tools: Pre-check sample size before parsing
      perf tools: Robustify dynamic sample content fetch
      perf tools: Propagate event parse error handling


 tools/perf/builtin-test.c |    9 ++++++++-
 tools/perf/builtin-top.c  |    7 ++++++-
 tools/perf/util/event.c   |   16 ++++++++++++++++
 tools/perf/util/event.h   |   12 +++++++++++-
 tools/perf/util/evlist.c  |   31 +++++++++++++++++++++++++++++++
 tools/perf/util/evlist.h  |    3 +++
 tools/perf/util/evsel.c   |   32 +++++++++++++++++++++++++++++++-
 tools/perf/util/header.c  |   31 -------------------------------
 tools/perf/util/header.h  |    2 --
 tools/perf/util/python.c  |   13 ++++++++++---
 tools/perf/util/session.c |   25 ++++++++++++++++++-------
 tools/perf/util/session.h |    2 ++
 12 files changed, 136 insertions(+), 47 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] perf tools updates
  2011-05-22  1:45 Frederic Weisbecker
@ 2011-05-22  8:49 ` Ingo Molnar
  2011-05-22 12:07   ` Frederic Weisbecker
  0 siblings, 1 reply; 14+ messages in thread
From: Ingo Molnar @ 2011-05-22  8:49 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Arnaldo Carvalho de Melo, LKML, Peter Zijlstra, Stephane Eranian,
	Linus Torvalds


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> Ingo, Arnaldo,
> 
> Please pull the perf/core branch that can be found at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> 	perf/core
> 
> Thanks,
> 	Frederic
> ---
> 
> Frederic Weisbecker (6):
>       perf tools: Check we are able to read the event size on mmap
>       perf tools: Remove junk code in mmap size handling
>       perf tools: Move evlist sample helpers to evlist area
>       perf tools: Pre-check sample size before parsing
>       perf tools: Robustify dynamic sample content fetch
>       perf tools: Propagate event parse error handling
> 
> 
>  tools/perf/builtin-test.c |    9 ++++++++-
>  tools/perf/builtin-top.c  |    7 ++++++-
>  tools/perf/util/event.c   |   16 ++++++++++++++++
>  tools/perf/util/event.h   |   12 +++++++++++-
>  tools/perf/util/evlist.c  |   31 +++++++++++++++++++++++++++++++
>  tools/perf/util/evlist.h  |    3 +++
>  tools/perf/util/evsel.c   |   32 +++++++++++++++++++++++++++++++-
>  tools/perf/util/header.c  |   31 -------------------------------
>  tools/perf/util/header.h  |    2 --
>  tools/perf/util/python.c  |   13 ++++++++++---
>  tools/perf/util/session.c |   25 ++++++++++++++++++-------
>  tools/perf/util/session.h |    2 ++
>  12 files changed, 136 insertions(+), 47 deletions(-)

To get this upstream ASAP i pulled them into perf/urgent and resolved two 
conflicts there (a contextual and a semantic one) - please double check my 
resolutions.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] perf tools updates
  2011-05-22  8:49 ` Ingo Molnar
@ 2011-05-22 12:07   ` Frederic Weisbecker
  0 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-05-22 12:07 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Arnaldo Carvalho de Melo, LKML, Peter Zijlstra, Stephane Eranian,
	Linus Torvalds

On Sun, May 22, 2011 at 10:49:34AM +0200, Ingo Molnar wrote:
> 
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > Ingo, Arnaldo,
> > 
> > Please pull the perf/core branch that can be found at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> > 	perf/core
> > 
> > Thanks,
> > 	Frederic
> > ---
> > 
> > Frederic Weisbecker (6):
> >       perf tools: Check we are able to read the event size on mmap
> >       perf tools: Remove junk code in mmap size handling
> >       perf tools: Move evlist sample helpers to evlist area
> >       perf tools: Pre-check sample size before parsing
> >       perf tools: Robustify dynamic sample content fetch
> >       perf tools: Propagate event parse error handling
> > 
> > 
> >  tools/perf/builtin-test.c |    9 ++++++++-
> >  tools/perf/builtin-top.c  |    7 ++++++-
> >  tools/perf/util/event.c   |   16 ++++++++++++++++
> >  tools/perf/util/event.h   |   12 +++++++++++-
> >  tools/perf/util/evlist.c  |   31 +++++++++++++++++++++++++++++++
> >  tools/perf/util/evlist.h  |    3 +++
> >  tools/perf/util/evsel.c   |   32 +++++++++++++++++++++++++++++++-
> >  tools/perf/util/header.c  |   31 -------------------------------
> >  tools/perf/util/header.h  |    2 --
> >  tools/perf/util/python.c  |   13 ++++++++++---
> >  tools/perf/util/session.c |   25 ++++++++++++++++++-------
> >  tools/perf/util/session.h |    2 ++
> >  12 files changed, 136 insertions(+), 47 deletions(-)
> 
> To get this upstream ASAP i pulled them into perf/urgent and resolved two 
> conflicts there (a contextual and a semantic one) - please double check my 
> resolutions.

Yeah looks good.

Thanks!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [GIT PULL] perf tools updates
@ 2011-06-29 23:34 Frederic Weisbecker
  2011-06-29 23:34 ` [PATCH 1/6] perf tools: Add inverted call graph report support Frederic Weisbecker
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern, Sam Liao

Ingo,

Please pull the perf/core branch that can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
	perf/core

It adds the inverted callchains support and let one use
parent filtering with parent sorting at the same time, because
it appears to me that inverted callchains sorted by filtered
parents is pretty useful, and extendable to more cool things.

Anyway inverted callchains used with some different sorting combination
in general can provide some interesting analysis flavours.

Having played with it a bit. It seems to me the callee point
of view (traditional -g callchains) is better suited to
find the precise zoomed-in places where cpu time is most
spent. Spot contention places, etc...

OTOH, caller point of view (-G, inverted callchain), is
for zoomed out observation, of course. It's more suited for
global profiling. To get a big overview of where the hot bulk
of a program is executing for example.

Examples:

- look at the hottest tree of call of a program.

	./perf report -G -s pid --stdio
	
     5.73%               perf:11933
            |
            --- __libc_start_main
               |          
               |--99.18%-- main
               |          run_builtin
               |          cmd_bench
               |          |          
               |          |--89.68%-- bench_sched_messaging
               |          |          |          
               |          |          |--96.11%-- create_worker
               |          |          |          |          
               |          |          |          |--95.10%-- __libc_fork
               |          |          |          |          |          
               |          |          |          |          |--93.99%-- stub_clone
               |          |          |          |          |          sys_clone
               |          |          |          |          |          do_fork
               |          |          |          |          |          |          
               |          |          |          |          |          |--99.09%-- copy_process
               |          |          |          |          |          |          |          
               |          |          |          |          |          |          |--91.62%-- dup_mm

- look at where kernel threads spend their time

	perf report -G -p kernel_thread -s parent --stdio
	
# Overhead  Parent symbol
# ........  .............
#
     0.07%  kernel_thread_helper
            |
            --- kernel_thread_helper
                kthread
               |          
               |--50.00%-- kjournald2
               |          jbd2_journal_commit_transaction
               |          journal_submit_commit_record
               |          submit_bh
               |          submit_bio
               |          generic_make_request
               |          __make_request
               |          __blk_run_queue
               |          scsi_request_fn
               |          scsi_dispatch_cmd
               |          ata_scsi_queuecmd
               |          ata_scsi_translate
               |          ata_qc_issue
               |          ata_bmdma_qc_issue
               |          ata_sff_qc_issue
               |          ata_sff_tf_load
               |          ata_sff_check_status
               |          ioread8
               |          
                --50.00%-- rcu_kthread
                          rcu_process_callbacks
                          delayed_put_task_struct
                          __put_task_struct
                          free_task
                          free_thread_info
                          free_thread_xstate
                          kmem_cache_free
                          __slab_free
                          add_partial
                          _raw_spin_lock
                          lock_acquire
                          
etc...

We could extend that by applying some cut in the callchains.
For example stop a callchain on a given dso and you can profile
which exported function is most called in it.

Anyway, this has some nice potential.


Thanks,
	Frederic
---

Frederic Weisbecker (5):
      perf tools: Make sort operations static
      perf tools: Remove sort print helpers declarations
      perf tools: Don't display ignored entries on stdio ui
      perf tools: Allow sort dimensions to be registered more than once
      perf tools: Only display parent field if explictly sorted

Sam Liao (1):
      perf tools: Add inverted call graph report support.


 tools/perf/Documentation/perf-report.txt |   15 ++-
 tools/perf/builtin-report.c              |   42 +++++-
 tools/perf/util/callchain.h              |    6 +
 tools/perf/util/hist.c                   |    6 +-
 tools/perf/util/session.c                |    7 +-
 tools/perf/util/sort.c                   |  223 ++++++++++++++----------------
 tools/perf/util/sort.h                   |   14 --
 7 files changed, 169 insertions(+), 144 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/6] perf tools: Add inverted call graph report support.
  2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
@ 2011-06-29 23:34 ` Frederic Weisbecker
  2011-06-29 23:34 ` [PATCH 2/6] perf tools: Make sort operations static Frederic Weisbecker
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Sam Liao, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Stephane Eranian, David Ahern, Frederic Weisbecker

From: Sam Liao <phyomh@gmail.com>

Add "caller/callee" option to support inverted butterfly report,
in the inverted report (with caller option), the call graph start
from the callee's ancestor. Users can use such view to catch system's
performance bottleneck from a sysprof like view. Using this option
with specified sort order like pid gives us high level view of call
graph statistics.

Also add "-G" alias for inverted call graph.

Signed-off-by: Sam Liao <phyomh@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 tools/perf/Documentation/perf-report.txt |   15 ++++++++++--
 tools/perf/builtin-report.c              |   33 ++++++++++++++++++++++++-----
 tools/perf/util/callchain.h              |    6 +++++
 tools/perf/util/hist.c                   |    3 +-
 tools/perf/util/session.c                |    7 +++++-
 5 files changed, 53 insertions(+), 11 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 8ba03d6..cfa8e51 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -80,15 +80,24 @@ OPTIONS
 --dump-raw-trace::
         Dump raw trace in ASCII.
 
--g [type,min]::
+-g [type,min,order]::
 --call-graph::
-        Display call chains using type and min percent threshold.
+        Display call chains using type, min percent threshold and order.
 	type can be either:
 	- flat: single column, linear exposure of call chains.
 	- graph: use a graph tree, displaying absolute overhead rates.
 	- fractal: like graph, but displays relative rates. Each branch of
 		 the tree is considered as a new profiled object. +
-	Default: fractal,0.5.
+
+	order can be either:
+	- callee: callee based call graph.
+	- caller: inverted caller based call graph.
+
+	Default: fractal,0.5,callee.
+
+-G::
+--inverted::
+        alias for inverted caller based call graph.
 
 --pretty=<key>::
         Pretty printing style.  key: normal, raw
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 287a173..271e252 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -45,7 +45,8 @@ static struct perf_read_values	show_threads_values;
 static const char	default_pretty_printing_style[] = "normal";
 static const char	*pretty_printing_style = default_pretty_printing_style;
 
-static char		callchain_default_opt[] = "fractal,0.5";
+static char		callchain_default_opt[] = "fractal,0.5,callee";
+static bool		inverted_callchain;
 static symbol_filter_t	annotate_init;
 
 static int perf_session__add_hist_entry(struct perf_session *session,
@@ -386,13 +387,29 @@ parse_callchain_opt(const struct option *opt __used, const char *arg,
 	if (!tok)
 		goto setup;
 
-	tok2 = strtok(NULL, ",");
 	callchain_param.min_percent = strtod(tok, &endptr);
 	if (tok == endptr)
 		return -1;
 
-	if (tok2)
+	/* get the print limit */
+	tok2 = strtok(NULL, ",");
+	if (!tok2)
+		goto setup;
+
+	if (tok2[0] != 'c') {
 		callchain_param.print_limit = strtod(tok2, &endptr);
+		tok2 = strtok(NULL, ",");
+		if (!tok2)
+			goto setup;
+	}
+
+	/* get the call chain order */
+	if (!strcmp(tok2, "caller"))
+		callchain_param.order = ORDER_CALLER;
+	else if (!strcmp(tok2, "callee"))
+		callchain_param.order = ORDER_CALLEE;
+	else
+		return -1;
 setup:
 	if (callchain_register_param(&callchain_param) < 0) {
 		fprintf(stderr, "Can't register callchain params\n");
@@ -436,9 +453,10 @@ static const struct option options[] = {
 		   "regex filter to identify parent, see: '--sort parent'"),
 	OPT_BOOLEAN('x', "exclude-other", &symbol_conf.exclude_other,
 		    "Only display entries with parent-match"),
-	OPT_CALLBACK_DEFAULT('g', "call-graph", NULL, "output_type,min_percent",
-		     "Display callchains using output_type (graph, flat, fractal, or none) and min percent threshold. "
-		     "Default: fractal,0.5", &parse_callchain_opt, callchain_default_opt),
+	OPT_CALLBACK_DEFAULT('g', "call-graph", NULL, "output_type,min_percent, call_order",
+		     "Display callchains using output_type (graph, flat, fractal, or none) , min percent threshold and callchain order. "
+		     "Default: fractal,0.5,callee", &parse_callchain_opt, callchain_default_opt),
+	OPT_BOOLEAN('G', "inverted", &inverted_callchain, "alias for inverted call graph"),
 	OPT_STRING('d', "dsos", &symbol_conf.dso_list_str, "dso[,dso...]",
 		   "only consider symbols in these dsos"),
 	OPT_STRING('C', "comms", &symbol_conf.comm_list_str, "comm[,comm...]",
@@ -467,6 +485,9 @@ int cmd_report(int argc, const char **argv, const char *prefix __used)
 	else if (use_tui)
 		use_browser = 1;
 
+	if (inverted_callchain)
+		callchain_param.order = ORDER_CALLER;
+
 	if (strcmp(input_name, "-") != 0)
 		setup_browser(true);
 	else
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 1a79df9..9b4ff16 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -14,6 +14,11 @@ enum chain_mode {
 	CHAIN_GRAPH_REL
 };
 
+enum chain_order {
+	ORDER_CALLER,
+	ORDER_CALLEE
+};
+
 struct callchain_node {
 	struct callchain_node	*parent;
 	struct list_head	siblings;
@@ -41,6 +46,7 @@ struct callchain_param {
 	u32			print_limit;
 	double			min_percent;
 	sort_chain_func_t	sort;
+	enum chain_order	order;
 };
 
 struct callchain_list {
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 627a02e..dae4202 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -14,7 +14,8 @@ enum hist_filter {
 
 struct callchain_param	callchain_param = {
 	.mode	= CHAIN_GRAPH_REL,
-	.min_percent = 0.5
+	.min_percent = 0.5,
+	.order  = ORDER_CALLEE
 };
 
 u16 hists__col_len(struct hists *self, enum hist_column col)
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index b723f21..558bcf9 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -247,9 +247,14 @@ int perf_session__resolve_callchain(struct perf_session *self,
 	callchain_cursor_reset(&self->callchain_cursor);
 
 	for (i = 0; i < chain->nr; i++) {
-		u64 ip = chain->ips[i];
+		u64 ip;
 		struct addr_location al;
 
+		if (callchain_param.order == ORDER_CALLEE)
+			ip = chain->ips[i];
+		else
+			ip = chain->ips[chain->nr - i - 1];
+
 		if (ip >= PERF_CONTEXT_MAX) {
 			switch (ip) {
 			case PERF_CONTEXT_HV:
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/6] perf tools: Make sort operations static
  2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
  2011-06-29 23:34 ` [PATCH 1/6] perf tools: Add inverted call graph report support Frederic Weisbecker
@ 2011-06-29 23:34 ` Frederic Weisbecker
  2011-06-29 23:34 ` [PATCH 3/6] perf tools: Remove sort print helpers declarations Frederic Weisbecker
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern, Sam Liao

These don't need to be globally visible.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
---
 tools/perf/util/sort.c |  211 ++++++++++++++++++++++-------------------------
 tools/perf/util/sort.h |    8 --
 2 files changed, 99 insertions(+), 120 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index f44fa54..f5dba56 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -15,95 +15,6 @@ char * field_sep;
 
 LIST_HEAD(hist_entry__sort_list);
 
-static int hist_entry__thread_snprintf(struct hist_entry *self, char *bf,
-				       size_t size, unsigned int width);
-static int hist_entry__comm_snprintf(struct hist_entry *self, char *bf,
-				     size_t size, unsigned int width);
-static int hist_entry__dso_snprintf(struct hist_entry *self, char *bf,
-				    size_t size, unsigned int width);
-static int hist_entry__sym_snprintf(struct hist_entry *self, char *bf,
-				    size_t size, unsigned int width);
-static int hist_entry__parent_snprintf(struct hist_entry *self, char *bf,
-				       size_t size, unsigned int width);
-static int hist_entry__cpu_snprintf(struct hist_entry *self, char *bf,
-				    size_t size, unsigned int width);
-
-struct sort_entry sort_thread = {
-	.se_header	= "Command:  Pid",
-	.se_cmp		= sort__thread_cmp,
-	.se_snprintf	= hist_entry__thread_snprintf,
-	.se_width_idx	= HISTC_THREAD,
-};
-
-struct sort_entry sort_comm = {
-	.se_header	= "Command",
-	.se_cmp		= sort__comm_cmp,
-	.se_collapse	= sort__comm_collapse,
-	.se_snprintf	= hist_entry__comm_snprintf,
-	.se_width_idx	= HISTC_COMM,
-};
-
-struct sort_entry sort_dso = {
-	.se_header	= "Shared Object",
-	.se_cmp		= sort__dso_cmp,
-	.se_snprintf	= hist_entry__dso_snprintf,
-	.se_width_idx	= HISTC_DSO,
-};
-
-struct sort_entry sort_sym = {
-	.se_header	= "Symbol",
-	.se_cmp		= sort__sym_cmp,
-	.se_snprintf	= hist_entry__sym_snprintf,
-	.se_width_idx	= HISTC_SYMBOL,
-};
-
-struct sort_entry sort_parent = {
-	.se_header	= "Parent symbol",
-	.se_cmp		= sort__parent_cmp,
-	.se_snprintf	= hist_entry__parent_snprintf,
-	.se_width_idx	= HISTC_PARENT,
-};
- 
-struct sort_entry sort_cpu = {
-	.se_header      = "CPU",
-	.se_cmp	        = sort__cpu_cmp,
-	.se_snprintf    = hist_entry__cpu_snprintf,
-	.se_width_idx	= HISTC_CPU,
-};
-
-struct sort_dimension {
-	const char		*name;
-	struct sort_entry	*entry;
-	int			taken;
-};
-
-static struct sort_dimension sort_dimensions[] = {
-	{ .name = "pid",	.entry = &sort_thread,	},
-	{ .name = "comm",	.entry = &sort_comm,	},
-	{ .name = "dso",	.entry = &sort_dso,	},
-	{ .name = "symbol",	.entry = &sort_sym,	},
-	{ .name = "parent",	.entry = &sort_parent,	},
-	{ .name = "cpu",	.entry = &sort_cpu,	},
-};
-
-int64_t cmp_null(void *l, void *r)
-{
-	if (!l && !r)
-		return 0;
-	else if (!l)
-		return -1;
-	else
-		return 1;
-}
-
-/* --sort pid */
-
-int64_t
-sort__thread_cmp(struct hist_entry *left, struct hist_entry *right)
-{
-	return right->thread->pid - left->thread->pid;
-}
-
 static int repsep_snprintf(char *bf, size_t size, const char *fmt, ...)
 {
 	int n;
@@ -125,6 +36,24 @@ static int repsep_snprintf(char *bf, size_t size, const char *fmt, ...)
 	return n;
 }
 
+static int64_t cmp_null(void *l, void *r)
+{
+	if (!l && !r)
+		return 0;
+	else if (!l)
+		return -1;
+	else
+		return 1;
+}
+
+/* --sort pid */
+
+static int64_t
+sort__thread_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	return right->thread->pid - left->thread->pid;
+}
+
 static int hist_entry__thread_snprintf(struct hist_entry *self, char *bf,
 				       size_t size, unsigned int width)
 {
@@ -132,15 +61,50 @@ static int hist_entry__thread_snprintf(struct hist_entry *self, char *bf,
 			      self->thread->comm ?: "", self->thread->pid);
 }
 
+struct sort_entry sort_thread = {
+	.se_header	= "Command:  Pid",
+	.se_cmp		= sort__thread_cmp,
+	.se_snprintf	= hist_entry__thread_snprintf,
+	.se_width_idx	= HISTC_THREAD,
+};
+
+/* --sort comm */
+
+static int64_t
+sort__comm_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	return right->thread->pid - left->thread->pid;
+}
+
+static int64_t
+sort__comm_collapse(struct hist_entry *left, struct hist_entry *right)
+{
+	char *comm_l = left->thread->comm;
+	char *comm_r = right->thread->comm;
+
+	if (!comm_l || !comm_r)
+		return cmp_null(comm_l, comm_r);
+
+	return strcmp(comm_l, comm_r);
+}
+
 static int hist_entry__comm_snprintf(struct hist_entry *self, char *bf,
 				     size_t size, unsigned int width)
 {
 	return repsep_snprintf(bf, size, "%*s", width, self->thread->comm);
 }
 
+struct sort_entry sort_comm = {
+	.se_header	= "Command",
+	.se_cmp		= sort__comm_cmp,
+	.se_collapse	= sort__comm_collapse,
+	.se_snprintf	= hist_entry__comm_snprintf,
+	.se_width_idx	= HISTC_COMM,
+};
+
 /* --sort dso */
 
-int64_t
+static int64_t
 sort__dso_cmp(struct hist_entry *left, struct hist_entry *right)
 {
 	struct dso *dso_l = left->ms.map ? left->ms.map->dso : NULL;
@@ -173,9 +137,16 @@ static int hist_entry__dso_snprintf(struct hist_entry *self, char *bf,
 	return repsep_snprintf(bf, size, "%-*s", width, "[unknown]");
 }
 
+struct sort_entry sort_dso = {
+	.se_header	= "Shared Object",
+	.se_cmp		= sort__dso_cmp,
+	.se_snprintf	= hist_entry__dso_snprintf,
+	.se_width_idx	= HISTC_DSO,
+};
+
 /* --sort symbol */
 
-int64_t
+static int64_t
 sort__sym_cmp(struct hist_entry *left, struct hist_entry *right)
 {
 	u64 ip_l, ip_r;
@@ -211,29 +182,16 @@ static int hist_entry__sym_snprintf(struct hist_entry *self, char *bf,
 	return ret;
 }
 
-/* --sort comm */
-
-int64_t
-sort__comm_cmp(struct hist_entry *left, struct hist_entry *right)
-{
-	return right->thread->pid - left->thread->pid;
-}
-
-int64_t
-sort__comm_collapse(struct hist_entry *left, struct hist_entry *right)
-{
-	char *comm_l = left->thread->comm;
-	char *comm_r = right->thread->comm;
-
-	if (!comm_l || !comm_r)
-		return cmp_null(comm_l, comm_r);
-
-	return strcmp(comm_l, comm_r);
-}
+struct sort_entry sort_sym = {
+	.se_header	= "Symbol",
+	.se_cmp		= sort__sym_cmp,
+	.se_snprintf	= hist_entry__sym_snprintf,
+	.se_width_idx	= HISTC_SYMBOL,
+};
 
 /* --sort parent */
 
-int64_t
+static int64_t
 sort__parent_cmp(struct hist_entry *left, struct hist_entry *right)
 {
 	struct symbol *sym_l = left->parent;
@@ -252,9 +210,16 @@ static int hist_entry__parent_snprintf(struct hist_entry *self, char *bf,
 			      self->parent ? self->parent->name : "[other]");
 }
 
+struct sort_entry sort_parent = {
+	.se_header	= "Parent symbol",
+	.se_cmp		= sort__parent_cmp,
+	.se_snprintf	= hist_entry__parent_snprintf,
+	.se_width_idx	= HISTC_PARENT,
+};
+
 /* --sort cpu */
 
-int64_t
+static int64_t
 sort__cpu_cmp(struct hist_entry *left, struct hist_entry *right)
 {
 	return right->cpu - left->cpu;
@@ -266,6 +231,28 @@ static int hist_entry__cpu_snprintf(struct hist_entry *self, char *bf,
 	return repsep_snprintf(bf, size, "%-*d", width, self->cpu);
 }
 
+struct sort_entry sort_cpu = {
+	.se_header      = "CPU",
+	.se_cmp	        = sort__cpu_cmp,
+	.se_snprintf    = hist_entry__cpu_snprintf,
+	.se_width_idx	= HISTC_CPU,
+};
+
+struct sort_dimension {
+	const char		*name;
+	struct sort_entry	*entry;
+	int			taken;
+};
+
+static struct sort_dimension sort_dimensions[] = {
+	{ .name = "pid",	.entry = &sort_thread,	},
+	{ .name = "comm",	.entry = &sort_comm,	},
+	{ .name = "dso",	.entry = &sort_dso,	},
+	{ .name = "symbol",	.entry = &sort_sym,	},
+	{ .name = "parent",	.entry = &sort_parent,	},
+	{ .name = "cpu",	.entry = &sort_cpu,	},
+};
+
 int sort_dimension__add(const char *tok)
 {
 	unsigned int i;
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 0b91053..4a6d309 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -108,14 +108,6 @@ extern size_t sort__thread_print(FILE *, struct hist_entry *, unsigned int);
 extern size_t sort__comm_print(FILE *, struct hist_entry *, unsigned int);
 extern size_t sort__dso_print(FILE *, struct hist_entry *, unsigned int);
 extern size_t sort__sym_print(FILE *, struct hist_entry *, unsigned int __used);
-extern int64_t cmp_null(void *, void *);
-extern int64_t sort__thread_cmp(struct hist_entry *, struct hist_entry *);
-extern int64_t sort__comm_cmp(struct hist_entry *, struct hist_entry *);
-extern int64_t sort__comm_collapse(struct hist_entry *, struct hist_entry *);
-extern int64_t sort__dso_cmp(struct hist_entry *, struct hist_entry *);
-extern int64_t sort__sym_cmp(struct hist_entry *, struct hist_entry *);
-extern int64_t sort__parent_cmp(struct hist_entry *, struct hist_entry *);
-int64_t sort__cpu_cmp(struct hist_entry *left, struct hist_entry *right);
 extern size_t sort__parent_print(FILE *, struct hist_entry *, unsigned int);
 extern int sort_dimension__add(const char *);
 void sort_entry__setup_elide(struct sort_entry *self, struct strlist *list,
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/6] perf tools: Remove sort print helpers declarations
  2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
  2011-06-29 23:34 ` [PATCH 1/6] perf tools: Add inverted call graph report support Frederic Weisbecker
  2011-06-29 23:34 ` [PATCH 2/6] perf tools: Make sort operations static Frederic Weisbecker
@ 2011-06-29 23:34 ` Frederic Weisbecker
  2011-06-29 23:34 ` [PATCH 4/6] perf tools: Don't display ignored entries on stdio ui Frederic Weisbecker
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern, Sam Liao

These are probably some old leftovers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
---
 tools/perf/util/sort.h |    6 ------
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 4a6d309..77d0388 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -103,12 +103,6 @@ extern struct sort_entry sort_thread;
 extern struct list_head hist_entry__sort_list;
 
 void setup_sorting(const char * const usagestr[], const struct option *opts);
-
-extern size_t sort__thread_print(FILE *, struct hist_entry *, unsigned int);
-extern size_t sort__comm_print(FILE *, struct hist_entry *, unsigned int);
-extern size_t sort__dso_print(FILE *, struct hist_entry *, unsigned int);
-extern size_t sort__sym_print(FILE *, struct hist_entry *, unsigned int __used);
-extern size_t sort__parent_print(FILE *, struct hist_entry *, unsigned int);
 extern int sort_dimension__add(const char *);
 void sort_entry__setup_elide(struct sort_entry *self, struct strlist *list,
 			     const char *list_name, FILE *fp);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/6] perf tools: Don't display ignored entries on stdio ui
  2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2011-06-29 23:34 ` [PATCH 3/6] perf tools: Remove sort print helpers declarations Frederic Weisbecker
@ 2011-06-29 23:34 ` Frederic Weisbecker
  2011-06-29 23:34 ` [PATCH 5/6] perf tools: Allow sort dimensions to be registered more than once Frederic Weisbecker
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern, Sam Liao

As for newt ui, don't display entries that have been marked
as ignored.

The practical current effect of this is to make parent
filtering really working. Before, entries that were ignored
were given a null parent but were still displayed. This
resulted in some weird effects:

 # Overhead      Command      Shared Object        Symbol
 # ........  ...........  .................  ............
 #
^A
                   |
                   --- __lock_acquire
                      |
                      |--95.97%-- lock_acquire
                      |          |
                      |          |--30.75%-- _raw_spin_lock

Discard these from the stdio display.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
---
 tools/perf/util/hist.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index dae4202..677e1da 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -847,6 +847,9 @@ print_entries:
 	for (nd = rb_first(&self->entries); nd; nd = rb_next(nd)) {
 		struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
 
+		if (h->filtered)
+			continue;
+
 		if (show_displacement) {
 			if (h->pair != NULL)
 				displacement = ((long)h->pair->position -
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/6] perf tools: Allow sort dimensions to be registered more than once
  2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2011-06-29 23:34 ` [PATCH 4/6] perf tools: Don't display ignored entries on stdio ui Frederic Weisbecker
@ 2011-06-29 23:34 ` Frederic Weisbecker
  2011-06-29 23:34 ` [PATCH 6/6] perf tools: Only display parent field if explictly sorted Frederic Weisbecker
  2011-07-01 10:01 ` [GIT PULL] perf tools updates Ingo Molnar
  6 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern, Sam Liao

So that the parent sort dimension can be registered twice: once
if we add it as an explicit sort dimension (-s parent) and twice
if we request a parent filter (-p foo).

We'll have only one parent sort dimension in the end but this
allows to override the default parent filter with we gave in "-p"
option. The goal of this is to prepare to allow the use of
"-s parent" and "-p foo" at the same time, ie: sort by filtered
parent.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
---
 tools/perf/util/sort.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index f5dba56..401e220 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -260,15 +260,9 @@ int sort_dimension__add(const char *tok)
 	for (i = 0; i < ARRAY_SIZE(sort_dimensions); i++) {
 		struct sort_dimension *sd = &sort_dimensions[i];
 
-		if (sd->taken)
-			continue;
-
 		if (strncasecmp(tok, sd->name, strlen(tok)))
 			continue;
 
-		if (sd->entry->se_collapse)
-			sort__need_collapse = 1;
-
 		if (sd->entry == &sort_parent) {
 			int ret = regcomp(&parent_regex, parent_pattern, REG_EXTENDED);
 			if (ret) {
@@ -281,6 +275,12 @@ int sort_dimension__add(const char *tok)
 			sort__has_parent = 1;
 		}
 
+		if (sd->taken)
+			return 0;
+
+		if (sd->entry->se_collapse)
+			sort__need_collapse = 1;
+
 		if (list_empty(&hist_entry__sort_list)) {
 			if (!strcmp(sd->name, "pid"))
 				sort__first_dimension = SORT_PID;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/6] perf tools: Only display parent field if explictly sorted
  2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2011-06-29 23:34 ` [PATCH 5/6] perf tools: Allow sort dimensions to be registered more than once Frederic Weisbecker
@ 2011-06-29 23:34 ` Frederic Weisbecker
  2011-07-01 10:01 ` [GIT PULL] perf tools updates Ingo Molnar
  6 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern, Sam Liao

We don't need to display the parent field if the parent
sorting machinery is only used for parent filtering
(as in "-p foo").

However if parent filtering is used in combination with
explicit parent sorting ( -s parent), we want to
display it.

Result with:

  perf report -p kernel_thread -s parent

Before:

 # Overhead  Parent symbol
 # ........  .............
 #
     0.07%
            |
            --- ioread8
                ata_sff_check_status
                ata_sff_tf_load
                ata_sff_qc_issue
                ata_bmdma_qc_issue
                ata_qc_issue
                ata_scsi_translate
                ata_scsi_queuecmd
                scsi_dispatch_cmd
                scsi_request_fn
                __blk_run_queue
                __make_request
                generic_make_request
                submit_bio
                submit_bh
                journal_submit_commit_record
                jbd2_journal_commit_transaction
                kjournald2
                kthread
                kernel_thread_helpe

After:

 # Overhead  Parent symbol
 # ........  .............
 #
     0.07%  kernel_thread_helper
            |
            --- ioread8
                ata_sff_check_status
                ata_sff_tf_load
                ata_sff_qc_issue
                ata_bmdma_qc_issue
                ata_qc_issue
                ata_scsi_translate
                ata_scsi_queuecmd
                scsi_dispatch_cmd
                scsi_request_fn
                __blk_run_queue
                __make_request
                generic_make_request
                submit_bio
                submit_bh
                journal_submit_commit_record
                jbd2_journal_commit_transaction
                kjournald2
                kthread
                kernel_thread_helper

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
---
 tools/perf/builtin-report.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 271e252..5d43d01 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -525,7 +525,14 @@ int cmd_report(int argc, const char **argv, const char *prefix __used)
 	if (parent_pattern != default_parent_pattern) {
 		if (sort_dimension__add("parent") < 0)
 			return -1;
-		sort_parent.elide = 1;
+
+		/*
+		 * Only show the parent fields if we explicitly
+		 * sort that way. If we only use parent machinery
+		 * for filtering, we don't want it.
+		 */
+		if (!strstr(sort_order, "parent"))
+			sort_parent.elide = 1;
 	} else
 		symbol_conf.exclude_other = false;
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [GIT PULL] perf tools updates
  2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
                   ` (5 preceding siblings ...)
  2011-06-29 23:34 ` [PATCH 6/6] perf tools: Only display parent field if explictly sorted Frederic Weisbecker
@ 2011-07-01 10:01 ` Ingo Molnar
  6 siblings, 0 replies; 14+ messages in thread
From: Ingo Molnar @ 2011-07-01 10:01 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Arnaldo Carvalho de Melo, Stephane Eranian,
	David Ahern, Sam Liao


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> Ingo,
> 
> Please pull the perf/core branch that can be found at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> 	perf/core
> 
> It adds the inverted callchains support and let one use
> parent filtering with parent sorting at the same time, because
> it appears to me that inverted callchains sorted by filtered
> parents is pretty useful, and extendable to more cool things.
> 
> Anyway inverted callchains used with some different sorting combination
> in general can provide some interesting analysis flavours.
> 
> Having played with it a bit. It seems to me the callee point
> of view (traditional -g callchains) is better suited to
> find the precise zoomed-in places where cpu time is most
> spent. Spot contention places, etc...
> 
> OTOH, caller point of view (-G, inverted callchain), is
> for zoomed out observation, of course. It's more suited for
> global profiling. To get a big overview of where the hot bulk
> of a program is executing for example.
> 
> Examples:
> 
> - look at the hottest tree of call of a program.
> 
> 	./perf report -G -s pid --stdio
> 	
>      5.73%               perf:11933
>             |
>             --- __libc_start_main
>                |          
>                |--99.18%-- main
>                |          run_builtin
>                |          cmd_bench
>                |          |          
>                |          |--89.68%-- bench_sched_messaging
>                |          |          |          
>                |          |          |--96.11%-- create_worker
>                |          |          |          |          
>                |          |          |          |--95.10%-- __libc_fork
>                |          |          |          |          |          
>                |          |          |          |          |--93.99%-- stub_clone
>                |          |          |          |          |          sys_clone
>                |          |          |          |          |          do_fork
>                |          |          |          |          |          |          
>                |          |          |          |          |          |--99.09%-- copy_process
>                |          |          |          |          |          |          |          
>                |          |          |          |          |          |          |--91.62%-- dup_mm
> 
> - look at where kernel threads spend their time
> 
> 	perf report -G -p kernel_thread -s parent --stdio
> 	
> # Overhead  Parent symbol
> # ........  .............
> #
>      0.07%  kernel_thread_helper
>             |
>             --- kernel_thread_helper
>                 kthread
>                |          
>                |--50.00%-- kjournald2
>                |          jbd2_journal_commit_transaction
>                |          journal_submit_commit_record
>                |          submit_bh
>                |          submit_bio
>                |          generic_make_request
>                |          __make_request
>                |          __blk_run_queue
>                |          scsi_request_fn
>                |          scsi_dispatch_cmd
>                |          ata_scsi_queuecmd
>                |          ata_scsi_translate
>                |          ata_qc_issue
>                |          ata_bmdma_qc_issue
>                |          ata_sff_qc_issue
>                |          ata_sff_tf_load
>                |          ata_sff_check_status
>                |          ioread8
>                |          
>                 --50.00%-- rcu_kthread
>                           rcu_process_callbacks
>                           delayed_put_task_struct
>                           __put_task_struct
>                           free_task
>                           free_thread_info
>                           free_thread_xstate
>                           kmem_cache_free
>                           __slab_free
>                           add_partial
>                           _raw_spin_lock
>                           lock_acquire
>                           
> etc...
> 
> We could extend that by applying some cut in the callchains.
> For example stop a callchain on a given dso and you can profile
> which exported function is most called in it.
> 
> Anyway, this has some nice potential.
> 
> 
> Thanks,
> 	Frederic
> ---
> 
> Frederic Weisbecker (5):
>       perf tools: Make sort operations static
>       perf tools: Remove sort print helpers declarations
>       perf tools: Don't display ignored entries on stdio ui
>       perf tools: Allow sort dimensions to be registered more than once
>       perf tools: Only display parent field if explictly sorted
> 
> Sam Liao (1):
>       perf tools: Add inverted call graph report support.
> 
> 
>  tools/perf/Documentation/perf-report.txt |   15 ++-
>  tools/perf/builtin-report.c              |   42 +++++-
>  tools/perf/util/callchain.h              |    6 +
>  tools/perf/util/hist.c                   |    6 +-
>  tools/perf/util/session.c                |    7 +-
>  tools/perf/util/sort.c                   |  223 ++++++++++++++----------------
>  tools/perf/util/sort.h                   |   14 --
>  7 files changed, 169 insertions(+), 144 deletions(-)

Pulled, thanks a lot Frederic and Sam Liao!

This feature looks really useful.

One thing that occured to me: could we perhaps make -G the default 
for -g -A profiles and keep -g the default for task-hierarchy (and 
per PID) profiling? [a hint could be added to the comment section of 
the output to show that there's a -g/-G distinction.]

The reason is that -G is arguably more suited for global, system-wide 
profiling - and this is also the mode of display that sysprof uses 
and which people got used to in general.

There is some small confusion potential from switching the view like 
this but i think if we point it out in the output it should be fine:

#
# Bottom-up (-g) call-graph, use -G to view the top-down call-graph
#

#
# Top-down (-G) call-graph, use -g to view the bottom-up call-graph
#

Another thing: could we perhaps make inverted call-graphs the default 
view for perf top --tui as well? That is a common 'global view' 
profiling tool as well.

Finally, we should perhaps refer to them as bottom-up versus top-down 
call-graphs, 'inverted' and 'normal' does not really reflect the true 
nature of the call-graph, and to many people top-down is the natural 
call-graph view mode ...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-07-01 10:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 1/6] perf tools: Add inverted call graph report support Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 2/6] perf tools: Make sort operations static Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 3/6] perf tools: Remove sort print helpers declarations Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 4/6] perf tools: Don't display ignored entries on stdio ui Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 5/6] perf tools: Allow sort dimensions to be registered more than once Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 6/6] perf tools: Only display parent field if explictly sorted Frederic Weisbecker
2011-07-01 10:01 ` [GIT PULL] perf tools updates Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2011-05-22  1:45 Frederic Weisbecker
2011-05-22  8:49 ` Ingo Molnar
2011-05-22 12:07   ` Frederic Weisbecker
2010-04-24  2:05 Frederic Weisbecker
2010-04-24  2:27 ` Frederic Weisbecker
2010-04-27  9:15 ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).