* [PATCH 1/6] perf tools: Add inverted call graph report support.
2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
@ 2011-06-29 23:34 ` Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 2/6] perf tools: Make sort operations static Frederic Weisbecker
` (5 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Sam Liao, Peter Zijlstra, Arnaldo Carvalho de Melo,
Stephane Eranian, David Ahern, Frederic Weisbecker
From: Sam Liao <phyomh@gmail.com>
Add "caller/callee" option to support inverted butterfly report,
in the inverted report (with caller option), the call graph start
from the callee's ancestor. Users can use such view to catch system's
performance bottleneck from a sysprof like view. Using this option
with specified sort order like pid gives us high level view of call
graph statistics.
Also add "-G" alias for inverted call graph.
Signed-off-by: Sam Liao <phyomh@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
tools/perf/Documentation/perf-report.txt | 15 ++++++++++--
tools/perf/builtin-report.c | 33 ++++++++++++++++++++++++-----
tools/perf/util/callchain.h | 6 +++++
tools/perf/util/hist.c | 3 +-
tools/perf/util/session.c | 7 +++++-
5 files changed, 53 insertions(+), 11 deletions(-)
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 8ba03d6..cfa8e51 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -80,15 +80,24 @@ OPTIONS
--dump-raw-trace::
Dump raw trace in ASCII.
--g [type,min]::
+-g [type,min,order]::
--call-graph::
- Display call chains using type and min percent threshold.
+ Display call chains using type, min percent threshold and order.
type can be either:
- flat: single column, linear exposure of call chains.
- graph: use a graph tree, displaying absolute overhead rates.
- fractal: like graph, but displays relative rates. Each branch of
the tree is considered as a new profiled object. +
- Default: fractal,0.5.
+
+ order can be either:
+ - callee: callee based call graph.
+ - caller: inverted caller based call graph.
+
+ Default: fractal,0.5,callee.
+
+-G::
+--inverted::
+ alias for inverted caller based call graph.
--pretty=<key>::
Pretty printing style. key: normal, raw
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 287a173..271e252 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -45,7 +45,8 @@ static struct perf_read_values show_threads_values;
static const char default_pretty_printing_style[] = "normal";
static const char *pretty_printing_style = default_pretty_printing_style;
-static char callchain_default_opt[] = "fractal,0.5";
+static char callchain_default_opt[] = "fractal,0.5,callee";
+static bool inverted_callchain;
static symbol_filter_t annotate_init;
static int perf_session__add_hist_entry(struct perf_session *session,
@@ -386,13 +387,29 @@ parse_callchain_opt(const struct option *opt __used, const char *arg,
if (!tok)
goto setup;
- tok2 = strtok(NULL, ",");
callchain_param.min_percent = strtod(tok, &endptr);
if (tok == endptr)
return -1;
- if (tok2)
+ /* get the print limit */
+ tok2 = strtok(NULL, ",");
+ if (!tok2)
+ goto setup;
+
+ if (tok2[0] != 'c') {
callchain_param.print_limit = strtod(tok2, &endptr);
+ tok2 = strtok(NULL, ",");
+ if (!tok2)
+ goto setup;
+ }
+
+ /* get the call chain order */
+ if (!strcmp(tok2, "caller"))
+ callchain_param.order = ORDER_CALLER;
+ else if (!strcmp(tok2, "callee"))
+ callchain_param.order = ORDER_CALLEE;
+ else
+ return -1;
setup:
if (callchain_register_param(&callchain_param) < 0) {
fprintf(stderr, "Can't register callchain params\n");
@@ -436,9 +453,10 @@ static const struct option options[] = {
"regex filter to identify parent, see: '--sort parent'"),
OPT_BOOLEAN('x', "exclude-other", &symbol_conf.exclude_other,
"Only display entries with parent-match"),
- OPT_CALLBACK_DEFAULT('g', "call-graph", NULL, "output_type,min_percent",
- "Display callchains using output_type (graph, flat, fractal, or none) and min percent threshold. "
- "Default: fractal,0.5", &parse_callchain_opt, callchain_default_opt),
+ OPT_CALLBACK_DEFAULT('g', "call-graph", NULL, "output_type,min_percent, call_order",
+ "Display callchains using output_type (graph, flat, fractal, or none) , min percent threshold and callchain order. "
+ "Default: fractal,0.5,callee", &parse_callchain_opt, callchain_default_opt),
+ OPT_BOOLEAN('G', "inverted", &inverted_callchain, "alias for inverted call graph"),
OPT_STRING('d', "dsos", &symbol_conf.dso_list_str, "dso[,dso...]",
"only consider symbols in these dsos"),
OPT_STRING('C', "comms", &symbol_conf.comm_list_str, "comm[,comm...]",
@@ -467,6 +485,9 @@ int cmd_report(int argc, const char **argv, const char *prefix __used)
else if (use_tui)
use_browser = 1;
+ if (inverted_callchain)
+ callchain_param.order = ORDER_CALLER;
+
if (strcmp(input_name, "-") != 0)
setup_browser(true);
else
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 1a79df9..9b4ff16 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -14,6 +14,11 @@ enum chain_mode {
CHAIN_GRAPH_REL
};
+enum chain_order {
+ ORDER_CALLER,
+ ORDER_CALLEE
+};
+
struct callchain_node {
struct callchain_node *parent;
struct list_head siblings;
@@ -41,6 +46,7 @@ struct callchain_param {
u32 print_limit;
double min_percent;
sort_chain_func_t sort;
+ enum chain_order order;
};
struct callchain_list {
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 627a02e..dae4202 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -14,7 +14,8 @@ enum hist_filter {
struct callchain_param callchain_param = {
.mode = CHAIN_GRAPH_REL,
- .min_percent = 0.5
+ .min_percent = 0.5,
+ .order = ORDER_CALLEE
};
u16 hists__col_len(struct hists *self, enum hist_column col)
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index b723f21..558bcf9 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -247,9 +247,14 @@ int perf_session__resolve_callchain(struct perf_session *self,
callchain_cursor_reset(&self->callchain_cursor);
for (i = 0; i < chain->nr; i++) {
- u64 ip = chain->ips[i];
+ u64 ip;
struct addr_location al;
+ if (callchain_param.order == ORDER_CALLEE)
+ ip = chain->ips[i];
+ else
+ ip = chain->ips[chain->nr - i - 1];
+
if (ip >= PERF_CONTEXT_MAX) {
switch (ip) {
case PERF_CONTEXT_HV:
--
1.7.5.4
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 2/6] perf tools: Make sort operations static
2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 1/6] perf tools: Add inverted call graph report support Frederic Weisbecker
@ 2011-06-29 23:34 ` Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 3/6] perf tools: Remove sort print helpers declarations Frederic Weisbecker
` (4 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern, Sam Liao
These don't need to be globally visible.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
---
tools/perf/util/sort.c | 211 ++++++++++++++++++++++-------------------------
tools/perf/util/sort.h | 8 --
2 files changed, 99 insertions(+), 120 deletions(-)
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index f44fa54..f5dba56 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -15,95 +15,6 @@ char * field_sep;
LIST_HEAD(hist_entry__sort_list);
-static int hist_entry__thread_snprintf(struct hist_entry *self, char *bf,
- size_t size, unsigned int width);
-static int hist_entry__comm_snprintf(struct hist_entry *self, char *bf,
- size_t size, unsigned int width);
-static int hist_entry__dso_snprintf(struct hist_entry *self, char *bf,
- size_t size, unsigned int width);
-static int hist_entry__sym_snprintf(struct hist_entry *self, char *bf,
- size_t size, unsigned int width);
-static int hist_entry__parent_snprintf(struct hist_entry *self, char *bf,
- size_t size, unsigned int width);
-static int hist_entry__cpu_snprintf(struct hist_entry *self, char *bf,
- size_t size, unsigned int width);
-
-struct sort_entry sort_thread = {
- .se_header = "Command: Pid",
- .se_cmp = sort__thread_cmp,
- .se_snprintf = hist_entry__thread_snprintf,
- .se_width_idx = HISTC_THREAD,
-};
-
-struct sort_entry sort_comm = {
- .se_header = "Command",
- .se_cmp = sort__comm_cmp,
- .se_collapse = sort__comm_collapse,
- .se_snprintf = hist_entry__comm_snprintf,
- .se_width_idx = HISTC_COMM,
-};
-
-struct sort_entry sort_dso = {
- .se_header = "Shared Object",
- .se_cmp = sort__dso_cmp,
- .se_snprintf = hist_entry__dso_snprintf,
- .se_width_idx = HISTC_DSO,
-};
-
-struct sort_entry sort_sym = {
- .se_header = "Symbol",
- .se_cmp = sort__sym_cmp,
- .se_snprintf = hist_entry__sym_snprintf,
- .se_width_idx = HISTC_SYMBOL,
-};
-
-struct sort_entry sort_parent = {
- .se_header = "Parent symbol",
- .se_cmp = sort__parent_cmp,
- .se_snprintf = hist_entry__parent_snprintf,
- .se_width_idx = HISTC_PARENT,
-};
-
-struct sort_entry sort_cpu = {
- .se_header = "CPU",
- .se_cmp = sort__cpu_cmp,
- .se_snprintf = hist_entry__cpu_snprintf,
- .se_width_idx = HISTC_CPU,
-};
-
-struct sort_dimension {
- const char *name;
- struct sort_entry *entry;
- int taken;
-};
-
-static struct sort_dimension sort_dimensions[] = {
- { .name = "pid", .entry = &sort_thread, },
- { .name = "comm", .entry = &sort_comm, },
- { .name = "dso", .entry = &sort_dso, },
- { .name = "symbol", .entry = &sort_sym, },
- { .name = "parent", .entry = &sort_parent, },
- { .name = "cpu", .entry = &sort_cpu, },
-};
-
-int64_t cmp_null(void *l, void *r)
-{
- if (!l && !r)
- return 0;
- else if (!l)
- return -1;
- else
- return 1;
-}
-
-/* --sort pid */
-
-int64_t
-sort__thread_cmp(struct hist_entry *left, struct hist_entry *right)
-{
- return right->thread->pid - left->thread->pid;
-}
-
static int repsep_snprintf(char *bf, size_t size, const char *fmt, ...)
{
int n;
@@ -125,6 +36,24 @@ static int repsep_snprintf(char *bf, size_t size, const char *fmt, ...)
return n;
}
+static int64_t cmp_null(void *l, void *r)
+{
+ if (!l && !r)
+ return 0;
+ else if (!l)
+ return -1;
+ else
+ return 1;
+}
+
+/* --sort pid */
+
+static int64_t
+sort__thread_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ return right->thread->pid - left->thread->pid;
+}
+
static int hist_entry__thread_snprintf(struct hist_entry *self, char *bf,
size_t size, unsigned int width)
{
@@ -132,15 +61,50 @@ static int hist_entry__thread_snprintf(struct hist_entry *self, char *bf,
self->thread->comm ?: "", self->thread->pid);
}
+struct sort_entry sort_thread = {
+ .se_header = "Command: Pid",
+ .se_cmp = sort__thread_cmp,
+ .se_snprintf = hist_entry__thread_snprintf,
+ .se_width_idx = HISTC_THREAD,
+};
+
+/* --sort comm */
+
+static int64_t
+sort__comm_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ return right->thread->pid - left->thread->pid;
+}
+
+static int64_t
+sort__comm_collapse(struct hist_entry *left, struct hist_entry *right)
+{
+ char *comm_l = left->thread->comm;
+ char *comm_r = right->thread->comm;
+
+ if (!comm_l || !comm_r)
+ return cmp_null(comm_l, comm_r);
+
+ return strcmp(comm_l, comm_r);
+}
+
static int hist_entry__comm_snprintf(struct hist_entry *self, char *bf,
size_t size, unsigned int width)
{
return repsep_snprintf(bf, size, "%*s", width, self->thread->comm);
}
+struct sort_entry sort_comm = {
+ .se_header = "Command",
+ .se_cmp = sort__comm_cmp,
+ .se_collapse = sort__comm_collapse,
+ .se_snprintf = hist_entry__comm_snprintf,
+ .se_width_idx = HISTC_COMM,
+};
+
/* --sort dso */
-int64_t
+static int64_t
sort__dso_cmp(struct hist_entry *left, struct hist_entry *right)
{
struct dso *dso_l = left->ms.map ? left->ms.map->dso : NULL;
@@ -173,9 +137,16 @@ static int hist_entry__dso_snprintf(struct hist_entry *self, char *bf,
return repsep_snprintf(bf, size, "%-*s", width, "[unknown]");
}
+struct sort_entry sort_dso = {
+ .se_header = "Shared Object",
+ .se_cmp = sort__dso_cmp,
+ .se_snprintf = hist_entry__dso_snprintf,
+ .se_width_idx = HISTC_DSO,
+};
+
/* --sort symbol */
-int64_t
+static int64_t
sort__sym_cmp(struct hist_entry *left, struct hist_entry *right)
{
u64 ip_l, ip_r;
@@ -211,29 +182,16 @@ static int hist_entry__sym_snprintf(struct hist_entry *self, char *bf,
return ret;
}
-/* --sort comm */
-
-int64_t
-sort__comm_cmp(struct hist_entry *left, struct hist_entry *right)
-{
- return right->thread->pid - left->thread->pid;
-}
-
-int64_t
-sort__comm_collapse(struct hist_entry *left, struct hist_entry *right)
-{
- char *comm_l = left->thread->comm;
- char *comm_r = right->thread->comm;
-
- if (!comm_l || !comm_r)
- return cmp_null(comm_l, comm_r);
-
- return strcmp(comm_l, comm_r);
-}
+struct sort_entry sort_sym = {
+ .se_header = "Symbol",
+ .se_cmp = sort__sym_cmp,
+ .se_snprintf = hist_entry__sym_snprintf,
+ .se_width_idx = HISTC_SYMBOL,
+};
/* --sort parent */
-int64_t
+static int64_t
sort__parent_cmp(struct hist_entry *left, struct hist_entry *right)
{
struct symbol *sym_l = left->parent;
@@ -252,9 +210,16 @@ static int hist_entry__parent_snprintf(struct hist_entry *self, char *bf,
self->parent ? self->parent->name : "[other]");
}
+struct sort_entry sort_parent = {
+ .se_header = "Parent symbol",
+ .se_cmp = sort__parent_cmp,
+ .se_snprintf = hist_entry__parent_snprintf,
+ .se_width_idx = HISTC_PARENT,
+};
+
/* --sort cpu */
-int64_t
+static int64_t
sort__cpu_cmp(struct hist_entry *left, struct hist_entry *right)
{
return right->cpu - left->cpu;
@@ -266,6 +231,28 @@ static int hist_entry__cpu_snprintf(struct hist_entry *self, char *bf,
return repsep_snprintf(bf, size, "%-*d", width, self->cpu);
}
+struct sort_entry sort_cpu = {
+ .se_header = "CPU",
+ .se_cmp = sort__cpu_cmp,
+ .se_snprintf = hist_entry__cpu_snprintf,
+ .se_width_idx = HISTC_CPU,
+};
+
+struct sort_dimension {
+ const char *name;
+ struct sort_entry *entry;
+ int taken;
+};
+
+static struct sort_dimension sort_dimensions[] = {
+ { .name = "pid", .entry = &sort_thread, },
+ { .name = "comm", .entry = &sort_comm, },
+ { .name = "dso", .entry = &sort_dso, },
+ { .name = "symbol", .entry = &sort_sym, },
+ { .name = "parent", .entry = &sort_parent, },
+ { .name = "cpu", .entry = &sort_cpu, },
+};
+
int sort_dimension__add(const char *tok)
{
unsigned int i;
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 0b91053..4a6d309 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -108,14 +108,6 @@ extern size_t sort__thread_print(FILE *, struct hist_entry *, unsigned int);
extern size_t sort__comm_print(FILE *, struct hist_entry *, unsigned int);
extern size_t sort__dso_print(FILE *, struct hist_entry *, unsigned int);
extern size_t sort__sym_print(FILE *, struct hist_entry *, unsigned int __used);
-extern int64_t cmp_null(void *, void *);
-extern int64_t sort__thread_cmp(struct hist_entry *, struct hist_entry *);
-extern int64_t sort__comm_cmp(struct hist_entry *, struct hist_entry *);
-extern int64_t sort__comm_collapse(struct hist_entry *, struct hist_entry *);
-extern int64_t sort__dso_cmp(struct hist_entry *, struct hist_entry *);
-extern int64_t sort__sym_cmp(struct hist_entry *, struct hist_entry *);
-extern int64_t sort__parent_cmp(struct hist_entry *, struct hist_entry *);
-int64_t sort__cpu_cmp(struct hist_entry *left, struct hist_entry *right);
extern size_t sort__parent_print(FILE *, struct hist_entry *, unsigned int);
extern int sort_dimension__add(const char *);
void sort_entry__setup_elide(struct sort_entry *self, struct strlist *list,
--
1.7.5.4
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 3/6] perf tools: Remove sort print helpers declarations
2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 1/6] perf tools: Add inverted call graph report support Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 2/6] perf tools: Make sort operations static Frederic Weisbecker
@ 2011-06-29 23:34 ` Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 4/6] perf tools: Don't display ignored entries on stdio ui Frederic Weisbecker
` (3 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern, Sam Liao
These are probably some old leftovers.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
---
tools/perf/util/sort.h | 6 ------
1 files changed, 0 insertions(+), 6 deletions(-)
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 4a6d309..77d0388 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -103,12 +103,6 @@ extern struct sort_entry sort_thread;
extern struct list_head hist_entry__sort_list;
void setup_sorting(const char * const usagestr[], const struct option *opts);
-
-extern size_t sort__thread_print(FILE *, struct hist_entry *, unsigned int);
-extern size_t sort__comm_print(FILE *, struct hist_entry *, unsigned int);
-extern size_t sort__dso_print(FILE *, struct hist_entry *, unsigned int);
-extern size_t sort__sym_print(FILE *, struct hist_entry *, unsigned int __used);
-extern size_t sort__parent_print(FILE *, struct hist_entry *, unsigned int);
extern int sort_dimension__add(const char *);
void sort_entry__setup_elide(struct sort_entry *self, struct strlist *list,
const char *list_name, FILE *fp);
--
1.7.5.4
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 4/6] perf tools: Don't display ignored entries on stdio ui
2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
` (2 preceding siblings ...)
2011-06-29 23:34 ` [PATCH 3/6] perf tools: Remove sort print helpers declarations Frederic Weisbecker
@ 2011-06-29 23:34 ` Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 5/6] perf tools: Allow sort dimensions to be registered more than once Frederic Weisbecker
` (2 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern, Sam Liao
As for newt ui, don't display entries that have been marked
as ignored.
The practical current effect of this is to make parent
filtering really working. Before, entries that were ignored
were given a null parent but were still displayed. This
resulted in some weird effects:
# Overhead Command Shared Object Symbol
# ........ ........... ................. ............
#
^A
|
--- __lock_acquire
|
|--95.97%-- lock_acquire
| |
| |--30.75%-- _raw_spin_lock
Discard these from the stdio display.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
---
tools/perf/util/hist.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index dae4202..677e1da 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -847,6 +847,9 @@ print_entries:
for (nd = rb_first(&self->entries); nd; nd = rb_next(nd)) {
struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
+ if (h->filtered)
+ continue;
+
if (show_displacement) {
if (h->pair != NULL)
displacement = ((long)h->pair->position -
--
1.7.5.4
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 5/6] perf tools: Allow sort dimensions to be registered more than once
2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
` (3 preceding siblings ...)
2011-06-29 23:34 ` [PATCH 4/6] perf tools: Don't display ignored entries on stdio ui Frederic Weisbecker
@ 2011-06-29 23:34 ` Frederic Weisbecker
2011-06-29 23:34 ` [PATCH 6/6] perf tools: Only display parent field if explictly sorted Frederic Weisbecker
2011-07-01 10:01 ` [GIT PULL] perf tools updates Ingo Molnar
6 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern, Sam Liao
So that the parent sort dimension can be registered twice: once
if we add it as an explicit sort dimension (-s parent) and twice
if we request a parent filter (-p foo).
We'll have only one parent sort dimension in the end but this
allows to override the default parent filter with we gave in "-p"
option. The goal of this is to prepare to allow the use of
"-s parent" and "-p foo" at the same time, ie: sort by filtered
parent.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
---
tools/perf/util/sort.c | 12 ++++++------
1 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index f5dba56..401e220 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -260,15 +260,9 @@ int sort_dimension__add(const char *tok)
for (i = 0; i < ARRAY_SIZE(sort_dimensions); i++) {
struct sort_dimension *sd = &sort_dimensions[i];
- if (sd->taken)
- continue;
-
if (strncasecmp(tok, sd->name, strlen(tok)))
continue;
- if (sd->entry->se_collapse)
- sort__need_collapse = 1;
-
if (sd->entry == &sort_parent) {
int ret = regcomp(&parent_regex, parent_pattern, REG_EXTENDED);
if (ret) {
@@ -281,6 +275,12 @@ int sort_dimension__add(const char *tok)
sort__has_parent = 1;
}
+ if (sd->taken)
+ return 0;
+
+ if (sd->entry->se_collapse)
+ sort__need_collapse = 1;
+
if (list_empty(&hist_entry__sort_list)) {
if (!strcmp(sd->name, "pid"))
sort__first_dimension = SORT_PID;
--
1.7.5.4
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 6/6] perf tools: Only display parent field if explictly sorted
2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
` (4 preceding siblings ...)
2011-06-29 23:34 ` [PATCH 5/6] perf tools: Allow sort dimensions to be registered more than once Frederic Weisbecker
@ 2011-06-29 23:34 ` Frederic Weisbecker
2011-07-01 10:01 ` [GIT PULL] perf tools updates Ingo Molnar
6 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2011-06-29 23:34 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
Arnaldo Carvalho de Melo, Stephane Eranian, David Ahern, Sam Liao
We don't need to display the parent field if the parent
sorting machinery is only used for parent filtering
(as in "-p foo").
However if parent filtering is used in combination with
explicit parent sorting ( -s parent), we want to
display it.
Result with:
perf report -p kernel_thread -s parent
Before:
# Overhead Parent symbol
# ........ .............
#
0.07%
|
--- ioread8
ata_sff_check_status
ata_sff_tf_load
ata_sff_qc_issue
ata_bmdma_qc_issue
ata_qc_issue
ata_scsi_translate
ata_scsi_queuecmd
scsi_dispatch_cmd
scsi_request_fn
__blk_run_queue
__make_request
generic_make_request
submit_bio
submit_bh
journal_submit_commit_record
jbd2_journal_commit_transaction
kjournald2
kthread
kernel_thread_helpe
After:
# Overhead Parent symbol
# ........ .............
#
0.07% kernel_thread_helper
|
--- ioread8
ata_sff_check_status
ata_sff_tf_load
ata_sff_qc_issue
ata_bmdma_qc_issue
ata_qc_issue
ata_scsi_translate
ata_scsi_queuecmd
scsi_dispatch_cmd
scsi_request_fn
__blk_run_queue
__make_request
generic_make_request
submit_bio
submit_bh
journal_submit_commit_record
jbd2_journal_commit_transaction
kjournald2
kthread
kernel_thread_helper
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
---
tools/perf/builtin-report.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 271e252..5d43d01 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -525,7 +525,14 @@ int cmd_report(int argc, const char **argv, const char *prefix __used)
if (parent_pattern != default_parent_pattern) {
if (sort_dimension__add("parent") < 0)
return -1;
- sort_parent.elide = 1;
+
+ /*
+ * Only show the parent fields if we explicitly
+ * sort that way. If we only use parent machinery
+ * for filtering, we don't want it.
+ */
+ if (!strstr(sort_order, "parent"))
+ sort_parent.elide = 1;
} else
symbol_conf.exclude_other = false;
--
1.7.5.4
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [GIT PULL] perf tools updates
2011-06-29 23:34 [GIT PULL] perf tools updates Frederic Weisbecker
` (5 preceding siblings ...)
2011-06-29 23:34 ` [PATCH 6/6] perf tools: Only display parent field if explictly sorted Frederic Weisbecker
@ 2011-07-01 10:01 ` Ingo Molnar
6 siblings, 0 replies; 14+ messages in thread
From: Ingo Molnar @ 2011-07-01 10:01 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Peter Zijlstra, Arnaldo Carvalho de Melo, Stephane Eranian,
David Ahern, Sam Liao
* Frederic Weisbecker <fweisbec@gmail.com> wrote:
> Ingo,
>
> Please pull the perf/core branch that can be found at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> perf/core
>
> It adds the inverted callchains support and let one use
> parent filtering with parent sorting at the same time, because
> it appears to me that inverted callchains sorted by filtered
> parents is pretty useful, and extendable to more cool things.
>
> Anyway inverted callchains used with some different sorting combination
> in general can provide some interesting analysis flavours.
>
> Having played with it a bit. It seems to me the callee point
> of view (traditional -g callchains) is better suited to
> find the precise zoomed-in places where cpu time is most
> spent. Spot contention places, etc...
>
> OTOH, caller point of view (-G, inverted callchain), is
> for zoomed out observation, of course. It's more suited for
> global profiling. To get a big overview of where the hot bulk
> of a program is executing for example.
>
> Examples:
>
> - look at the hottest tree of call of a program.
>
> ./perf report -G -s pid --stdio
>
> 5.73% perf:11933
> |
> --- __libc_start_main
> |
> |--99.18%-- main
> | run_builtin
> | cmd_bench
> | |
> | |--89.68%-- bench_sched_messaging
> | | |
> | | |--96.11%-- create_worker
> | | | |
> | | | |--95.10%-- __libc_fork
> | | | | |
> | | | | |--93.99%-- stub_clone
> | | | | | sys_clone
> | | | | | do_fork
> | | | | | |
> | | | | | |--99.09%-- copy_process
> | | | | | | |
> | | | | | | |--91.62%-- dup_mm
>
> - look at where kernel threads spend their time
>
> perf report -G -p kernel_thread -s parent --stdio
>
> # Overhead Parent symbol
> # ........ .............
> #
> 0.07% kernel_thread_helper
> |
> --- kernel_thread_helper
> kthread
> |
> |--50.00%-- kjournald2
> | jbd2_journal_commit_transaction
> | journal_submit_commit_record
> | submit_bh
> | submit_bio
> | generic_make_request
> | __make_request
> | __blk_run_queue
> | scsi_request_fn
> | scsi_dispatch_cmd
> | ata_scsi_queuecmd
> | ata_scsi_translate
> | ata_qc_issue
> | ata_bmdma_qc_issue
> | ata_sff_qc_issue
> | ata_sff_tf_load
> | ata_sff_check_status
> | ioread8
> |
> --50.00%-- rcu_kthread
> rcu_process_callbacks
> delayed_put_task_struct
> __put_task_struct
> free_task
> free_thread_info
> free_thread_xstate
> kmem_cache_free
> __slab_free
> add_partial
> _raw_spin_lock
> lock_acquire
>
> etc...
>
> We could extend that by applying some cut in the callchains.
> For example stop a callchain on a given dso and you can profile
> which exported function is most called in it.
>
> Anyway, this has some nice potential.
>
>
> Thanks,
> Frederic
> ---
>
> Frederic Weisbecker (5):
> perf tools: Make sort operations static
> perf tools: Remove sort print helpers declarations
> perf tools: Don't display ignored entries on stdio ui
> perf tools: Allow sort dimensions to be registered more than once
> perf tools: Only display parent field if explictly sorted
>
> Sam Liao (1):
> perf tools: Add inverted call graph report support.
>
>
> tools/perf/Documentation/perf-report.txt | 15 ++-
> tools/perf/builtin-report.c | 42 +++++-
> tools/perf/util/callchain.h | 6 +
> tools/perf/util/hist.c | 6 +-
> tools/perf/util/session.c | 7 +-
> tools/perf/util/sort.c | 223 ++++++++++++++----------------
> tools/perf/util/sort.h | 14 --
> 7 files changed, 169 insertions(+), 144 deletions(-)
Pulled, thanks a lot Frederic and Sam Liao!
This feature looks really useful.
One thing that occured to me: could we perhaps make -G the default
for -g -A profiles and keep -g the default for task-hierarchy (and
per PID) profiling? [a hint could be added to the comment section of
the output to show that there's a -g/-G distinction.]
The reason is that -G is arguably more suited for global, system-wide
profiling - and this is also the mode of display that sysprof uses
and which people got used to in general.
There is some small confusion potential from switching the view like
this but i think if we point it out in the output it should be fine:
#
# Bottom-up (-g) call-graph, use -G to view the top-down call-graph
#
#
# Top-down (-G) call-graph, use -g to view the bottom-up call-graph
#
Another thing: could we perhaps make inverted call-graphs the default
view for perf top --tui as well? That is a common 'global view'
profiling tool as well.
Finally, we should perhaps refer to them as bottom-up versus top-down
call-graphs, 'inverted' and 'normal' does not really reflect the true
nature of the call-graph, and to many people top-down is the natural
call-graph view mode ...
Thanks,
Ingo
^ permalink raw reply [flat|nested] 14+ messages in thread