* [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file
@ 2009-07-05 5:39 Frederic Weisbecker
2009-07-05 5:39 ` [PATCH 2/5] perf report: Use a modifiable string for default callchain options Frederic Weisbecker
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2009-07-05 5:39 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras,
Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo,
Frederic Weisbecker
perf report segfaults while trying to handle callchains from a non
callchain data file.
Instead of a segfault, print a useful message to the user.
Reported-by: Jens Axboe <jens.axboe@oracle.com>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
tools/perf/builtin-report.c | 16 +++++++++++++---
1 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index fa937f5..9f9575a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1684,9 +1684,19 @@ static int __cmd_report(void)
sample_type = perf_header__sample_type();
- if (sort__has_parent && !(sample_type & PERF_SAMPLE_CALLCHAIN)) {
- fprintf(stderr, "selected --sort parent, but no callchain data\n");
- exit(-1);
+ if (!(sample_type & PERF_SAMPLE_CALLCHAIN)) {
+ if (sort__has_parent) {
+ fprintf(stderr, "selected --sort parent, but no"
+ " callchain data. Did you call"
+ " perf record without -g?\n");
+ exit(-1);
+ }
+ if (callchain) {
+ fprintf(stderr, "selected -c but no callchain data."
+ " Did you call perf record without"
+ " -g?\n");
+ exit(-1);
+ }
}
if (load_kernel() < 0) {
--
1.6.2.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/5] perf report: Use a modifiable string for default callchain options
2009-07-05 5:39 [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file Frederic Weisbecker
@ 2009-07-05 5:39 ` Frederic Weisbecker
2009-07-05 9:52 ` [tip:perfcounters/urgent] " tip-bot for Frederic Weisbecker
2009-07-05 5:39 ` [PATCH 3/5] perf report: Change default callchain parameters Frederic Weisbecker
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Frederic Weisbecker @ 2009-07-05 5:39 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras,
Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo,
Frederic Weisbecker
If the user doesn't provide options to tune his callchain output
(ie: if he uses -c without arguments) then the default value passed
in the OPT_CALLBACK_DEFAULT() macro is used.
But it's parsed later by strtok() which will replace comma separators
to a zero. This may segfault as we are using a read-only string.
Use a modifiable one instead, and also fix the "100%" default
minimum threshold value by turning it into a 0 (output every callchains)
as it was intended in the origin.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
tools/perf/builtin-report.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 9f9575a..3db99fd 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -58,6 +58,8 @@ static char *parent_pattern = default_parent_pattern;
static regex_t parent_regex;
static int exclude_other = 1;
+
+static char callchain_default_opt[] = "flat,0";
static int callchain;
static enum chain_mode callchain_mode;
static double callchain_min_percent = 0.0;
@@ -1871,7 +1873,7 @@ static const struct option options[] = {
"Only display entries with parent-match"),
OPT_CALLBACK_DEFAULT('c', "callchain", NULL, "output_type,min_percent",
"Display callchains using output_type and min percent threshold. "
- "Default: flat,0", &parse_callchain_opt, "flat,100"),
+ "Default: flat,0", &parse_callchain_opt, callchain_default_opt),
OPT_STRING('d', "dsos", &dso_list_str, "dso[,dso...]",
"only consider symbols in these dsos"),
OPT_STRING('C', "comms", &comm_list_str, "comm[,comm...]",
--
1.6.2.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/5] perf report: Change default callchain parameters
2009-07-05 5:39 [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file Frederic Weisbecker
2009-07-05 5:39 ` [PATCH 2/5] perf report: Use a modifiable string for default callchain options Frederic Weisbecker
@ 2009-07-05 5:39 ` Frederic Weisbecker
2009-07-05 9:52 ` [tip:perfcounters/urgent] " tip-bot for Frederic Weisbecker
2009-07-05 5:39 ` [PATCH 4/5] perf tools: callchains: Manage the cumul hits on the fly Frederic Weisbecker
` (2 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Frederic Weisbecker @ 2009-07-05 5:39 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras,
Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo,
Frederic Weisbecker
The default callchain parameters are set to use the flat mode and never
filter any overhead threshold of backtrace.
But flat mode is boring compared to graph mode.
Also the number of callchains may be very high if none is
filtered.
Let's change this to set the graph view and a minimum overhead of 0.5%
as default parameters.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
tools/perf/builtin-report.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 3db99fd..8bd5865 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -59,10 +59,10 @@ static regex_t parent_regex;
static int exclude_other = 1;
-static char callchain_default_opt[] = "flat,0";
+static char callchain_default_opt[] = "graph,0.5";
static int callchain;
static enum chain_mode callchain_mode;
-static double callchain_min_percent = 0.0;
+static double callchain_min_percent = 0.5;
static u64 sample_type;
--
1.6.2.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/5] perf tools: callchains: Manage the cumul hits on the fly
2009-07-05 5:39 [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file Frederic Weisbecker
2009-07-05 5:39 ` [PATCH 2/5] perf report: Use a modifiable string for default callchain options Frederic Weisbecker
2009-07-05 5:39 ` [PATCH 3/5] perf report: Change default callchain parameters Frederic Weisbecker
@ 2009-07-05 5:39 ` Frederic Weisbecker
2009-07-05 9:52 ` [tip:perfcounters/urgent] perf_counter " tip-bot for Frederic Weisbecker
2009-07-05 5:39 ` [PATCH 5/5] perf report: Support callchains with relative overhead rate Frederic Weisbecker
2009-07-05 9:51 ` [tip:perfcounters/urgent] perf report: Warn on callchain output request from non-callchain file tip-bot for Frederic Weisbecker
4 siblings, 1 reply; 14+ messages in thread
From: Frederic Weisbecker @ 2009-07-05 5:39 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras,
Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo,
Frederic Weisbecker
The cumul hits are the number of hits of every childs of a node
plus the hits of the current nodes, required for percentage
computing of a branch.
Theses numbers are calculated during the sorting of the branches of
the callchain tree using a depth first postfix traversal, so that
cumulative hits are propagated in the right order.
But if we plan to implement percentages relative to the parent and not
absolute percentages (relative to the whole overhead), we need to know
the cumulative hits of the parent before computing the children
because the relative minimum acceptable number of entries (ie: minimum
rate against the cumulative hits from the parent) is the basis to
filter the children against a given rate.
Then we need to handle the cumul hits on the fly to prepare the
implementation of relative overhead rates.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
tools/perf/util/callchain.c | 12 ++++++++----
1 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index c9900fe..5d244af 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -74,13 +74,11 @@ static void __sort_chain_graph(struct callchain_node *node, u64 min_hit)
struct callchain_node *child;
node->rb_root = RB_ROOT;
- node->cumul_hit = node->hit;
chain_for_each_child(child, node) {
__sort_chain_graph(child, min_hit);
if (child->cumul_hit >= min_hit)
rb_insert_callchain(&node->rb_root, child, GRAPH);
- node->cumul_hit += child->cumul_hit;
}
}
@@ -159,7 +157,7 @@ add_child(struct callchain_node *parent, struct ip_callchain *chain,
new = create_child(parent, false);
fill_node(new, chain, start, syms);
- new->hit = 1;
+ new->cumul_hit = new->hit = 1;
}
/*
@@ -189,6 +187,7 @@ split_add_child(struct callchain_node *parent, struct ip_callchain *chain,
/* split the hits */
new->hit = parent->hit;
+ new->cumul_hit = parent->cumul_hit;
new->val_nr = parent->val_nr - idx_local;
parent->val_nr = idx_local;
@@ -216,10 +215,13 @@ __append_chain_children(struct callchain_node *root, struct ip_callchain *chain,
unsigned int ret = __append_chain(rnode, chain, start, syms);
if (!ret)
- return;
+ goto cumul;
}
/* nothing in children, add to the current node */
add_child(root, chain, start, syms);
+
+cumul:
+ root->cumul_hit++;
}
static int
@@ -261,6 +263,8 @@ __append_chain(struct callchain_node *root, struct ip_callchain *chain,
/* we match 100% of the path, increment the hit */
if (i - start == root->val_nr && i == chain->nr) {
root->hit++;
+ root->cumul_hit++;
+
return 0;
}
--
1.6.2.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/5] perf report: Support callchains with relative overhead rate
2009-07-05 5:39 [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file Frederic Weisbecker
` (2 preceding siblings ...)
2009-07-05 5:39 ` [PATCH 4/5] perf tools: callchains: Manage the cumul hits on the fly Frederic Weisbecker
@ 2009-07-05 5:39 ` Frederic Weisbecker
2009-07-05 8:34 ` Ingo Molnar
2009-07-05 9:52 ` [tip:perfcounters/urgent] perf report: Add "Fractal" mode output - support " tip-bot for Frederic Weisbecker
2009-07-05 9:51 ` [tip:perfcounters/urgent] perf report: Warn on callchain output request from non-callchain file tip-bot for Frederic Weisbecker
4 siblings, 2 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2009-07-05 5:39 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras,
Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo,
Frederic Weisbecker
The current callchain displays the overhead rates as absolute:
relative to the total overhead.
This patch provides relative overhead percentage, in which each
branch of the callchain tree is a independant instrumentated object.
You can produce such output by using the "relative" mode
that you can lower in r, re, rel, etc...
./perf report -s sym -c relative
Example:
8.46% [k] copy_user_generic_string
|
|--52.01%-- generic_file_aio_read
| do_sync_read
| vfs_read
| |
| |--97.20%-- sys_pread64
| | system_call_fastpath
| | pread64
| |
| --2.81%-- sys_read
| system_call_fastpath
| __read
|
|--39.85%-- generic_file_buffered_write
| __generic_file_aio_write_nolock
| generic_file_aio_write
| do_sync_write
| reiserfs_file_write
| vfs_write
| |
| |--97.05%-- sys_pwrite64
| | system_call_fastpath
| | __pwrite64
| |
| --2.95%-- sys_write
| system_call_fastpath
| __write_nocancel
[...]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
tools/perf/builtin-report.c | 58 ++++++++++++++++++++----------
tools/perf/util/callchain.c | 84 +++++++++++++++++++++++++++++++++++--------
tools/perf/util/callchain.h | 21 ++++++++---
3 files changed, 123 insertions(+), 40 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 8bd5865..ac9fb56 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -60,9 +60,14 @@ static regex_t parent_regex;
static int exclude_other = 1;
static char callchain_default_opt[] = "graph,0.5";
+
static int callchain;
-static enum chain_mode callchain_mode;
-static double callchain_min_percent = 0.5;
+
+static
+struct callchain_param callchain_param = {
+ .mode = CHAIN_GRAPH_ABS,
+ .min_percent = 0.5
+};
static u64 sample_type;
@@ -846,9 +851,15 @@ callchain__fprintf_graph(FILE *fp, struct callchain_node *self,
struct callchain_node *child;
struct callchain_list *chain;
int new_depth_mask = depth_mask;
+ u64 new_total;
size_t ret = 0;
int i;
+ if (callchain_param.mode == CHAIN_GRAPH_REL)
+ new_total = self->cumul_hit;
+ else
+ new_total = total_samples;
+
node = rb_first(&self->rb_root);
while (node) {
child = rb_entry(node, struct callchain_node, rb_node);
@@ -873,10 +884,10 @@ callchain__fprintf_graph(FILE *fp, struct callchain_node *self,
continue;
ret += ipchain__fprintf_graph(fp, chain, depth,
new_depth_mask, i++,
- total_samples,
+ new_total,
child->cumul_hit);
}
- ret += callchain__fprintf_graph(fp, child, total_samples,
+ ret += callchain__fprintf_graph(fp, child, new_total,
depth + 1,
new_depth_mask | (1 << depth));
node = next;
@@ -925,13 +936,18 @@ hist_entry_callchain__fprintf(FILE *fp, struct hist_entry *self,
chain = rb_entry(rb_node, struct callchain_node, rb_node);
percent = chain->hit * 100.0 / total_samples;
- if (callchain_mode == FLAT) {
+ switch (callchain_param.mode) {
+ case CHAIN_FLAT:
ret += percent_color_fprintf(fp, " %6.2f%%\n",
percent);
ret += callchain__fprintf_flat(fp, chain, total_samples);
- } else if (callchain_mode == GRAPH) {
+ break;
+ case CHAIN_GRAPH_ABS: /* Falldown */
+ case CHAIN_GRAPH_REL:
ret += callchain__fprintf_graph(fp, chain,
total_samples, 1, 1);
+ default:
+ break;
}
ret += fprintf(fp, "\n");
rb_node = rb_next(rb_node);
@@ -1219,14 +1235,9 @@ static void output__insert_entry(struct hist_entry *he, u64 min_callchain_hits)
struct rb_node *parent = NULL;
struct hist_entry *iter;
- if (callchain) {
- if (callchain_mode == FLAT)
- sort_chain_flat(&he->sorted_chain, &he->callchain,
- min_callchain_hits);
- else if (callchain_mode == GRAPH)
- sort_chain_graph(&he->sorted_chain, &he->callchain,
- min_callchain_hits);
- }
+ if (callchain)
+ callchain_param.sort(&he->sorted_chain, &he->callchain,
+ min_callchain_hits, &callchain_param);
while (*p != NULL) {
parent = *p;
@@ -1249,7 +1260,7 @@ static void output__resort(u64 total_samples)
struct rb_root *tree = &hist;
u64 min_callchain_hits;
- min_callchain_hits = total_samples * (callchain_min_percent / 100);
+ min_callchain_hits = total_samples * (callchain_param.min_percent / 100);
if (sort__need_collapse)
tree = &collapse_hists;
@@ -1829,22 +1840,31 @@ parse_callchain_opt(const struct option *opt __used, const char *arg,
/* get the output mode */
if (!strncmp(tok, "graph", strlen(arg)))
- callchain_mode = GRAPH;
+ callchain_param.mode = CHAIN_GRAPH_ABS;
else if (!strncmp(tok, "flat", strlen(arg)))
- callchain_mode = FLAT;
+ callchain_param.mode = CHAIN_FLAT;
+
+ else if (!strncmp(tok, "relative", strlen(arg)))
+ callchain_param.mode = CHAIN_GRAPH_REL;
+
else
return -1;
/* get the min percentage */
tok = strtok(NULL, ",");
if (!tok)
- return 0;
+ goto setup;
- callchain_min_percent = strtod(tok, &endptr);
+ callchain_param.min_percent = strtod(tok, &endptr);
if (tok == endptr)
return -1;
+setup:
+ if (register_callchain_param(&callchain_param) < 0) {
+ fprintf(stderr, "Can't register callchain params\n");
+ return -1;
+ }
return 0;
}
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 5d244af..9d3c814 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -32,13 +32,14 @@ rb_insert_callchain(struct rb_root *root, struct callchain_node *chain,
rnode = rb_entry(parent, struct callchain_node, rb_node);
switch (mode) {
- case FLAT:
+ case CHAIN_FLAT:
if (rnode->hit < chain->hit)
p = &(*p)->rb_left;
else
p = &(*p)->rb_right;
break;
- case GRAPH:
+ case CHAIN_GRAPH_ABS: /* Falldown */
+ case CHAIN_GRAPH_REL:
if (rnode->cumul_hit < chain->cumul_hit)
p = &(*p)->rb_left;
else
@@ -53,43 +54,96 @@ rb_insert_callchain(struct rb_root *root, struct callchain_node *chain,
rb_insert_color(&chain->rb_node, root);
}
+static void
+__sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node,
+ u64 min_hit)
+{
+ struct callchain_node *child;
+
+ chain_for_each_child(child, node)
+ __sort_chain_flat(rb_root, child, min_hit);
+
+ if (node->hit && node->hit >= min_hit)
+ rb_insert_callchain(rb_root, node, CHAIN_FLAT);
+}
+
/*
* Once we get every callchains from the stream, we can now
* sort them by hit
*/
-void sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node,
- u64 min_hit)
+static void
+sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node,
+ u64 min_hit, struct callchain_param *param __used)
+{
+ __sort_chain_flat(rb_root, node, min_hit);
+}
+
+static void __sort_chain_graph_abs(struct callchain_node *node,
+ u64 min_hit)
{
struct callchain_node *child;
- chain_for_each_child(child, node)
- sort_chain_flat(rb_root, child, min_hit);
+ node->rb_root = RB_ROOT;
- if (node->hit && node->hit >= min_hit)
- rb_insert_callchain(rb_root, node, FLAT);
+ chain_for_each_child(child, node) {
+ __sort_chain_graph_abs(child, min_hit);
+ if (child->cumul_hit >= min_hit)
+ rb_insert_callchain(&node->rb_root, child,
+ CHAIN_GRAPH_ABS);
+ }
+}
+
+static void
+sort_chain_graph_abs(struct rb_root *rb_root, struct callchain_node *chain_root,
+ u64 min_hit, struct callchain_param *param __used)
+{
+ __sort_chain_graph_abs(chain_root, min_hit);
+ rb_root->rb_node = chain_root->rb_root.rb_node;
}
-static void __sort_chain_graph(struct callchain_node *node, u64 min_hit)
+static void __sort_chain_graph_rel(struct callchain_node *node,
+ double min_percent)
{
struct callchain_node *child;
+ u64 min_hit;
node->rb_root = RB_ROOT;
+ min_hit = node->cumul_hit * min_percent / 100.0;
chain_for_each_child(child, node) {
- __sort_chain_graph(child, min_hit);
+ __sort_chain_graph_rel(child, min_percent);
if (child->cumul_hit >= min_hit)
- rb_insert_callchain(&node->rb_root, child, GRAPH);
+ rb_insert_callchain(&node->rb_root, child,
+ CHAIN_GRAPH_REL);
}
}
-void
-sort_chain_graph(struct rb_root *rb_root, struct callchain_node *chain_root,
- u64 min_hit)
+static void
+sort_chain_graph_rel(struct rb_root *rb_root, struct callchain_node *chain_root,
+ u64 min_hit __used, struct callchain_param *param)
{
- __sort_chain_graph(chain_root, min_hit);
+ __sort_chain_graph_rel(chain_root, param->min_percent);
rb_root->rb_node = chain_root->rb_root.rb_node;
}
+int register_callchain_param(struct callchain_param *param)
+{
+ switch (param->mode) {
+ case CHAIN_GRAPH_ABS:
+ param->sort = sort_chain_graph_abs;
+ break;
+ case CHAIN_GRAPH_REL:
+ param->sort = sort_chain_graph_rel;
+ break;
+ case CHAIN_FLAT:
+ param->sort = sort_chain_flat;
+ break;
+ default:
+ return -1;
+ }
+ return 0;
+}
+
/*
* Create a child for a parent. If inherit_children, then the new child
* will become the new parent of it's parent children
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index f3e4776..7812122 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -7,8 +7,9 @@
#include "symbol.h"
enum chain_mode {
- FLAT,
- GRAPH
+ CHAIN_FLAT,
+ CHAIN_GRAPH_ABS,
+ CHAIN_GRAPH_REL
};
struct callchain_node {
@@ -23,6 +24,17 @@ struct callchain_node {
u64 cumul_hit; /* hit + hits of children */
};
+struct callchain_param;
+
+typedef void (*sort_chain_func_t)(struct rb_root *, struct callchain_node *,
+ u64, struct callchain_param *);
+
+struct callchain_param {
+ enum chain_mode mode;
+ double min_percent;
+ sort_chain_func_t sort;
+};
+
struct callchain_list {
u64 ip;
struct symbol *sym;
@@ -36,10 +48,7 @@ static inline void callchain_init(struct callchain_node *node)
INIT_LIST_HEAD(&node->val);
}
+int register_callchain_param(struct callchain_param *param);
void append_chain(struct callchain_node *root, struct ip_callchain *chain,
struct symbol **syms);
-void sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node,
- u64 min_hit);
-void sort_chain_graph(struct rb_root *rb_root, struct callchain_node *node,
- u64 min_hit);
#endif
--
1.6.2.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] perf report: Support callchains with relative overhead rate
2009-07-05 5:39 ` [PATCH 5/5] perf report: Support callchains with relative overhead rate Frederic Weisbecker
@ 2009-07-05 8:34 ` Ingo Molnar
2009-07-05 8:59 ` Ingo Molnar
2009-07-05 13:19 ` Frederic Weisbecker
2009-07-05 9:52 ` [tip:perfcounters/urgent] perf report: Add "Fractal" mode output - support " tip-bot for Frederic Weisbecker
1 sibling, 2 replies; 14+ messages in thread
From: Ingo Molnar @ 2009-07-05 8:34 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras,
Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo
* Frederic Weisbecker <fweisbec@gmail.com> wrote:
> The current callchain displays the overhead rates as absolute:
> relative to the total overhead.
>
> This patch provides relative overhead percentage, in which each
> branch of the callchain tree is a independant instrumentated object.
>
> You can produce such output by using the "relative" mode
> that you can lower in r, re, rel, etc...
>
> ./perf report -s sym -c relative
>
> Example:
>
> 8.46% [k] copy_user_generic_string
> |
> |--52.01%-- generic_file_aio_read
> | do_sync_read
> | vfs_read
> | |
> | |--97.20%-- sys_pread64
> | | system_call_fastpath
> | | pread64
> | |
> | --2.81%-- sys_read
> | system_call_fastpath
> | __read
> |
> |--39.85%-- generic_file_buffered_write
> | __generic_file_aio_write_nolock
> | generic_file_aio_write
> | do_sync_write
> | reiserfs_file_write
> | vfs_write
> | |
> | |--97.05%-- sys_pwrite64
> | | system_call_fastpath
> | | __pwrite64
> | |
> | --2.95%-- sys_write
> | system_call_fastpath
> | __write_nocancel
> [...]
Wow, this is extremely intuitive and powerful looking!
It's basically a fractal structure: each sub-graph looks like a
full-blown profile in itself. Thus the overhead of individual
components of the graph profile can be analyzed without having to
think in small numbers.
The above example shows it particularly well - it shows that in
regard to generic_file_buffered_write() overhead, the system is
doing 97% sys_pwrite64() calls and 3% sys_write() calls.
Thus i took the liberty to change your last patch in two ways: i
renamed 'relative' to 'fractal' (it was not a proper counterpart to
'graph' anyway - we have no 'absolute' output mode name either), and
i changed it to be the default output mode. This stuff rocks!
absolute-graph and flat mode can be displayed too, via the option,
as usual.
Ingo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] perf report: Support callchains with relative overhead rate
2009-07-05 8:34 ` Ingo Molnar
@ 2009-07-05 8:59 ` Ingo Molnar
2009-07-05 13:23 ` Frederic Weisbecker
2009-07-05 13:19 ` Frederic Weisbecker
1 sibling, 1 reply; 14+ messages in thread
From: Ingo Molnar @ 2009-07-05 8:59 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras,
Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo
btw., i get some buggy looking output with:
$ perf record -f -g ~/hackbench 10
$ perf report -c
|--5.11%-- unix_stream_sendmsg
| |
| |--100.00%-- __sock_sendmsg
| | sock_aio_write
| | do_sync_write
| | vfs_write
| | sys_write
| | sysenter_dispatch
| | 0xf7f72430
| | 0xffebbca000000014
| |
| --11.11%-- sock_aio_write
| do_sync_write
| vfs_write
| sys_write
| sysenter_dispatch
| 0xf7f72430
| 0xffebbca000000014
Those percentages dont sum up to 100% :-)
Another detail: i think we should signal when we crop the output due
to the filter, via a line of:
| [...]
or so.
Plus, when doing 'perf report' on a call-chain recording, shouldnt
we auto-detect this fact and default to fractal output
automatically, instead of flat mode?
User can still force flat mode via 'perf report -c flat'.
Ingo
^ permalink raw reply [flat|nested] 14+ messages in thread
* [tip:perfcounters/urgent] perf report: Warn on callchain output request from non-callchain file
2009-07-05 5:39 [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file Frederic Weisbecker
` (3 preceding siblings ...)
2009-07-05 5:39 ` [PATCH 5/5] perf report: Support callchains with relative overhead rate Frederic Weisbecker
@ 2009-07-05 9:51 ` tip-bot for Frederic Weisbecker
4 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2009-07-05 9:51 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, anton, paulus, acme, hpa, mingo, a.p.zijlstra,
efault, jens.axboe, fweisbec, tglx, mingo
Commit-ID: 91b4eaea93f5be95f4477554399680a53aff2343
Gitweb: http://git.kernel.org/tip/91b4eaea93f5be95f4477554399680a53aff2343
Author: Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Sun, 5 Jul 2009 07:39:17 +0200
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 5 Jul 2009 10:30:21 +0200
perf report: Warn on callchain output request from non-callchain file
perf report segfaults while trying to handle callchains from a non
callchain data file.
Instead of a segfault, print a useful message to the user.
Reported-by: Jens Axboe <jens.axboe@oracle.com>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246772361-9960-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
tools/perf/builtin-report.c | 16 +++++++++++++---
1 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index fa937f5..9f9575a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1684,9 +1684,19 @@ static int __cmd_report(void)
sample_type = perf_header__sample_type();
- if (sort__has_parent && !(sample_type & PERF_SAMPLE_CALLCHAIN)) {
- fprintf(stderr, "selected --sort parent, but no callchain data\n");
- exit(-1);
+ if (!(sample_type & PERF_SAMPLE_CALLCHAIN)) {
+ if (sort__has_parent) {
+ fprintf(stderr, "selected --sort parent, but no"
+ " callchain data. Did you call"
+ " perf record without -g?\n");
+ exit(-1);
+ }
+ if (callchain) {
+ fprintf(stderr, "selected -c but no callchain data."
+ " Did you call perf record without"
+ " -g?\n");
+ exit(-1);
+ }
}
if (load_kernel() < 0) {
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [tip:perfcounters/urgent] perf report: Use a modifiable string for default callchain options
2009-07-05 5:39 ` [PATCH 2/5] perf report: Use a modifiable string for default callchain options Frederic Weisbecker
@ 2009-07-05 9:52 ` tip-bot for Frederic Weisbecker
0 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2009-07-05 9:52 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, acme, anton, paulus, hpa, mingo, a.p.zijlstra,
efault, jens.axboe, fweisbec, tglx, mingo
Commit-ID: be9038859e56f729cc9d3b070a35fb8829a73696
Gitweb: http://git.kernel.org/tip/be9038859e56f729cc9d3b070a35fb8829a73696
Author: Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Sun, 5 Jul 2009 07:39:18 +0200
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 5 Jul 2009 10:30:21 +0200
perf report: Use a modifiable string for default callchain options
If the user doesn't provide options to tune his callchain output
(ie: if he uses -c without arguments) then the default value passed
in the OPT_CALLBACK_DEFAULT() macro is used.
But it's parsed later by strtok() which will replace comma separators
to a zero. This may segfault as we are using a read-only string.
Use a modifiable one instead, and also fix the "100%" default
minimum threshold value by turning it into a 0 (output every callchains)
as it was intended in the origin.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246772361-9960-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
tools/perf/builtin-report.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 9f9575a..3db99fd 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -58,6 +58,8 @@ static char *parent_pattern = default_parent_pattern;
static regex_t parent_regex;
static int exclude_other = 1;
+
+static char callchain_default_opt[] = "flat,0";
static int callchain;
static enum chain_mode callchain_mode;
static double callchain_min_percent = 0.0;
@@ -1871,7 +1873,7 @@ static const struct option options[] = {
"Only display entries with parent-match"),
OPT_CALLBACK_DEFAULT('c', "callchain", NULL, "output_type,min_percent",
"Display callchains using output_type and min percent threshold. "
- "Default: flat,0", &parse_callchain_opt, "flat,100"),
+ "Default: flat,0", &parse_callchain_opt, callchain_default_opt),
OPT_STRING('d', "dsos", &dso_list_str, "dso[,dso...]",
"only consider symbols in these dsos"),
OPT_STRING('C', "comms", &comm_list_str, "comm[,comm...]",
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [tip:perfcounters/urgent] perf report: Change default callchain parameters
2009-07-05 5:39 ` [PATCH 3/5] perf report: Change default callchain parameters Frederic Weisbecker
@ 2009-07-05 9:52 ` tip-bot for Frederic Weisbecker
0 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2009-07-05 9:52 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, acme, anton, paulus, hpa, mingo, a.p.zijlstra,
efault, jens.axboe, fweisbec, tglx, mingo
Commit-ID: 94a8eb028a57854157a936c7e66b09e2559f115a
Gitweb: http://git.kernel.org/tip/94a8eb028a57854157a936c7e66b09e2559f115a
Author: Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Sun, 5 Jul 2009 07:39:19 +0200
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 5 Jul 2009 10:30:22 +0200
perf report: Change default callchain parameters
The default callchain parameters are set to use the flat mode and never
filter any overhead threshold of backtrace.
But flat mode is boring compared to graph mode.
Also the number of callchains may be very high if none is
filtered.
Let's change this to set the graph view and a minimum overhead of 0.5%
as default parameters.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246772361-9960-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
tools/perf/builtin-report.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 3db99fd..8bd5865 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -59,10 +59,10 @@ static regex_t parent_regex;
static int exclude_other = 1;
-static char callchain_default_opt[] = "flat,0";
+static char callchain_default_opt[] = "graph,0.5";
static int callchain;
static enum chain_mode callchain_mode;
-static double callchain_min_percent = 0.0;
+static double callchain_min_percent = 0.5;
static u64 sample_type;
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [tip:perfcounters/urgent] perf_counter tools: callchains: Manage the cumul hits on the fly
2009-07-05 5:39 ` [PATCH 4/5] perf tools: callchains: Manage the cumul hits on the fly Frederic Weisbecker
@ 2009-07-05 9:52 ` tip-bot for Frederic Weisbecker
0 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2009-07-05 9:52 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, acme, anton, paulus, hpa, mingo, a.p.zijlstra,
efault, jens.axboe, fweisbec, tglx, mingo
Commit-ID: e05b876c222178bc6abcfa9f23d8311731691046
Gitweb: http://git.kernel.org/tip/e05b876c222178bc6abcfa9f23d8311731691046
Author: Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Sun, 5 Jul 2009 07:39:20 +0200
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 5 Jul 2009 10:30:22 +0200
perf_counter tools: callchains: Manage the cumul hits on the fly
The cumul hits are the number of hits of every childs of a node
plus the hits of the current nodes, required for percentage
computing of a branch.
Theses numbers are calculated during the sorting of the branches of
the callchain tree using a depth first postfix traversal, so that
cumulative hits are propagated in the right order.
But if we plan to implement percentages relative to the parent and not
absolute percentages (relative to the whole overhead), we need to know
the cumulative hits of the parent before computing the children
because the relative minimum acceptable number of entries (ie: minimum
rate against the cumulative hits from the parent) is the basis to
filter the children against a given rate.
Then we need to handle the cumul hits on the fly to prepare the
implementation of relative overhead rates.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246772361-9960-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
tools/perf/util/callchain.c | 12 ++++++++----
1 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index c9900fe..5d244af 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -74,13 +74,11 @@ static void __sort_chain_graph(struct callchain_node *node, u64 min_hit)
struct callchain_node *child;
node->rb_root = RB_ROOT;
- node->cumul_hit = node->hit;
chain_for_each_child(child, node) {
__sort_chain_graph(child, min_hit);
if (child->cumul_hit >= min_hit)
rb_insert_callchain(&node->rb_root, child, GRAPH);
- node->cumul_hit += child->cumul_hit;
}
}
@@ -159,7 +157,7 @@ add_child(struct callchain_node *parent, struct ip_callchain *chain,
new = create_child(parent, false);
fill_node(new, chain, start, syms);
- new->hit = 1;
+ new->cumul_hit = new->hit = 1;
}
/*
@@ -189,6 +187,7 @@ split_add_child(struct callchain_node *parent, struct ip_callchain *chain,
/* split the hits */
new->hit = parent->hit;
+ new->cumul_hit = parent->cumul_hit;
new->val_nr = parent->val_nr - idx_local;
parent->val_nr = idx_local;
@@ -216,10 +215,13 @@ __append_chain_children(struct callchain_node *root, struct ip_callchain *chain,
unsigned int ret = __append_chain(rnode, chain, start, syms);
if (!ret)
- return;
+ goto cumul;
}
/* nothing in children, add to the current node */
add_child(root, chain, start, syms);
+
+cumul:
+ root->cumul_hit++;
}
static int
@@ -261,6 +263,8 @@ __append_chain(struct callchain_node *root, struct ip_callchain *chain,
/* we match 100% of the path, increment the hit */
if (i - start == root->val_nr && i == chain->nr) {
root->hit++;
+ root->cumul_hit++;
+
return 0;
}
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [tip:perfcounters/urgent] perf report: Add "Fractal" mode output - support callchains with relative overhead rate
2009-07-05 5:39 ` [PATCH 5/5] perf report: Support callchains with relative overhead rate Frederic Weisbecker
2009-07-05 8:34 ` Ingo Molnar
@ 2009-07-05 9:52 ` tip-bot for Frederic Weisbecker
1 sibling, 0 replies; 14+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2009-07-05 9:52 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, acme, anton, paulus, hpa, mingo, a.p.zijlstra,
efault, jens.axboe, fweisbec, tglx, mingo
Commit-ID: 805d127d62472f17c7d79baa001a7651afe2fa47
Gitweb: http://git.kernel.org/tip/805d127d62472f17c7d79baa001a7651afe2fa47
Author: Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Sun, 5 Jul 2009 07:39:21 +0200
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Sun, 5 Jul 2009 10:30:23 +0200
perf report: Add "Fractal" mode output - support callchains with relative overhead rate
The current callchain displays the overhead rates as absolute:
relative to the total overhead.
This patch provides relative overhead percentage, in which each
branch of the callchain tree is a independant instrumentated object.
This provides a 'fractal' view of the call-chain profile: each
sub-graph looks like a profile in itself - relative to its parent.
You can produce such output by using the "fractal" mode
that you can abbreviate via f, fr, fra, frac, etc...
./perf report -s sym -c fractal
Example:
8.46% [k] copy_user_generic_string
|
|--52.01%-- generic_file_aio_read
| do_sync_read
| vfs_read
| |
| |--97.20%-- sys_pread64
| | system_call_fastpath
| | pread64
| |
| --2.81%-- sys_read
| system_call_fastpath
| __read
|
|--39.85%-- generic_file_buffered_write
| __generic_file_aio_write_nolock
| generic_file_aio_write
| do_sync_write
| reiserfs_file_write
| vfs_write
| |
| |--97.05%-- sys_pwrite64
| | system_call_fastpath
| | __pwrite64
| |
| --2.95%-- sys_write
| system_call_fastpath
| __write_nocancel
[...]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246772361-9960-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
tools/perf/builtin-report.c | 60 ++++++++++++++++++++----------
tools/perf/util/callchain.c | 84 +++++++++++++++++++++++++++++++++++--------
tools/perf/util/callchain.h | 21 ++++++++---
3 files changed, 124 insertions(+), 41 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 8bd5865..4e5cc26 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -59,10 +59,15 @@ static regex_t parent_regex;
static int exclude_other = 1;
-static char callchain_default_opt[] = "graph,0.5";
+static char callchain_default_opt[] = "fractal,0.5";
+
static int callchain;
-static enum chain_mode callchain_mode;
-static double callchain_min_percent = 0.5;
+
+static
+struct callchain_param callchain_param = {
+ .mode = CHAIN_GRAPH_ABS,
+ .min_percent = 0.5
+};
static u64 sample_type;
@@ -846,9 +851,15 @@ callchain__fprintf_graph(FILE *fp, struct callchain_node *self,
struct callchain_node *child;
struct callchain_list *chain;
int new_depth_mask = depth_mask;
+ u64 new_total;
size_t ret = 0;
int i;
+ if (callchain_param.mode == CHAIN_GRAPH_REL)
+ new_total = self->cumul_hit;
+ else
+ new_total = total_samples;
+
node = rb_first(&self->rb_root);
while (node) {
child = rb_entry(node, struct callchain_node, rb_node);
@@ -873,10 +884,10 @@ callchain__fprintf_graph(FILE *fp, struct callchain_node *self,
continue;
ret += ipchain__fprintf_graph(fp, chain, depth,
new_depth_mask, i++,
- total_samples,
+ new_total,
child->cumul_hit);
}
- ret += callchain__fprintf_graph(fp, child, total_samples,
+ ret += callchain__fprintf_graph(fp, child, new_total,
depth + 1,
new_depth_mask | (1 << depth));
node = next;
@@ -925,13 +936,18 @@ hist_entry_callchain__fprintf(FILE *fp, struct hist_entry *self,
chain = rb_entry(rb_node, struct callchain_node, rb_node);
percent = chain->hit * 100.0 / total_samples;
- if (callchain_mode == FLAT) {
+ switch (callchain_param.mode) {
+ case CHAIN_FLAT:
ret += percent_color_fprintf(fp, " %6.2f%%\n",
percent);
ret += callchain__fprintf_flat(fp, chain, total_samples);
- } else if (callchain_mode == GRAPH) {
+ break;
+ case CHAIN_GRAPH_ABS: /* Falldown */
+ case CHAIN_GRAPH_REL:
ret += callchain__fprintf_graph(fp, chain,
total_samples, 1, 1);
+ default:
+ break;
}
ret += fprintf(fp, "\n");
rb_node = rb_next(rb_node);
@@ -1219,14 +1235,9 @@ static void output__insert_entry(struct hist_entry *he, u64 min_callchain_hits)
struct rb_node *parent = NULL;
struct hist_entry *iter;
- if (callchain) {
- if (callchain_mode == FLAT)
- sort_chain_flat(&he->sorted_chain, &he->callchain,
- min_callchain_hits);
- else if (callchain_mode == GRAPH)
- sort_chain_graph(&he->sorted_chain, &he->callchain,
- min_callchain_hits);
- }
+ if (callchain)
+ callchain_param.sort(&he->sorted_chain, &he->callchain,
+ min_callchain_hits, &callchain_param);
while (*p != NULL) {
parent = *p;
@@ -1249,7 +1260,7 @@ static void output__resort(u64 total_samples)
struct rb_root *tree = &hist;
u64 min_callchain_hits;
- min_callchain_hits = total_samples * (callchain_min_percent / 100);
+ min_callchain_hits = total_samples * (callchain_param.min_percent / 100);
if (sort__need_collapse)
tree = &collapse_hists;
@@ -1829,22 +1840,31 @@ parse_callchain_opt(const struct option *opt __used, const char *arg,
/* get the output mode */
if (!strncmp(tok, "graph", strlen(arg)))
- callchain_mode = GRAPH;
+ callchain_param.mode = CHAIN_GRAPH_ABS;
else if (!strncmp(tok, "flat", strlen(arg)))
- callchain_mode = FLAT;
+ callchain_param.mode = CHAIN_FLAT;
+
+ else if (!strncmp(tok, "fractal", strlen(arg)))
+ callchain_param.mode = CHAIN_GRAPH_REL;
+
else
return -1;
/* get the min percentage */
tok = strtok(NULL, ",");
if (!tok)
- return 0;
+ goto setup;
- callchain_min_percent = strtod(tok, &endptr);
+ callchain_param.min_percent = strtod(tok, &endptr);
if (tok == endptr)
return -1;
+setup:
+ if (register_callchain_param(&callchain_param) < 0) {
+ fprintf(stderr, "Can't register callchain params\n");
+ return -1;
+ }
return 0;
}
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 5d244af..9d3c814 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -32,13 +32,14 @@ rb_insert_callchain(struct rb_root *root, struct callchain_node *chain,
rnode = rb_entry(parent, struct callchain_node, rb_node);
switch (mode) {
- case FLAT:
+ case CHAIN_FLAT:
if (rnode->hit < chain->hit)
p = &(*p)->rb_left;
else
p = &(*p)->rb_right;
break;
- case GRAPH:
+ case CHAIN_GRAPH_ABS: /* Falldown */
+ case CHAIN_GRAPH_REL:
if (rnode->cumul_hit < chain->cumul_hit)
p = &(*p)->rb_left;
else
@@ -53,43 +54,96 @@ rb_insert_callchain(struct rb_root *root, struct callchain_node *chain,
rb_insert_color(&chain->rb_node, root);
}
+static void
+__sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node,
+ u64 min_hit)
+{
+ struct callchain_node *child;
+
+ chain_for_each_child(child, node)
+ __sort_chain_flat(rb_root, child, min_hit);
+
+ if (node->hit && node->hit >= min_hit)
+ rb_insert_callchain(rb_root, node, CHAIN_FLAT);
+}
+
/*
* Once we get every callchains from the stream, we can now
* sort them by hit
*/
-void sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node,
- u64 min_hit)
+static void
+sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node,
+ u64 min_hit, struct callchain_param *param __used)
+{
+ __sort_chain_flat(rb_root, node, min_hit);
+}
+
+static void __sort_chain_graph_abs(struct callchain_node *node,
+ u64 min_hit)
{
struct callchain_node *child;
- chain_for_each_child(child, node)
- sort_chain_flat(rb_root, child, min_hit);
+ node->rb_root = RB_ROOT;
- if (node->hit && node->hit >= min_hit)
- rb_insert_callchain(rb_root, node, FLAT);
+ chain_for_each_child(child, node) {
+ __sort_chain_graph_abs(child, min_hit);
+ if (child->cumul_hit >= min_hit)
+ rb_insert_callchain(&node->rb_root, child,
+ CHAIN_GRAPH_ABS);
+ }
+}
+
+static void
+sort_chain_graph_abs(struct rb_root *rb_root, struct callchain_node *chain_root,
+ u64 min_hit, struct callchain_param *param __used)
+{
+ __sort_chain_graph_abs(chain_root, min_hit);
+ rb_root->rb_node = chain_root->rb_root.rb_node;
}
-static void __sort_chain_graph(struct callchain_node *node, u64 min_hit)
+static void __sort_chain_graph_rel(struct callchain_node *node,
+ double min_percent)
{
struct callchain_node *child;
+ u64 min_hit;
node->rb_root = RB_ROOT;
+ min_hit = node->cumul_hit * min_percent / 100.0;
chain_for_each_child(child, node) {
- __sort_chain_graph(child, min_hit);
+ __sort_chain_graph_rel(child, min_percent);
if (child->cumul_hit >= min_hit)
- rb_insert_callchain(&node->rb_root, child, GRAPH);
+ rb_insert_callchain(&node->rb_root, child,
+ CHAIN_GRAPH_REL);
}
}
-void
-sort_chain_graph(struct rb_root *rb_root, struct callchain_node *chain_root,
- u64 min_hit)
+static void
+sort_chain_graph_rel(struct rb_root *rb_root, struct callchain_node *chain_root,
+ u64 min_hit __used, struct callchain_param *param)
{
- __sort_chain_graph(chain_root, min_hit);
+ __sort_chain_graph_rel(chain_root, param->min_percent);
rb_root->rb_node = chain_root->rb_root.rb_node;
}
+int register_callchain_param(struct callchain_param *param)
+{
+ switch (param->mode) {
+ case CHAIN_GRAPH_ABS:
+ param->sort = sort_chain_graph_abs;
+ break;
+ case CHAIN_GRAPH_REL:
+ param->sort = sort_chain_graph_rel;
+ break;
+ case CHAIN_FLAT:
+ param->sort = sort_chain_flat;
+ break;
+ default:
+ return -1;
+ }
+ return 0;
+}
+
/*
* Create a child for a parent. If inherit_children, then the new child
* will become the new parent of it's parent children
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index f3e4776..7812122 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -7,8 +7,9 @@
#include "symbol.h"
enum chain_mode {
- FLAT,
- GRAPH
+ CHAIN_FLAT,
+ CHAIN_GRAPH_ABS,
+ CHAIN_GRAPH_REL
};
struct callchain_node {
@@ -23,6 +24,17 @@ struct callchain_node {
u64 cumul_hit; /* hit + hits of children */
};
+struct callchain_param;
+
+typedef void (*sort_chain_func_t)(struct rb_root *, struct callchain_node *,
+ u64, struct callchain_param *);
+
+struct callchain_param {
+ enum chain_mode mode;
+ double min_percent;
+ sort_chain_func_t sort;
+};
+
struct callchain_list {
u64 ip;
struct symbol *sym;
@@ -36,10 +48,7 @@ static inline void callchain_init(struct callchain_node *node)
INIT_LIST_HEAD(&node->val);
}
+int register_callchain_param(struct callchain_param *param);
void append_chain(struct callchain_node *root, struct ip_callchain *chain,
struct symbol **syms);
-void sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node,
- u64 min_hit);
-void sort_chain_graph(struct rb_root *rb_root, struct callchain_node *node,
- u64 min_hit);
#endif
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] perf report: Support callchains with relative overhead rate
2009-07-05 8:34 ` Ingo Molnar
2009-07-05 8:59 ` Ingo Molnar
@ 2009-07-05 13:19 ` Frederic Weisbecker
1 sibling, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2009-07-05 13:19 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras,
Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo
On Sun, Jul 05, 2009 at 10:34:00AM +0200, Ingo Molnar wrote:
>
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
>
> > The current callchain displays the overhead rates as absolute:
> > relative to the total overhead.
> >
> > This patch provides relative overhead percentage, in which each
> > branch of the callchain tree is a independant instrumentated object.
> >
> > You can produce such output by using the "relative" mode
> > that you can lower in r, re, rel, etc...
> >
> > ./perf report -s sym -c relative
> >
> > Example:
> >
> > 8.46% [k] copy_user_generic_string
> > |
> > |--52.01%-- generic_file_aio_read
> > | do_sync_read
> > | vfs_read
> > | |
> > | |--97.20%-- sys_pread64
> > | | system_call_fastpath
> > | | pread64
> > | |
> > | --2.81%-- sys_read
> > | system_call_fastpath
> > | __read
> > |
> > |--39.85%-- generic_file_buffered_write
> > | __generic_file_aio_write_nolock
> > | generic_file_aio_write
> > | do_sync_write
> > | reiserfs_file_write
> > | vfs_write
> > | |
> > | |--97.05%-- sys_pwrite64
> > | | system_call_fastpath
> > | | __pwrite64
> > | |
> > | --2.95%-- sys_write
> > | system_call_fastpath
> > | __write_nocancel
> > [...]
>
> Wow, this is extremely intuitive and powerful looking!
>
> It's basically a fractal structure: each sub-graph looks like a
> full-blown profile in itself. Thus the overhead of individual
> components of the graph profile can be analyzed without having to
> think in small numbers.
>
> The above example shows it particularly well - it shows that in
> regard to generic_file_buffered_write() overhead, the system is
> doing 97% sys_pwrite64() calls and 3% sys_write() calls.
>
> Thus i took the liberty to change your last patch in two ways: i
> renamed 'relative' to 'fractal' (it was not a proper counterpart to
> 'graph' anyway - we have no 'absolute' output mode name either), and
> i changed it to be the default output mode. This stuff rocks!
Ok.
I first planned to add a submode, or more likely a parameter structured
like the following:
perf report -c layout,min,mode
where layout=flat|graph
and mode=abs|rel
But relative flat doesn't make sense.
fractal is nice, it's just a pity that "graph" mode doesn't tell much
about it's absolute measure context.
> absolute-graph and flat mode can be displayed too, via the option,
> as usual.
>
> Ingo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] perf report: Support callchains with relative overhead rate
2009-07-05 8:59 ` Ingo Molnar
@ 2009-07-05 13:23 ` Frederic Weisbecker
0 siblings, 0 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2009-07-05 13:23 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras,
Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo
On Sun, Jul 05, 2009 at 10:59:49AM +0200, Ingo Molnar wrote:
>
> btw., i get some buggy looking output with:
>
> $ perf record -f -g ~/hackbench 10
>
> $ perf report -c
>
>
> |--5.11%-- unix_stream_sendmsg
> | |
> | |--100.00%-- __sock_sendmsg
> | | sock_aio_write
> | | do_sync_write
> | | vfs_write
> | | sys_write
> | | sysenter_dispatch
> | | 0xf7f72430
> | | 0xffebbca000000014
> | |
> | --11.11%-- sock_aio_write
> | do_sync_write
> | vfs_write
> | sys_write
> | sysenter_dispatch
> | 0xf7f72430
> | 0xffebbca000000014
>
> Those percentages dont sum up to 100% :-)
Argh. I can reproduce it, will have a look.
> Another detail: i think we should signal when we crop the output due
> to the filter, via a line of:
>
> | [...]
>
> or so.
Ok.
> Plus, when doing 'perf report' on a call-chain recording, shouldnt
> we auto-detect this fact and default to fractal output
> automatically, instead of flat mode?
>
> User can still force flat mode via 'perf report -c flat'.
Yeah but the user won't be able to ignore the callchain.
May be I may add a -c none in this case?
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-07-05 13:23 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-05 5:39 [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file Frederic Weisbecker
2009-07-05 5:39 ` [PATCH 2/5] perf report: Use a modifiable string for default callchain options Frederic Weisbecker
2009-07-05 9:52 ` [tip:perfcounters/urgent] " tip-bot for Frederic Weisbecker
2009-07-05 5:39 ` [PATCH 3/5] perf report: Change default callchain parameters Frederic Weisbecker
2009-07-05 9:52 ` [tip:perfcounters/urgent] " tip-bot for Frederic Weisbecker
2009-07-05 5:39 ` [PATCH 4/5] perf tools: callchains: Manage the cumul hits on the fly Frederic Weisbecker
2009-07-05 9:52 ` [tip:perfcounters/urgent] perf_counter " tip-bot for Frederic Weisbecker
2009-07-05 5:39 ` [PATCH 5/5] perf report: Support callchains with relative overhead rate Frederic Weisbecker
2009-07-05 8:34 ` Ingo Molnar
2009-07-05 8:59 ` Ingo Molnar
2009-07-05 13:23 ` Frederic Weisbecker
2009-07-05 13:19 ` Frederic Weisbecker
2009-07-05 9:52 ` [tip:perfcounters/urgent] perf report: Add "Fractal" mode output - support " tip-bot for Frederic Weisbecker
2009-07-05 9:51 ` [tip:perfcounters/urgent] perf report: Warn on callchain output request from non-callchain file tip-bot for Frederic Weisbecker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox