* [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file
@ 2009-07-05 5:39 Frederic Weisbecker
2009-07-05 5:39 ` [PATCH 2/5] perf report: Use a modifiable string for default callchain options Frederic Weisbecker
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Frederic Weisbecker @ 2009-07-05 5:39 UTC (permalink / raw)
To: Ingo Molnar
Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras,
Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo,
Frederic Weisbecker
perf report segfaults while trying to handle callchains from a non
callchain data file.
Instead of a segfault, print a useful message to the user.
Reported-by: Jens Axboe <jens.axboe@oracle.com>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
tools/perf/builtin-report.c | 16 +++++++++++++---
1 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index fa937f5..9f9575a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1684,9 +1684,19 @@ static int __cmd_report(void)
sample_type = perf_header__sample_type();
- if (sort__has_parent && !(sample_type & PERF_SAMPLE_CALLCHAIN)) {
- fprintf(stderr, "selected --sort parent, but no callchain data\n");
- exit(-1);
+ if (!(sample_type & PERF_SAMPLE_CALLCHAIN)) {
+ if (sort__has_parent) {
+ fprintf(stderr, "selected --sort parent, but no"
+ " callchain data. Did you call"
+ " perf record without -g?\n");
+ exit(-1);
+ }
+ if (callchain) {
+ fprintf(stderr, "selected -c but no callchain data."
+ " Did you call perf record without"
+ " -g?\n");
+ exit(-1);
+ }
}
if (load_kernel() < 0) {
--
1.6.2.3
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 2/5] perf report: Use a modifiable string for default callchain options 2009-07-05 5:39 [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file Frederic Weisbecker @ 2009-07-05 5:39 ` Frederic Weisbecker 2009-07-05 9:52 ` [tip:perfcounters/urgent] " tip-bot for Frederic Weisbecker 2009-07-05 5:39 ` [PATCH 3/5] perf report: Change default callchain parameters Frederic Weisbecker ` (3 subsequent siblings) 4 siblings, 1 reply; 14+ messages in thread From: Frederic Weisbecker @ 2009-07-05 5:39 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras, Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo, Frederic Weisbecker If the user doesn't provide options to tune his callchain output (ie: if he uses -c without arguments) then the default value passed in the OPT_CALLBACK_DEFAULT() macro is used. But it's parsed later by strtok() which will replace comma separators to a zero. This may segfault as we are using a read-only string. Use a modifiable one instead, and also fix the "100%" default minimum threshold value by turning it into a 0 (output every callchains) as it was intended in the origin. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> --- tools/perf/builtin-report.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 9f9575a..3db99fd 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -58,6 +58,8 @@ static char *parent_pattern = default_parent_pattern; static regex_t parent_regex; static int exclude_other = 1; + +static char callchain_default_opt[] = "flat,0"; static int callchain; static enum chain_mode callchain_mode; static double callchain_min_percent = 0.0; @@ -1871,7 +1873,7 @@ static const struct option options[] = { "Only display entries with parent-match"), OPT_CALLBACK_DEFAULT('c', "callchain", NULL, "output_type,min_percent", "Display callchains using output_type and min percent threshold. " - "Default: flat,0", &parse_callchain_opt, "flat,100"), + "Default: flat,0", &parse_callchain_opt, callchain_default_opt), OPT_STRING('d', "dsos", &dso_list_str, "dso[,dso...]", "only consider symbols in these dsos"), OPT_STRING('C', "comms", &comm_list_str, "comm[,comm...]", -- 1.6.2.3 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [tip:perfcounters/urgent] perf report: Use a modifiable string for default callchain options 2009-07-05 5:39 ` [PATCH 2/5] perf report: Use a modifiable string for default callchain options Frederic Weisbecker @ 2009-07-05 9:52 ` tip-bot for Frederic Weisbecker 0 siblings, 0 replies; 14+ messages in thread From: tip-bot for Frederic Weisbecker @ 2009-07-05 9:52 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, acme, anton, paulus, hpa, mingo, a.p.zijlstra, efault, jens.axboe, fweisbec, tglx, mingo Commit-ID: be9038859e56f729cc9d3b070a35fb8829a73696 Gitweb: http://git.kernel.org/tip/be9038859e56f729cc9d3b070a35fb8829a73696 Author: Frederic Weisbecker <fweisbec@gmail.com> AuthorDate: Sun, 5 Jul 2009 07:39:18 +0200 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Sun, 5 Jul 2009 10:30:21 +0200 perf report: Use a modifiable string for default callchain options If the user doesn't provide options to tune his callchain output (ie: if he uses -c without arguments) then the default value passed in the OPT_CALLBACK_DEFAULT() macro is used. But it's parsed later by strtok() which will replace comma separators to a zero. This may segfault as we are using a read-only string. Use a modifiable one instead, and also fix the "100%" default minimum threshold value by turning it into a 0 (output every callchains) as it was intended in the origin. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246772361-9960-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- tools/perf/builtin-report.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 9f9575a..3db99fd 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -58,6 +58,8 @@ static char *parent_pattern = default_parent_pattern; static regex_t parent_regex; static int exclude_other = 1; + +static char callchain_default_opt[] = "flat,0"; static int callchain; static enum chain_mode callchain_mode; static double callchain_min_percent = 0.0; @@ -1871,7 +1873,7 @@ static const struct option options[] = { "Only display entries with parent-match"), OPT_CALLBACK_DEFAULT('c', "callchain", NULL, "output_type,min_percent", "Display callchains using output_type and min percent threshold. " - "Default: flat,0", &parse_callchain_opt, "flat,100"), + "Default: flat,0", &parse_callchain_opt, callchain_default_opt), OPT_STRING('d', "dsos", &dso_list_str, "dso[,dso...]", "only consider symbols in these dsos"), OPT_STRING('C', "comms", &comm_list_str, "comm[,comm...]", ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/5] perf report: Change default callchain parameters 2009-07-05 5:39 [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file Frederic Weisbecker 2009-07-05 5:39 ` [PATCH 2/5] perf report: Use a modifiable string for default callchain options Frederic Weisbecker @ 2009-07-05 5:39 ` Frederic Weisbecker 2009-07-05 9:52 ` [tip:perfcounters/urgent] " tip-bot for Frederic Weisbecker 2009-07-05 5:39 ` [PATCH 4/5] perf tools: callchains: Manage the cumul hits on the fly Frederic Weisbecker ` (2 subsequent siblings) 4 siblings, 1 reply; 14+ messages in thread From: Frederic Weisbecker @ 2009-07-05 5:39 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras, Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo, Frederic Weisbecker The default callchain parameters are set to use the flat mode and never filter any overhead threshold of backtrace. But flat mode is boring compared to graph mode. Also the number of callchains may be very high if none is filtered. Let's change this to set the graph view and a minimum overhead of 0.5% as default parameters. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> --- tools/perf/builtin-report.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 3db99fd..8bd5865 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -59,10 +59,10 @@ static regex_t parent_regex; static int exclude_other = 1; -static char callchain_default_opt[] = "flat,0"; +static char callchain_default_opt[] = "graph,0.5"; static int callchain; static enum chain_mode callchain_mode; -static double callchain_min_percent = 0.0; +static double callchain_min_percent = 0.5; static u64 sample_type; -- 1.6.2.3 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [tip:perfcounters/urgent] perf report: Change default callchain parameters 2009-07-05 5:39 ` [PATCH 3/5] perf report: Change default callchain parameters Frederic Weisbecker @ 2009-07-05 9:52 ` tip-bot for Frederic Weisbecker 0 siblings, 0 replies; 14+ messages in thread From: tip-bot for Frederic Weisbecker @ 2009-07-05 9:52 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, acme, anton, paulus, hpa, mingo, a.p.zijlstra, efault, jens.axboe, fweisbec, tglx, mingo Commit-ID: 94a8eb028a57854157a936c7e66b09e2559f115a Gitweb: http://git.kernel.org/tip/94a8eb028a57854157a936c7e66b09e2559f115a Author: Frederic Weisbecker <fweisbec@gmail.com> AuthorDate: Sun, 5 Jul 2009 07:39:19 +0200 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Sun, 5 Jul 2009 10:30:22 +0200 perf report: Change default callchain parameters The default callchain parameters are set to use the flat mode and never filter any overhead threshold of backtrace. But flat mode is boring compared to graph mode. Also the number of callchains may be very high if none is filtered. Let's change this to set the graph view and a minimum overhead of 0.5% as default parameters. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246772361-9960-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- tools/perf/builtin-report.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 3db99fd..8bd5865 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -59,10 +59,10 @@ static regex_t parent_regex; static int exclude_other = 1; -static char callchain_default_opt[] = "flat,0"; +static char callchain_default_opt[] = "graph,0.5"; static int callchain; static enum chain_mode callchain_mode; -static double callchain_min_percent = 0.0; +static double callchain_min_percent = 0.5; static u64 sample_type; ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/5] perf tools: callchains: Manage the cumul hits on the fly 2009-07-05 5:39 [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file Frederic Weisbecker 2009-07-05 5:39 ` [PATCH 2/5] perf report: Use a modifiable string for default callchain options Frederic Weisbecker 2009-07-05 5:39 ` [PATCH 3/5] perf report: Change default callchain parameters Frederic Weisbecker @ 2009-07-05 5:39 ` Frederic Weisbecker 2009-07-05 9:52 ` [tip:perfcounters/urgent] perf_counter " tip-bot for Frederic Weisbecker 2009-07-05 5:39 ` [PATCH 5/5] perf report: Support callchains with relative overhead rate Frederic Weisbecker 2009-07-05 9:51 ` [tip:perfcounters/urgent] perf report: Warn on callchain output request from non-callchain file tip-bot for Frederic Weisbecker 4 siblings, 1 reply; 14+ messages in thread From: Frederic Weisbecker @ 2009-07-05 5:39 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras, Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo, Frederic Weisbecker The cumul hits are the number of hits of every childs of a node plus the hits of the current nodes, required for percentage computing of a branch. Theses numbers are calculated during the sorting of the branches of the callchain tree using a depth first postfix traversal, so that cumulative hits are propagated in the right order. But if we plan to implement percentages relative to the parent and not absolute percentages (relative to the whole overhead), we need to know the cumulative hits of the parent before computing the children because the relative minimum acceptable number of entries (ie: minimum rate against the cumulative hits from the parent) is the basis to filter the children against a given rate. Then we need to handle the cumul hits on the fly to prepare the implementation of relative overhead rates. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> --- tools/perf/util/callchain.c | 12 ++++++++---- 1 files changed, 8 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index c9900fe..5d244af 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -74,13 +74,11 @@ static void __sort_chain_graph(struct callchain_node *node, u64 min_hit) struct callchain_node *child; node->rb_root = RB_ROOT; - node->cumul_hit = node->hit; chain_for_each_child(child, node) { __sort_chain_graph(child, min_hit); if (child->cumul_hit >= min_hit) rb_insert_callchain(&node->rb_root, child, GRAPH); - node->cumul_hit += child->cumul_hit; } } @@ -159,7 +157,7 @@ add_child(struct callchain_node *parent, struct ip_callchain *chain, new = create_child(parent, false); fill_node(new, chain, start, syms); - new->hit = 1; + new->cumul_hit = new->hit = 1; } /* @@ -189,6 +187,7 @@ split_add_child(struct callchain_node *parent, struct ip_callchain *chain, /* split the hits */ new->hit = parent->hit; + new->cumul_hit = parent->cumul_hit; new->val_nr = parent->val_nr - idx_local; parent->val_nr = idx_local; @@ -216,10 +215,13 @@ __append_chain_children(struct callchain_node *root, struct ip_callchain *chain, unsigned int ret = __append_chain(rnode, chain, start, syms); if (!ret) - return; + goto cumul; } /* nothing in children, add to the current node */ add_child(root, chain, start, syms); + +cumul: + root->cumul_hit++; } static int @@ -261,6 +263,8 @@ __append_chain(struct callchain_node *root, struct ip_callchain *chain, /* we match 100% of the path, increment the hit */ if (i - start == root->val_nr && i == chain->nr) { root->hit++; + root->cumul_hit++; + return 0; } -- 1.6.2.3 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [tip:perfcounters/urgent] perf_counter tools: callchains: Manage the cumul hits on the fly 2009-07-05 5:39 ` [PATCH 4/5] perf tools: callchains: Manage the cumul hits on the fly Frederic Weisbecker @ 2009-07-05 9:52 ` tip-bot for Frederic Weisbecker 0 siblings, 0 replies; 14+ messages in thread From: tip-bot for Frederic Weisbecker @ 2009-07-05 9:52 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, acme, anton, paulus, hpa, mingo, a.p.zijlstra, efault, jens.axboe, fweisbec, tglx, mingo Commit-ID: e05b876c222178bc6abcfa9f23d8311731691046 Gitweb: http://git.kernel.org/tip/e05b876c222178bc6abcfa9f23d8311731691046 Author: Frederic Weisbecker <fweisbec@gmail.com> AuthorDate: Sun, 5 Jul 2009 07:39:20 +0200 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Sun, 5 Jul 2009 10:30:22 +0200 perf_counter tools: callchains: Manage the cumul hits on the fly The cumul hits are the number of hits of every childs of a node plus the hits of the current nodes, required for percentage computing of a branch. Theses numbers are calculated during the sorting of the branches of the callchain tree using a depth first postfix traversal, so that cumulative hits are propagated in the right order. But if we plan to implement percentages relative to the parent and not absolute percentages (relative to the whole overhead), we need to know the cumulative hits of the parent before computing the children because the relative minimum acceptable number of entries (ie: minimum rate against the cumulative hits from the parent) is the basis to filter the children against a given rate. Then we need to handle the cumul hits on the fly to prepare the implementation of relative overhead rates. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246772361-9960-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- tools/perf/util/callchain.c | 12 ++++++++---- 1 files changed, 8 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index c9900fe..5d244af 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -74,13 +74,11 @@ static void __sort_chain_graph(struct callchain_node *node, u64 min_hit) struct callchain_node *child; node->rb_root = RB_ROOT; - node->cumul_hit = node->hit; chain_for_each_child(child, node) { __sort_chain_graph(child, min_hit); if (child->cumul_hit >= min_hit) rb_insert_callchain(&node->rb_root, child, GRAPH); - node->cumul_hit += child->cumul_hit; } } @@ -159,7 +157,7 @@ add_child(struct callchain_node *parent, struct ip_callchain *chain, new = create_child(parent, false); fill_node(new, chain, start, syms); - new->hit = 1; + new->cumul_hit = new->hit = 1; } /* @@ -189,6 +187,7 @@ split_add_child(struct callchain_node *parent, struct ip_callchain *chain, /* split the hits */ new->hit = parent->hit; + new->cumul_hit = parent->cumul_hit; new->val_nr = parent->val_nr - idx_local; parent->val_nr = idx_local; @@ -216,10 +215,13 @@ __append_chain_children(struct callchain_node *root, struct ip_callchain *chain, unsigned int ret = __append_chain(rnode, chain, start, syms); if (!ret) - return; + goto cumul; } /* nothing in children, add to the current node */ add_child(root, chain, start, syms); + +cumul: + root->cumul_hit++; } static int @@ -261,6 +263,8 @@ __append_chain(struct callchain_node *root, struct ip_callchain *chain, /* we match 100% of the path, increment the hit */ if (i - start == root->val_nr && i == chain->nr) { root->hit++; + root->cumul_hit++; + return 0; } ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/5] perf report: Support callchains with relative overhead rate 2009-07-05 5:39 [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file Frederic Weisbecker ` (2 preceding siblings ...) 2009-07-05 5:39 ` [PATCH 4/5] perf tools: callchains: Manage the cumul hits on the fly Frederic Weisbecker @ 2009-07-05 5:39 ` Frederic Weisbecker 2009-07-05 8:34 ` Ingo Molnar 2009-07-05 9:52 ` [tip:perfcounters/urgent] perf report: Add "Fractal" mode output - support " tip-bot for Frederic Weisbecker 2009-07-05 9:51 ` [tip:perfcounters/urgent] perf report: Warn on callchain output request from non-callchain file tip-bot for Frederic Weisbecker 4 siblings, 2 replies; 14+ messages in thread From: Frederic Weisbecker @ 2009-07-05 5:39 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras, Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo, Frederic Weisbecker The current callchain displays the overhead rates as absolute: relative to the total overhead. This patch provides relative overhead percentage, in which each branch of the callchain tree is a independant instrumentated object. You can produce such output by using the "relative" mode that you can lower in r, re, rel, etc... ./perf report -s sym -c relative Example: 8.46% [k] copy_user_generic_string | |--52.01%-- generic_file_aio_read | do_sync_read | vfs_read | | | |--97.20%-- sys_pread64 | | system_call_fastpath | | pread64 | | | --2.81%-- sys_read | system_call_fastpath | __read | |--39.85%-- generic_file_buffered_write | __generic_file_aio_write_nolock | generic_file_aio_write | do_sync_write | reiserfs_file_write | vfs_write | | | |--97.05%-- sys_pwrite64 | | system_call_fastpath | | __pwrite64 | | | --2.95%-- sys_write | system_call_fastpath | __write_nocancel [...] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> --- tools/perf/builtin-report.c | 58 ++++++++++++++++++++---------- tools/perf/util/callchain.c | 84 +++++++++++++++++++++++++++++++++++-------- tools/perf/util/callchain.h | 21 ++++++++--- 3 files changed, 123 insertions(+), 40 deletions(-) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 8bd5865..ac9fb56 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -60,9 +60,14 @@ static regex_t parent_regex; static int exclude_other = 1; static char callchain_default_opt[] = "graph,0.5"; + static int callchain; -static enum chain_mode callchain_mode; -static double callchain_min_percent = 0.5; + +static +struct callchain_param callchain_param = { + .mode = CHAIN_GRAPH_ABS, + .min_percent = 0.5 +}; static u64 sample_type; @@ -846,9 +851,15 @@ callchain__fprintf_graph(FILE *fp, struct callchain_node *self, struct callchain_node *child; struct callchain_list *chain; int new_depth_mask = depth_mask; + u64 new_total; size_t ret = 0; int i; + if (callchain_param.mode == CHAIN_GRAPH_REL) + new_total = self->cumul_hit; + else + new_total = total_samples; + node = rb_first(&self->rb_root); while (node) { child = rb_entry(node, struct callchain_node, rb_node); @@ -873,10 +884,10 @@ callchain__fprintf_graph(FILE *fp, struct callchain_node *self, continue; ret += ipchain__fprintf_graph(fp, chain, depth, new_depth_mask, i++, - total_samples, + new_total, child->cumul_hit); } - ret += callchain__fprintf_graph(fp, child, total_samples, + ret += callchain__fprintf_graph(fp, child, new_total, depth + 1, new_depth_mask | (1 << depth)); node = next; @@ -925,13 +936,18 @@ hist_entry_callchain__fprintf(FILE *fp, struct hist_entry *self, chain = rb_entry(rb_node, struct callchain_node, rb_node); percent = chain->hit * 100.0 / total_samples; - if (callchain_mode == FLAT) { + switch (callchain_param.mode) { + case CHAIN_FLAT: ret += percent_color_fprintf(fp, " %6.2f%%\n", percent); ret += callchain__fprintf_flat(fp, chain, total_samples); - } else if (callchain_mode == GRAPH) { + break; + case CHAIN_GRAPH_ABS: /* Falldown */ + case CHAIN_GRAPH_REL: ret += callchain__fprintf_graph(fp, chain, total_samples, 1, 1); + default: + break; } ret += fprintf(fp, "\n"); rb_node = rb_next(rb_node); @@ -1219,14 +1235,9 @@ static void output__insert_entry(struct hist_entry *he, u64 min_callchain_hits) struct rb_node *parent = NULL; struct hist_entry *iter; - if (callchain) { - if (callchain_mode == FLAT) - sort_chain_flat(&he->sorted_chain, &he->callchain, - min_callchain_hits); - else if (callchain_mode == GRAPH) - sort_chain_graph(&he->sorted_chain, &he->callchain, - min_callchain_hits); - } + if (callchain) + callchain_param.sort(&he->sorted_chain, &he->callchain, + min_callchain_hits, &callchain_param); while (*p != NULL) { parent = *p; @@ -1249,7 +1260,7 @@ static void output__resort(u64 total_samples) struct rb_root *tree = &hist; u64 min_callchain_hits; - min_callchain_hits = total_samples * (callchain_min_percent / 100); + min_callchain_hits = total_samples * (callchain_param.min_percent / 100); if (sort__need_collapse) tree = &collapse_hists; @@ -1829,22 +1840,31 @@ parse_callchain_opt(const struct option *opt __used, const char *arg, /* get the output mode */ if (!strncmp(tok, "graph", strlen(arg))) - callchain_mode = GRAPH; + callchain_param.mode = CHAIN_GRAPH_ABS; else if (!strncmp(tok, "flat", strlen(arg))) - callchain_mode = FLAT; + callchain_param.mode = CHAIN_FLAT; + + else if (!strncmp(tok, "relative", strlen(arg))) + callchain_param.mode = CHAIN_GRAPH_REL; + else return -1; /* get the min percentage */ tok = strtok(NULL, ","); if (!tok) - return 0; + goto setup; - callchain_min_percent = strtod(tok, &endptr); + callchain_param.min_percent = strtod(tok, &endptr); if (tok == endptr) return -1; +setup: + if (register_callchain_param(&callchain_param) < 0) { + fprintf(stderr, "Can't register callchain params\n"); + return -1; + } return 0; } diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index 5d244af..9d3c814 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -32,13 +32,14 @@ rb_insert_callchain(struct rb_root *root, struct callchain_node *chain, rnode = rb_entry(parent, struct callchain_node, rb_node); switch (mode) { - case FLAT: + case CHAIN_FLAT: if (rnode->hit < chain->hit) p = &(*p)->rb_left; else p = &(*p)->rb_right; break; - case GRAPH: + case CHAIN_GRAPH_ABS: /* Falldown */ + case CHAIN_GRAPH_REL: if (rnode->cumul_hit < chain->cumul_hit) p = &(*p)->rb_left; else @@ -53,43 +54,96 @@ rb_insert_callchain(struct rb_root *root, struct callchain_node *chain, rb_insert_color(&chain->rb_node, root); } +static void +__sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node, + u64 min_hit) +{ + struct callchain_node *child; + + chain_for_each_child(child, node) + __sort_chain_flat(rb_root, child, min_hit); + + if (node->hit && node->hit >= min_hit) + rb_insert_callchain(rb_root, node, CHAIN_FLAT); +} + /* * Once we get every callchains from the stream, we can now * sort them by hit */ -void sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node, - u64 min_hit) +static void +sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node, + u64 min_hit, struct callchain_param *param __used) +{ + __sort_chain_flat(rb_root, node, min_hit); +} + +static void __sort_chain_graph_abs(struct callchain_node *node, + u64 min_hit) { struct callchain_node *child; - chain_for_each_child(child, node) - sort_chain_flat(rb_root, child, min_hit); + node->rb_root = RB_ROOT; - if (node->hit && node->hit >= min_hit) - rb_insert_callchain(rb_root, node, FLAT); + chain_for_each_child(child, node) { + __sort_chain_graph_abs(child, min_hit); + if (child->cumul_hit >= min_hit) + rb_insert_callchain(&node->rb_root, child, + CHAIN_GRAPH_ABS); + } +} + +static void +sort_chain_graph_abs(struct rb_root *rb_root, struct callchain_node *chain_root, + u64 min_hit, struct callchain_param *param __used) +{ + __sort_chain_graph_abs(chain_root, min_hit); + rb_root->rb_node = chain_root->rb_root.rb_node; } -static void __sort_chain_graph(struct callchain_node *node, u64 min_hit) +static void __sort_chain_graph_rel(struct callchain_node *node, + double min_percent) { struct callchain_node *child; + u64 min_hit; node->rb_root = RB_ROOT; + min_hit = node->cumul_hit * min_percent / 100.0; chain_for_each_child(child, node) { - __sort_chain_graph(child, min_hit); + __sort_chain_graph_rel(child, min_percent); if (child->cumul_hit >= min_hit) - rb_insert_callchain(&node->rb_root, child, GRAPH); + rb_insert_callchain(&node->rb_root, child, + CHAIN_GRAPH_REL); } } -void -sort_chain_graph(struct rb_root *rb_root, struct callchain_node *chain_root, - u64 min_hit) +static void +sort_chain_graph_rel(struct rb_root *rb_root, struct callchain_node *chain_root, + u64 min_hit __used, struct callchain_param *param) { - __sort_chain_graph(chain_root, min_hit); + __sort_chain_graph_rel(chain_root, param->min_percent); rb_root->rb_node = chain_root->rb_root.rb_node; } +int register_callchain_param(struct callchain_param *param) +{ + switch (param->mode) { + case CHAIN_GRAPH_ABS: + param->sort = sort_chain_graph_abs; + break; + case CHAIN_GRAPH_REL: + param->sort = sort_chain_graph_rel; + break; + case CHAIN_FLAT: + param->sort = sort_chain_flat; + break; + default: + return -1; + } + return 0; +} + /* * Create a child for a parent. If inherit_children, then the new child * will become the new parent of it's parent children diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h index f3e4776..7812122 100644 --- a/tools/perf/util/callchain.h +++ b/tools/perf/util/callchain.h @@ -7,8 +7,9 @@ #include "symbol.h" enum chain_mode { - FLAT, - GRAPH + CHAIN_FLAT, + CHAIN_GRAPH_ABS, + CHAIN_GRAPH_REL }; struct callchain_node { @@ -23,6 +24,17 @@ struct callchain_node { u64 cumul_hit; /* hit + hits of children */ }; +struct callchain_param; + +typedef void (*sort_chain_func_t)(struct rb_root *, struct callchain_node *, + u64, struct callchain_param *); + +struct callchain_param { + enum chain_mode mode; + double min_percent; + sort_chain_func_t sort; +}; + struct callchain_list { u64 ip; struct symbol *sym; @@ -36,10 +48,7 @@ static inline void callchain_init(struct callchain_node *node) INIT_LIST_HEAD(&node->val); } +int register_callchain_param(struct callchain_param *param); void append_chain(struct callchain_node *root, struct ip_callchain *chain, struct symbol **syms); -void sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node, - u64 min_hit); -void sort_chain_graph(struct rb_root *rb_root, struct callchain_node *node, - u64 min_hit); #endif -- 1.6.2.3 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] perf report: Support callchains with relative overhead rate 2009-07-05 5:39 ` [PATCH 5/5] perf report: Support callchains with relative overhead rate Frederic Weisbecker @ 2009-07-05 8:34 ` Ingo Molnar 2009-07-05 8:59 ` Ingo Molnar 2009-07-05 13:19 ` Frederic Weisbecker 2009-07-05 9:52 ` [tip:perfcounters/urgent] perf report: Add "Fractal" mode output - support " tip-bot for Frederic Weisbecker 1 sibling, 2 replies; 14+ messages in thread From: Ingo Molnar @ 2009-07-05 8:34 UTC (permalink / raw) To: Frederic Weisbecker Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras, Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo * Frederic Weisbecker <fweisbec@gmail.com> wrote: > The current callchain displays the overhead rates as absolute: > relative to the total overhead. > > This patch provides relative overhead percentage, in which each > branch of the callchain tree is a independant instrumentated object. > > You can produce such output by using the "relative" mode > that you can lower in r, re, rel, etc... > > ./perf report -s sym -c relative > > Example: > > 8.46% [k] copy_user_generic_string > | > |--52.01%-- generic_file_aio_read > | do_sync_read > | vfs_read > | | > | |--97.20%-- sys_pread64 > | | system_call_fastpath > | | pread64 > | | > | --2.81%-- sys_read > | system_call_fastpath > | __read > | > |--39.85%-- generic_file_buffered_write > | __generic_file_aio_write_nolock > | generic_file_aio_write > | do_sync_write > | reiserfs_file_write > | vfs_write > | | > | |--97.05%-- sys_pwrite64 > | | system_call_fastpath > | | __pwrite64 > | | > | --2.95%-- sys_write > | system_call_fastpath > | __write_nocancel > [...] Wow, this is extremely intuitive and powerful looking! It's basically a fractal structure: each sub-graph looks like a full-blown profile in itself. Thus the overhead of individual components of the graph profile can be analyzed without having to think in small numbers. The above example shows it particularly well - it shows that in regard to generic_file_buffered_write() overhead, the system is doing 97% sys_pwrite64() calls and 3% sys_write() calls. Thus i took the liberty to change your last patch in two ways: i renamed 'relative' to 'fractal' (it was not a proper counterpart to 'graph' anyway - we have no 'absolute' output mode name either), and i changed it to be the default output mode. This stuff rocks! absolute-graph and flat mode can be displayed too, via the option, as usual. Ingo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] perf report: Support callchains with relative overhead rate 2009-07-05 8:34 ` Ingo Molnar @ 2009-07-05 8:59 ` Ingo Molnar 2009-07-05 13:23 ` Frederic Weisbecker 2009-07-05 13:19 ` Frederic Weisbecker 1 sibling, 1 reply; 14+ messages in thread From: Ingo Molnar @ 2009-07-05 8:59 UTC (permalink / raw) To: Frederic Weisbecker Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras, Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo btw., i get some buggy looking output with: $ perf record -f -g ~/hackbench 10 $ perf report -c |--5.11%-- unix_stream_sendmsg | | | |--100.00%-- __sock_sendmsg | | sock_aio_write | | do_sync_write | | vfs_write | | sys_write | | sysenter_dispatch | | 0xf7f72430 | | 0xffebbca000000014 | | | --11.11%-- sock_aio_write | do_sync_write | vfs_write | sys_write | sysenter_dispatch | 0xf7f72430 | 0xffebbca000000014 Those percentages dont sum up to 100% :-) Another detail: i think we should signal when we crop the output due to the filter, via a line of: | [...] or so. Plus, when doing 'perf report' on a call-chain recording, shouldnt we auto-detect this fact and default to fractal output automatically, instead of flat mode? User can still force flat mode via 'perf report -c flat'. Ingo ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] perf report: Support callchains with relative overhead rate 2009-07-05 8:59 ` Ingo Molnar @ 2009-07-05 13:23 ` Frederic Weisbecker 0 siblings, 0 replies; 14+ messages in thread From: Frederic Weisbecker @ 2009-07-05 13:23 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras, Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo On Sun, Jul 05, 2009 at 10:59:49AM +0200, Ingo Molnar wrote: > > btw., i get some buggy looking output with: > > $ perf record -f -g ~/hackbench 10 > > $ perf report -c > > > |--5.11%-- unix_stream_sendmsg > | | > | |--100.00%-- __sock_sendmsg > | | sock_aio_write > | | do_sync_write > | | vfs_write > | | sys_write > | | sysenter_dispatch > | | 0xf7f72430 > | | 0xffebbca000000014 > | | > | --11.11%-- sock_aio_write > | do_sync_write > | vfs_write > | sys_write > | sysenter_dispatch > | 0xf7f72430 > | 0xffebbca000000014 > > Those percentages dont sum up to 100% :-) Argh. I can reproduce it, will have a look. > Another detail: i think we should signal when we crop the output due > to the filter, via a line of: > > | [...] > > or so. Ok. > Plus, when doing 'perf report' on a call-chain recording, shouldnt > we auto-detect this fact and default to fractal output > automatically, instead of flat mode? > > User can still force flat mode via 'perf report -c flat'. Yeah but the user won't be able to ignore the callchain. May be I may add a -c none in this case? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 5/5] perf report: Support callchains with relative overhead rate 2009-07-05 8:34 ` Ingo Molnar 2009-07-05 8:59 ` Ingo Molnar @ 2009-07-05 13:19 ` Frederic Weisbecker 1 sibling, 0 replies; 14+ messages in thread From: Frederic Weisbecker @ 2009-07-05 13:19 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, Peter Zijlstra, Mike Galbraith, Paul Mackerras, Anton Blanchard, Jens Axboe, Arnaldo Carvalho de Melo On Sun, Jul 05, 2009 at 10:34:00AM +0200, Ingo Molnar wrote: > > * Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > The current callchain displays the overhead rates as absolute: > > relative to the total overhead. > > > > This patch provides relative overhead percentage, in which each > > branch of the callchain tree is a independant instrumentated object. > > > > You can produce such output by using the "relative" mode > > that you can lower in r, re, rel, etc... > > > > ./perf report -s sym -c relative > > > > Example: > > > > 8.46% [k] copy_user_generic_string > > | > > |--52.01%-- generic_file_aio_read > > | do_sync_read > > | vfs_read > > | | > > | |--97.20%-- sys_pread64 > > | | system_call_fastpath > > | | pread64 > > | | > > | --2.81%-- sys_read > > | system_call_fastpath > > | __read > > | > > |--39.85%-- generic_file_buffered_write > > | __generic_file_aio_write_nolock > > | generic_file_aio_write > > | do_sync_write > > | reiserfs_file_write > > | vfs_write > > | | > > | |--97.05%-- sys_pwrite64 > > | | system_call_fastpath > > | | __pwrite64 > > | | > > | --2.95%-- sys_write > > | system_call_fastpath > > | __write_nocancel > > [...] > > Wow, this is extremely intuitive and powerful looking! > > It's basically a fractal structure: each sub-graph looks like a > full-blown profile in itself. Thus the overhead of individual > components of the graph profile can be analyzed without having to > think in small numbers. > > The above example shows it particularly well - it shows that in > regard to generic_file_buffered_write() overhead, the system is > doing 97% sys_pwrite64() calls and 3% sys_write() calls. > > Thus i took the liberty to change your last patch in two ways: i > renamed 'relative' to 'fractal' (it was not a proper counterpart to > 'graph' anyway - we have no 'absolute' output mode name either), and > i changed it to be the default output mode. This stuff rocks! Ok. I first planned to add a submode, or more likely a parameter structured like the following: perf report -c layout,min,mode where layout=flat|graph and mode=abs|rel But relative flat doesn't make sense. fractal is nice, it's just a pity that "graph" mode doesn't tell much about it's absolute measure context. > absolute-graph and flat mode can be displayed too, via the option, > as usual. > > Ingo ^ permalink raw reply [flat|nested] 14+ messages in thread
* [tip:perfcounters/urgent] perf report: Add "Fractal" mode output - support callchains with relative overhead rate 2009-07-05 5:39 ` [PATCH 5/5] perf report: Support callchains with relative overhead rate Frederic Weisbecker 2009-07-05 8:34 ` Ingo Molnar @ 2009-07-05 9:52 ` tip-bot for Frederic Weisbecker 1 sibling, 0 replies; 14+ messages in thread From: tip-bot for Frederic Weisbecker @ 2009-07-05 9:52 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, acme, anton, paulus, hpa, mingo, a.p.zijlstra, efault, jens.axboe, fweisbec, tglx, mingo Commit-ID: 805d127d62472f17c7d79baa001a7651afe2fa47 Gitweb: http://git.kernel.org/tip/805d127d62472f17c7d79baa001a7651afe2fa47 Author: Frederic Weisbecker <fweisbec@gmail.com> AuthorDate: Sun, 5 Jul 2009 07:39:21 +0200 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Sun, 5 Jul 2009 10:30:23 +0200 perf report: Add "Fractal" mode output - support callchains with relative overhead rate The current callchain displays the overhead rates as absolute: relative to the total overhead. This patch provides relative overhead percentage, in which each branch of the callchain tree is a independant instrumentated object. This provides a 'fractal' view of the call-chain profile: each sub-graph looks like a profile in itself - relative to its parent. You can produce such output by using the "fractal" mode that you can abbreviate via f, fr, fra, frac, etc... ./perf report -s sym -c fractal Example: 8.46% [k] copy_user_generic_string | |--52.01%-- generic_file_aio_read | do_sync_read | vfs_read | | | |--97.20%-- sys_pread64 | | system_call_fastpath | | pread64 | | | --2.81%-- sys_read | system_call_fastpath | __read | |--39.85%-- generic_file_buffered_write | __generic_file_aio_write_nolock | generic_file_aio_write | do_sync_write | reiserfs_file_write | vfs_write | | | |--97.05%-- sys_pwrite64 | | system_call_fastpath | | __pwrite64 | | | --2.95%-- sys_write | system_call_fastpath | __write_nocancel [...] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246772361-9960-5-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- tools/perf/builtin-report.c | 60 ++++++++++++++++++++---------- tools/perf/util/callchain.c | 84 +++++++++++++++++++++++++++++++++++-------- tools/perf/util/callchain.h | 21 ++++++++--- 3 files changed, 124 insertions(+), 41 deletions(-) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 8bd5865..4e5cc26 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -59,10 +59,15 @@ static regex_t parent_regex; static int exclude_other = 1; -static char callchain_default_opt[] = "graph,0.5"; +static char callchain_default_opt[] = "fractal,0.5"; + static int callchain; -static enum chain_mode callchain_mode; -static double callchain_min_percent = 0.5; + +static +struct callchain_param callchain_param = { + .mode = CHAIN_GRAPH_ABS, + .min_percent = 0.5 +}; static u64 sample_type; @@ -846,9 +851,15 @@ callchain__fprintf_graph(FILE *fp, struct callchain_node *self, struct callchain_node *child; struct callchain_list *chain; int new_depth_mask = depth_mask; + u64 new_total; size_t ret = 0; int i; + if (callchain_param.mode == CHAIN_GRAPH_REL) + new_total = self->cumul_hit; + else + new_total = total_samples; + node = rb_first(&self->rb_root); while (node) { child = rb_entry(node, struct callchain_node, rb_node); @@ -873,10 +884,10 @@ callchain__fprintf_graph(FILE *fp, struct callchain_node *self, continue; ret += ipchain__fprintf_graph(fp, chain, depth, new_depth_mask, i++, - total_samples, + new_total, child->cumul_hit); } - ret += callchain__fprintf_graph(fp, child, total_samples, + ret += callchain__fprintf_graph(fp, child, new_total, depth + 1, new_depth_mask | (1 << depth)); node = next; @@ -925,13 +936,18 @@ hist_entry_callchain__fprintf(FILE *fp, struct hist_entry *self, chain = rb_entry(rb_node, struct callchain_node, rb_node); percent = chain->hit * 100.0 / total_samples; - if (callchain_mode == FLAT) { + switch (callchain_param.mode) { + case CHAIN_FLAT: ret += percent_color_fprintf(fp, " %6.2f%%\n", percent); ret += callchain__fprintf_flat(fp, chain, total_samples); - } else if (callchain_mode == GRAPH) { + break; + case CHAIN_GRAPH_ABS: /* Falldown */ + case CHAIN_GRAPH_REL: ret += callchain__fprintf_graph(fp, chain, total_samples, 1, 1); + default: + break; } ret += fprintf(fp, "\n"); rb_node = rb_next(rb_node); @@ -1219,14 +1235,9 @@ static void output__insert_entry(struct hist_entry *he, u64 min_callchain_hits) struct rb_node *parent = NULL; struct hist_entry *iter; - if (callchain) { - if (callchain_mode == FLAT) - sort_chain_flat(&he->sorted_chain, &he->callchain, - min_callchain_hits); - else if (callchain_mode == GRAPH) - sort_chain_graph(&he->sorted_chain, &he->callchain, - min_callchain_hits); - } + if (callchain) + callchain_param.sort(&he->sorted_chain, &he->callchain, + min_callchain_hits, &callchain_param); while (*p != NULL) { parent = *p; @@ -1249,7 +1260,7 @@ static void output__resort(u64 total_samples) struct rb_root *tree = &hist; u64 min_callchain_hits; - min_callchain_hits = total_samples * (callchain_min_percent / 100); + min_callchain_hits = total_samples * (callchain_param.min_percent / 100); if (sort__need_collapse) tree = &collapse_hists; @@ -1829,22 +1840,31 @@ parse_callchain_opt(const struct option *opt __used, const char *arg, /* get the output mode */ if (!strncmp(tok, "graph", strlen(arg))) - callchain_mode = GRAPH; + callchain_param.mode = CHAIN_GRAPH_ABS; else if (!strncmp(tok, "flat", strlen(arg))) - callchain_mode = FLAT; + callchain_param.mode = CHAIN_FLAT; + + else if (!strncmp(tok, "fractal", strlen(arg))) + callchain_param.mode = CHAIN_GRAPH_REL; + else return -1; /* get the min percentage */ tok = strtok(NULL, ","); if (!tok) - return 0; + goto setup; - callchain_min_percent = strtod(tok, &endptr); + callchain_param.min_percent = strtod(tok, &endptr); if (tok == endptr) return -1; +setup: + if (register_callchain_param(&callchain_param) < 0) { + fprintf(stderr, "Can't register callchain params\n"); + return -1; + } return 0; } diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index 5d244af..9d3c814 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -32,13 +32,14 @@ rb_insert_callchain(struct rb_root *root, struct callchain_node *chain, rnode = rb_entry(parent, struct callchain_node, rb_node); switch (mode) { - case FLAT: + case CHAIN_FLAT: if (rnode->hit < chain->hit) p = &(*p)->rb_left; else p = &(*p)->rb_right; break; - case GRAPH: + case CHAIN_GRAPH_ABS: /* Falldown */ + case CHAIN_GRAPH_REL: if (rnode->cumul_hit < chain->cumul_hit) p = &(*p)->rb_left; else @@ -53,43 +54,96 @@ rb_insert_callchain(struct rb_root *root, struct callchain_node *chain, rb_insert_color(&chain->rb_node, root); } +static void +__sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node, + u64 min_hit) +{ + struct callchain_node *child; + + chain_for_each_child(child, node) + __sort_chain_flat(rb_root, child, min_hit); + + if (node->hit && node->hit >= min_hit) + rb_insert_callchain(rb_root, node, CHAIN_FLAT); +} + /* * Once we get every callchains from the stream, we can now * sort them by hit */ -void sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node, - u64 min_hit) +static void +sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node, + u64 min_hit, struct callchain_param *param __used) +{ + __sort_chain_flat(rb_root, node, min_hit); +} + +static void __sort_chain_graph_abs(struct callchain_node *node, + u64 min_hit) { struct callchain_node *child; - chain_for_each_child(child, node) - sort_chain_flat(rb_root, child, min_hit); + node->rb_root = RB_ROOT; - if (node->hit && node->hit >= min_hit) - rb_insert_callchain(rb_root, node, FLAT); + chain_for_each_child(child, node) { + __sort_chain_graph_abs(child, min_hit); + if (child->cumul_hit >= min_hit) + rb_insert_callchain(&node->rb_root, child, + CHAIN_GRAPH_ABS); + } +} + +static void +sort_chain_graph_abs(struct rb_root *rb_root, struct callchain_node *chain_root, + u64 min_hit, struct callchain_param *param __used) +{ + __sort_chain_graph_abs(chain_root, min_hit); + rb_root->rb_node = chain_root->rb_root.rb_node; } -static void __sort_chain_graph(struct callchain_node *node, u64 min_hit) +static void __sort_chain_graph_rel(struct callchain_node *node, + double min_percent) { struct callchain_node *child; + u64 min_hit; node->rb_root = RB_ROOT; + min_hit = node->cumul_hit * min_percent / 100.0; chain_for_each_child(child, node) { - __sort_chain_graph(child, min_hit); + __sort_chain_graph_rel(child, min_percent); if (child->cumul_hit >= min_hit) - rb_insert_callchain(&node->rb_root, child, GRAPH); + rb_insert_callchain(&node->rb_root, child, + CHAIN_GRAPH_REL); } } -void -sort_chain_graph(struct rb_root *rb_root, struct callchain_node *chain_root, - u64 min_hit) +static void +sort_chain_graph_rel(struct rb_root *rb_root, struct callchain_node *chain_root, + u64 min_hit __used, struct callchain_param *param) { - __sort_chain_graph(chain_root, min_hit); + __sort_chain_graph_rel(chain_root, param->min_percent); rb_root->rb_node = chain_root->rb_root.rb_node; } +int register_callchain_param(struct callchain_param *param) +{ + switch (param->mode) { + case CHAIN_GRAPH_ABS: + param->sort = sort_chain_graph_abs; + break; + case CHAIN_GRAPH_REL: + param->sort = sort_chain_graph_rel; + break; + case CHAIN_FLAT: + param->sort = sort_chain_flat; + break; + default: + return -1; + } + return 0; +} + /* * Create a child for a parent. If inherit_children, then the new child * will become the new parent of it's parent children diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h index f3e4776..7812122 100644 --- a/tools/perf/util/callchain.h +++ b/tools/perf/util/callchain.h @@ -7,8 +7,9 @@ #include "symbol.h" enum chain_mode { - FLAT, - GRAPH + CHAIN_FLAT, + CHAIN_GRAPH_ABS, + CHAIN_GRAPH_REL }; struct callchain_node { @@ -23,6 +24,17 @@ struct callchain_node { u64 cumul_hit; /* hit + hits of children */ }; +struct callchain_param; + +typedef void (*sort_chain_func_t)(struct rb_root *, struct callchain_node *, + u64, struct callchain_param *); + +struct callchain_param { + enum chain_mode mode; + double min_percent; + sort_chain_func_t sort; +}; + struct callchain_list { u64 ip; struct symbol *sym; @@ -36,10 +48,7 @@ static inline void callchain_init(struct callchain_node *node) INIT_LIST_HEAD(&node->val); } +int register_callchain_param(struct callchain_param *param); void append_chain(struct callchain_node *root, struct ip_callchain *chain, struct symbol **syms); -void sort_chain_flat(struct rb_root *rb_root, struct callchain_node *node, - u64 min_hit); -void sort_chain_graph(struct rb_root *rb_root, struct callchain_node *node, - u64 min_hit); #endif ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [tip:perfcounters/urgent] perf report: Warn on callchain output request from non-callchain file 2009-07-05 5:39 [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file Frederic Weisbecker ` (3 preceding siblings ...) 2009-07-05 5:39 ` [PATCH 5/5] perf report: Support callchains with relative overhead rate Frederic Weisbecker @ 2009-07-05 9:51 ` tip-bot for Frederic Weisbecker 4 siblings, 0 replies; 14+ messages in thread From: tip-bot for Frederic Weisbecker @ 2009-07-05 9:51 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, anton, paulus, acme, hpa, mingo, a.p.zijlstra, efault, jens.axboe, fweisbec, tglx, mingo Commit-ID: 91b4eaea93f5be95f4477554399680a53aff2343 Gitweb: http://git.kernel.org/tip/91b4eaea93f5be95f4477554399680a53aff2343 Author: Frederic Weisbecker <fweisbec@gmail.com> AuthorDate: Sun, 5 Jul 2009 07:39:17 +0200 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Sun, 5 Jul 2009 10:30:21 +0200 perf report: Warn on callchain output request from non-callchain file perf report segfaults while trying to handle callchains from a non callchain data file. Instead of a segfault, print a useful message to the user. Reported-by: Jens Axboe <jens.axboe@oracle.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246772361-9960-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- tools/perf/builtin-report.c | 16 +++++++++++++--- 1 files changed, 13 insertions(+), 3 deletions(-) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index fa937f5..9f9575a 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -1684,9 +1684,19 @@ static int __cmd_report(void) sample_type = perf_header__sample_type(); - if (sort__has_parent && !(sample_type & PERF_SAMPLE_CALLCHAIN)) { - fprintf(stderr, "selected --sort parent, but no callchain data\n"); - exit(-1); + if (!(sample_type & PERF_SAMPLE_CALLCHAIN)) { + if (sort__has_parent) { + fprintf(stderr, "selected --sort parent, but no" + " callchain data. Did you call" + " perf record without -g?\n"); + exit(-1); + } + if (callchain) { + fprintf(stderr, "selected -c but no callchain data." + " Did you call perf record without" + " -g?\n"); + exit(-1); + } } if (load_kernel() < 0) { ^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-07-05 13:23 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-07-05 5:39 [PATCH 1/5] perf report: Warn on callchain output request from non-callchain file Frederic Weisbecker 2009-07-05 5:39 ` [PATCH 2/5] perf report: Use a modifiable string for default callchain options Frederic Weisbecker 2009-07-05 9:52 ` [tip:perfcounters/urgent] " tip-bot for Frederic Weisbecker 2009-07-05 5:39 ` [PATCH 3/5] perf report: Change default callchain parameters Frederic Weisbecker 2009-07-05 9:52 ` [tip:perfcounters/urgent] " tip-bot for Frederic Weisbecker 2009-07-05 5:39 ` [PATCH 4/5] perf tools: callchains: Manage the cumul hits on the fly Frederic Weisbecker 2009-07-05 9:52 ` [tip:perfcounters/urgent] perf_counter " tip-bot for Frederic Weisbecker 2009-07-05 5:39 ` [PATCH 5/5] perf report: Support callchains with relative overhead rate Frederic Weisbecker 2009-07-05 8:34 ` Ingo Molnar 2009-07-05 8:59 ` Ingo Molnar 2009-07-05 13:23 ` Frederic Weisbecker 2009-07-05 13:19 ` Frederic Weisbecker 2009-07-05 9:52 ` [tip:perfcounters/urgent] perf report: Add "Fractal" mode output - support " tip-bot for Frederic Weisbecker 2009-07-05 9:51 ` [tip:perfcounters/urgent] perf report: Warn on callchain output request from non-callchain file tip-bot for Frederic Weisbecker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox