* [PATCH] perf report: Fix wrong LBR block sorting
@ 2021-04-07 2:44 Jin Yao
2021-04-07 13:49 ` Andi Kleen
0 siblings, 1 reply; 3+ messages in thread
From: Jin Yao @ 2021-04-07 2:44 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
When '--total-cycles' is specified, it supports sorting for all blocks
by 'Sampled Cycles%'. This is useful to concentrate on the globally
hottest blocks.
'Sampled Cycles%' - block sampled cycles aggregation / total sampled cycles
But in current code, it doesn't use the cycles aggregation. Part of 'cycles'
counting is possibly dropped for some overlap jumps. But for identifying
the hot block, we always need the full cycles.
# perf record -b ./triad_loop
# perf report --total-cycles --stdio
Before:
#
# Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object
# ............... .............. ........... .......... ...................................................................... ....................
#
0.81% 793 4.32% 793 [setup-vdso.h:34 -> setup-vdso.h:40] ld-2.27.so
0.49% 480 0.87% 160 [native_write_msr+0 -> native_write_msr+16] [kernel.kallsyms]
0.48% 476 0.52% 95 [native_read_msr+0 -> native_read_msr+29] [kernel.kallsyms]
0.31% 303 1.65% 303 [nmi_restore+0 -> nmi_restore+37] [kernel.kallsyms]
0.26% 255 1.39% 255 [nohz_balance_exit_idle+75 -> nohz_balance_exit_idle+162] [kernel.kallsyms]
0.24% 234 1.28% 234 [end_repeat_nmi+67 -> end_repeat_nmi+83] [kernel.kallsyms]
0.23% 227 1.24% 227 [__irqentry_text_end+96 -> __irqentry_text_end+126] [kernel.kallsyms]
0.20% 194 1.06% 194 [native_set_debugreg+52 -> native_set_debugreg+56] [kernel.kallsyms]
0.11% 106 0.14% 26 [native_sched_clock+0 -> native_sched_clock+98] [kernel.kallsyms]
0.10% 97 0.53% 97 [trigger_load_balance+0 -> trigger_load_balance+67] [kernel.kallsyms]
0.09% 85 0.46% 85 [get-dynamic-info.h:102 -> get-dynamic-info.h:111] ld-2.27.so
...
0.00% 92.7K 0.02% 4 [triad_loop.c:64 -> triad_loop.c:65] triad_loop
The hottest block '[triad_loop.c:64 -> triad_loop.c:65]' is not at
the top of output.
After:
# Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles [Program Block Range] Shared Object
# ............... .............. ........... .......... ...................................................................... ....................
#
94.35% 92.7K 0.02% 4 [triad_loop.c:64 -> triad_loop.c:65] triad_loop
0.81% 793 4.32% 793 [setup-vdso.h:34 -> setup-vdso.h:40] ld-2.27.so
0.49% 480 0.87% 160 [native_write_msr+0 -> native_write_msr+16] [kernel.kallsyms]
0.48% 476 0.52% 95 [native_read_msr+0 -> native_read_msr+29] [kernel.kallsyms]
0.31% 303 1.65% 303 [nmi_restore+0 -> nmi_restore+37] [kernel.kallsyms]
0.26% 255 1.39% 255 [nohz_balance_exit_idle+75 -> nohz_balance_exit_idle+162] [kernel.kallsyms]
0.24% 234 1.28% 234 [end_repeat_nmi+67 -> end_repeat_nmi+83] [kernel.kallsyms]
0.23% 227 1.24% 227 [__irqentry_text_end+96 -> __irqentry_text_end+126] [kernel.kallsyms]
0.20% 194 1.06% 194 [native_set_debugreg+52 -> native_set_debugreg+56] [kernel.kallsyms]
0.11% 106 0.14% 26 [native_sched_clock+0 -> native_sched_clock+98] [kernel.kallsyms]
0.10% 97 0.53% 97 [trigger_load_balance+0 -> trigger_load_balance+67] [kernel.kallsyms]
0.09% 85 0.46% 85 [get-dynamic-info.h:102 -> get-dynamic-info.h:111] ld-2.27.so
0.08% 82 0.06% 11 [intel_pmu_drain_pebs_nhm+580 -> intel_pmu_drain_pebs_nhm+627] [kernel.kallsyms]
0.08% 77 0.42% 77 [lru_add_drain_cpu+0 -> lru_add_drain_cpu+133] [kernel.kallsyms]
0.08% 74 0.10% 18 [handle_pmi_common+271 -> handle_pmi_common+310] [kernel.kallsyms]
0.08% 74 0.40% 74 [get-dynamic-info.h:131 -> get-dynamic-info.h:157] ld-2.27.so
0.07% 69 0.09% 17 [intel_pmu_drain_pebs_nhm+432 -> intel_pmu_drain_pebs_nhm+468] [kernel.kallsyms]
Now the hottest block is reported at the top of output.
Fixes: b65a7d372b1a ("perf hist: Support block formats with compare/sort/display")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
tools/perf/util/block-info.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 423ec69bda6c..5ecd4f401f32 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -201,7 +201,7 @@ static int block_total_cycles_pct_entry(struct perf_hpp_fmt *fmt,
double ratio = 0.0;
if (block_fmt->total_cycles)
- ratio = (double)bi->cycles / (double)block_fmt->total_cycles;
+ ratio = (double)bi->cycles_aggr / (double)block_fmt->total_cycles;
return color_pct(hpp, block_fmt->width, 100.0 * ratio);
}
@@ -216,9 +216,9 @@ static int64_t block_total_cycles_pct_sort(struct perf_hpp_fmt *fmt,
double l, r;
if (block_fmt->total_cycles) {
- l = ((double)bi_l->cycles /
+ l = ((double)bi_l->cycles_aggr /
(double)block_fmt->total_cycles) * 100000.0;
- r = ((double)bi_r->cycles /
+ r = ((double)bi_r->cycles_aggr /
(double)block_fmt->total_cycles) * 100000.0;
return (int64_t)l - (int64_t)r;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] perf report: Fix wrong LBR block sorting
2021-04-07 2:44 [PATCH] perf report: Fix wrong LBR block sorting Jin Yao
@ 2021-04-07 13:49 ` Andi Kleen
2021-04-07 19:22 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 3+ messages in thread
From: Andi Kleen @ 2021-04-07 13:49 UTC (permalink / raw)
To: Jin Yao
Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel,
kan.liang, yao.jin
> Now the hottest block is reported at the top of output.
>
> Fixes: b65a7d372b1a ("perf hist: Support block formats with compare/sort/display")
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
-Andi
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] perf report: Fix wrong LBR block sorting
2021-04-07 13:49 ` Andi Kleen
@ 2021-04-07 19:22 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 3+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-04-07 19:22 UTC (permalink / raw)
To: Andi Kleen
Cc: Jin Yao, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel,
kan.liang, yao.jin
Em Wed, Apr 07, 2021 at 06:49:57AM -0700, Andi Kleen escreveu:
> > Now the hottest block is reported at the top of output.
> >
> > Fixes: b65a7d372b1a ("perf hist: Support block formats with compare/sort/display")
> > Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>
>
> Reviewed-by: Andi Kleen <ak@linux.intel.com>
Thanks, applied.
- Arnaldo
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-04-07 19:22 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-04-07 2:44 [PATCH] perf report: Fix wrong LBR block sorting Jin Yao
2021-04-07 13:49 ` Andi Kleen
2021-04-07 19:22 ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox