public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf report: Fix wrong LBR block sorting
@ 2021-04-07  2:44 Jin Yao
  2021-04-07 13:49 ` Andi Kleen
  0 siblings, 1 reply; 3+ messages in thread
From: Jin Yao @ 2021-04-07  2:44 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

When '--total-cycles' is specified, it supports sorting for all blocks
by 'Sampled Cycles%'. This is useful to concentrate on the globally
hottest blocks.

'Sampled Cycles%' - block sampled cycles aggregation / total sampled cycles

But in current code, it doesn't use the cycles aggregation. Part of 'cycles'
counting is possibly dropped for some overlap jumps. But for identifying
the hot block, we always need the full cycles.

  # perf record -b ./triad_loop
  # perf report --total-cycles --stdio

Before:

  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                                   [Program Block Range]         Shared Object
  # ...............  ..............  ...........  ..........  ......................................................................  ....................
  #
              0.81%             793        4.32%         793                                    [setup-vdso.h:34 -> setup-vdso.h:40]            ld-2.27.so
              0.49%             480        0.87%         160                             [native_write_msr+0 -> native_write_msr+16]     [kernel.kallsyms]
              0.48%             476        0.52%          95                               [native_read_msr+0 -> native_read_msr+29]     [kernel.kallsyms]
              0.31%             303        1.65%         303                                       [nmi_restore+0 -> nmi_restore+37]     [kernel.kallsyms]
              0.26%             255        1.39%         255               [nohz_balance_exit_idle+75 -> nohz_balance_exit_idle+162]     [kernel.kallsyms]
              0.24%             234        1.28%         234                                [end_repeat_nmi+67 -> end_repeat_nmi+83]     [kernel.kallsyms]
              0.23%             227        1.24%         227                     [__irqentry_text_end+96 -> __irqentry_text_end+126]     [kernel.kallsyms]
              0.20%             194        1.06%         194                      [native_set_debugreg+52 -> native_set_debugreg+56]     [kernel.kallsyms]
              0.11%             106        0.14%          26                         [native_sched_clock+0 -> native_sched_clock+98]     [kernel.kallsyms]
              0.10%              97        0.53%          97                     [trigger_load_balance+0 -> trigger_load_balance+67]     [kernel.kallsyms]
              0.09%              85        0.46%          85                      [get-dynamic-info.h:102 -> get-dynamic-info.h:111]            ld-2.27.so
  ...
              0.00%           92.7K        0.02%           4                                    [triad_loop.c:64 -> triad_loop.c:65]            triad_loop

The hottest block '[triad_loop.c:64 -> triad_loop.c:65]' is not at
the top of output.

After:

  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                                   [Program Block Range]         Shared Object
  # ...............  ..............  ...........  ..........  ......................................................................  ....................
  #
             94.35%           92.7K        0.02%           4                                    [triad_loop.c:64 -> triad_loop.c:65]            triad_loop
              0.81%             793        4.32%         793                                    [setup-vdso.h:34 -> setup-vdso.h:40]            ld-2.27.so
              0.49%             480        0.87%         160                             [native_write_msr+0 -> native_write_msr+16]     [kernel.kallsyms]
              0.48%             476        0.52%          95                               [native_read_msr+0 -> native_read_msr+29]     [kernel.kallsyms]
              0.31%             303        1.65%         303                                       [nmi_restore+0 -> nmi_restore+37]     [kernel.kallsyms]
              0.26%             255        1.39%         255               [nohz_balance_exit_idle+75 -> nohz_balance_exit_idle+162]     [kernel.kallsyms]
              0.24%             234        1.28%         234                                [end_repeat_nmi+67 -> end_repeat_nmi+83]     [kernel.kallsyms]
              0.23%             227        1.24%         227                     [__irqentry_text_end+96 -> __irqentry_text_end+126]     [kernel.kallsyms]
              0.20%             194        1.06%         194                      [native_set_debugreg+52 -> native_set_debugreg+56]     [kernel.kallsyms]
              0.11%             106        0.14%          26                         [native_sched_clock+0 -> native_sched_clock+98]     [kernel.kallsyms]
              0.10%              97        0.53%          97                     [trigger_load_balance+0 -> trigger_load_balance+67]     [kernel.kallsyms]
              0.09%              85        0.46%          85                      [get-dynamic-info.h:102 -> get-dynamic-info.h:111]            ld-2.27.so
              0.08%              82        0.06%          11          [intel_pmu_drain_pebs_nhm+580 -> intel_pmu_drain_pebs_nhm+627]     [kernel.kallsyms]
              0.08%              77        0.42%          77                          [lru_add_drain_cpu+0 -> lru_add_drain_cpu+133]     [kernel.kallsyms]
              0.08%              74        0.10%          18                        [handle_pmi_common+271 -> handle_pmi_common+310]     [kernel.kallsyms]
              0.08%              74        0.40%          74                      [get-dynamic-info.h:131 -> get-dynamic-info.h:157]            ld-2.27.so
              0.07%              69        0.09%          17          [intel_pmu_drain_pebs_nhm+432 -> intel_pmu_drain_pebs_nhm+468]     [kernel.kallsyms]

Now the hottest block is reported at the top of output.

Fixes: b65a7d372b1a ("perf hist: Support block formats with compare/sort/display")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/block-info.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 423ec69bda6c..5ecd4f401f32 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -201,7 +201,7 @@ static int block_total_cycles_pct_entry(struct perf_hpp_fmt *fmt,
 	double ratio = 0.0;
 
 	if (block_fmt->total_cycles)
-		ratio = (double)bi->cycles / (double)block_fmt->total_cycles;
+		ratio = (double)bi->cycles_aggr / (double)block_fmt->total_cycles;
 
 	return color_pct(hpp, block_fmt->width, 100.0 * ratio);
 }
@@ -216,9 +216,9 @@ static int64_t block_total_cycles_pct_sort(struct perf_hpp_fmt *fmt,
 	double l, r;
 
 	if (block_fmt->total_cycles) {
-		l = ((double)bi_l->cycles /
+		l = ((double)bi_l->cycles_aggr /
 			(double)block_fmt->total_cycles) * 100000.0;
-		r = ((double)bi_r->cycles /
+		r = ((double)bi_r->cycles_aggr /
 			(double)block_fmt->total_cycles) * 100000.0;
 		return (int64_t)l - (int64_t)r;
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] perf report: Fix wrong LBR block sorting
  2021-04-07  2:44 [PATCH] perf report: Fix wrong LBR block sorting Jin Yao
@ 2021-04-07 13:49 ` Andi Kleen
  2021-04-07 19:22   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 3+ messages in thread
From: Andi Kleen @ 2021-04-07 13:49 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel,
	kan.liang, yao.jin

> Now the hottest block is reported at the top of output.
> 
> Fixes: b65a7d372b1a ("perf hist: Support block formats with compare/sort/display")
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>


Reviewed-by: Andi Kleen <ak@linux.intel.com>
-Andi

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] perf report: Fix wrong LBR block sorting
  2021-04-07 13:49 ` Andi Kleen
@ 2021-04-07 19:22   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 3+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-04-07 19:22 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jin Yao, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel,
	kan.liang, yao.jin

Em Wed, Apr 07, 2021 at 06:49:57AM -0700, Andi Kleen escreveu:
> > Now the hottest block is reported at the top of output.
> > 
> > Fixes: b65a7d372b1a ("perf hist: Support block formats with compare/sort/display")
> > Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> 
> 
> Reviewed-by: Andi Kleen <ak@linux.intel.com>

Thanks, applied.

- Arnaldo


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-07 19:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-04-07  2:44 [PATCH] perf report: Fix wrong LBR block sorting Jin Yao
2021-04-07 13:49 ` Andi Kleen
2021-04-07 19:22   ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox