* perf stat VERSUS perf stat report :: TopDown Metrics
@ 2024-05-21 16:54 Michael Petlan
2024-05-23 17:31 ` Liang, Kan
0 siblings, 1 reply; 2+ messages in thread
From: Michael Petlan @ 2024-05-21 16:54 UTC (permalink / raw)
To: linux-perf-users
Cc: vmolnaro, kan.liang, irogers, acme, ak, alexander.shishkin
Hello!
I have a test for perf-stat record/report functionality, which compares the outputs,
basically whether `perf stat report` is able to reconstruct the same results as
printed by `perf stat`. In the Intel environments with TopDown events/metrics, the
test started failing on the fact that perf-stat-report has a different approach to
handle the metrics:
================================ perf stat ================================
Performance counter stats for 'ls':
0.69 msec task-clock # 0.477 CPUs utilized
0 context-switches # 0.000 /sec
0 cpu-migrations # 0.000 /sec
97 page-faults # 139.648 K/sec
1,776,212 cycles # 2.557 GHz
1,970,435 instructions # 1.11 insn per cycle
403,140 branches # 580.389 M/sec
11,837 branch-misses # 2.94% of all branches
TopdownL1 # 30.0 % tma_backend_bound
# 13.0 % tma_bad_speculation
# 37.5 % tma_frontend_bound
# 19.5 % tma_retiring
TopdownL2 # 12.1 % tma_branch_mispredicts
# 11.7 % tma_core_bound
# 13.2 % tma_fetch_bandwidth
# 24.3 % tma_fetch_latency
# 3.1 % tma_heavy_operations
# 16.3 % tma_light_operations
# 1.0 % tma_machine_clears
# 18.3 % tma_memory_bound
0.001456908 seconds time elapsed
0.000000000 seconds user
0.001647000 seconds sys
================================ perf stat report ================================
Performance counter stats for '/usr/bin/perf stat record ls':
0.69 msec task-clock # 0.477 CPUs utilized
0 context-switches # 0.000 /sec
0 cpu-migrations # 0.000 /sec
97 page-faults # 139.648 K/sec
1,776,212 cycles # 2.557 GHz
1,970,435 instructions # 1.11 insn per cycle
403,140 branches # 580.389 M/sec
11,837 branch-misses # 2.94% of all branches
10,657,272 TOPDOWN.SLOTS
2,089,661 topdown-retiring
4,053,942 topdown-fe-bound
1,964,281 topdown-mem-bound
3,218,078 topdown-be-bound
334,345 topdown-heavy-ops
1,295,589 topdown-br-mispredict
2,632,973 topdown-fetch-lat
1,379,176 topdown-bad-spec
21,248 INT_MISC.UOP_DROPPING # 30.590 M/sec
0.001456908 seconds time elapsed
While perf-stat (and perf-stat-record) calculates the percentages, perf-stat-report
just prints the raw numbers. Thinking about it, it might be useful to know the raw
numbers too, but rather via an option, while by default, both should behave the same,
shouldn't they? Is perf-stat-report missing some metric postprocessing?
Thanks!
Michael
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: perf stat VERSUS perf stat report :: TopDown Metrics
2024-05-21 16:54 perf stat VERSUS perf stat report :: TopDown Metrics Michael Petlan
@ 2024-05-23 17:31 ` Liang, Kan
0 siblings, 0 replies; 2+ messages in thread
From: Liang, Kan @ 2024-05-23 17:31 UTC (permalink / raw)
To: Michael Petlan, linux-perf-users
Cc: vmolnaro, irogers, acme, ak, alexander.shishkin
On 2024-05-21 12:54 p.m., Michael Petlan wrote:
> Hello!
>
> I have a test for perf-stat record/report functionality, which compares the outputs,
> basically whether `perf stat report` is able to reconstruct the same results as
> printed by `perf stat`. In the Intel environments with TopDown events/metrics, the
> test started failing on the fact that perf-stat-report has a different approach to
> handle the metrics:
>
> ================================ perf stat ================================
> Performance counter stats for 'ls':
>
> 0.69 msec task-clock # 0.477 CPUs utilized
> 0 context-switches # 0.000 /sec
> 0 cpu-migrations # 0.000 /sec
> 97 page-faults # 139.648 K/sec
> 1,776,212 cycles # 2.557 GHz
> 1,970,435 instructions # 1.11 insn per cycle
> 403,140 branches # 580.389 M/sec
> 11,837 branch-misses # 2.94% of all branches
> TopdownL1 # 30.0 % tma_backend_bound
> # 13.0 % tma_bad_speculation
> # 37.5 % tma_frontend_bound
> # 19.5 % tma_retiring
> TopdownL2 # 12.1 % tma_branch_mispredicts
> # 11.7 % tma_core_bound
> # 13.2 % tma_fetch_bandwidth
> # 24.3 % tma_fetch_latency
> # 3.1 % tma_heavy_operations
> # 16.3 % tma_light_operations
> # 1.0 % tma_machine_clears
> # 18.3 % tma_memory_bound
>
> 0.001456908 seconds time elapsed
>
> 0.000000000 seconds user
> 0.001647000 seconds sys
>
> ================================ perf stat report ================================
> Performance counter stats for '/usr/bin/perf stat record ls':
>
> 0.69 msec task-clock # 0.477 CPUs utilized
> 0 context-switches # 0.000 /sec
> 0 cpu-migrations # 0.000 /sec
> 97 page-faults # 139.648 K/sec
> 1,776,212 cycles # 2.557 GHz
> 1,970,435 instructions # 1.11 insn per cycle
> 403,140 branches # 580.389 M/sec
> 11,837 branch-misses # 2.94% of all branches
> 10,657,272 TOPDOWN.SLOTS
> 2,089,661 topdown-retiring
> 4,053,942 topdown-fe-bound
> 1,964,281 topdown-mem-bound
> 3,218,078 topdown-be-bound
> 334,345 topdown-heavy-ops
> 1,295,589 topdown-br-mispredict
> 2,632,973 topdown-fetch-lat
> 1,379,176 topdown-bad-spec
> 21,248 INT_MISC.UOP_DROPPING # 30.590 M/sec
>
> 0.001456908 seconds time elapsed
>
> While perf-stat (and perf-stat-record) calculates the percentages, perf-stat-report
> just prints the raw numbers. Thinking about it, it might be useful to know the raw
> numbers too, but rather via an option, while by default, both should behave the same,
> shouldn't they? Is perf-stat-report missing some metric postprocessing?
The perf-stat-record/report doesn't support the metrics well. It only
records the value based on the events. So all the metrics lost.
$ ./perf stat record -M cpi sleep 1
Performance counter stats for 'sleep 1':
183,590 INST_RETIRED.ANY:u # 2.3
per_instr cpi
429,375 CPU_CLK_UNHALTED.THREAD:u
$ ./perf stat report
Performance counter stats for 'perf stat record -M cpi sleep 1':
183,590 INST_RETIRED.ANY:u
429,375 CPU_CLK_UNHALTED.THREAD:u
To fix it, we need to record the metrics information as well. I will
take a look.
Thanks,
Kan
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-05-23 17:32 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-21 16:54 perf stat VERSUS perf stat report :: TopDown Metrics Michael Petlan
2024-05-23 17:31 ` Liang, Kan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).