linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* perf stat VERSUS perf stat report :: TopDown Metrics
@ 2024-05-21 16:54 Michael Petlan
  2024-05-23 17:31 ` Liang, Kan
  0 siblings, 1 reply; 2+ messages in thread
From: Michael Petlan @ 2024-05-21 16:54 UTC (permalink / raw)
  To: linux-perf-users
  Cc: vmolnaro, kan.liang, irogers, acme, ak, alexander.shishkin

Hello!

I have a test for perf-stat record/report functionality, which compares the outputs,
basically whether `perf stat report` is able to reconstruct the same results as
printed by `perf stat`. In the Intel environments with TopDown events/metrics, the
test started failing on the fact that perf-stat-report has a different approach to
handle the metrics:

================================ perf stat ================================
 Performance counter stats for 'ls':

              0.69 msec task-clock                   #    0.477 CPUs utilized
                 0      context-switches             #    0.000 /sec
                 0      cpu-migrations               #    0.000 /sec
                97      page-faults                  #  139.648 K/sec
         1,776,212      cycles                       #    2.557 GHz
         1,970,435      instructions                 #    1.11  insn per cycle
           403,140      branches                     #  580.389 M/sec
            11,837      branch-misses                #    2.94% of all branches
                        TopdownL1             #     30.0 %  tma_backend_bound
                                              #     13.0 %  tma_bad_speculation
                                              #     37.5 %  tma_frontend_bound
                                              #     19.5 %  tma_retiring
                        TopdownL2             #     12.1 %  tma_branch_mispredicts
                                              #     11.7 %  tma_core_bound
                                              #     13.2 %  tma_fetch_bandwidth
                                              #     24.3 %  tma_fetch_latency
                                              #      3.1 %  tma_heavy_operations
                                              #     16.3 %  tma_light_operations
                                              #      1.0 %  tma_machine_clears
                                              #     18.3 %  tma_memory_bound

       0.001456908 seconds time elapsed

       0.000000000 seconds user
       0.001647000 seconds sys

================================ perf stat report ================================
 Performance counter stats for '/usr/bin/perf stat record ls':

              0.69 msec task-clock                   #    0.477 CPUs utilized
                 0      context-switches             #    0.000 /sec
                 0      cpu-migrations               #    0.000 /sec
                97      page-faults                  #  139.648 K/sec
         1,776,212      cycles                       #    2.557 GHz
         1,970,435      instructions                 #    1.11  insn per cycle
           403,140      branches                     #  580.389 M/sec
            11,837      branch-misses                #    2.94% of all branches
        10,657,272      TOPDOWN.SLOTS
         2,089,661      topdown-retiring
         4,053,942      topdown-fe-bound
         1,964,281      topdown-mem-bound
         3,218,078      topdown-be-bound
           334,345      topdown-heavy-ops
         1,295,589      topdown-br-mispredict
         2,632,973      topdown-fetch-lat
         1,379,176      topdown-bad-spec
            21,248      INT_MISC.UOP_DROPPING        #   30.590 M/sec

       0.001456908 seconds time elapsed

While perf-stat (and perf-stat-record) calculates the percentages, perf-stat-report
just prints the raw numbers. Thinking about it, it might be useful to know the raw
numbers too, but rather via an option, while by default, both should behave the same,
shouldn't they? Is perf-stat-report missing some metric postprocessing?

Thanks!

Michael


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: perf stat VERSUS perf stat report :: TopDown Metrics
  2024-05-21 16:54 perf stat VERSUS perf stat report :: TopDown Metrics Michael Petlan
@ 2024-05-23 17:31 ` Liang, Kan
  0 siblings, 0 replies; 2+ messages in thread
From: Liang, Kan @ 2024-05-23 17:31 UTC (permalink / raw)
  To: Michael Petlan, linux-perf-users
  Cc: vmolnaro, irogers, acme, ak, alexander.shishkin



On 2024-05-21 12:54 p.m., Michael Petlan wrote:
> Hello!
> 
> I have a test for perf-stat record/report functionality, which compares the outputs,
> basically whether `perf stat report` is able to reconstruct the same results as
> printed by `perf stat`. In the Intel environments with TopDown events/metrics, the
> test started failing on the fact that perf-stat-report has a different approach to
> handle the metrics:
> 
> ================================ perf stat ================================
>  Performance counter stats for 'ls':
> 
>               0.69 msec task-clock                   #    0.477 CPUs utilized
>                  0      context-switches             #    0.000 /sec
>                  0      cpu-migrations               #    0.000 /sec
>                 97      page-faults                  #  139.648 K/sec
>          1,776,212      cycles                       #    2.557 GHz
>          1,970,435      instructions                 #    1.11  insn per cycle
>            403,140      branches                     #  580.389 M/sec
>             11,837      branch-misses                #    2.94% of all branches
>                         TopdownL1             #     30.0 %  tma_backend_bound
>                                               #     13.0 %  tma_bad_speculation
>                                               #     37.5 %  tma_frontend_bound
>                                               #     19.5 %  tma_retiring
>                         TopdownL2             #     12.1 %  tma_branch_mispredicts
>                                               #     11.7 %  tma_core_bound
>                                               #     13.2 %  tma_fetch_bandwidth
>                                               #     24.3 %  tma_fetch_latency
>                                               #      3.1 %  tma_heavy_operations
>                                               #     16.3 %  tma_light_operations
>                                               #      1.0 %  tma_machine_clears
>                                               #     18.3 %  tma_memory_bound
> 
>        0.001456908 seconds time elapsed
> 
>        0.000000000 seconds user
>        0.001647000 seconds sys
> 
> ================================ perf stat report ================================
>  Performance counter stats for '/usr/bin/perf stat record ls':
> 
>               0.69 msec task-clock                   #    0.477 CPUs utilized
>                  0      context-switches             #    0.000 /sec
>                  0      cpu-migrations               #    0.000 /sec
>                 97      page-faults                  #  139.648 K/sec
>          1,776,212      cycles                       #    2.557 GHz
>          1,970,435      instructions                 #    1.11  insn per cycle
>            403,140      branches                     #  580.389 M/sec
>             11,837      branch-misses                #    2.94% of all branches
>         10,657,272      TOPDOWN.SLOTS
>          2,089,661      topdown-retiring
>          4,053,942      topdown-fe-bound
>          1,964,281      topdown-mem-bound
>          3,218,078      topdown-be-bound
>            334,345      topdown-heavy-ops
>          1,295,589      topdown-br-mispredict
>          2,632,973      topdown-fetch-lat
>          1,379,176      topdown-bad-spec
>             21,248      INT_MISC.UOP_DROPPING        #   30.590 M/sec
> 
>        0.001456908 seconds time elapsed
> 
> While perf-stat (and perf-stat-record) calculates the percentages, perf-stat-report
> just prints the raw numbers. Thinking about it, it might be useful to know the raw
> numbers too, but rather via an option, while by default, both should behave the same,
> shouldn't they? Is perf-stat-report missing some metric postprocessing?

The perf-stat-record/report doesn't support the metrics well. It only
records the value based on the events. So all the metrics lost.


$ ./perf stat record -M cpi sleep 1

 Performance counter stats for 'sleep 1':

           183,590      INST_RETIRED.ANY:u               #      2.3
per_instr  cpi
           429,375      CPU_CLK_UNHALTED.THREAD:u


$ ./perf stat report

 Performance counter stats for 'perf stat record -M cpi sleep 1':

           183,590      INST_RETIRED.ANY:u
           429,375      CPU_CLK_UNHALTED.THREAD:u


To fix it, we need to record the metrics information as well. I will
take a look.

Thanks,
Kan

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-05-23 17:32 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-21 16:54 perf stat VERSUS perf stat report :: TopDown Metrics Michael Petlan
2024-05-23 17:31 ` Liang, Kan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).