linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* perf test failures in linux-next on s390
@ 2023-06-13 12:54 Thomas Richter
  2023-06-13 14:32 ` Ian Rogers
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Richter @ 2023-06-13 12:54 UTC (permalink / raw)
  To: linux-perf-use., Arnaldo Carvalho de Melo, Ian Rogers

Hi all,

I have run the perf test suite on the current 6.4rc6 kernel and see just one error:
# ./perf test 2>&1 | fgrep FAILED
fgrep: warning: fgrep is obsolescent; using grep -F
 42.3: BPF prologue generation                                       : FAILED!
# 

However when I download the linux-next tree and build kernel and perf
tool with the same kernel config file, I get a bunch of failing test cases,
many with perf tool dumping core:

# perf test 2>&1 | fgrep FAILED
fgrep: warning: fgrep is obsolescent; using grep -F
  6.1: Test event parsing                                            : FAILED!
 10.3: Parsing of PMU event table metrics                            : FAILED!
 10.4: Parsing of PMU event table metrics with fake PMUs             : FAILED!
 17: Setup struct perf_event_attr                                    : FAILED!
 24: Number of exit events of a simple workload                      : FAILED! core-dump
 28: Use a dummy software event to keep tracking                     : FAILED!
 35: Track with sched_switch                                         : FAILED!
 42.3: BPF prologue generation                                       : FAILED!
 66: Parse and process metrics                                       : FAILED!
 68: Event expansion for cgroups                                     : FAILED!
 69.2: Perf time to TSC                                              : FAILED! core-dump
 74: build id cache operations                                       : FAILED! core-dump
 81: kernel lock contention analysis test                            : FAILED!
 86: Zstd perf.data compression/decompression                        : FAILED! core-dump
 87: perf record tests                                               : FAILED! core-dump
 94: perf all metricgroups test                                      : FAILED!
 95: perf all metrics test                                           : FAILED!
106: Test java symbol                                                : FAILED! core-dump
#

I am afraid this will show up pretty soon in the linux tree.
I am going to look into each failure in the next few days.

What I already found out is that many test cases now fail due to the
event/PMU rework, here is one example:

# perf test -Fvvvv 95
95: perf all metrics test
--- start ---
Testing cpi
....
Metric 'transaction' not printed in:
Error:
The TX_NC_TABORT event is not supported.
---- end ----
perf all metrics test: FAILED!
# ls -l /sys/devices/cpum_cf/events/TX_NC_TABORT
-r--r--r--. 1 root root 4096 Jun 13 13:49 /sys/devices/cpum_cf/events/TX_NC_TABORT
# 

As can be seen, the event is definitely there and supported.
This same test case succeeds in the linux tree!

Hopefully I can sort out some of the failures before this code show up
in the linux tree.
-- 
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: perf test failures in linux-next on s390
  2023-06-13 12:54 perf test failures in linux-next on s390 Thomas Richter
@ 2023-06-13 14:32 ` Ian Rogers
  2023-06-14  8:31   ` Thomas Richter
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Rogers @ 2023-06-13 14:32 UTC (permalink / raw)
  To: Thomas Richter; +Cc: linux-perf-use., Arnaldo Carvalho de Melo

On Tue, Jun 13, 2023 at 5:54 AM Thomas Richter <tmricht@linux.ibm.com> wrote:
>
> Hi all,
>
> I have run the perf test suite on the current 6.4rc6 kernel and see just one error:
> # ./perf test 2>&1 | fgrep FAILED
> fgrep: warning: fgrep is obsolescent; using grep -F
>  42.3: BPF prologue generation                                       : FAILED!
> #
>
> However when I download the linux-next tree and build kernel and perf
> tool with the same kernel config file, I get a bunch of failing test cases,
> many with perf tool dumping core:
>
> # perf test 2>&1 | fgrep FAILED
> fgrep: warning: fgrep is obsolescent; using grep -F
>   6.1: Test event parsing                                            : FAILED!
>  10.3: Parsing of PMU event table metrics                            : FAILED!
>  10.4: Parsing of PMU event table metrics with fake PMUs             : FAILED!
>  17: Setup struct perf_event_attr                                    : FAILED!
>  24: Number of exit events of a simple workload                      : FAILED! core-dump
>  28: Use a dummy software event to keep tracking                     : FAILED!
>  35: Track with sched_switch                                         : FAILED!
>  42.3: BPF prologue generation                                       : FAILED!
>  66: Parse and process metrics                                       : FAILED!
>  68: Event expansion for cgroups                                     : FAILED!
>  69.2: Perf time to TSC                                              : FAILED! core-dump
>  74: build id cache operations                                       : FAILED! core-dump
>  81: kernel lock contention analysis test                            : FAILED!
>  86: Zstd perf.data compression/decompression                        : FAILED! core-dump
>  87: perf record tests                                               : FAILED! core-dump
>  94: perf all metricgroups test                                      : FAILED!
>  95: perf all metrics test                                           : FAILED!
> 106: Test java symbol                                                : FAILED! core-dump
> #
>
> I am afraid this will show up pretty soon in the linux tree.
> I am going to look into each failure in the next few days.
>
> What I already found out is that many test cases now fail due to the
> event/PMU rework, here is one example:
>
> # perf test -Fvvvv 95
> 95: perf all metrics test
> --- start ---
> Testing cpi
> ....
> Metric 'transaction' not printed in:
> Error:
> The TX_NC_TABORT event is not supported.
> ---- end ----
> perf all metrics test: FAILED!
> # ls -l /sys/devices/cpum_cf/events/TX_NC_TABORT
> -r--r--r--. 1 root root 4096 Jun 13 13:49 /sys/devices/cpum_cf/events/TX_NC_TABORT
> #
>
> As can be seen, the event is definitely there and supported.
> This same test case succeeds in the linux tree!
>
> Hopefully I can sort out some of the failures before this code show up
> in the linux tree.

Thanks Thomas, to be clear this is what is in
perf-tools-next/linux-next and not 6.4?

Rather than try to do more complicated cases like the metrics tests,
it makes sense to dig into why event parsing is failing. Test 6 first
of all, could you give output?

Thanks,
Ian

> --
> Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
> --
> Vorsitzender des Aufsichtsrats: Gregor Pillen
> Geschäftsführung: David Faller
> Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: perf test failures in linux-next on s390
  2023-06-13 14:32 ` Ian Rogers
@ 2023-06-14  8:31   ` Thomas Richter
  2023-06-14 14:57     ` Ian Rogers
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Richter @ 2023-06-14  8:31 UTC (permalink / raw)
  To: Ian Rogers; +Cc: linux-perf-use., Arnaldo Carvalho de Melo, Sumanth Korikkar

On 6/13/23 16:32, Ian Rogers wrote:
> On Tue, Jun 13, 2023 at 5:54 AM Thomas Richter <tmricht@linux.ibm.com> wrote:
>>
>> Hi all,
>>
>> I have run the perf test suite on the current 6.4rc6 kernel and see just one error:
>> # ./perf test 2>&1 | fgrep FAILED
>> fgrep: warning: fgrep is obsolescent; using grep -F
>>  42.3: BPF prologue generation                                       : FAILED!
>> #
>>
>> However when I download the linux-next tree and build kernel and perf
>> tool with the same kernel config file, I get a bunch of failing test cases,
>> many with perf tool dumping core:
>>
>> # perf test 2>&1 | fgrep FAILED
>> fgrep: warning: fgrep is obsolescent; using grep -F
>>   6.1: Test event parsing                                            : FAILED!
>>  10.3: Parsing of PMU event table metrics                            : FAILED!
>>  10.4: Parsing of PMU event table metrics with fake PMUs             : FAILED!
>>  17: Setup struct perf_event_attr                                    : FAILED!
>>  24: Number of exit events of a simple workload                      : FAILED! core-dump
>>  28: Use a dummy software event to keep tracking                     : FAILED!
>>  35: Track with sched_switch                                         : FAILED!
>>  42.3: BPF prologue generation                                       : FAILED!
>>  66: Parse and process metrics                                       : FAILED!
>>  68: Event expansion for cgroups                                     : FAILED!
>>  69.2: Perf time to TSC                                              : FAILED! core-dump
>>  74: build id cache operations                                       : FAILED! core-dump
>>  81: kernel lock contention analysis test                            : FAILED!
>>  86: Zstd perf.data compression/decompression                        : FAILED! core-dump
>>  87: perf record tests                                               : FAILED! core-dump
>>  94: perf all metricgroups test                                      : FAILED!
>>  95: perf all metrics test                                           : FAILED!
>> 106: Test java symbol                                                : FAILED! core-dump
>> #
>>
>> I am afraid this will show up pretty soon in the linux tree.
>> I am going to look into each failure in the next few days.
>>
>> What I already found out is that many test cases now fail due to the
>> event/PMU rework, here is one example:
>>
>> # perf test -Fvvvv 95
>> 95: perf all metrics test
>> --- start ---
>> Testing cpi
>> ....
>> Metric 'transaction' not printed in:
>> Error:
>> The TX_NC_TABORT event is not supported.
>> ---- end ----
>> perf all metrics test: FAILED!
>> # ls -l /sys/devices/cpum_cf/events/TX_NC_TABORT
>> -r--r--r--. 1 root root 4096 Jun 13 13:49 /sys/devices/cpum_cf/events/TX_NC_TABORT
>> #
>>
>> As can be seen, the event is definitely there and supported.
>> This same test case succeeds in the linux tree!
>>
>> Hopefully I can sort out some of the failures before this code show up
>> in the linux tree.
> 
> Thanks Thomas, to be clear this is what is in
> perf-tools-next/linux-next and not 6.4?

Ian,

thanks for your help.
Correct, I am talking about the linux-next repo. The linux repo is fine.

> 
> Rather than try to do more complicated cases like the metrics tests,
> it makes sense to dig into why event parsing is failing. Test 6 first
> of all, could you give output?
> 
> Thanks,
> Ian
> 
We discussed some aspects of this about two weeks ago, but last week
I was on vacation and now I resumed my work on linux-next.
We run the linux-next perf test suite every night and I am concerned
and would like to get this sorted out before it hits Linux 6.5.

Here is the output on my linux-next tree built yesterday:
# uname -a
Linux a35lp67.lnxne.boe 6.4.0-rc6-next-20230613d-perf #2 \
              SMP Tue Jun 13 15:18:43 CEST 2023 s390x GNU/Linux
# ./perf test -F 6
  6: Parse event definition strings  :
  6.1: Test event parsing            :Segmentation fault (core dumped)
#
# gdb perf
  ....
  (gdb) r test -F 6
   6: Parse event definition strings                                  :
  6.1: Test event parsing                                            :
Program received signal SIGSEGV, Segmentation fault.
__GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47
(gdb) where
#0  __GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47
#1  0x000000000110a18c in test__term_equal_term (evlist=0x152ea80) at tests/parse-events.c:1580
#2  0x000000000110a96a in test_event (e=0x14dc758 <test.events+1416>) at tests/parse-events.c:2209
#3  0x000000000110ac58 in test_events (events=0x14dc1d0 <test.events>, cnt=61) at tests/parse-events.c:2260
#4  0x000000000110ad52 in test__events2 (test=0x1500758 <suite.parse_events>, subtest=0)
    at tests/parse-events.c:2272
#5  0x00000000010f6fac in run_test (test=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:236
#6  0x00000000010f7142 in test_and_print (t=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:265
#7  0x00000000010f7b1e in __cmd_test (argc=1, argv=0x3ffffffa320, skiplist=0x0) at tests/builtin-test.c:436
#8  0x00000000010f8404 in cmd_test (argc=1, argv=0x3ffffffa320) at tests/builtin-test.c:559
#9  0x00000000011473fc in run_builtin (p=0x14f60e8 <commands+600>, argc=3, argv=0x3ffffffa320) at perf.c:323
#10 0x000000000114776e in handle_internal_command (argc=3, argv=0x3ffffffa320) at perf.c:377
#11 0x0000000001147980 in run_argv (argcp=0x3ffffff9f94, argv=0x3ffffff9f88) at perf.c:421
#12 0x0000000001147d48 in main (argc=3, argv=0x3ffffffa320) at perf.c:537
(gdb)

To be honest, I am no expert on the yacc/bison/flex tool chain.
I understand a little bit about them, but that is it.

When I look at the output of perf test -Fvvvv 6 on linux-next, some things seem odd,
I marked them with 3 question masks ???:

# ./perf test -Fvvv 6
  6: Parse event definition strings     :
  6.1: Test event parsing               :
--- start ---
running test 0 'syscalls:sys_enter_openat'
Using CPUID IBM,3931,704,A01,3.7,002f
running test 1 'syscalls:*'
running test 2 'r1a'
running test 3 '1:1'
running test 4 'instructions'
No PMU found for 'instructions'FAILED tests/parse-events.c:143 wrong number of entries
Event test failure: test 4 'instructions'running test 5 'cycles/period=100000,config2/'
??? What is wrong here?
??? Output on linux 6.4.0rc3:
??? # ./perf stat -e instructions -- true
???
??? Performance counter stats for 'true':
???
???         2,965,720      instructions
???
???        0.002026832 seconds time elapsed
???
???        0.000056000 seconds user
???        0.002048000 seconds sys
??? #
??? This is fine and works as expected. The s390 PMU for counters
??? has a direct mapping for this. So we end up in the s390 PMU
??? to retrieve the value.
???
??? Output on linux-next
???# ./perf stat -e instructions -- true
???
??? Performance counter stats for 'true':
???
???              0.65 msec task-clock                       #    0.250 CPUs utilized
???                 0      context-switches                 #    0.000 /sec
???                 0      cpu-migrations                   #    0.000 /sec
???                49      page-faults                      #   75.375 K/sec
???         3,367,228      cycles                           #    5.180 GHz
???         2,880,270      instructions                     #    0.86  insn per cycle
???   <not supported>      branches
???   <not supported>      branch-misses
???
???       0.002599176 seconds time elapsed
???
???       0.000053000 seconds user
???       0.002650000 seconds sys
???
???#
??? Somehow we end up in a different PMU. The output is the same as if
??? I do not specify an event at all. To reach the s390 specific PMU
??? I have to add it explicitly as in:
???# ./perf stat -e cpum_cf/instructions/ -- true
???
??? Performance counter stats for 'true':
???
???         2,814,522      cpum_cf/instructions/
???
???       0.001899881 seconds time elapsed
???
???       0.000050000 seconds user
???       0.001928000 seconds sys
???
???]#
No PMU found for 'cycles/period=100000,config2/'FAILED tests/parse-events.c:157 wrong number of entries
Event test failure: test 5 'cycles/period=100000,config2/'running test 6 'faults'
...
??? Similar output for basicly all events.

No PMU found for 'cycles'running test 59 'cycles/name=name/'
No PMU found for 'name'Segmentation fault (core dumped)

Hope this helps.

PS: Should we keep the linux-perf-use mailing list as addressee? Not sure
if everybody else is interested in this?
-- 
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: perf test failures in linux-next on s390
  2023-06-14  8:31   ` Thomas Richter
@ 2023-06-14 14:57     ` Ian Rogers
  2023-06-15  8:57       ` Thomas Richter
  2023-06-15  9:39       ` Thomas Richter
  0 siblings, 2 replies; 15+ messages in thread
From: Ian Rogers @ 2023-06-14 14:57 UTC (permalink / raw)
  To: Thomas Richter
  Cc: linux-perf-use., Arnaldo Carvalho de Melo, Sumanth Korikkar

On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote:
>
> On 6/13/23 16:32, Ian Rogers wrote:
> > On Tue, Jun 13, 2023 at 5:54 AM Thomas Richter <tmricht@linux.ibm.com> wrote:
> >>
> >> Hi all,
> >>
> >> I have run the perf test suite on the current 6.4rc6 kernel and see just one error:
> >> # ./perf test 2>&1 | fgrep FAILED
> >> fgrep: warning: fgrep is obsolescent; using grep -F
> >>  42.3: BPF prologue generation                                       : FAILED!
> >> #
> >>
> >> However when I download the linux-next tree and build kernel and perf
> >> tool with the same kernel config file, I get a bunch of failing test cases,
> >> many with perf tool dumping core:
> >>
> >> # perf test 2>&1 | fgrep FAILED
> >> fgrep: warning: fgrep is obsolescent; using grep -F
> >>   6.1: Test event parsing                                            : FAILED!
> >>  10.3: Parsing of PMU event table metrics                            : FAILED!
> >>  10.4: Parsing of PMU event table metrics with fake PMUs             : FAILED!
> >>  17: Setup struct perf_event_attr                                    : FAILED!
> >>  24: Number of exit events of a simple workload                      : FAILED! core-dump
> >>  28: Use a dummy software event to keep tracking                     : FAILED!
> >>  35: Track with sched_switch                                         : FAILED!
> >>  42.3: BPF prologue generation                                       : FAILED!
> >>  66: Parse and process metrics                                       : FAILED!
> >>  68: Event expansion for cgroups                                     : FAILED!
> >>  69.2: Perf time to TSC                                              : FAILED! core-dump
> >>  74: build id cache operations                                       : FAILED! core-dump
> >>  81: kernel lock contention analysis test                            : FAILED!
> >>  86: Zstd perf.data compression/decompression                        : FAILED! core-dump
> >>  87: perf record tests                                               : FAILED! core-dump
> >>  94: perf all metricgroups test                                      : FAILED!
> >>  95: perf all metrics test                                           : FAILED!
> >> 106: Test java symbol                                                : FAILED! core-dump
> >> #
> >>
> >> I am afraid this will show up pretty soon in the linux tree.
> >> I am going to look into each failure in the next few days.
> >>
> >> What I already found out is that many test cases now fail due to the
> >> event/PMU rework, here is one example:
> >>
> >> # perf test -Fvvvv 95
> >> 95: perf all metrics test
> >> --- start ---
> >> Testing cpi
> >> ....
> >> Metric 'transaction' not printed in:
> >> Error:
> >> The TX_NC_TABORT event is not supported.
> >> ---- end ----
> >> perf all metrics test: FAILED!
> >> # ls -l /sys/devices/cpum_cf/events/TX_NC_TABORT
> >> -r--r--r--. 1 root root 4096 Jun 13 13:49 /sys/devices/cpum_cf/events/TX_NC_TABORT
> >> #
> >>
> >> As can be seen, the event is definitely there and supported.
> >> This same test case succeeds in the linux tree!
> >>
> >> Hopefully I can sort out some of the failures before this code show up
> >> in the linux tree.
> >
> > Thanks Thomas, to be clear this is what is in
> > perf-tools-next/linux-next and not 6.4?
>
> Ian,
>
> thanks for your help.
> Correct, I am talking about the linux-next repo. The linux repo is fine.
>
> >
> > Rather than try to do more complicated cases like the metrics tests,
> > it makes sense to dig into why event parsing is failing. Test 6 first
> > of all, could you give output?
> >
> > Thanks,
> > Ian
> >
> We discussed some aspects of this about two weeks ago, but last week
> I was on vacation and now I resumed my work on linux-next.
> We run the linux-next perf test suite every night and I am concerned
> and would like to get this sorted out before it hits Linux 6.5.
>
> Here is the output on my linux-next tree built yesterday:
> # uname -a
> Linux a35lp67.lnxne.boe 6.4.0-rc6-next-20230613d-perf #2 \
>               SMP Tue Jun 13 15:18:43 CEST 2023 s390x GNU/Linux
> # ./perf test -F 6
>   6: Parse event definition strings  :
>   6.1: Test event parsing            :Segmentation fault (core dumped)
> #
> # gdb perf
>   ....
>   (gdb) r test -F 6
>    6: Parse event definition strings                                  :
>   6.1: Test event parsing                                            :
> Program received signal SIGSEGV, Segmentation fault.
> __GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47
> (gdb) where
> #0  __GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47
> #1  0x000000000110a18c in test__term_equal_term (evlist=0x152ea80) at tests/parse-events.c:1580
> #2  0x000000000110a96a in test_event (e=0x14dc758 <test.events+1416>) at tests/parse-events.c:2209
> #3  0x000000000110ac58 in test_events (events=0x14dc1d0 <test.events>, cnt=61) at tests/parse-events.c:2260
> #4  0x000000000110ad52 in test__events2 (test=0x1500758 <suite.parse_events>, subtest=0)
>     at tests/parse-events.c:2272
> #5  0x00000000010f6fac in run_test (test=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:236
> #6  0x00000000010f7142 in test_and_print (t=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:265
> #7  0x00000000010f7b1e in __cmd_test (argc=1, argv=0x3ffffffa320, skiplist=0x0) at tests/builtin-test.c:436
> #8  0x00000000010f8404 in cmd_test (argc=1, argv=0x3ffffffa320) at tests/builtin-test.c:559
> #9  0x00000000011473fc in run_builtin (p=0x14f60e8 <commands+600>, argc=3, argv=0x3ffffffa320) at perf.c:323
> #10 0x000000000114776e in handle_internal_command (argc=3, argv=0x3ffffffa320) at perf.c:377
> #11 0x0000000001147980 in run_argv (argcp=0x3ffffff9f94, argv=0x3ffffff9f88) at perf.c:421
> #12 0x0000000001147d48 in main (argc=3, argv=0x3ffffffa320) at perf.c:537
> (gdb)
>
> To be honest, I am no expert on the yacc/bison/flex tool chain.
> I understand a little bit about them, but that is it.
>
> When I look at the output of perf test -Fvvvv 6 on linux-next, some things seem odd,
> I marked them with 3 question masks ???:
>
> # ./perf test -Fvvv 6
>   6: Parse event definition strings     :
>   6.1: Test event parsing               :
> --- start ---
> running test 0 'syscalls:sys_enter_openat'
> Using CPUID IBM,3931,704,A01,3.7,002f
> running test 1 'syscalls:*'
> running test 2 'r1a'
> running test 3 '1:1'
> running test 4 'instructions'
> No PMU found for 'instructions'FAILED tests/parse-events.c:143 wrong number of entries
> Event test failure: test 4 'instructions'running test 5 'cycles/period=100000,config2/'
> ??? What is wrong here?
> ??? Output on linux 6.4.0rc3:
> ??? # ./perf stat -e instructions -- true
> ???
> ??? Performance counter stats for 'true':
> ???
> ???         2,965,720      instructions
> ???
> ???        0.002026832 seconds time elapsed
> ???
> ???        0.000056000 seconds user
> ???        0.002048000 seconds sys
> ??? #
> ??? This is fine and works as expected. The s390 PMU for counters
> ??? has a direct mapping for this. So we end up in the s390 PMU
> ??? to retrieve the value.
> ???
> ??? Output on linux-next
> ???# ./perf stat -e instructions -- true
> ???
> ??? Performance counter stats for 'true':
> ???
> ???              0.65 msec task-clock                       #    0.250 CPUs utilized
> ???                 0      context-switches                 #    0.000 /sec
> ???                 0      cpu-migrations                   #    0.000 /sec
> ???                49      page-faults                      #   75.375 K/sec
> ???         3,367,228      cycles                           #    5.180 GHz
> ???         2,880,270      instructions                     #    0.86  insn per cycle
> ???   <not supported>      branches
> ???   <not supported>      branch-misses
> ???
> ???       0.002599176 seconds time elapsed
> ???
> ???       0.000053000 seconds user
> ???       0.002650000 seconds sys
> ???
> ???#
> ??? Somehow we end up in a different PMU. The output is the same as if
> ??? I do not specify an event at all. To reach the s390 specific PMU
> ??? I have to add it explicitly as in:
> ???# ./perf stat -e cpum_cf/instructions/ -- true
> ???
> ??? Performance counter stats for 'true':
> ???
> ???         2,814,522      cpum_cf/instructions/
> ???
> ???       0.001899881 seconds time elapsed
> ???
> ???       0.000050000 seconds user
> ???       0.001928000 seconds sys
> ???
> ???]#
> No PMU found for 'cycles/period=100000,config2/'FAILED tests/parse-events.c:157 wrong number of entries
> Event test failure: test 5 'cycles/period=100000,config2/'running test 6 'faults'
> ...
> ??? Similar output for basicly all events.
>
> No PMU found for 'cycles'running test 59 'cycles/name=name/'
> No PMU found for 'name'Segmentation fault (core dumped)
>
> Hope this helps.
>
> PS: Should we keep the linux-perf-use mailing list as addressee? Not sure
> if everybody else is interested in this?

Smaller list is okay. Could you send me a zip of the sysfs
(/sys/devices) ? At least one issue is that the code didn't find a
core PMU. On non-hybrid x86 this would be /sys/devices/cpu, I think we
spoke about this before for s390 and there are >1. The issue here is
that the test found 0, and we're trying to use PMUs in the code now as
a way to sort events. There's code/comment in util/pmu.c:

'''
/**
 * is_sysfs_pmu_core() - PMU CORE devices have different name other than cpu in
 *         sysfs on some platforms like ARM or Intel hybrid. Looking for
 *         possible the cpus file in sysfs files to identify whether this is a
 *         core device.
 * @name: The PMU name such as "cpu_atom".
 */
static int is_sysfs_pmu_core(const char *name)
{
char path[PATH_MAX];

if (!perf_pmu__pathname_scnprintf(path, sizeof(path), name, "cpus"))
return 0;
return file_available(path);
}
...
bool is_pmu_core(const char *name)
{
return !strcmp(name, "cpu") || is_sysfs_pmu_core(name);
}
'''

Thanks,
Ian

> --
> Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
> --
> Vorsitzender des Aufsichtsrats: Gregor Pillen
> Geschäftsführung: David Faller
> Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: perf test failures in linux-next on s390
  2023-06-14 14:57     ` Ian Rogers
@ 2023-06-15  8:57       ` Thomas Richter
  2023-06-15  9:39       ` Thomas Richter
  1 sibling, 0 replies; 15+ messages in thread
From: Thomas Richter @ 2023-06-15  8:57 UTC (permalink / raw)
  To: Ian Rogers; +Cc: linux-perf-use., Arnaldo Carvalho de Melo, Sumanth Korikkar

[-- Attachment #1: Type: text/plain, Size: 12355 bytes --]

On 6/14/23 16:57, Ian Rogers wrote:
> On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote:
>>
>> On 6/13/23 16:32, Ian Rogers wrote:
>>> On Tue, Jun 13, 2023 at 5:54 AM Thomas Richter <tmricht@linux.ibm.com> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I have run the perf test suite on the current 6.4rc6 kernel and see just one error:
>>>> # ./perf test 2>&1 | fgrep FAILED
>>>> fgrep: warning: fgrep is obsolescent; using grep -F
>>>>  42.3: BPF prologue generation                                       : FAILED!
>>>> #
>>>>
>>>> However when I download the linux-next tree and build kernel and perf
>>>> tool with the same kernel config file, I get a bunch of failing test cases,
>>>> many with perf tool dumping core:
>>>>
>>>> # perf test 2>&1 | fgrep FAILED
>>>> fgrep: warning: fgrep is obsolescent; using grep -F
>>>>   6.1: Test event parsing                                            : FAILED!
>>>>  10.3: Parsing of PMU event table metrics                            : FAILED!
>>>>  10.4: Parsing of PMU event table metrics with fake PMUs             : FAILED!
>>>>  17: Setup struct perf_event_attr                                    : FAILED!
>>>>  24: Number of exit events of a simple workload                      : FAILED! core-dump
>>>>  28: Use a dummy software event to keep tracking                     : FAILED!
>>>>  35: Track with sched_switch                                         : FAILED!
>>>>  42.3: BPF prologue generation                                       : FAILED!
>>>>  66: Parse and process metrics                                       : FAILED!
>>>>  68: Event expansion for cgroups                                     : FAILED!
>>>>  69.2: Perf time to TSC                                              : FAILED! core-dump
>>>>  74: build id cache operations                                       : FAILED! core-dump
>>>>  81: kernel lock contention analysis test                            : FAILED!
>>>>  86: Zstd perf.data compression/decompression                        : FAILED! core-dump
>>>>  87: perf record tests                                               : FAILED! core-dump
>>>>  94: perf all metricgroups test                                      : FAILED!
>>>>  95: perf all metrics test                                           : FAILED!
>>>> 106: Test java symbol                                                : FAILED! core-dump
>>>> #
>>>>
>>>> I am afraid this will show up pretty soon in the linux tree.
>>>> I am going to look into each failure in the next few days.
>>>>
>>>> What I already found out is that many test cases now fail due to the
>>>> event/PMU rework, here is one example:
>>>>
>>>> # perf test -Fvvvv 95
>>>> 95: perf all metrics test
>>>> --- start ---
>>>> Testing cpi
>>>> ....
>>>> Metric 'transaction' not printed in:
>>>> Error:
>>>> The TX_NC_TABORT event is not supported.
>>>> ---- end ----
>>>> perf all metrics test: FAILED!
>>>> # ls -l /sys/devices/cpum_cf/events/TX_NC_TABORT
>>>> -r--r--r--. 1 root root 4096 Jun 13 13:49 /sys/devices/cpum_cf/events/TX_NC_TABORT
>>>> #
>>>>
>>>> As can be seen, the event is definitely there and supported.
>>>> This same test case succeeds in the linux tree!
>>>>
>>>> Hopefully I can sort out some of the failures before this code show up
>>>> in the linux tree.
>>>
>>> Thanks Thomas, to be clear this is what is in
>>> perf-tools-next/linux-next and not 6.4?
>>
>> Ian,
>>
>> thanks for your help.
>> Correct, I am talking about the linux-next repo. The linux repo is fine.
>>
>>>
>>> Rather than try to do more complicated cases like the metrics tests,
>>> it makes sense to dig into why event parsing is failing. Test 6 first
>>> of all, could you give output?
>>>
>>> Thanks,
>>> Ian
>>>
>> We discussed some aspects of this about two weeks ago, but last week
>> I was on vacation and now I resumed my work on linux-next.
>> We run the linux-next perf test suite every night and I am concerned
>> and would like to get this sorted out before it hits Linux 6.5.
>>
>> Here is the output on my linux-next tree built yesterday:
>> # uname -a
>> Linux a35lp67.lnxne.boe 6.4.0-rc6-next-20230613d-perf #2 \
>>               SMP Tue Jun 13 15:18:43 CEST 2023 s390x GNU/Linux
>> # ./perf test -F 6
>>   6: Parse event definition strings  :
>>   6.1: Test event parsing            :Segmentation fault (core dumped)
>> #
>> # gdb perf
>>   ....
>>   (gdb) r test -F 6
>>    6: Parse event definition strings                                  :
>>   6.1: Test event parsing                                            :
>> Program received signal SIGSEGV, Segmentation fault.
>> __GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47
>> (gdb) where
>> #0  __GI_strcmp () at ../sysdeps/s390/strcmp-vx.S:47
>> #1  0x000000000110a18c in test__term_equal_term (evlist=0x152ea80) at tests/parse-events.c:1580
>> #2  0x000000000110a96a in test_event (e=0x14dc758 <test.events+1416>) at tests/parse-events.c:2209
>> #3  0x000000000110ac58 in test_events (events=0x14dc1d0 <test.events>, cnt=61) at tests/parse-events.c:2260
>> #4  0x000000000110ad52 in test__events2 (test=0x1500758 <suite.parse_events>, subtest=0)
>>     at tests/parse-events.c:2272
>> #5  0x00000000010f6fac in run_test (test=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:236
>> #6  0x00000000010f7142 in test_and_print (t=0x1500758 <suite.parse_events>, subtest=0) at tests/builtin-test.c:265
>> #7  0x00000000010f7b1e in __cmd_test (argc=1, argv=0x3ffffffa320, skiplist=0x0) at tests/builtin-test.c:436
>> #8  0x00000000010f8404 in cmd_test (argc=1, argv=0x3ffffffa320) at tests/builtin-test.c:559
>> #9  0x00000000011473fc in run_builtin (p=0x14f60e8 <commands+600>, argc=3, argv=0x3ffffffa320) at perf.c:323
>> #10 0x000000000114776e in handle_internal_command (argc=3, argv=0x3ffffffa320) at perf.c:377
>> #11 0x0000000001147980 in run_argv (argcp=0x3ffffff9f94, argv=0x3ffffff9f88) at perf.c:421
>> #12 0x0000000001147d48 in main (argc=3, argv=0x3ffffffa320) at perf.c:537
>> (gdb)
>>
>> To be honest, I am no expert on the yacc/bison/flex tool chain.
>> I understand a little bit about them, but that is it.
>>
>> When I look at the output of perf test -Fvvvv 6 on linux-next, some things seem odd,
>> I marked them with 3 question masks ???:
>>
>> # ./perf test -Fvvv 6
>>   6: Parse event definition strings     :
>>   6.1: Test event parsing               :
>> --- start ---
>> running test 0 'syscalls:sys_enter_openat'
>> Using CPUID IBM,3931,704,A01,3.7,002f
>> running test 1 'syscalls:*'
>> running test 2 'r1a'
>> running test 3 '1:1'
>> running test 4 'instructions'
>> No PMU found for 'instructions'FAILED tests/parse-events.c:143 wrong number of entries
>> Event test failure: test 4 'instructions'running test 5 'cycles/period=100000,config2/'
>> ??? What is wrong here?
>> ??? Output on linux 6.4.0rc3:
>> ??? # ./perf stat -e instructions -- true
>> ???
>> ??? Performance counter stats for 'true':
>> ???
>> ???         2,965,720      instructions
>> ???
>> ???        0.002026832 seconds time elapsed
>> ???
>> ???        0.000056000 seconds user
>> ???        0.002048000 seconds sys
>> ??? #
>> ??? This is fine and works as expected. The s390 PMU for counters
>> ??? has a direct mapping for this. So we end up in the s390 PMU
>> ??? to retrieve the value.
>> ???
>> ??? Output on linux-next
>> ???# ./perf stat -e instructions -- true
>> ???
>> ??? Performance counter stats for 'true':
>> ???
>> ???              0.65 msec task-clock                       #    0.250 CPUs utilized
>> ???                 0      context-switches                 #    0.000 /sec
>> ???                 0      cpu-migrations                   #    0.000 /sec
>> ???                49      page-faults                      #   75.375 K/sec
>> ???         3,367,228      cycles                           #    5.180 GHz
>> ???         2,880,270      instructions                     #    0.86  insn per cycle
>> ???   <not supported>      branches
>> ???   <not supported>      branch-misses
>> ???
>> ???       0.002599176 seconds time elapsed
>> ???
>> ???       0.000053000 seconds user
>> ???       0.002650000 seconds sys
>> ???
>> ???#
>> ??? Somehow we end up in a different PMU. The output is the same as if
>> ??? I do not specify an event at all. To reach the s390 specific PMU
>> ??? I have to add it explicitly as in:
>> ???# ./perf stat -e cpum_cf/instructions/ -- true
>> ???
>> ??? Performance counter stats for 'true':
>> ???
>> ???         2,814,522      cpum_cf/instructions/
>> ???
>> ???       0.001899881 seconds time elapsed
>> ???
>> ???       0.000050000 seconds user
>> ???       0.001928000 seconds sys
>> ???
>> ???]#
>> No PMU found for 'cycles/period=100000,config2/'FAILED tests/parse-events.c:157 wrong number of entries
>> Event test failure: test 5 'cycles/period=100000,config2/'running test 6 'faults'
>> ...
>> ??? Similar output for basicly all events.
>>
>> No PMU found for 'cycles'running test 59 'cycles/name=name/'
>> No PMU found for 'name'Segmentation fault (core dumped)
>>
>> Hope this helps.
>>
>> PS: Should we keep the linux-perf-use mailing list as addressee? Not sure
>> if everybody else is interested in this?
> 
> Smaller list is okay. Could you send me a zip of the sysfs
> (/sys/devices) ? At least one issue is that the code didn't find a
> core PMU. On non-hybrid x86 this would be /sys/devices/cpu, I think we
> spoke about this before for s390 and there are >1. The issue here is
> that the test found 0, and we're trying to use PMUs in the code now as
> a way to sort events. There's code/comment in util/pmu.c:
> 
> '''
> /**
>  * is_sysfs_pmu_core() - PMU CORE devices have different name other than cpu in
>  *         sysfs on some platforms like ARM or Intel hybrid. Looking for
>  *         possible the cpus file in sysfs files to identify whether this is a
>  *         core device.
>  * @name: The PMU name such as "cpu_atom".
>  */
> static int is_sysfs_pmu_core(const char *name)
> {
> char path[PATH_MAX];
> 
> if (!perf_pmu__pathname_scnprintf(path, sizeof(path), name, "cpus"))
> return 0;
> return file_available(path);
> }
> ...
> bool is_pmu_core(const char *name)
> {
> return !strcmp(name, "cpu") || is_sysfs_pmu_core(name);
> }
> '''
> 
> Thanks,
> Ian
> 

Thanks for refreshing my memory. With s390 core PMU named as
/sys/devices/cpum_cf, this fix was missing:

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index fe64ad292d36..6142e4710a2f 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1419,7 +1419,7 @@ void perf_pmu__del_formats(struct list_head *formats)
 
 bool is_pmu_core(const char *name)
 {
-       return !strcmp(name, "cpu") || is_sysfs_pmu_core(name);
+       return !strcmp(name, "cpu") || !strcmp(name, "cpum_cf") || is_sysfs_pmu_core(name);
 }
 
 bool perf_pmu__supports_legacy_cache(const struct perf_pmu *pmu)

With that fix applied, the test succeeds:

# ./perf test -F 6
  6: Parse event definition strings                                  :
  6.1: Test event parsing                                            : Ok
  6.2: Parsing of all PMU events from sysfs                          : Ok
  6.3: Parsing of given PMU events from sysfs                        : Ok
  6.4: Parsing of aliased events from sysfs                          : Skip (no aliases in sysfs)
  6.5: Parsing of aliased events                                     : Ok
  6.6: Parsing of terms (event modifiers)                            : Ok
#

I have tried to send my /sys/devices tree before as zip or tgz but that
mail got deleted by your mailer because it contained compressed data.

So I send you an ls -lR from our 5 PMUs on s390 as text attachment and
here are the type values:
# for i in  /sys/devices/cpum_[sc]f* /sys/devices/pai_*; do echo PMU $i PMU_TYPE $(cat $i/type); done
PMU /sys/devices/cpum_cf PMU_TYPE 8
PMU /sys/devices/cpum_cf_diag PMU_TYPE 9
PMU /sys/devices/cpum_sf PMU_TYPE 4
PMU /sys/devices/pai_crypto PMU_TYPE 10
PMU /sys/devices/pai_ext PMU_TYPE 11
# 

For the /sys/devices/PMU tree see attachment sysfs-s390.txt

Thanks a lot for you help.
-- 
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

[-- Attachment #2: sysfs-s390.txt --]
[-- Type: text/plain, Size: 19744 bytes --]

/sys/devices/cpum_cf:
total 0
drwxr-xr-x 2 root root    0 Jun 14 13:03 events
drwxr-xr-x 2 root root    0 Jun 14 13:03 format
-rw-r--r-- 1 root root 4096 Jun 15 10:48 perf_event_mux_interval_ms
lrwxrwxrwx 1 root root    0 Jun 15 10:48 subsystem -> ../../bus/event_source
-r--r--r-- 1 root root 4096 Jun 14 13:03 type
-rw-r--r-- 1 root root 4096 Jun 15 10:48 uevent

/sys/devices/cpum_cf/events:
total 0
-r--r--r-- 1 root root 4096 Jun 14 13:03 AES_BLOCKED_CYCLES
-r--r--r-- 1 root root 4096 Jun 14 13:03 AES_BLOCKED_FUNCTIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 AES_CYCLES
-r--r--r-- 1 root root 4096 Jun 14 13:03 AES_FUNCTIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 BCD_DFP_EXECUTION_SLOTS
-r--r--r-- 1 root root 4096 Jun 14 13:03 CPU_CYCLES
-r--r--r-- 1 root root 4096 Jun 14 13:03 CRSTE_1MB_WRITES
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_OFF_DRAWER
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_OFF_DRAWER_MEMORY
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_CHIP
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_CHIP_CHIP_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_CHIP_DRAWER_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_CHIP_IV
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_CHIP_MEMORY
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_DRAWER
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_DRAWER_MEMORY
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_MODULE
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_ON_MODULE_MEMORY
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_REQ
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_REQ_CHIP_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_REQ_DRAWER_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 DCW_REQ_IV
-r--r--r-- 1 root root 4096 Jun 14 13:03 DEA_BLOCKED_CYCLES
-r--r--r-- 1 root root 4096 Jun 14 13:03 DEA_BLOCKED_FUNCTIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 DEA_CYCLES
-r--r--r-- 1 root root 4096 Jun 14 13:03 DEA_FUNCTIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 DECIMAL_INSTRUCTIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 DFLT_ACCESS
-r--r--r-- 1 root root 4096 Jun 14 13:03 DFLT_CC
-r--r--r-- 1 root root 4096 Jun 14 13:03 DFLT_CCFINISH
-r--r--r-- 1 root root 4096 Jun 14 13:03 DFLT_CYCLES
-r--r--r-- 1 root root 4096 Jun 14 13:03 DTLB2_GPAGE_WRITES
-r--r--r-- 1 root root 4096 Jun 14 13:03 DTLB2_MISSES
-r--r--r-- 1 root root 4096 Jun 14 13:03 DTLB2_WRITES
-r--r--r-- 1 root root 4096 Jun 14 13:03 ECC_BLOCKED_CYCLES_COUNT
-r--r--r-- 1 root root 4096 Jun 14 13:03 ECC_BLOCKED_FUNCTION_COUNT
-r--r--r-- 1 root root 4096 Jun 14 13:03 ECC_CYCLES_COUNT
-r--r--r-- 1 root root 4096 Jun 14 13:03 ECC_FUNCTION_COUNT
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_OFF_DRAWER
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_OFF_DRAWER_MEMORY
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_CHIP
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_CHIP_CHIP_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_CHIP_DRAWER_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_CHIP_IV
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_CHIP_MEMORY
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_DRAWER
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_DRAWER_MEMORY
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_MODULE
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_ON_MODULE_MEMORY
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_REQ
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_REQ_CHIP_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_REQ_DRAWER_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 ICW_REQ_IV
-r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_OFF_DRAWER_CHIP_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_OFF_DRAWER_DRAWER_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_OFF_DRAWER_IV
-r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_ON_DRAWER_CHIP_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_ON_DRAWER_DRAWER_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_ON_DRAWER_IV
-r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_ON_MODULE_CHIP_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_ON_MODULE_DRAWER_HIT
-r--r--r-- 1 root root 4096 Jun 14 13:03 IDCW_ON_MODULE_IV
-r--r--r-- 1 root root 4096 Jun 14 13:03 INSTRUCTIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 ITLB2_MISSES
-r--r--r-- 1 root root 4096 Jun 14 13:03 ITLB2_WRITES
-r--r--r-- 1 root root 4096 Jun 14 13:03 L1C_TLB2_MISSES
-r--r--r-- 1 root root 4096 Jun 14 13:03 L1D_DIR_WRITES
-r--r--r-- 1 root root 4096 Jun 14 13:03 L1D_PENALTY_CYCLES
-r--r--r-- 1 root root 4096 Jun 14 13:03 L1D_RO_EXCL_WRITES
-r--r--r-- 1 root root 4096 Jun 14 13:03 L1I_DIR_WRITES
-r--r--r-- 1 root root 4096 Jun 14 13:03 L1I_PENALTY_CYCLES
-r--r--r-- 1 root root 4096 Jun 14 13:03 LAST_HOST_TRANSLATIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 MT_DIAG_CYCLES_ONE_THR_ACTIVE
-r--r--r-- 1 root root 4096 Jun 14 13:03 MT_DIAG_CYCLES_TWO_THR_ACTIVE
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_COMPLETIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_HOLD_LOCK
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_INVOCATIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_WAIT_LOCK
-r--r--r-- 1 root root 4096 Jun 14 13:03 PRNG_BLOCKED_CYCLES
-r--r--r-- 1 root root 4096 Jun 14 13:03 PRNG_BLOCKED_FUNCTIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 PRNG_CYCLES
-r--r--r-- 1 root root 4096 Jun 14 13:03 PRNG_FUNCTIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 PROBLEM_STATE_CPU_CYCLES
-r--r--r-- 1 root root 4096 Jun 14 13:03 PROBLEM_STATE_INSTRUCTIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 SHA_BLOCKED_CYCLES
-r--r--r-- 1 root root 4096 Jun 14 13:03 SHA_BLOCKED_FUNCTIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 SHA_CYCLES
-r--r--r-- 1 root root 4096 Jun 14 13:03 SHA_FUNCTIONS
-r--r--r-- 1 root root 4096 Jun 14 13:03 SORTL
-r--r--r-- 1 root root 4096 Jun 14 13:03 TLB2_CRSTE_WRITES
-r--r--r-- 1 root root 4096 Jun 14 13:03 TLB2_ENGINES_BUSY
-r--r--r-- 1 root root 4096 Jun 14 13:03 TLB2_PTE_WRITES
-r--r--r-- 1 root root 4096 Jun 14 13:03 TX_C_TABORT_NO_SPECIAL
-r--r--r-- 1 root root 4096 Jun 14 13:03 TX_C_TABORT_SPECIAL
-r--r--r-- 1 root root 4096 Jun 14 13:03 TX_C_TEND
-r--r--r-- 1 root root 4096 Jun 14 13:03 TX_NC_TABORT
-r--r--r-- 1 root root 4096 Jun 14 13:03 TX_NC_TEND
-r--r--r-- 1 root root 4096 Jun 14 13:03 VX_BCD_EXECUTION_SLOTS

/sys/devices/cpum_cf/format:
total 0
-r--r--r-- 1 root root 4096 Jun 14 13:03 event

/sys/devices/cpum_cf_diag:
total 0
drwxr-xr-x 2 root root    0 Jun 14 15:43 events
drwxr-xr-x 2 root root    0 Jun 14 15:43 format
-rw-r--r-- 1 root root 4096 Jun 15 10:53 perf_event_mux_interval_ms
lrwxrwxrwx 1 root root    0 Jun 15 10:53 subsystem -> ../../bus/event_source
-r--r--r-- 1 root root 4096 Jun 14 15:43 type
-rw-r--r-- 1 root root 4096 Jun 15 10:53 uevent

/sys/devices/cpum_cf_diag/events:
total 0
-r--r--r-- 1 root root 4096 Jun 14 15:43 CF_DIAG

/sys/devices/cpum_cf_diag/format:
total 0
-r--r--r-- 1 root root 4096 Jun 14 15:43 event

/sys/devices/cpum_sf:
total 0
drwxr-xr-x 2 root root    0 Jun 14 13:03 events
drwxr-xr-x 2 root root    0 Jun 14 13:03 format
-rw-r--r-- 1 root root 4096 Jun 15 10:48 perf_event_mux_interval_ms
lrwxrwxrwx 1 root root    0 Jun 15 10:48 subsystem -> ../../bus/event_source
-r--r--r-- 1 root root 4096 Jun 14 13:03 type
-rw-r--r-- 1 root root 4096 Jun 15 10:48 uevent

/sys/devices/cpum_sf/events:
total 0
-r--r--r-- 1 root root 4096 Jun 14 13:03 SF_CYCLES_BASIC
-r--r--r-- 1 root root 4096 Jun 14 13:03 SF_CYCLES_BASIC_DIAG

/sys/devices/cpum_sf/format:
total 0
-r--r--r-- 1 root root 4096 Jun 14 13:03 event

/sys/devices/pai_crypto:
total 0
drwxr-xr-x 2 root root    0 Jun 14 12:12 events
drwxr-xr-x 2 root root    0 Jun 14 13:03 format
-rw-r--r-- 1 root root 4096 Jun 15 10:49 perf_event_mux_interval_ms
lrwxrwxrwx 1 root root    0 Jun 15 10:49 subsystem -> ../../bus/event_source
-r--r--r-- 1 root root 4096 Jun 14 13:03 type
-rw-r--r-- 1 root root 4096 Jun 15 10:49 uevent

/sys/devices/pai_crypto/events:
total 0
-r--r--r-- 1 root root 4096 Jun 14 15:43 CRYPTO_ALL
-r--r--r-- 1 root root 4096 Jun 15 10:33 IBM_RESERVED_155
-r--r--r-- 1 root root 4096 Jun 15 10:33 IBM_RESERVED_156
-r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ECDSA_SIGN_P256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ECDSA_SIGN_P384
-r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ECDSA_SIGN_P521
-r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ECDSA_VERIFY_P256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ECDSA_VERIFY_P384
-r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ECDSA_VERIFY_P521
-r--r--r-- 1 root root 4096 Jun 14 15:43 KDSA_EDDSA_SIGN_ED25519
-r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_EDDSA_SIGN_ED448
-r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_EDDSA_VERIFY_ED25519
-r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_EDDSA_VERIFY_ED448
-r--r--r-- 1 root root 4096 Jun 14 15:43 KDSA_ENCRYPTED_ECDSA_SIGN_P256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ENCRYPTED_ECDSA_SIGN_P384
-r--r--r-- 1 root root 4096 Jun 14 15:43 KDSA_ENCRYPTED_ECDSA_SIGN_P521
-r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ENCRYPTED_EDDSA_SIGN_ED25519
-r--r--r-- 1 root root 4096 Jun 15 10:33 KDSA_ENCRYPTED_EDDSA_SIGN_ED448
-r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_GHASH
-r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHA_1
-r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHA_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHA3_224
-r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHA3_256
-r--r--r-- 1 root root 4096 Jun 14 15:43 KIMD_SHA3_384
-r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHA3_512
-r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHA_512
-r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHAKE_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KIMD_SHAKE_256
-r--r--r-- 1 root root 4096 Jun 14 13:03 KLMD_SHA_1
-r--r--r-- 1 root root 4096 Jun 15 10:33 KLMD_SHA_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KLMD_SHA3_224
-r--r--r-- 1 root root 4096 Jun 15 10:33 KLMD_SHA3_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KLMD_SHA3_384
-r--r--r-- 1 root root 4096 Jun 14 15:43 KLMD_SHA3_512
-r--r--r-- 1 root root 4096 Jun 15 10:33 KLMD_SHA_512
-r--r--r-- 1 root root 4096 Jun 14 15:43 KLMD_SHAKE_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KLMD_SHAKE_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_AES_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_DEA
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_ENCRYPTED_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_ENCRYPTED_AES_192
-r--r--r-- 1 root root 4096 Jun 14 15:43 KMAC_ENCRYPTED_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_ENCRYPTED_DEA
-r--r--r-- 1 root root 4096 Jun 14 15:43 KMAC_ENCRYPTED_TDEA_128
-r--r--r-- 1 root root 4096 Jun 14 15:43 KMAC_ENCRYPTED_TDEA_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_TDEA_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMAC_TDEA_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_AES_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_AES_256
-r--r--r-- 1 root root 4096 Jun 14 15:43 KMA_GCM_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMA_GCM_AES_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMA_GCM_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMA_GCM_ENCRYPTED_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMA_GCM_ENCRYPTED_AES_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMA_GCM_ENCRYPTED_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_AES_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_DEA
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_ENCRYPTED_AES_128
-r--r--r-- 1 root root 4096 Jun 14 15:43 KMC_ENCRYPTED_AES_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_ENCRYPTED_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_ENCRYPTED_DEA
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_ENCRYPTED_TDEA_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_ENCRYPTED_TDEA_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_PRNG
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMC_TDEA_128
-r--r--r-- 1 root root 4096 Jun 14 15:43 KMC_TDEA_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_AES_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_DEA
-r--r--r-- 1 root root 4096 Jun 14 15:43 KMCTR_ENCRYPTED_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_ENCRYPTED_AES_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_ENCRYPTED_AES_256
-r--r--r-- 1 root root 4096 Jun 14 15:43 KMCTR_ENCRYPTED_DEA
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_ENCRYPTED_TDEA_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_ENCRYPTED_TDEA_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_TDEA_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMCTR_TDEA_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_DEA
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_ENCRYPTED_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_ENCRYPTED_AES_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_ENCRYPTED_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_ENCRYPTED_DEA
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_ENCRYPTED_TDEA_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_ENCRYPTED_TDEA_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_AES_192
-r--r--r-- 1 root root 4096 Jun 14 15:43 KMF_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_DEA
-r--r--r-- 1 root root 4096 Jun 14 15:43 KMF_ENCRYPTED_AES_128
-r--r--r-- 1 root root 4096 Jun 14 15:43 KMF_ENCRYPTED_AES_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_ENCRYPTED_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_ENCRYPTED_DEA
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_ENCRYPTED_TDEA_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_ENCRYPTED_TDEA_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_TDEA_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMF_TDEA_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_AES_192
-r--r--r-- 1 root root 4096 Jun 14 15:43 KMO_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_DEA
-r--r--r-- 1 root root 4096 Jun 14 15:43 KMO_ENCRYPTED_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_ENCRYPTED_AES_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_ENCRYPTED_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_ENCRYPTED_DEA
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_ENCRYPTED_TDEA_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_ENCRYPTED_TDEA_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_TDEA_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KMO_TDEA_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_TDEA_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_TDEA_192
-r--r--r-- 1 root root 4096 Jun 14 15:43 KM_XTS_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_XTS_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_XTS_ENCRYPTED_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 KM_XTS_ENCRYPTED_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_AES_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_DEA
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_AES_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_AES_256A
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_DEA
-r--r--r-- 1 root root 4096 Jun 14 15:43 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_TDEA_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_TDEA_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_TDEA_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_LAST_BLOCK_CMAC_USING_TDEA_192
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_XTS_PARAMETER_USING_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_XTS_PARAMETER_USING_AES_256
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_XTS_PARAMETER_USING_ENCRYPTED_AES_128
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_COMPUTE_XTS_PARAMETER_USING_ENCRYPTED_AES_256
-r--r--r-- 1 root root 4096 Jun 14 15:43 PCC_SCALAR_MULTIPLY_ED25519
-r--r--r-- 1 root root 4096 Jun 14 15:43 PCC_SCALAR_MULTIPLY_ED448
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_SCALAR_MULTIPLY_P256
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_SCALAR_MULTIPLY_P384
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_SCALAR_MULTIPLY_P521
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_SCALAR_MULTIPLY_X25519
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCC_SCALAR_MULTIPLY_X448
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_AES_128_KEY
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_AES_192_KEY
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_AES_256_KEY
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_DEA_KEY
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_ECC_ED25519_KEY
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_ECC_ED448_KEY
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_ECC_P256_KEY
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_ECC_P384_KEY
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_ECC_P521_KEY
-r--r--r-- 1 root root 4096 Jun 15 10:33 PCKMO_ENCRYPT_TDEA_128_KEY
-r--r--r-- 1 root root 4096 Jun 14 15:43 PCKMO_ENCRYPT_TDEA_192_KEY
-r--r--r-- 1 root root 4096 Jun 15 10:33 PRNO_SHA_512_DRNG
-r--r--r-- 1 root root 4096 Jun 15 10:33 PRNO_TRNG
-r--r--r-- 1 root root 4096 Jun 15 10:33 PRNO_TRNG_QUERY_RAW_TO_CONDITIONED_RATIO

/sys/devices/pai_crypto/format:
total 0
-r--r--r-- 1 root root 4096 Jun 14 13:03 event

/sys/devices/pai_ext:
total 0
drwxr-xr-x 2 root root    0 Jun 14 13:03 events
drwxr-xr-x 2 root root    0 Jun 14 13:03 format
-rw-r--r-- 1 root root 4096 Jun 15 10:49 perf_event_mux_interval_ms
lrwxrwxrwx 1 root root    0 Jun 15 10:49 subsystem -> ../../bus/event_source
-r--r--r-- 1 root root 4096 Jun 14 13:03 type
-rw-r--r-- 1 root root 4096 Jun 15 10:49 uevent

/sys/devices/pai_ext/events:
total 0
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_1MFRAME
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_2GFRAME
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_ACCESSEXCEPT
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_ADD
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_ALL
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_AVGPOOL2D
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_BATCHNORM
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_CONVOLUTION
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_DIV
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_EXP
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_GRUACT
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_IBM_RESERVED_9
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_LARGEDIM
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_LOG
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_LSTMACT
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_MATMUL_OP
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_MATMUL_OP_BCAST23
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_MAX
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_MAXPOOL2D
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_MIN
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_MUL
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_RELU
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_SIGMOID
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_SMALLBATCH
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_SMALLTENSOR
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_SOFTMAX
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_SUB
-r--r--r-- 1 root root 4096 Jun 14 13:03 NNPA_TANH

/sys/devices/pai_ext/format:
total 0
-r--r--r-- 1 root root 4096 Jun 14 13:03 event

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: perf test failures in linux-next on s390
  2023-06-14 14:57     ` Ian Rogers
  2023-06-15  8:57       ` Thomas Richter
@ 2023-06-15  9:39       ` Thomas Richter
  2023-06-15 14:34         ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 15+ messages in thread
From: Thomas Richter @ 2023-06-15  9:39 UTC (permalink / raw)
  To: Ian Rogers; +Cc: linux-perf-use., Arnaldo Carvalho de Melo, Sumanth Korikkar

On 6/14/23 16:57, Ian Rogers wrote:
> On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote:

....

> 
> Smaller list is okay. Could you send me a zip of the sysfs
> (/sys/devices) ? At least one issue is that the code didn't find a
> core PMU. On non-hybrid x86 this would be /sys/devices/cpu, I think we
> spoke about this before for s390 and there are >1. The issue here is
> that the test found 0, and we're trying to use PMUs in the code now as
> a way to sort events. There's code/comment in util/pmu.c:
> 
> '''
> /**
>  * is_sysfs_pmu_core() - PMU CORE devices have different name other than cpu in
>  *         sysfs on some platforms like ARM or Intel hybrid. Looking for
>  *         possible the cpus file in sysfs files to identify whether this is a
>  *         core device.
>  * @name: The PMU name such as "cpu_atom".
>  */
> static int is_sysfs_pmu_core(const char *name)
> {
> char path[PATH_MAX];
> 
> if (!perf_pmu__pathname_scnprintf(path, sizeof(path), name, "cpus"))
> return 0;
> return file_available(path);
> }
> ...
> bool is_pmu_core(const char *name)
> {
> return !strcmp(name, "cpu") || is_sysfs_pmu_core(name);
> }
> '''
> 
> Thanks,
> Ian
> 

Maybe we should scan the directory

[linux-next]# ll /sys/bus/event_source/devices
total 0
lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_cf -> ../../../devices/cpum_cf
lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_cf_diag -> ../../../devices/cpum_cf_diag
lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_sf -> ../../../devices/cpum_sf
lrwxrwxrwx 1 root root 0 Jun  2 15:11 kprobe -> ../../../devices/kprobe
lrwxrwxrwx 1 root root 0 Jun  2 15:11 software -> ../../../devices/software
lrwxrwxrwx 1 root root 0 Jun  2 15:11 tracepoint -> ../../../devices/tracepoint
lrwxrwxrwx 1 root root 0 Jun  2 15:11 uprobe -> ../../../devices/uprobe
[linux-next]#

This directory lists the PMUs available on s390, maybe this is true for
other platform...

Just me 2 cents
>> --
>> Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
>> --
>> Vorsitzender des Aufsichtsrats: Gregor Pillen
>> Geschäftsführung: David Faller
>> Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
>>

-- 
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: perf test failures in linux-next on s390
  2023-06-15  9:39       ` Thomas Richter
@ 2023-06-15 14:34         ` Arnaldo Carvalho de Melo
  2023-06-16 14:23           ` Ian Rogers
  0 siblings, 1 reply; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-06-15 14:34 UTC (permalink / raw)
  To: Thomas Richter
  Cc: Ian Rogers, linux-perf-use., Sumanth Korikkar, James Clark,
	Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry,
	Will Deacon

Ccing the ARM people too:

Em Thu, Jun 15, 2023 at 11:39:16AM +0200, Thomas Richter escreveu:
> On 6/14/23 16:57, Ian Rogers wrote:
> > On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote:
> 
> ....
> 
> > 
> > Smaller list is okay. Could you send me a zip of the sysfs
> > (/sys/devices) ? At least one issue is that the code didn't find a
> > core PMU. On non-hybrid x86 this would be /sys/devices/cpu, I think we
> > spoke about this before for s390 and there are >1. The issue here is
> > that the test found 0, and we're trying to use PMUs in the code now as
> > a way to sort events. There's code/comment in util/pmu.c:
> > 
> > '''
> > /**
> >  * is_sysfs_pmu_core() - PMU CORE devices have different name other than cpu in
> >  *         sysfs on some platforms like ARM or Intel hybrid. Looking for
> >  *         possible the cpus file in sysfs files to identify whether this is a
> >  *         core device.
> >  * @name: The PMU name such as "cpu_atom".
> >  */
> > static int is_sysfs_pmu_core(const char *name)
> > {
> > char path[PATH_MAX];
> > 
> > if (!perf_pmu__pathname_scnprintf(path, sizeof(path), name, "cpus"))
> > return 0;
> > return file_available(path);
> > }
> > ...
> > bool is_pmu_core(const char *name)
> > {
> > return !strcmp(name, "cpu") || is_sysfs_pmu_core(name);
> > }
> > '''
> 
> Maybe we should scan the directory
> 
> [linux-next]# ll /sys/bus/event_source/devices
> total 0
> lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_cf -> ../../../devices/cpum_cf
> lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_cf_diag -> ../../../devices/cpum_cf_diag
> lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_sf -> ../../../devices/cpum_sf
> lrwxrwxrwx 1 root root 0 Jun  2 15:11 kprobe -> ../../../devices/kprobe
> lrwxrwxrwx 1 root root 0 Jun  2 15:11 software -> ../../../devices/software
> lrwxrwxrwx 1 root root 0 Jun  2 15:11 tracepoint -> ../../../devices/tracepoint
> lrwxrwxrwx 1 root root 0 Jun  2 15:11 uprobe -> ../../../devices/uprobe
> [linux-next]#
> 
> This directory lists the PMUs available on s390, maybe this is true for
> other platform...

I noticed this on an arm64 board:

acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls
COPYING  CREDITS  Documentation  Kbuild  Kconfig  LICENSES  MAINTAINERS  Makefile  README  arch  block	certs  crypto  drivers	fs  include  init  io_uring  ipc  kernel  lib  mm  net	perf.data  rust  samples  scripts  security  sound  tools  usr	virt

 Performance counter stats for 'ls':

   <not supported>      armv8_cortex_a72/cycles:u/
   <not supported>      armv8_cortex_a53/cycles:u/
   <not supported>      armv8_cortex_a72/instructions:u/
   <not supported>      armv8_cortex_a53/instructions:u/

       0.009192788 seconds time elapsed

       0.000000000 seconds user
       0.009411000 seconds sys


acme@roc-rk3399-pc:~/git/perf-tools-next$

root@roc-rk3399-pc:~# ls -la /sys/bus/event_source/devices
total 0
drwxr-xr-x 2 root root 0 Jan  1  1970 .
drwxr-xr-x 4 root root 0 Jan  1  1970 ..
lrwxrwxrwx 1 root root 0 Jan  1  1970 armv8_cortex_a53 -> ../../../devices/armv8_cortex_a53
lrwxrwxrwx 1 root root 0 Jan  1  1970 armv8_cortex_a72 -> ../../../devices/armv8_cortex_a72
lrwxrwxrwx 1 root root 0 Jan  1  1970 breakpoint -> ../../../devices/breakpoint
lrwxrwxrwx 1 root root 0 Jun 14 21:40 cs_etm -> ../../../devices/cs_etm
lrwxrwxrwx 1 root root 0 Jan  1  1970 software -> ../../../devices/software
lrwxrwxrwx 1 root root 0 Jan  1  1970 tracepoint -> ../../../devices/tracepoint
lrwxrwxrwx 1 root root 0 Jan  1  1970 uprobe -> ../../../devices/uprobe
root@roc-rk3399-pc:~#

running perf test now:

Linux roc-rk3399-pc 6.1.0-rc5-00123-g4dd7ff4a0311 #2 SMP PREEMPT Wed Nov 16 19:55:11 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
root@roc-rk3399-pc:~# perf test
  1: vmlinux symtab matches kallsyms                                 : Ok
  2: Detect openat syscall event                                     : Ok
  3: Detect openat syscall event on all cpus                         : Ok
  4: mmap interface tests                                            :
  4.1: Read samples using the mmap interface                         : Ok
  4.2: User space counter reading of instructions                    : Skip (permissions)
  4.3: User space counter reading of cycles                          : Skip (permissions)
  5: Test data source output                                         : Ok
  6: Parse event definition strings                                  :
  6.1: Test event parsing                                            : FAILED!
  6.2: Parsing of all PMU events from sysfs                          : Ok
  6.3: Parsing of given PMU events from sysfs                        : Ok
  6.4: Parsing of aliased events from sysfs                          : Skip (no aliases in sysfs)
  6.5: Parsing of aliased events                                     : Ok
  6.6: Parsing of terms (event modifiers)                            : Ok
  7: Simple expression parser                                        : Ok
  8: PERF_RECORD_* events & perf_sample fields                       : Ok
  9: Parse perf pmu format                                           : Ok
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 10.5: Parsing of metric thresholds with fake PMUs                   : Ok
 11: DSO data read                                                   : Ok
 12: DSO data cache                                                  : Ok
 13: DSO data reopen                                                 : Ok
 14: Roundtrip evsel->name                                           : Ok
 15: Parse sched tracepoints fields                                  : Ok
  16: syscalls:sys_enter_openat event fields                          : Ok
 17: Setup struct perf_event_attr                                    : Skip
 18: Match and link multiple hists                                   : Ok
 19: 'import perf' in python                                         : FAILED!
 20: Breakpoint overflow signal handler                              : Skip
 21: Breakpoint overflow sampling                                    : Skip
 22: Breakpoint accounting                                           : Ok
 23: Watchpoint                                                      :
 23.1: Read Only Watchpoint                                          : Ok
 23.2: Write Only Watchpoint                                         : Ok
 23.3: Read / Write Watchpoint                                       : Ok
 23.4: Modify Watchpoint                                             :
...





acme@roc-rk3399-pc:~/git/perf-tools-next$ cat /proc/cpuinfo
processor	: 0
BogoMIPS	: 48.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd03
CPU revision	: 4

processor	: 1
BogoMIPS	: 48.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd03
CPU revision	: 4

processor	: 2
BogoMIPS	: 48.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd03
CPU revision	: 4

processor	: 3
BogoMIPS	: 48.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd03
CPU revision	: 4

processor	: 4
BogoMIPS	: 48.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd08
CPU revision	: 2

processor	: 5
BogoMIPS	: 48.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd08
CPU revision	: 2

acme@roc-rk3399-pc:~/git/perf-tools-next$

root@roc-rk3399-pc:~# dmidecode
# dmidecode 3.3
Getting SMBIOS data from sysfs.
SMBIOS 3.0 present.
7 structures occupying 287 bytes.
Table at 0xF0E3C020.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
	Vendor: U-Boot
	Version: 2022.10-rc5+
	Release Date: 10/01/2022
	ROM Size: 64 kB
	Characteristics:
		PCI is supported
		BIOS is upgradeable
		Selectable boot is supported
		Targeted content distribution is supported
		UEFI is supported
	BIOS Revision: 22.10

Handle 0x0001, DMI type 1, 27 bytes
System Information
	Manufacturer: libre-computer
	Product Name: roc-rk3399-pc
	Version: Not Specified
	Serial Number: b03c01a7179278b7
	UUID: 63333062-3130-3761-3137-393237386237
	Wake-up Type: Reserved
	SKU Number: Not Specified
	Family: Not Specified

Handle 0x0002, DMI type 2, 14 bytes
Base Board Information
	Manufacturer: libre-computer
	Product Name: roc-rk3399-pc
	Version: Not Specified
	Serial Number: Not Specified
	Asset Tag: Not Specified
	Features:
		Board is a hosting board
	Location In Chassis: Not Specified
	Chassis Handle: 0x0000
	Type: Motherboard

Handle 0x0003, DMI type 3, 21 bytes
Chassis Information
	Manufacturer: libre-computer
	Type: Desktop
	Lock: Not Present
	Version: Not Specified
	Serial Number: Not Specified
	Asset Tag: Not Specified
	Boot-up State: Safe
	Power Supply State: Safe
	Thermal State: Safe
	Security Status: None
	OEM Information: 0x00000000
	Height: Unspecified
	Number Of Power Cords: Unspecified
	Contained Elements: 0

Handle 0x0004, DMI type 4, 48 bytes
Processor Information
	Socket Designation: Not Specified
	Type: Central Processor
	Family: Unknown
	Manufacturer: Unknown
	ID: 00 00 00 00 00 00 00 00
	Version: Unknown
	Voltage: Unknown
	External Clock: Unknown
	Max Speed: Unknown
	Current Speed: Unknown
	Status: Unpopulated
	Upgrade: None
	L1 Cache Handle: Not Provided
	L2 Cache Handle: Not Provided
	L3 Cache Handle: Not Provided
	Serial Number: Not Specified
	Asset Tag: Not Specified
	Part Number: Not Specified
	Characteristics: None

Handle 0x0005, DMI type 32, 11 bytes
System Boot Information
	Status: No errors detected

Handle 0x0006, DMI type 127, 4 bytes
End Of Table

root@roc-rk3399-pc:~#



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: perf test failures in linux-next on s390
  2023-06-15 14:34         ` Arnaldo Carvalho de Melo
@ 2023-06-16 14:23           ` Ian Rogers
  2023-06-16 14:36             ` Hybrid PMU issues on aarch64. was: " Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Rogers @ 2023-06-16 14:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark,
	Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry,
	Will Deacon

On Thu, Jun 15, 2023 at 7:35 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Ccing the ARM people too:
>
> Em Thu, Jun 15, 2023 at 11:39:16AM +0200, Thomas Richter escreveu:
> > On 6/14/23 16:57, Ian Rogers wrote:
> > > On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote:
> >
> > ....
> >
> > >
> > > Smaller list is okay. Could you send me a zip of the sysfs
> > > (/sys/devices) ? At least one issue is that the code didn't find a
> > > core PMU. On non-hybrid x86 this would be /sys/devices/cpu, I think we
> > > spoke about this before for s390 and there are >1. The issue here is
> > > that the test found 0, and we're trying to use PMUs in the code now as
> > > a way to sort events. There's code/comment in util/pmu.c:
> > >
> > > '''
> > > /**
> > >  * is_sysfs_pmu_core() - PMU CORE devices have different name other than cpu in
> > >  *         sysfs on some platforms like ARM or Intel hybrid. Looking for
> > >  *         possible the cpus file in sysfs files to identify whether this is a
> > >  *         core device.
> > >  * @name: The PMU name such as "cpu_atom".
> > >  */
> > > static int is_sysfs_pmu_core(const char *name)
> > > {
> > > char path[PATH_MAX];
> > >
> > > if (!perf_pmu__pathname_scnprintf(path, sizeof(path), name, "cpus"))
> > > return 0;
> > > return file_available(path);
> > > }
> > > ...
> > > bool is_pmu_core(const char *name)
> > > {
> > > return !strcmp(name, "cpu") || is_sysfs_pmu_core(name);
> > > }
> > > '''
> >
> > Maybe we should scan the directory
> >
> > [linux-next]# ll /sys/bus/event_source/devices
> > total 0
> > lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_cf -> ../../../devices/cpum_cf
> > lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_cf_diag -> ../../../devices/cpum_cf_diag
> > lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_sf -> ../../../devices/cpum_sf
> > lrwxrwxrwx 1 root root 0 Jun  2 15:11 kprobe -> ../../../devices/kprobe
> > lrwxrwxrwx 1 root root 0 Jun  2 15:11 software -> ../../../devices/software
> > lrwxrwxrwx 1 root root 0 Jun  2 15:11 tracepoint -> ../../../devices/tracepoint
> > lrwxrwxrwx 1 root root 0 Jun  2 15:11 uprobe -> ../../../devices/uprobe
> > [linux-next]#
> >
> > This directory lists the PMUs available on s390, maybe this is true for
> > other platform...
>
> I noticed this on an arm64 board:
>
> acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls
> COPYING  CREDITS  Documentation  Kbuild  Kconfig  LICENSES  MAINTAINERS  Makefile  README  arch  block  certs  crypto  drivers  fs  include  init  io_uring  ipc  kernel  lib  mm  net  perf.data  rust  samples  scripts  security  sound  tools  usr  virt
>
>  Performance counter stats for 'ls':
>
>    <not supported>      armv8_cortex_a72/cycles:u/
>    <not supported>      armv8_cortex_a53/cycles:u/
>    <not supported>      armv8_cortex_a72/instructions:u/
>    <not supported>      armv8_cortex_a53/instructions:u/

I tested on a raspberry pi and perf-tools-next is working there. I
suspect the issue here is the heterogeneous PMU. The cycles event is
converted into a perf_event_attr with type 0 and config 0. When there
are heterogeneous PMUs then we try to use the extended type to say we
want armv8_cortex_a72 and armv8_cortex_a53 cycles events. Let's say
the type number of armv8_cortex_a72 and armv8_cortex_a53 PMUs are 9
and 10 respectively. With heterogeneous encodings the type in the
perf_event_attr remains as 0 and the config becomes 9 << 32 and 10 <<
32. I suspect your kernel is seeing the extended type information and
not handling it, hence the error.

We add in the extended type for hardware and legacy cache events in
the parse events code:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n435
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1239
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1478
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1511

The addition of the extended type happens if
perf_pmus__supports_extended_type() returns true, its implementation
is:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n480
bool perf_pmus__supports_extended_type(void)
{
  return perf_pmus__num_core_pmus() > 1;
}

Previously on heterogeneous ARM the extended type wouldn't be encoded
and I believe the event was opened on the PMU of the current CPU only.
This is a bug because you will not count events on all PMUs. We can
make perf_pmus__supports_extended_type return false on ARM which
should bring back the previous behavior - or do some kind of dynamic
detection using perf_event_open. We could do some kind of ARM quirk
workaround behavior, for example, I suspect
/sys/bus/event_source/devices/armv8_cortex_a53/events and
/sys/bus/event_source/devices/armv8_cortex_a72/events both contain a
cycles event. If we used a raw rather than hardware type encoding then
the wildcarding should work. Unfortunately there are many encodings
with extended type and sysfs won't have them all.

Thanks,
Ian

>        0.009192788 seconds time elapsed
>
>        0.000000000 seconds user
>        0.009411000 seconds sys
>
>
> acme@roc-rk3399-pc:~/git/perf-tools-next$
>
> root@roc-rk3399-pc:~# ls -la /sys/bus/event_source/devices
> total 0
> drwxr-xr-x 2 root root 0 Jan  1  1970 .
> drwxr-xr-x 4 root root 0 Jan  1  1970 ..
> lrwxrwxrwx 1 root root 0 Jan  1  1970 armv8_cortex_a53 -> ../../../devices/armv8_cortex_a53
> lrwxrwxrwx 1 root root 0 Jan  1  1970 armv8_cortex_a72 -> ../../../devices/armv8_cortex_a72
> lrwxrwxrwx 1 root root 0 Jan  1  1970 breakpoint -> ../../../devices/breakpoint
> lrwxrwxrwx 1 root root 0 Jun 14 21:40 cs_etm -> ../../../devices/cs_etm
> lrwxrwxrwx 1 root root 0 Jan  1  1970 software -> ../../../devices/software
> lrwxrwxrwx 1 root root 0 Jan  1  1970 tracepoint -> ../../../devices/tracepoint
> lrwxrwxrwx 1 root root 0 Jan  1  1970 uprobe -> ../../../devices/uprobe
> root@roc-rk3399-pc:~#
>
> running perf test now:
>
> Linux roc-rk3399-pc 6.1.0-rc5-00123-g4dd7ff4a0311 #2 SMP PREEMPT Wed Nov 16 19:55:11 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
> root@roc-rk3399-pc:~# perf test
>   1: vmlinux symtab matches kallsyms                                 : Ok
>   2: Detect openat syscall event                                     : Ok
>   3: Detect openat syscall event on all cpus                         : Ok
>   4: mmap interface tests                                            :
>   4.1: Read samples using the mmap interface                         : Ok
>   4.2: User space counter reading of instructions                    : Skip (permissions)
>   4.3: User space counter reading of cycles                          : Skip (permissions)
>   5: Test data source output                                         : Ok
>   6: Parse event definition strings                                  :
>   6.1: Test event parsing                                            : FAILED!
>   6.2: Parsing of all PMU events from sysfs                          : Ok
>   6.3: Parsing of given PMU events from sysfs                        : Ok
>   6.4: Parsing of aliased events from sysfs                          : Skip (no aliases in sysfs)
>   6.5: Parsing of aliased events                                     : Ok
>   6.6: Parsing of terms (event modifiers)                            : Ok
>   7: Simple expression parser                                        : Ok
>   8: PERF_RECORD_* events & perf_sample fields                       : Ok
>   9: Parse perf pmu format                                           : Ok
>  10: PMU events                                                      :
>  10.1: PMU event table sanity                                        : Ok
>  10.2: PMU event map aliases                                         : Ok
>  10.3: Parsing of PMU event table metrics                            : Ok
>  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
>  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
>  11: DSO data read                                                   : Ok
>  12: DSO data cache                                                  : Ok
>  13: DSO data reopen                                                 : Ok
>  14: Roundtrip evsel->name                                           : Ok
>  15: Parse sched tracepoints fields                                  : Ok
>   16: syscalls:sys_enter_openat event fields                          : Ok
>  17: Setup struct perf_event_attr                                    : Skip
>  18: Match and link multiple hists                                   : Ok
>  19: 'import perf' in python                                         : FAILED!
>  20: Breakpoint overflow signal handler                              : Skip
>  21: Breakpoint overflow sampling                                    : Skip
>  22: Breakpoint accounting                                           : Ok
>  23: Watchpoint                                                      :
>  23.1: Read Only Watchpoint                                          : Ok
>  23.2: Write Only Watchpoint                                         : Ok
>  23.3: Read / Write Watchpoint                                       : Ok
>  23.4: Modify Watchpoint                                             :
> ...
>
>
>
>
>
> acme@roc-rk3399-pc:~/git/perf-tools-next$ cat /proc/cpuinfo
> processor       : 0
> BogoMIPS        : 48.00
> Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0xd03
> CPU revision    : 4
>
> processor       : 1
> BogoMIPS        : 48.00
> Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0xd03
> CPU revision    : 4
>
> processor       : 2
> BogoMIPS        : 48.00
> Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0xd03
> CPU revision    : 4
>
> processor       : 3
> BogoMIPS        : 48.00
> Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0xd03
> CPU revision    : 4
>
> processor       : 4
> BogoMIPS        : 48.00
> Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0xd08
> CPU revision    : 2
>
> processor       : 5
> BogoMIPS        : 48.00
> Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
> CPU implementer : 0x41
> CPU architecture: 8
> CPU variant     : 0x0
> CPU part        : 0xd08
> CPU revision    : 2
>
> acme@roc-rk3399-pc:~/git/perf-tools-next$
>
> root@roc-rk3399-pc:~# dmidecode
> # dmidecode 3.3
> Getting SMBIOS data from sysfs.
> SMBIOS 3.0 present.
> 7 structures occupying 287 bytes.
> Table at 0xF0E3C020.
>
> Handle 0x0000, DMI type 0, 24 bytes
> BIOS Information
>         Vendor: U-Boot
>         Version: 2022.10-rc5+
>         Release Date: 10/01/2022
>         ROM Size: 64 kB
>         Characteristics:
>                 PCI is supported
>                 BIOS is upgradeable
>                 Selectable boot is supported
>                 Targeted content distribution is supported
>                 UEFI is supported
>         BIOS Revision: 22.10
>
> Handle 0x0001, DMI type 1, 27 bytes
> System Information
>         Manufacturer: libre-computer
>         Product Name: roc-rk3399-pc
>         Version: Not Specified
>         Serial Number: b03c01a7179278b7
>         UUID: 63333062-3130-3761-3137-393237386237
>         Wake-up Type: Reserved
>         SKU Number: Not Specified
>         Family: Not Specified
>
> Handle 0x0002, DMI type 2, 14 bytes
> Base Board Information
>         Manufacturer: libre-computer
>         Product Name: roc-rk3399-pc
>         Version: Not Specified
>         Serial Number: Not Specified
>         Asset Tag: Not Specified
>         Features:
>                 Board is a hosting board
>         Location In Chassis: Not Specified
>         Chassis Handle: 0x0000
>         Type: Motherboard
>
> Handle 0x0003, DMI type 3, 21 bytes
> Chassis Information
>         Manufacturer: libre-computer
>         Type: Desktop
>         Lock: Not Present
>         Version: Not Specified
>         Serial Number: Not Specified
>         Asset Tag: Not Specified
>         Boot-up State: Safe
>         Power Supply State: Safe
>         Thermal State: Safe
>         Security Status: None
>         OEM Information: 0x00000000
>         Height: Unspecified
>         Number Of Power Cords: Unspecified
>         Contained Elements: 0
>
> Handle 0x0004, DMI type 4, 48 bytes
> Processor Information
>         Socket Designation: Not Specified
>         Type: Central Processor
>         Family: Unknown
>         Manufacturer: Unknown
>         ID: 00 00 00 00 00 00 00 00
>         Version: Unknown
>         Voltage: Unknown
>         External Clock: Unknown
>         Max Speed: Unknown
>         Current Speed: Unknown
>         Status: Unpopulated
>         Upgrade: None
>         L1 Cache Handle: Not Provided
>         L2 Cache Handle: Not Provided
>         L3 Cache Handle: Not Provided
>         Serial Number: Not Specified
>         Asset Tag: Not Specified
>         Part Number: Not Specified
>         Characteristics: None
>
> Handle 0x0005, DMI type 32, 11 bytes
> System Boot Information
>         Status: No errors detected
>
> Handle 0x0006, DMI type 127, 4 bytes
> End Of Table
>
> root@roc-rk3399-pc:~#
>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390
  2023-06-16 14:23           ` Ian Rogers
@ 2023-06-16 14:36             ` Arnaldo Carvalho de Melo
  2023-06-16 14:44               ` Arnaldo Carvalho de Melo
  2023-06-19 10:04               ` Thomas Richter
  0 siblings, 2 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-06-16 14:36 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark,
	Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry,
	Will Deacon

Em Fri, Jun 16, 2023 at 07:23:30AM -0700, Ian Rogers escreveu:
> On Thu, Jun 15, 2023 at 7:35 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > Ccing the ARM people too:
> > Em Thu, Jun 15, 2023 at 11:39:16AM +0200, Thomas Richter escreveu:
> > > On 6/14/23 16:57, Ian Rogers wrote:
> > > > On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote:
> > > > bool is_pmu_core(const char *name)
> > > > {
> > > > return !strcmp(name, "cpu") || is_sysfs_pmu_core(name);
> > > > }

> > > Maybe we should scan the directory

> > > [linux-next]# ll /sys/bus/event_source/devices
> > > total 0
> > > lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_cf -> ../../../devices/cpum_cf
> > > lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_cf_diag -> ../../../devices/cpum_cf_diag
> > > lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_sf -> ../../../devices/cpum_sf
> > > lrwxrwxrwx 1 root root 0 Jun  2 15:11 kprobe -> ../../../devices/kprobe
> > > lrwxrwxrwx 1 root root 0 Jun  2 15:11 software -> ../../../devices/software
> > > lrwxrwxrwx 1 root root 0 Jun  2 15:11 tracepoint -> ../../../devices/tracepoint
> > > lrwxrwxrwx 1 root root 0 Jun  2 15:11 uprobe -> ../../../devices/uprobe
> > > [linux-next]#

> > > This directory lists the PMUs available on s390, maybe this is true for
> > > other platform...

> > I noticed this on an arm64 board:

> > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls
> > COPYING  CREDITS  Documentation  Kbuild  Kconfig  LICENSES  MAINTAINERS  Makefile  README  arch  block  certs  crypto  drivers  fs  include  init  io_uring  ipc  kernel  lib  mm  net  perf.data  rust  samples  scripts  security  sound  tools  usr  virt

> >  Performance counter stats for 'ls':

> >    <not supported>      armv8_cortex_a72/cycles:u/
> >    <not supported>      armv8_cortex_a53/cycles:u/
> >    <not supported>      armv8_cortex_a72/instructions:u/
> >    <not supported>      armv8_cortex_a53/instructions:u/

> I tested on a raspberry pi and perf-tools-next is working there. I
> suspect the issue here is the heterogeneous PMU. The cycles event is
> converted into a perf_event_attr with type 0 and config 0. When there
> are heterogeneous PMUs then we try to use the extended type to say we
> want armv8_cortex_a72 and armv8_cortex_a53 cycles events. Let's say
> the type number of armv8_cortex_a72 and armv8_cortex_a53 PMUs are 9
> and 10 respectively. With heterogeneous encodings the type in the

The numbers are 8 and 7, PERF_TYPE_HW (thus zero, thus not printed):

root@roc-rk3399-pc:~# perf stat -vv -e cycles sleep 1
Using CPUID 0x00000000410fd080
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
  size                             136
  config                           0x800000000
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 13885  cpu -1  group_fd -1  flags 0x8
sys_perf_event_open failed, error -2
Warning:
cycles event is not supported by the kernel.
------------------------------------------------------------
perf_event_attr:
  size                             136
  config                           0x700000000
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 13885  cpu -1  group_fd -1  flags 0x8
sys_perf_event_open failed, error -2
Warning:
cycles event is not supported by the kernel.
failed to read counter cycles
failed to read counter cycles

 Performance counter stats for 'sleep 1':

   <not supported>      armv8_cortex_a72/cycles/
   <not supported>      armv8_cortex_a53/cycles/

       1.011406938 seconds time elapsed

       0.000000000 seconds user
       0.010886000 seconds sys


root@roc-rk3399-pc:~#

> perf_event_attr remains as 0 and the config becomes 9 << 32 and 10 <<
> 32. I suspect your kernel is seeing the extended type information and
> not handling it, hence the error.

looks this is the case indeed
 
> We add in the extended type for hardware and legacy cache events in
> the parse events code:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n435
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1239
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1478
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1511
> 
> The addition of the extended type happens if
> perf_pmus__supports_extended_type() returns true, its implementation
> is:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n480
> bool perf_pmus__supports_extended_type(void)
> {
>   return perf_pmus__num_core_pmus() > 1;
> }
> 
> Previously on heterogeneous ARM the extended type wouldn't be encoded
> and I believe the event was opened on the PMU of the current CPU only.

I think that is the case, haven't checked so far tho.

> This is a bug because you will not count events on all PMUs. We can
> make perf_pmus__supports_extended_type return false on ARM which
> should bring back the previous behavior - or do some kind of dynamic

simplest first step, trying it.

> detection using perf_event_open. We could do some kind of ARM quirk
> workaround behavior, for example, I suspect
> /sys/bus/event_source/devices/armv8_cortex_a53/events and
> /sys/bus/event_source/devices/armv8_cortex_a72/events both contain a
> cycles event. If we used a raw rather than hardware type encoding then
> the wildcarding should work. Unfortunately there are many encodings
> with extended type and sysfs won't have them all.
> 
> Thanks,
> Ian
> 
> >        0.009192788 seconds time elapsed
> >
> >        0.000000000 seconds user
> >        0.009411000 seconds sys
> >
> >
> > acme@roc-rk3399-pc:~/git/perf-tools-next$
> >
> > root@roc-rk3399-pc:~# ls -la /sys/bus/event_source/devices
> > total 0
> > drwxr-xr-x 2 root root 0 Jan  1  1970 .
> > drwxr-xr-x 4 root root 0 Jan  1  1970 ..
> > lrwxrwxrwx 1 root root 0 Jan  1  1970 armv8_cortex_a53 -> ../../../devices/armv8_cortex_a53
> > lrwxrwxrwx 1 root root 0 Jan  1  1970 armv8_cortex_a72 -> ../../../devices/armv8_cortex_a72
> > lrwxrwxrwx 1 root root 0 Jan  1  1970 breakpoint -> ../../../devices/breakpoint
> > lrwxrwxrwx 1 root root 0 Jun 14 21:40 cs_etm -> ../../../devices/cs_etm
> > lrwxrwxrwx 1 root root 0 Jan  1  1970 software -> ../../../devices/software
> > lrwxrwxrwx 1 root root 0 Jan  1  1970 tracepoint -> ../../../devices/tracepoint
> > lrwxrwxrwx 1 root root 0 Jan  1  1970 uprobe -> ../../../devices/uprobe
> > root@roc-rk3399-pc:~#
> >
> > running perf test now:
> >
> > Linux roc-rk3399-pc 6.1.0-rc5-00123-g4dd7ff4a0311 #2 SMP PREEMPT Wed Nov 16 19:55:11 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
> > root@roc-rk3399-pc:~# perf test
> >   1: vmlinux symtab matches kallsyms                                 : Ok
> >   2: Detect openat syscall event                                     : Ok
> >   3: Detect openat syscall event on all cpus                         : Ok
> >   4: mmap interface tests                                            :
> >   4.1: Read samples using the mmap interface                         : Ok
> >   4.2: User space counter reading of instructions                    : Skip (permissions)
> >   4.3: User space counter reading of cycles                          : Skip (permissions)
> >   5: Test data source output                                         : Ok
> >   6: Parse event definition strings                                  :
> >   6.1: Test event parsing                                            : FAILED!
> >   6.2: Parsing of all PMU events from sysfs                          : Ok
> >   6.3: Parsing of given PMU events from sysfs                        : Ok
> >   6.4: Parsing of aliased events from sysfs                          : Skip (no aliases in sysfs)
> >   6.5: Parsing of aliased events                                     : Ok
> >   6.6: Parsing of terms (event modifiers)                            : Ok
> >   7: Simple expression parser                                        : Ok
> >   8: PERF_RECORD_* events & perf_sample fields                       : Ok
> >   9: Parse perf pmu format                                           : Ok
> >  10: PMU events                                                      :
> >  10.1: PMU event table sanity                                        : Ok
> >  10.2: PMU event map aliases                                         : Ok
> >  10.3: Parsing of PMU event table metrics                            : Ok
> >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> >  11: DSO data read                                                   : Ok
> >  12: DSO data cache                                                  : Ok
> >  13: DSO data reopen                                                 : Ok
> >  14: Roundtrip evsel->name                                           : Ok
> >  15: Parse sched tracepoints fields                                  : Ok
> >   16: syscalls:sys_enter_openat event fields                          : Ok
> >  17: Setup struct perf_event_attr                                    : Skip
> >  18: Match and link multiple hists                                   : Ok
> >  19: 'import perf' in python                                         : FAILED!
> >  20: Breakpoint overflow signal handler                              : Skip
> >  21: Breakpoint overflow sampling                                    : Skip
> >  22: Breakpoint accounting                                           : Ok
> >  23: Watchpoint                                                      :
> >  23.1: Read Only Watchpoint                                          : Ok
> >  23.2: Write Only Watchpoint                                         : Ok
> >  23.3: Read / Write Watchpoint                                       : Ok
> >  23.4: Modify Watchpoint                                             :
> > ...
> >
> >
> >
> >
> >
> > acme@roc-rk3399-pc:~/git/perf-tools-next$ cat /proc/cpuinfo
> > processor       : 0
> > BogoMIPS        : 48.00
> > Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
> > CPU implementer : 0x41
> > CPU architecture: 8
> > CPU variant     : 0x0
> > CPU part        : 0xd03
> > CPU revision    : 4
> >
> > processor       : 1
> > BogoMIPS        : 48.00
> > Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
> > CPU implementer : 0x41
> > CPU architecture: 8
> > CPU variant     : 0x0
> > CPU part        : 0xd03
> > CPU revision    : 4
> >
> > processor       : 2
> > BogoMIPS        : 48.00
> > Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
> > CPU implementer : 0x41
> > CPU architecture: 8
> > CPU variant     : 0x0
> > CPU part        : 0xd03
> > CPU revision    : 4
> >
> > processor       : 3
> > BogoMIPS        : 48.00
> > Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
> > CPU implementer : 0x41
> > CPU architecture: 8
> > CPU variant     : 0x0
> > CPU part        : 0xd03
> > CPU revision    : 4
> >
> > processor       : 4
> > BogoMIPS        : 48.00
> > Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
> > CPU implementer : 0x41
> > CPU architecture: 8
> > CPU variant     : 0x0
> > CPU part        : 0xd08
> > CPU revision    : 2
> >
> > processor       : 5
> > BogoMIPS        : 48.00
> > Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
> > CPU implementer : 0x41
> > CPU architecture: 8
> > CPU variant     : 0x0
> > CPU part        : 0xd08
> > CPU revision    : 2
> >
> > acme@roc-rk3399-pc:~/git/perf-tools-next$
> >
> > root@roc-rk3399-pc:~# dmidecode
> > # dmidecode 3.3
> > Getting SMBIOS data from sysfs.
> > SMBIOS 3.0 present.
> > 7 structures occupying 287 bytes.
> > Table at 0xF0E3C020.
> >
> > Handle 0x0000, DMI type 0, 24 bytes
> > BIOS Information
> >         Vendor: U-Boot
> >         Version: 2022.10-rc5+
> >         Release Date: 10/01/2022
> >         ROM Size: 64 kB
> >         Characteristics:
> >                 PCI is supported
> >                 BIOS is upgradeable
> >                 Selectable boot is supported
> >                 Targeted content distribution is supported
> >                 UEFI is supported
> >         BIOS Revision: 22.10
> >
> > Handle 0x0001, DMI type 1, 27 bytes
> > System Information
> >         Manufacturer: libre-computer
> >         Product Name: roc-rk3399-pc
> >         Version: Not Specified
> >         Serial Number: b03c01a7179278b7
> >         UUID: 63333062-3130-3761-3137-393237386237
> >         Wake-up Type: Reserved
> >         SKU Number: Not Specified
> >         Family: Not Specified
> >
> > Handle 0x0002, DMI type 2, 14 bytes
> > Base Board Information
> >         Manufacturer: libre-computer
> >         Product Name: roc-rk3399-pc
> >         Version: Not Specified
> >         Serial Number: Not Specified
> >         Asset Tag: Not Specified
> >         Features:
> >                 Board is a hosting board
> >         Location In Chassis: Not Specified
> >         Chassis Handle: 0x0000
> >         Type: Motherboard
> >
> > Handle 0x0003, DMI type 3, 21 bytes
> > Chassis Information
> >         Manufacturer: libre-computer
> >         Type: Desktop
> >         Lock: Not Present
> >         Version: Not Specified
> >         Serial Number: Not Specified
> >         Asset Tag: Not Specified
> >         Boot-up State: Safe
> >         Power Supply State: Safe
> >         Thermal State: Safe
> >         Security Status: None
> >         OEM Information: 0x00000000
> >         Height: Unspecified
> >         Number Of Power Cords: Unspecified
> >         Contained Elements: 0
> >
> > Handle 0x0004, DMI type 4, 48 bytes
> > Processor Information
> >         Socket Designation: Not Specified
> >         Type: Central Processor
> >         Family: Unknown
> >         Manufacturer: Unknown
> >         ID: 00 00 00 00 00 00 00 00
> >         Version: Unknown
> >         Voltage: Unknown
> >         External Clock: Unknown
> >         Max Speed: Unknown
> >         Current Speed: Unknown
> >         Status: Unpopulated
> >         Upgrade: None
> >         L1 Cache Handle: Not Provided
> >         L2 Cache Handle: Not Provided
> >         L3 Cache Handle: Not Provided
> >         Serial Number: Not Specified
> >         Asset Tag: Not Specified
> >         Part Number: Not Specified
> >         Characteristics: None
> >
> > Handle 0x0005, DMI type 32, 11 bytes
> > System Boot Information
> >         Status: No errors detected
> >
> > Handle 0x0006, DMI type 127, 4 bytes
> > End Of Table
> >
> > root@roc-rk3399-pc:~#
> >
> >

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390
  2023-06-16 14:36             ` Hybrid PMU issues on aarch64. was: " Arnaldo Carvalho de Melo
@ 2023-06-16 14:44               ` Arnaldo Carvalho de Melo
  2023-06-16 16:28                 ` Ian Rogers
  2023-06-19 10:04               ` Thomas Richter
  1 sibling, 1 reply; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-06-16 14:44 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark,
	Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry,
	Will Deacon

Em Fri, Jun 16, 2023 at 11:36:27AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > I noticed this on an arm64 board:
> 
> > > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls
> > > COPYING  CREDITS  Documentation  Kbuild  Kconfig  LICENSES  MAINTAINERS  Makefile  README  arch  block  certs  crypto  drivers  fs  include  init  io_uring  ipc  kernel  lib  mm  net  perf.data  rust  samples  scripts  security  sound  tools  usr  virt
> 
> > >  Performance counter stats for 'ls':
> 
> > >    <not supported>      armv8_cortex_a72/cycles:u/
> > >    <not supported>      armv8_cortex_a53/cycles:u/
> > >    <not supported>      armv8_cortex_a72/instructions:u/
> > >    <not supported>      armv8_cortex_a53/instructions:u/
> 
> > I tested on a raspberry pi and perf-tools-next is working there. I
> > suspect the issue here is the heterogeneous PMU. The cycles event is
> > converted into a perf_event_attr with type 0 and config 0. When there
> > are heterogeneous PMUs then we try to use the extended type to say we
> > want armv8_cortex_a72 and armv8_cortex_a53 cycles events. Let's say
> > the type number of armv8_cortex_a72 and armv8_cortex_a53 PMUs are 9
> > and 10 respectively. With heterogeneous encodings the type in the
> 
> The numbers are 8 and 7, PERF_TYPE_HW (thus zero, thus not printed):
> 
> root@roc-rk3399-pc:~# perf stat -vv -e cycles sleep 1
> Using CPUID 0x00000000410fd080
> Control descriptor is not initialized
> ------------------------------------------------------------
> perf_event_attr:
>   size                             136
>   config                           0x800000000
> ------------------------------------------------------------
> sys_perf_event_open: pid 13885  cpu -1  group_fd -1  flags 0x8
> sys_perf_event_open failed, error -2
> Warning:
> cycles event is not supported by the kernel.
> ------------------------------------------------------------
> perf_event_attr:
>   size                             136
>   config                           0x700000000
> ------------------------------------------------------------
> sys_perf_event_open: pid 13885  cpu -1  group_fd -1  flags 0x8
> sys_perf_event_open failed, error -2
> Warning:
> cycles event is not supported by the kernel.
> failed to read counter cycles
> failed to read counter cycles
> 
>  Performance counter stats for 'sleep 1':
> 
>    <not supported>      armv8_cortex_a72/cycles/
>    <not supported>      armv8_cortex_a53/cycles/
> 
>        1.011406938 seconds time elapsed
> 
>        0.000000000 seconds user
>        0.010886000 seconds sys
> 
> 
> root@roc-rk3399-pc:~#
> 
> > perf_event_attr remains as 0 and the config becomes 9 << 32 and 10 <<
> > 32. I suspect your kernel is seeing the extended type information and
> > not handling it, hence the error.
> 
> looks this is the case indeed
>  
> > We add in the extended type for hardware and legacy cache events in
> > the parse events code:
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n435
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1239
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1478
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1511
> > 
> > The addition of the extended type happens if
> > perf_pmus__supports_extended_type() returns true, its implementation
> > is:
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n480
> > bool perf_pmus__supports_extended_type(void)
> > {
> >   return perf_pmus__num_core_pmus() > 1;
> > }
> > 
> > Previously on heterogeneous ARM the extended type wouldn't be encoded
> > and I believe the event was opened on the PMU of the current CPU only.
> 
> I think that is the case, haven't checked so far tho.
> 
> > This is a bug because you will not count events on all PMUs. We can
> > make perf_pmus__supports_extended_type return false on ARM which
> > should bring back the previous behavior - or do some kind of dynamic
> 
> simplest first step, trying it.

Spot on:

acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat ls
COPYING  CREDITS  Documentation  Kbuild  Kconfig  LICENSES  MAINTAINERS  Makefile  README  arch  block	certs  crypto  drivers	fs  include  init  io_uring  ipc  kernel  lib  mm  net	rust  samples  scripts	security  sound  tools	usr  virt

 Performance counter stats for 'ls':

              9.01 msec task-clock:u                     #    0.401 CPUs utilized
                 0      context-switches:u               #    0.000 /sec
                 0      cpu-migrations:u                 #    0.000 /sec
                84      page-faults:u                    #    9.320 K/sec
           1188641      cycles:u                         #    0.132 GHz
            601132      instructions:u                   #    0.51  insn per cycle
             64768      branches:u                       #    7.186 M/sec
             11680      branch-misses:u                  #   18.03% of all branches

       0.022502514 seconds time elapsed

       0.000000000 seconds user
       0.022946000 seconds sys


acme@roc-rk3399-pc:~/git/perf-tools-next$
acme@roc-rk3399-pc:~/git/perf-tools-next$ perf record ls
COPYING  CREDITS  Documentation  Kbuild  Kconfig  LICENSES  MAINTAINERS  Makefile  README  arch  block	certs  crypto  drivers	fs  include  init  io_uring  ipc  kernel  lib  mm  net	perf.data  rust  samples  scripts  security  sound  tools  usr	virt
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.003 MB perf.data (18 samples) ]
acme@roc-rk3399-pc:~/git/perf-tools-next$ perf evlist
cycles:Pu
dummy:HGu
acme@roc-rk3399-pc:~/git/perf-tools-next$ ldd ~/bin/perf | grep asan
	libasan.so.6 => /lib/aarch64-linux-gnu/libasan.so.6 (0x0000ffffa5a00000)
acme@roc-rk3399-pc:~/git/perf-tools-next$

With the following patch. Do you want to submit it or may I add it as is
using an edited discussion in this thread as the commit log message?

Thanks!

- Arnaldo

diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index a2032c1b7644..9af961105a64 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -494,7 +494,13 @@ int perf_pmus__num_core_pmus(void)
 
 bool perf_pmus__supports_extended_type(void)
 {
+#if defined(__aarch64__)
+	// We can't use the extended type information where the PMU number
+	// is encoded in the upper perf_event_attr::type bits. (<< 32).
+	return false;
+#else
 	return perf_pmus__num_core_pmus() > 1;
+#endif
 }
 
 struct perf_pmu *evsel__find_pmu(const struct evsel *evsel)

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390
  2023-06-16 14:44               ` Arnaldo Carvalho de Melo
@ 2023-06-16 16:28                 ` Ian Rogers
  2023-06-16 16:53                   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Rogers @ 2023-06-16 16:28 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark,
	Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry,
	Will Deacon

On Fri, Jun 16, 2023 at 7:44 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Fri, Jun 16, 2023 at 11:36:27AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > I noticed this on an arm64 board:
> >
> > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls
> > > > COPYING  CREDITS  Documentation  Kbuild  Kconfig  LICENSES  MAINTAINERS  Makefile  README  arch  block  certs  crypto  drivers  fs  include  init  io_uring  ipc  kernel  lib  mm  net  perf.data  rust  samples  scripts  security  sound  tools  usr  virt
> >
> > > >  Performance counter stats for 'ls':
> >
> > > >    <not supported>      armv8_cortex_a72/cycles:u/
> > > >    <not supported>      armv8_cortex_a53/cycles:u/
> > > >    <not supported>      armv8_cortex_a72/instructions:u/
> > > >    <not supported>      armv8_cortex_a53/instructions:u/
> >
> > > I tested on a raspberry pi and perf-tools-next is working there. I
> > > suspect the issue here is the heterogeneous PMU. The cycles event is
> > > converted into a perf_event_attr with type 0 and config 0. When there
> > > are heterogeneous PMUs then we try to use the extended type to say we
> > > want armv8_cortex_a72 and armv8_cortex_a53 cycles events. Let's say
> > > the type number of armv8_cortex_a72 and armv8_cortex_a53 PMUs are 9
> > > and 10 respectively. With heterogeneous encodings the type in the
> >
> > The numbers are 8 and 7, PERF_TYPE_HW (thus zero, thus not printed):
> >
> > root@roc-rk3399-pc:~# perf stat -vv -e cycles sleep 1
> > Using CPUID 0x00000000410fd080
> > Control descriptor is not initialized
> > ------------------------------------------------------------
> > perf_event_attr:
> >   size                             136
> >   config                           0x800000000
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 13885  cpu -1  group_fd -1  flags 0x8
> > sys_perf_event_open failed, error -2
> > Warning:
> > cycles event is not supported by the kernel.
> > ------------------------------------------------------------
> > perf_event_attr:
> >   size                             136
> >   config                           0x700000000
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 13885  cpu -1  group_fd -1  flags 0x8
> > sys_perf_event_open failed, error -2
> > Warning:
> > cycles event is not supported by the kernel.
> > failed to read counter cycles
> > failed to read counter cycles
> >
> >  Performance counter stats for 'sleep 1':
> >
> >    <not supported>      armv8_cortex_a72/cycles/
> >    <not supported>      armv8_cortex_a53/cycles/
> >
> >        1.011406938 seconds time elapsed
> >
> >        0.000000000 seconds user
> >        0.010886000 seconds sys
> >
> >
> > root@roc-rk3399-pc:~#
> >
> > > perf_event_attr remains as 0 and the config becomes 9 << 32 and 10 <<
> > > 32. I suspect your kernel is seeing the extended type information and
> > > not handling it, hence the error.
> >
> > looks this is the case indeed
> >
> > > We add in the extended type for hardware and legacy cache events in
> > > the parse events code:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n435
> > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1239
> > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1478
> > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1511
> > >
> > > The addition of the extended type happens if
> > > perf_pmus__supports_extended_type() returns true, its implementation
> > > is:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n480
> > > bool perf_pmus__supports_extended_type(void)
> > > {
> > >   return perf_pmus__num_core_pmus() > 1;
> > > }
> > >
> > > Previously on heterogeneous ARM the extended type wouldn't be encoded
> > > and I believe the event was opened on the PMU of the current CPU only.
> >
> > I think that is the case, haven't checked so far tho.
> >
> > > This is a bug because you will not count events on all PMUs. We can
> > > make perf_pmus__supports_extended_type return false on ARM which
> > > should bring back the previous behavior - or do some kind of dynamic
> >
> > simplest first step, trying it.
>
> Spot on:
>
> acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat ls
> COPYING  CREDITS  Documentation  Kbuild  Kconfig  LICENSES  MAINTAINERS  Makefile  README  arch  block  certs  crypto  drivers  fs  include  init  io_uring  ipc  kernel  lib  mm  net  rust  samples  scripts  security  sound  tools  usr  virt
>
>  Performance counter stats for 'ls':
>
>               9.01 msec task-clock:u                     #    0.401 CPUs utilized
>                  0      context-switches:u               #    0.000 /sec
>                  0      cpu-migrations:u                 #    0.000 /sec
>                 84      page-faults:u                    #    9.320 K/sec
>            1188641      cycles:u                         #    0.132 GHz
>             601132      instructions:u                   #    0.51  insn per cycle
>              64768      branches:u                       #    7.186 M/sec
>              11680      branch-misses:u                  #   18.03% of all branches
>
>        0.022502514 seconds time elapsed
>
>        0.000000000 seconds user
>        0.022946000 seconds sys
>
>
> acme@roc-rk3399-pc:~/git/perf-tools-next$
> acme@roc-rk3399-pc:~/git/perf-tools-next$ perf record ls
> COPYING  CREDITS  Documentation  Kbuild  Kconfig  LICENSES  MAINTAINERS  Makefile  README  arch  block  certs  crypto  drivers  fs  include  init  io_uring  ipc  kernel  lib  mm  net  perf.data  rust  samples  scripts  security  sound  tools  usr  virt
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.003 MB perf.data (18 samples) ]
> acme@roc-rk3399-pc:~/git/perf-tools-next$ perf evlist
> cycles:Pu
> dummy:HGu
> acme@roc-rk3399-pc:~/git/perf-tools-next$ ldd ~/bin/perf | grep asan
>         libasan.so.6 => /lib/aarch64-linux-gnu/libasan.so.6 (0x0000ffffa5a00000)
> acme@roc-rk3399-pc:~/git/perf-tools-next$
>
> With the following patch. Do you want to submit it or may I add it as is
> using an edited discussion in this thread as the commit log message?
>
> Thanks!
>
> - Arnaldo

Hi Arnaldo,

presumably with the #ifdef you just get 1 PMU - shame. I think rather
than do an #ifdef we can do something like call is_event_supported:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232
so:
bool perf_pmus__supports_extended_type(void)
  struct perf_pmu *pmu = NULL;
  if (perf_pmus__num_core_pmus() <= 1)
     return false;
  while((pmu = perf_pmus__scan_core(pmu) != NULL) {
    return is_event_supported(PERF_TYPE_HARDWARE,
PERF_COUNT_HW_CPU_CYCLES | ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT);
 }
return false;
}
We probably don't want to do this for each call of
perf_pmus__supports_extended_type so you could use a static and
pthread_once, etc.

This would mean if this regression is introduced elsewhere than ARM it
will self heal. It will also mean that when ARM support extended types
in the kernel, they will get the normal heterogeneous behavior.

Thanks,
Ian

> diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
> index a2032c1b7644..9af961105a64 100644
> --- a/tools/perf/util/pmus.c
> +++ b/tools/perf/util/pmus.c
> @@ -494,7 +494,13 @@ int perf_pmus__num_core_pmus(void)
>
>  bool perf_pmus__supports_extended_type(void)
>  {
> +#if defined(__aarch64__)
> +       // We can't use the extended type information where the PMU number
> +       // is encoded in the upper perf_event_attr::type bits. (<< 32).
> +       return false;
> +#else
>         return perf_pmus__num_core_pmus() > 1;
> +#endif
>  }
>
>  struct perf_pmu *evsel__find_pmu(const struct evsel *evsel)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390
  2023-06-16 16:28                 ` Ian Rogers
@ 2023-06-16 16:53                   ` Arnaldo Carvalho de Melo
  2023-06-16 21:47                     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-06-16 16:53 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark,
	Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry,
	Will Deacon

Em Fri, Jun 16, 2023 at 09:28:12AM -0700, Ian Rogers escreveu:
> On Fri, Jun 16, 2023 at 7:44 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > Em Fri, Jun 16, 2023 at 11:36:27AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > I noticed this on an arm64 board:
> > >
> > > > > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls
> > > > > COPYING  CREDITS  Documentation  Kbuild  Kconfig  LICENSES  MAINTAINERS  Makefile  README  arch  block  certs  crypto  drivers  fs  include  init  io_uring  ipc  kernel  lib  mm  net  perf.data  rust  samples  scripts  security  sound  tools  usr  virt
> > >
> > > > >  Performance counter stats for 'ls':
> > >
> > > > >    <not supported>      armv8_cortex_a72/cycles:u/
> > > > >    <not supported>      armv8_cortex_a53/cycles:u/
> > > > >    <not supported>      armv8_cortex_a72/instructions:u/
> > > > >    <not supported>      armv8_cortex_a53/instructions:u/
> > >
> > > > I tested on a raspberry pi and perf-tools-next is working there. I
> > > > suspect the issue here is the heterogeneous PMU. The cycles event is
> > > > converted into a perf_event_attr with type 0 and config 0. When there
> > > > are heterogeneous PMUs then we try to use the extended type to say we
> > > > want armv8_cortex_a72 and armv8_cortex_a53 cycles events. Let's say
> > > > the type number of armv8_cortex_a72 and armv8_cortex_a53 PMUs are 9
> > > > and 10 respectively. With heterogeneous encodings the type in the
> > >
> > > The numbers are 8 and 7, PERF_TYPE_HW (thus zero, thus not printed):
> > >
> > > root@roc-rk3399-pc:~# perf stat -vv -e cycles sleep 1
> > > Using CPUID 0x00000000410fd080
> > > Control descriptor is not initialized
> > > ------------------------------------------------------------
> > > perf_event_attr:
> > >   size                             136
> > >   config                           0x800000000
> > > ------------------------------------------------------------
> > > sys_perf_event_open: pid 13885  cpu -1  group_fd -1  flags 0x8
> > > sys_perf_event_open failed, error -2
> > > Warning:
> > > cycles event is not supported by the kernel.
> > > ------------------------------------------------------------
> > > perf_event_attr:
> > >   size                             136
> > >   config                           0x700000000
> > > ------------------------------------------------------------
> > > sys_perf_event_open: pid 13885  cpu -1  group_fd -1  flags 0x8
> > > sys_perf_event_open failed, error -2
> > > Warning:
> > > cycles event is not supported by the kernel.
> > > failed to read counter cycles
> > > failed to read counter cycles
> > >
> > >  Performance counter stats for 'sleep 1':
> > >
> > >    <not supported>      armv8_cortex_a72/cycles/
> > >    <not supported>      armv8_cortex_a53/cycles/
> > >
> > >        1.011406938 seconds time elapsed
> > >
> > >        0.000000000 seconds user
> > >        0.010886000 seconds sys
> > >
> > >
> > > root@roc-rk3399-pc:~#
> > >
> > > > perf_event_attr remains as 0 and the config becomes 9 << 32 and 10 <<
> > > > 32. I suspect your kernel is seeing the extended type information and
> > > > not handling it, hence the error.
> > >
> > > looks this is the case indeed
> > >
> > > > We add in the extended type for hardware and legacy cache events in
> > > > the parse events code:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n435
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1239
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1478
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1511
> > > >
> > > > The addition of the extended type happens if
> > > > perf_pmus__supports_extended_type() returns true, its implementation
> > > > is:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/pmus.c?h=perf-tools-next#n480
> > > > bool perf_pmus__supports_extended_type(void)
> > > > {
> > > >   return perf_pmus__num_core_pmus() > 1;
> > > > }
> > > >
> > > > Previously on heterogeneous ARM the extended type wouldn't be encoded
> > > > and I believe the event was opened on the PMU of the current CPU only.
> > >
> > > I think that is the case, haven't checked so far tho.
> > >
> > > > This is a bug because you will not count events on all PMUs. We can
> > > > make perf_pmus__supports_extended_type return false on ARM which
> > > > should bring back the previous behavior - or do some kind of dynamic
> > >
> > > simplest first step, trying it.
> >
> > Spot on:
> >
> > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat ls
> > COPYING  CREDITS  Documentation  Kbuild  Kconfig  LICENSES  MAINTAINERS  Makefile  README  arch  block  certs  crypto  drivers  fs  include  init  io_uring  ipc  kernel  lib  mm  net  rust  samples  scripts  security  sound  tools  usr  virt
> >
> >  Performance counter stats for 'ls':
> >
> >               9.01 msec task-clock:u                     #    0.401 CPUs utilized
> >                  0      context-switches:u               #    0.000 /sec
> >                  0      cpu-migrations:u                 #    0.000 /sec
> >                 84      page-faults:u                    #    9.320 K/sec
> >            1188641      cycles:u                         #    0.132 GHz
> >             601132      instructions:u                   #    0.51  insn per cycle
> >              64768      branches:u                       #    7.186 M/sec
> >              11680      branch-misses:u                  #   18.03% of all branches
> >
> >        0.022502514 seconds time elapsed
> >
> >        0.000000000 seconds user
> >        0.022946000 seconds sys
> >
> >
> > acme@roc-rk3399-pc:~/git/perf-tools-next$
> > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf record ls
> > COPYING  CREDITS  Documentation  Kbuild  Kconfig  LICENSES  MAINTAINERS  Makefile  README  arch  block  certs  crypto  drivers  fs  include  init  io_uring  ipc  kernel  lib  mm  net  perf.data  rust  samples  scripts  security  sound  tools  usr  virt
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.003 MB perf.data (18 samples) ]
> > acme@roc-rk3399-pc:~/git/perf-tools-next$ perf evlist
> > cycles:Pu
> > dummy:HGu
> > acme@roc-rk3399-pc:~/git/perf-tools-next$ ldd ~/bin/perf | grep asan
> >         libasan.so.6 => /lib/aarch64-linux-gnu/libasan.so.6 (0x0000ffffa5a00000)
> > acme@roc-rk3399-pc:~/git/perf-tools-next$
> >
> > With the following patch. Do you want to submit it or may I add it as is
> > using an edited discussion in this thread as the commit log message?
> >
> > Thanks!
> >
> > - Arnaldo
> 
> Hi Arnaldo,
> 
> presumably with the #ifdef you just get 1 PMU - shame. I think rather
> than do an #ifdef we can do something like call is_event_supported:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232
> so:
> bool perf_pmus__supports_extended_type(void)
>   struct perf_pmu *pmu = NULL;
>   if (perf_pmus__num_core_pmus() <= 1)
>      return false;
>   while((pmu = perf_pmus__scan_core(pmu) != NULL) {
>     return is_event_supported(PERF_TYPE_HARDWARE,
> PERF_COUNT_HW_CPU_CYCLES | ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT);
>  }
> return false;
> }
> We probably don't want to do this for each call of
> perf_pmus__supports_extended_type so you could use a static and
> pthread_once, etc.
> 
> This would mean if this regression is introduced elsewhere than ARM it
> will self heal. It will also mean that when ARM support extended types
> in the kernel, they will get the normal heterogeneous behavior.

That looks better, I'll try it when I get back to my office.

- Arnaldo
 
> Thanks,
> Ian
> 
> > diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
> > index a2032c1b7644..9af961105a64 100644
> > --- a/tools/perf/util/pmus.c
> > +++ b/tools/perf/util/pmus.c
> > @@ -494,7 +494,13 @@ int perf_pmus__num_core_pmus(void)
> >
> >  bool perf_pmus__supports_extended_type(void)
> >  {
> > +#if defined(__aarch64__)
> > +       // We can't use the extended type information where the PMU number
> > +       // is encoded in the upper perf_event_attr::type bits. (<< 32).
> > +       return false;
> > +#else
> >         return perf_pmus__num_core_pmus() > 1;
> > +#endif
> >  }
> >
> >  struct perf_pmu *evsel__find_pmu(const struct evsel *evsel)

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390
  2023-06-16 16:53                   ` Arnaldo Carvalho de Melo
@ 2023-06-16 21:47                     ` Arnaldo Carvalho de Melo
  2023-06-16 22:09                       ` Ian Rogers
  0 siblings, 1 reply; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-06-16 21:47 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark,
	Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry,
	Will Deacon

Em Fri, Jun 16, 2023 at 01:53:41PM -0300, Arnaldo Carvalho de Melo escreveu:
> > presumably with the #ifdef you just get 1 PMU - shame. I think rather
> > than do an #ifdef we can do something like call is_event_supported:
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232
> > so:
> > bool perf_pmus__supports_extended_type(void)
> >   struct perf_pmu *pmu = NULL;
> >   if (perf_pmus__num_core_pmus() <= 1)
> >      return false;
> >   while((pmu = perf_pmus__scan_core(pmu) != NULL) {
> >     return is_event_supported(PERF_TYPE_HARDWARE,
> > PERF_COUNT_HW_CPU_CYCLES | ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT);
> >  }
> > return false;
> > }
> > We probably don't want to do this for each call of
> > perf_pmus__supports_extended_type so you could use a static and
> > pthread_once, etc.
> > 
> > This would mean if this regression is introduced elsewhere than ARM it
> > will self heal. It will also mean that when ARM support extended types
> > in the kernel, they will get the normal heterogeneous behavior.
> 
> That looks better, I'll try it when I get back to my office.

End result, Ack?


diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index a2032c1b7644..d891d72c824e 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -4,6 +4,7 @@
 #include <subcmd/pager.h>
 #include <sys/types.h>
 #include <dirent.h>
+#include <pthread.h>
 #include <string.h>
 #include <unistd.h>
 #include "debug.h"
@@ -492,9 +493,35 @@ int perf_pmus__num_core_pmus(void)
 	return count;
 }
 
+static bool __perf_pmus__supports_extended_type(void)
+{
+	struct perf_pmu *pmu = NULL;
+
+	if (perf_pmus__num_core_pmus() <= 1)
+		return false;
+
+	while ((pmu = perf_pmus__scan_core(pmu)) != NULL) {
+		if (!is_event_supported(PERF_TYPE_HARDWARE, PERF_COUNT_HW_CPU_CYCLES | ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT)))
+			return false;
+	}
+
+	return true;
+}
+
+static bool perf_pmus__do_support_extended_type;
+
+static void perf_pmus__init_supports_extended_type(void)
+{
+	perf_pmus__do_support_extended_type = __perf_pmus__supports_extended_type();
+}
+
 bool perf_pmus__supports_extended_type(void)
 {
-	return perf_pmus__num_core_pmus() > 1;
+	static pthread_once_t extended_type_once = PTHREAD_ONCE_INIT;
+
+	pthread_once(&extended_type_once, perf_pmus__init_supports_extended_type);
+
+	return perf_pmus__do_support_extended_type;
 }
 
 struct perf_pmu *evsel__find_pmu(const struct evsel *evsel)
diff --git a/tools/perf/util/print-events.c b/tools/perf/util/print-events.c
index 7a5f87392720..a7566edc86a3 100644
--- a/tools/perf/util/print-events.c
+++ b/tools/perf/util/print-events.c
@@ -229,7 +229,7 @@ void print_sdt_events(const struct print_callbacks *print_cb, void *print_state)
 	strlist__delete(sdtlist);
 }
 
-static bool is_event_supported(u8 type, u64 config)
+bool is_event_supported(u8 type, u64 config)
 {
 	bool ret = true;
 	int open_return;
diff --git a/tools/perf/util/print-events.h b/tools/perf/util/print-events.h
index e75a3d7e3fe3..d7fab411e75c 100644
--- a/tools/perf/util/print-events.h
+++ b/tools/perf/util/print-events.h
@@ -3,6 +3,7 @@
 #define __PERF_PRINT_EVENTS_H
 
 #include <linux/perf_event.h>
+#include <linux/types.h>
 #include <stdbool.h>
 
 struct event_symbol;
@@ -36,5 +37,6 @@ void print_symbol_events(const struct print_callbacks *print_cb, void *print_sta
 			 unsigned int max);
 void print_tool_events(const struct print_callbacks *print_cb, void *print_state);
 void print_tracepoint_events(const struct print_callbacks *print_cb, void *print_state);
+bool is_event_supported(u8 type, u64 config);
 
 #endif /* __PERF_PRINT_EVENTS_H */

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390
  2023-06-16 21:47                     ` Arnaldo Carvalho de Melo
@ 2023-06-16 22:09                       ` Ian Rogers
  0 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2023-06-16 22:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Thomas Richter, linux-perf-use., Sumanth Korikkar, James Clark,
	Leo Yan, Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry,
	Will Deacon

On Fri, Jun 16, 2023 at 2:47 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Fri, Jun 16, 2023 at 01:53:41PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > presumably with the #ifdef you just get 1 PMU - shame. I think rather
> > > than do an #ifdef we can do something like call is_event_supported:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n232
> > > so:
> > > bool perf_pmus__supports_extended_type(void)
> > >   struct perf_pmu *pmu = NULL;
> > >   if (perf_pmus__num_core_pmus() <= 1)
> > >      return false;
> > >   while((pmu = perf_pmus__scan_core(pmu) != NULL) {
> > >     return is_event_supported(PERF_TYPE_HARDWARE,
> > > PERF_COUNT_HW_CPU_CYCLES | ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT);
> > >  }
> > > return false;
> > > }
> > > We probably don't want to do this for each call of
> > > perf_pmus__supports_extended_type so you could use a static and
> > > pthread_once, etc.
> > >
> > > This would mean if this regression is introduced elsewhere than ARM it
> > > will self heal. It will also mean that when ARM support extended types
> > > in the kernel, they will get the normal heterogeneous behavior.
> >
> > That looks better, I'll try it when I get back to my office.
>
> End result, Ack?
>

Acked-by: Ian Rogers <irogers@google.com>

Thanks,
Ian

> diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
> index a2032c1b7644..d891d72c824e 100644
> --- a/tools/perf/util/pmus.c
> +++ b/tools/perf/util/pmus.c
> @@ -4,6 +4,7 @@
>  #include <subcmd/pager.h>
>  #include <sys/types.h>
>  #include <dirent.h>
> +#include <pthread.h>
>  #include <string.h>
>  #include <unistd.h>
>  #include "debug.h"
> @@ -492,9 +493,35 @@ int perf_pmus__num_core_pmus(void)
>         return count;
>  }
>
> +static bool __perf_pmus__supports_extended_type(void)
> +{
> +       struct perf_pmu *pmu = NULL;
> +
> +       if (perf_pmus__num_core_pmus() <= 1)
> +               return false;
> +
> +       while ((pmu = perf_pmus__scan_core(pmu)) != NULL) {
> +               if (!is_event_supported(PERF_TYPE_HARDWARE, PERF_COUNT_HW_CPU_CYCLES | ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT)))
> +                       return false;
> +       }
> +
> +       return true;
> +}
> +
> +static bool perf_pmus__do_support_extended_type;
> +
> +static void perf_pmus__init_supports_extended_type(void)
> +{
> +       perf_pmus__do_support_extended_type = __perf_pmus__supports_extended_type();
> +}
> +
>  bool perf_pmus__supports_extended_type(void)
>  {
> -       return perf_pmus__num_core_pmus() > 1;
> +       static pthread_once_t extended_type_once = PTHREAD_ONCE_INIT;
> +
> +       pthread_once(&extended_type_once, perf_pmus__init_supports_extended_type);
> +
> +       return perf_pmus__do_support_extended_type;
>  }
>
>  struct perf_pmu *evsel__find_pmu(const struct evsel *evsel)
> diff --git a/tools/perf/util/print-events.c b/tools/perf/util/print-events.c
> index 7a5f87392720..a7566edc86a3 100644
> --- a/tools/perf/util/print-events.c
> +++ b/tools/perf/util/print-events.c
> @@ -229,7 +229,7 @@ void print_sdt_events(const struct print_callbacks *print_cb, void *print_state)
>         strlist__delete(sdtlist);
>  }
>
> -static bool is_event_supported(u8 type, u64 config)
> +bool is_event_supported(u8 type, u64 config)
>  {
>         bool ret = true;
>         int open_return;
> diff --git a/tools/perf/util/print-events.h b/tools/perf/util/print-events.h
> index e75a3d7e3fe3..d7fab411e75c 100644
> --- a/tools/perf/util/print-events.h
> +++ b/tools/perf/util/print-events.h
> @@ -3,6 +3,7 @@
>  #define __PERF_PRINT_EVENTS_H
>
>  #include <linux/perf_event.h>
> +#include <linux/types.h>
>  #include <stdbool.h>
>
>  struct event_symbol;
> @@ -36,5 +37,6 @@ void print_symbol_events(const struct print_callbacks *print_cb, void *print_sta
>                          unsigned int max);
>  void print_tool_events(const struct print_callbacks *print_cb, void *print_state);
>  void print_tracepoint_events(const struct print_callbacks *print_cb, void *print_state);
> +bool is_event_supported(u8 type, u64 config);
>
>  #endif /* __PERF_PRINT_EVENTS_H */

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Hybrid PMU issues on aarch64. was: Re: perf test failures in linux-next on s390
  2023-06-16 14:36             ` Hybrid PMU issues on aarch64. was: " Arnaldo Carvalho de Melo
  2023-06-16 14:44               ` Arnaldo Carvalho de Melo
@ 2023-06-19 10:04               ` Thomas Richter
  1 sibling, 0 replies; 15+ messages in thread
From: Thomas Richter @ 2023-06-19 10:04 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers
  Cc: linux-perf-use., Sumanth Korikkar, James Clark, Leo Yan,
	Suzuki K Poulose, Mike Leach, Mark Rutland, John Garry,
	Will Deacon

On 6/16/23 16:36, Arnaldo Carvalho de Melo wrote:
> Em Fri, Jun 16, 2023 at 07:23:30AM -0700, Ian Rogers escreveu:
>> On Thu, Jun 15, 2023 at 7:35 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>>> Ccing the ARM people too:
>>> Em Thu, Jun 15, 2023 at 11:39:16AM +0200, Thomas Richter escreveu:
>>>> On 6/14/23 16:57, Ian Rogers wrote:
>>>>> On Wed, Jun 14, 2023 at 1:32 AM Thomas Richter <tmricht@linux.ibm.com> wrote:
>>>>> bool is_pmu_core(const char *name)
>>>>> {
>>>>> return !strcmp(name, "cpu") || is_sysfs_pmu_core(name);
>>>>> }
> 
>>>> Maybe we should scan the directory
> 
>>>> [linux-next]# ll /sys/bus/event_source/devices
>>>> total 0
>>>> lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_cf -> ../../../devices/cpum_cf
>>>> lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_cf_diag -> ../../../devices/cpum_cf_diag
>>>> lrwxrwxrwx 1 root root 0 Jun  2 15:11 cpum_sf -> ../../../devices/cpum_sf
>>>> lrwxrwxrwx 1 root root 0 Jun  2 15:11 kprobe -> ../../../devices/kprobe
>>>> lrwxrwxrwx 1 root root 0 Jun  2 15:11 software -> ../../../devices/software
>>>> lrwxrwxrwx 1 root root 0 Jun  2 15:11 tracepoint -> ../../../devices/tracepoint
>>>> lrwxrwxrwx 1 root root 0 Jun  2 15:11 uprobe -> ../../../devices/uprobe
>>>> [linux-next]#
> 
>>>> This directory lists the PMUs available on s390, maybe this is true for
>>>> other platform...
> 
>>> I noticed this on an arm64 board:
> 
>>> acme@roc-rk3399-pc:~/git/perf-tools-next$ perf stat -e cycles:u,instructions:u ls
>>> COPYING  CREDITS  Documentation  Kbuild  Kconfig  LICENSES  MAINTAINERS  Makefile  README  arch  block  certs  crypto  drivers  fs  include  init  io_uring  ipc  kernel  lib  mm  net  perf.data  rust  samples  scripts  security  sound  tools  usr  virt
> 
>>>  Performance counter stats for 'ls':
> 
>>>    <not supported>      armv8_cortex_a72/cycles:u/
>>>    <not supported>      armv8_cortex_a53/cycles:u/
>>>    <not supported>      armv8_cortex_a72/instructions:u/
>>>    <not supported>      armv8_cortex_a53/instructions:u/
> 
>> I tested on a raspberry pi and perf-tools-next is working there. I
>> suspect the issue here is the heterogeneous PMU. The cycles event is
>> converted into a perf_event_attr with type 0 and config 0. When there
>> are heterogeneous PMUs then we try to use the extended type to say we
>> want armv8_cortex_a72 and armv8_cortex_a53 cycles events. Let's say
>> the type number of armv8_cortex_a72 and armv8_cortex_a53 PMUs are 9
>> and 10 respectively. With heterogeneous encodings the type in the
> 
> The numbers are 8 and 7, PERF_TYPE_HW (thus zero, thus not printed):
> 
> root@roc-rk3399-pc:~# perf stat -vv -e cycles sleep 1
> Using CPUID 0x00000000410fd080
> Control descriptor is not initialized
> ------------------------------------------------------------
> perf_event_attr:
>   size                             136
>   config                           0x800000000
>   sample_type                      IDENTIFIER
>   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>   disabled                         1
>   inherit                          1
>   enable_on_exec                   1
>   exclude_guest                    1
> ------------------------------------------------------------
> sys_perf_event_open: pid 13885  cpu -1  group_fd -1  flags 0x8
> sys_perf_event_open failed, error -2
> Warning:

On s390 with above patch applied and latest git pull of linux-next this morning
I get this result:

# ./perf test -F 6
  6: Parse event definition strings                                  :
  6.1: Test event parsing                                            : Ok
  6.2: Parsing of all PMU events from sysfs                          : Ok
  6.3: Parsing of given PMU events from sysfs                        : Ok
  6.4: Parsing of aliased events from sysfs                          : Skip (no aliases in sysfs)
  6.5: Parsing of aliased events                                     : Ok
  6.6: Parsing of terms (event modifiers)                            : Ok
#

However the config member in perf_event_attr::config member does not change
as can be seen in this trace:
# ./perf stat -e cycles -vv true
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
  size                             136
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 6510  cpu -1  group_fd -1  flags 0x8 = 3
cycles: -1: 2646065 510719 510719
cycles: 2646065 510719 510719

 Performance counter stats for 'true':

         2,646,065      cycles                                                                

       0.002084266 seconds time elapsed

       0.000052000 seconds user
       0.002107000 seconds sys


#

Thanks for fixing this...

-- 
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-06-19 10:04 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-13 12:54 perf test failures in linux-next on s390 Thomas Richter
2023-06-13 14:32 ` Ian Rogers
2023-06-14  8:31   ` Thomas Richter
2023-06-14 14:57     ` Ian Rogers
2023-06-15  8:57       ` Thomas Richter
2023-06-15  9:39       ` Thomas Richter
2023-06-15 14:34         ` Arnaldo Carvalho de Melo
2023-06-16 14:23           ` Ian Rogers
2023-06-16 14:36             ` Hybrid PMU issues on aarch64. was: " Arnaldo Carvalho de Melo
2023-06-16 14:44               ` Arnaldo Carvalho de Melo
2023-06-16 16:28                 ` Ian Rogers
2023-06-16 16:53                   ` Arnaldo Carvalho de Melo
2023-06-16 21:47                     ` Arnaldo Carvalho de Melo
2023-06-16 22:09                       ` Ian Rogers
2023-06-19 10:04               ` Thomas Richter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).